Download 2003 Tissue Microarray Specification:
The tissue microarray data exchange specification: A community-based, open source tool for sharing tissue microarray data.
BMC Med Inform Decis Making. Accepted 23 May 2003
View or Download Article
*This letter summarizes the Standards Session of the May 30, 2001 AIMCL
Tissue Microarray Infostructure Workshop*
The AIMCL "Automated Information Management in the Clinical Laboratory" is
hosted each year by the University of Michigan and is organized by Dr. Bruce
Friedman. This year, Bruce hosted a special workshop on Tissue Microarray
(TMA) Infostructure on May 30. I didn't do a head-count, but there seemed
to be at least 50 (maybe as many as 70) people in attendance. Many of the
attendees represented commercial interests.
During the morning session, several TMA investigators described the ongoing
research projects in their laboratories (Mark Rubin, U of Michigan; Steve
Bova, Johns Hopkins; David Rimm, Yale; Matt van de Rijn, Stanford).
The afternoon shifted discussion to TMA data exchange standards. I gave a
short lecture on the importance of standards for data exchange and made the
argument that, at present, all tissue microarray data is handled
idiosyncratically by different laboratories, resulting in the inability of
researchers to exchange TMA datasets.
If there were a standard for the exchange of TMA data, researchers could:
1. Share their TMA data to collaborators in other laboratories
2. Submit their TMA data to journals or to data respositories in a standard
3. Merge their TMA data with other TMA datasets.
4. Update their TMA datasets with accrued data from related datasets (e.g.
specimen repositories that have collected patient follow-up data related to
tissue cores included in the TMA)
5. Validate TMA experimental data (by comparing experimental data elements
with the corresponding data elements produced by other researchers using the
same TMA block or the same TMA tissue cores).
6. Support distributed dataset queries (by searching over common data
elements found in different available datasets, including tissue repository
datasets and gene array datasets)
7. Extend the value of any dataset by allowing a community of researchers to
perform data analyses that may not have been anticipated at the time that
the TMA was designed.
The key to all of these efforts is the concept of DATA SHARING. Sharing may
be altruistic (placing your TMA data into the public domain), commercial (I
will share my data with you if you pay me the required fee), or professional
(you can have my data if I'm included as an author on the paper).
During my presentation, and in the group discussion afterward, the following
general qualities of a TMA standard were discussed:
1. The standard should be free and non-proprietary.
2. The standard should be self-descriptive. Anyone reviewing a TMA file
should be able to precisely determine how the data is organized by reading
the data tags included in the file.
3. The standard should, when feasible, use publicly available common data
elements linked to a web site that fully defines each common data element
included in the standard (needed to support dataset-independent distributed
network queries). This means that the committee that creates the TMA
standard must work with other standards committees to ensure cross-database
compatibility of common data elements
4. The standard should be generic (able to describe any laboratory's TMA
5. The standard should be extensible. This means that there will need to be
a standards committee that can make changes in the standard over time and
that can keep a documented history of modifications in the standard.
6. The standard should be easy to implement. It should be relatively easy
for a programmer to translate any commercial TMA dataset into the TMA
standard (and to reverse the process)
7. The standard should not be a requirement. The committee that creates the
standard should take no measure to require laboratories to implement the
standard. Those using the standard would be able to choose that data that
is included in their
shared datasets (e.g. they may choose to withold or encrypt patient
8. The standard should have community buy-in. Laboratories, commercial
vendors, pathology organizations, government agencies, and other standards
committees should all have the opportunity to comment on the standards.
Some of the data elements included in the TMA file standard might be:
Tissue Microarray Header Data
3. Lab or origin
4. Creation date
5. Modification dates
6. Unique identifier
Specific core data:
8. Numbers of cores in array
9 Size of cores in array
10.core coordinate system
11. Coordinates of cores
14. Core data
Data element (1 of 1000)
a. Surgical pathology specimen (de-identified)
This may link to patient identifier in another dataset
This may link to another core of the same specimen in another
tissue microarray file
Patient identifier may link to clinical/demographic data in one or
b. Code for particular core
d. Results of stain (method identifier)
e. Image of data element
There was enthusiasm among the group to continue work towards developing the
standard. There was consensus for writing the standard in XML. There were
no dissenters to the idea of working on drafts of the TMA standard through a
listserver consisting of all the TMA workshop registrants.
The next TMA standards workshop has been organized by Dr. Mary Edgerton
(Vanderbilt) and will be held Oct. 6, 2001 in conjuction with the APIII
meeting in Pittsburgh. If all goes well, before the Pittsburgh meeting, we
should have a rudimentary draft of an XML standard, including comments from
the Ann Arbor registrants.
Jules Berman, Ph.D., M.D.
Program Director, Pathology Informatics
Cancer Diagnosis Program, DCTD, NCI, NIH