Informatics for Pathology-based Specimen Resources. Workshop Summary.

Informatics for Pathology-based Specimen Resources. Workshop Summary.

On May 13 and 14, 1999, the Resources Development Branch held a workshop in Annapolis Maryland, entitled, "Informatics for Pathology-based Specimen Resources." Participants included pathologists, informaticians, computer scientists, lawyers, software vendors, statisticians, and government scientists from NIH, the VA, the Armed Forces Institute of Pathology, and the CDC. The meeting began with three lectures designed to focus discussion and provide common concepts and language for the participants: an overview of medical informatics, by Dr. James Cimino; an update of current activities in pathology informatics, by Dr. Michael Becich; and a discussion of Distributed Network Query Models, by Joe Futrelle. Breakout groups addressed the following areas: Uses for pathology informatics systems; Legal and ethical issues; network technical issues; Political, marketing and economic issues; Model systems for pathology informatics; and Data and tissue availability. Each breakout group presented a summary of its discussion to the assembled participants, and time was allotted for participants to discuss the issues raised by the individual breakout groups.


Pathology informatics has served primarily to provide clinicians with access to electronic pathology reports for their patients. Participants appeared to be in implicit agreement that additional efforts are needed to expand the role of pathology informatics to make pathology data and specimens more available to support research. Pathologists were enthusiastic about becoming involved in this effort and providing pathology resources for this effort. Themes that emerged repeatedly during the workshop related to how to share specimens and data, how to standardize data (so that information from multiple sources can be usefully merged), how to ensure patient confidentiality, and how to provide appropriate incentives and compensation.


There was agreement on the desirability of institutions using a common or standardized report format, or at least preparing reports to include all the data necessary for critical text elements to be automatically extracted and reassembled into a standard report format. Reports should contain the information indicated by practice guidelines (such as those published by professional organizations) and data should be entered using common terms. Furthermore, a method (perhaps automated) is needed to convert the data from standard reports into a standard coding nomenclature (e.g. SNOMED,CPT, etc.).

The participants cautioned that standardization does not always achieve the desired results. The Bethesda reporting system for cytopathology was cited as an example. Despite its simple text approach and wide use, individual institutions have not entered data uniformly or have entered it inaccurately, producing datasets that cannot be easily used for interlaboratory comparisons. Other examples of data structure without data rigor are the large multi-institutional systems such as that used by the VA. While the VA has one of the best structured medical data systems in existence, with uniform data dictionaries implemented in over 160 institutions, data input on the local level is sometimes incomplete or inaccurate (e.g. uncorrected misspellings, incomplete or absent text field entries, etc.).

Despite their reservations, participants enthusiastically endorsed the development of reporting standards, to provide common entry of text at all institutions. There was further consensus that most stakeholders in the field of informatics (professional societies, accrediting agencies, researchers, etc.) would support standardization of pathology reports. As an incentive, third party payers, including the government, might choose to provide a bonus for pathology reports adhering to a standard reporting format.


The breakout sessions on "Legal and ethical issues," agreed that while the legal and ethical issues related to pathology informatics are beginning to be addressed, many remain unresolved. These include: specimen ownership; data ownership, and permissible charges for data or tissue access. Another complicating issue is that there are different requirements for use of specimens and data by grant supported researchers than for commercial research using the same materials. Many of the ethical and legal guidelines currently under consideration apply only to federally-funded researchers. Arrangements between commercial organizations are currently free from certain restrictive regulations that apply to federal agencies and federally funded researchers.

The group suggested that it would be important to capture consent status, along with any specific restrictions on that consent, into the electronic pathology record. Informatics initiatives should tie each and every specimen and data record to current and past conditions placed on use of the specimen and data. Most importantly, there must be a way to tag specimen reports to indicate that consent has been withdrawn. This alone is a new informatics activity that should be considered in developing standardized electronic pathology reports.

The potential usefulness of cancer registries was discussed at length. Forty-nine of the 50 states have cancer registries, and cancer registries are required by law to collect patient data and to make it available for research. In the past, pathology data was summarized by the Registrars during chart review and entered into cancer registry records. Sufficient informatics support may allow Registrars to automatically acquire complete pathology data linked to specimens. Because that data would be collected under the legal mandate of the registries, many restrictions that usually apply to the collection of patient data would be obviated.

There was considerable enthusiasm for an initiative by the NCI to involve the SEER Program and the CDC in developing new mechanisms to help cancer registries collect information, including pathology reports as well as information generated outside of the hospital setting, such as recurrence and treatment data. This would greatly expand the value of both existing cancer registries and existing pathology data, by improving the clinical information that is collected by registrars. The important efforts in this area should focus on developing standardized report formats and the software tools needed to merge cancer registry data with the large pathology datasets .


Joe Futrelle opened this discussion with a lecture on the topic of network query models. In general, network query involves: client software to request information via the internet; middleware to receive the query and translate it into a standard format with tags to show the identity or institution of origin, or the security clearance of the client that initiated the query. The client query, processed by the middleware, is delivered to a target institution. The query is read and a determination made as to whether or not to answer the query and what, if any, information to include in the reply. If the query is approved, the institution's own search engine requests the data from the institutional database and a reply is created and formatted according to a standardized reply format (such as XML). This process is repeated at each institution that has agreed to participate. All replies are transmitted to the middleware query agent, with patient identifiers removed or encrypted. The query agent prepares a report and delivers it to the client. These steps can be done automatically and virtually instantaneously.

Standards already exist for conducting queries among multiple databases over the internet. Two of these query models were discussed in some detail; Z39.50 and XML, which is likely to emerge as the best format for conducting such queries. The workshop participants agreed that defining software requirements was not as important as defining the data structure. They noted that developing standard data models is the most important thing that the pathology community could do. This would require community agreement not only on the names of the data objects, but on the meanings of those objects. For instance, when "Cause of Death" is selected as the data object, those entering the data must agree on the meaning of that concept. If a man who smoked for 30 years develops lung cancer, and dies with metastatic disease, the cause of death could be "lung cancer," or it could be "long-term cigarette abuse." While the choices are completely different, both apply and either might legitimately appear as the Cause of Death. There will be a need for consistent rules for assigning data elements.


Pathology departments offering new services, such as informatics and specimen-related support, should be compensated for providing those services. As an example, one primary reason for the long and dramatic decline in autopsy rates has been the inability of most pathology departments to find mechanisms for adequate monetary compensation for autopsies. There was general agreement that economic considerations have made it increasingly difficult for pathologists to provide tissue specimens to researchers. In addition, some pathology departments are discarding archived tissues after 3-5 years of storage. Since pathology departments are reluctant to undertake new responsibilities without a defined compensation mechanism, they are unlikely to assume the burden of providing archived tissue with accompanying clinical data reported in a standardized format without an additional economic incentive.

In addition to specimen handling costs, preparation of reports in a way that supports informatics initiatives will also require incentives. Third parties might pay a bonus for reports that are well-coded using CPT, SNOMED or other standard terminologies that permit more accurate billing, better data analysis, improved outcome studies, etc. Advocacy groups might also be motivated by the potential to improve research and patient care to provide additional support to the development of the informatics needed to improve pathology reports. It was also suggested that laboratory information system vendors might invest in informatics initiatives that appear likely to add value to their services.

There was enthusiastic agreement that pathology informatics could add significant value to pathology reports as well as facilitate better research support. Some examples of improved reporting include: adding images in reports, coding data to make reports more useful to third parties, using the Bethesda system and other reporting standards in a more consistent manner, and permanently and securely associating tissue specimens with pathology data.


1. There needs to be a way of associating specimens with clinical data and to do so at multiple institutions.

There was general consensus that pathology data has value, that the value of data cannot be realized until the data is queried, and that data from multiple institutions is needed to provide the demographic richness and specimen quality needed for translational research efforts. Any such system must securely protect patient privacy and data confidentiality.

2. NCI needs to convey their research priorities to pathology departments.

There was a strong feeling among pathologists that NCI could play a valuable role in helping to define those research areas that would benefit from specimen-based informatics efforts. There was some reluctance to embark on a massive informatics undertaking unless pathologists knew that their efforts coincided with NCI's interests.

3. There is a need for further discussion of the professional contributions of pathologist and informaticians to research.

There was concern that collaborative research activities might undervalue the contribution of those pathologists or informaticians who prepare specimens or data for researchers. Researchers, institutions, and pathology departments should re-examine the professional role played by specimen providers and informatics experts. Involvement of specimen providers and informatics experts in the design of experimental protocols will be increasingly necessary as specimen-related pathology datasets become available.

4. Standardized pathology reports are needed.

There was complete agreement that NCI should be involved in efforts to define common data elements and to standardize pathology reports and link the data to the specimens that needed by cancer researchers. Further, NCI should cooperate with advocacy groups, professional organizations and laboratory information system vendors to help devise appropriate incentives for pathology departments to cooperate in report standardization and the development of specimen-based databases.

5. Cancer Registrars need to include pathology report data in Registry records.

There was considerable enthusiasm for the NCI to cooperate with other governmental efforts, such as the registry programs of SEER and the CDC, to improve the data that is routinely collected by cancer registries to support research.

Jules Berman, Ph.D., M.D.
Program Director, Pathology Informatics
Resources Development Branch
Cancer Diagnosis Program, DCTD, NCI, NIH
EPN 700
6130 Executive Blvd
Rockville, MD 20892
voice: 301-496-7147
fax: 301-402-7819