Berman JJ, Moore GW. SNOMED-Encoded surgical pathology databases:
a tool for epidemiologic investigation. Modern Pathology
This article was reviewed in an editorial in the following
issue of Modern Pathology.
Silverberg SG. SNOMED-Encoded surgical pathology databases:
's no big deal - or is it?. Modern Pathology 9:953-954, 1996
Jules J. Berman and G. William Moore
Running Title: SNOMED Databases
Pathology departments have invested considerable energy, sometimes extending over several decades, toward coding their anatomic pathology reports. As a result of these labors, there is now a vast amount of electronically coded data from surgical pathology reports, holding a wealth of information relevant to virtually every recognized pathologic entity. The original intent of SNOMED was to prepare population-based disease data from pathology reports, but no such studies have emerged in the medical literature. This is due in part to the non-uniform, idiosyncratic and incomplete manner in which most SNOMED databases are constructed. Automatic (computer driven) coding provides uniformity and completeness of SNOMED databases, and offers the possibility of customized recoding for an entire collection of reports using any nomenclature and any set of coding algorithms. In prior investigations, we have described a computer program that SNOMED-codes surgical pathology reports, and we have provided an analysis of a large surgical pathology SNOMED database. In this report, we describe the importance of coded surgical pathology databases for research, teaching, hospital administration, and public health, and we explain the functional differences between coded databases and free text collections of surgical pathology data. Surgical pathology departments and vendors of laboratory information systems can insure that surgical report files can be automatically coded or recoded with any chosen nomenclature by adhering to simple guidelines.
key words: database, SNOMED, epidemiology, nomenclature, coding,
Pathologists constantly rely on epidemiologic data to supplement morphologic data. In many cases, simple demographics point so closely to a lesion's identity that most pathologists would be reluctant to make a diagnosis when a case does not conform to an expected presentation. For instance, a pathologist might hesitate before making the diagnosis of Ewing's tumor in a black patient (1), follicular carcinoma of the thyroid in a high school student (2), or liposarcoma in an infant (3). In general, pathologists represent incidence data as a percentage of received specimens. Examples are: glioblastomas comprise 55% of intracranial gliomas occurring in patients of all ages (4); synovial sarcoma comprises 5% to 10% of soft tissue malignancies in patients under the age of 20 years (5); and medullary carcinoma of thyroid comprises 10% of all primary thyroid tumors (2). Epidemiologists report incidence data as the number of new cases diagnosed in a given year per age-adjusted population, often stratifying their data by age and sex (6). The difference between data collected by a pathologist and data collected by an epidemiologist is that the pathologist's data relate the incidence of one lesion to the incidence of other lesions, whereas an epidemiologist's data relate the number of patients with a lesion to the number of persons in a population group.
The disadvantages of traditionally acquired epidemiologic data have received considerable attention, and introductory texts in epidemiology often discuss these problems. Annual data for the entire U.S. population have been collected since 1935 by the Vital Statistics Program of the National Center for Health Statistics. The data are taken from death certificates, and account for more than 99% of U.S. deaths (7). Death certificate data are notoriously error-prone, and the problems seem to extend beyond national borders, as a similar set of complaints have been voiced in the United States and the United Kingdom (8-12). The most common error occurs when a mode of death is listed as the cause of death (e.g., cardiac arrest, cardiopulmonary arrest), thus nullifying the potential value of the death certificate (13). Another problem related to the use of death certificates is their proven lack of agreement with autopsy data (10).
A recent survey of 49 national and international health atlases has shown that there is virtually no consistency in the way that death data are presented (14). Perhaps the best epidemiologic data are derived from registries targeted for a specific disease, such as the NCDB (National Cancer Database) and SEER (Surveillance, Epidemiology and End Results) databases. The NCDB data consist of tumor registry data collected from 1071 hospitals, and include 409,000 cases from 1990, comprising 39% of the total expected U.S. incidence of cancer (15). SEER data are collected from tumor registries and account for about 10% of the U.S. population (7). Tumor registrars follow the cases, so that clinical outcome can be correlated with specific treatment protocols. Needless to say, disease registries entail enormous expense, and produce data related to a narrow disease definition that may ignore common morphologic and conceptual themes that are important in pathogenesis. For instance, the researcher interested in demographic information pertaining to small cell carcinoma of lung may not be satisfied with demographics that blend information on all cancer types arising in the lung. Furthermore, a researcher interested in small cell carcinoma as a pathologic concept (i.e., as an entity that can arise from many different anatomic sites, including lung) would have little use for epidemiologic data limited exclusively to pulmonary cases. Finally, a researcher interested in obtaining tissue (e.g., from paraffin blocks) needs to examine a database that links diagnoses to surgical pathology reports, and gains nothing from examining population statistics.
THE CODED SURGICAL PATHOLOGY DATABASE
Surgical pathology databases offer many advantages over other forms of disease registries: 1) all biopsied disease entities are included in the database, representing every category of biopsied disease (e.g., metabolic/toxic, traumatic, genetic/congenital, neoplastic, degenerative, inflammatory, infectious); 2) specimens can be characterized not only by diagnosis but also by descriptive terminology that may relate to prognostic or treatment categories; 3) database entries correspond to archived material (glass slides and paraffin blocks) that can be recovered for research purposes; and 4) preparing and coding reports is an established activity of surgical pathology departments, and the resulting coded databases do not require a large additional investment in personnel or funding.
Most pathology departments use their coded databases for the sole purpose of case look-up. Arguably, case look-up could be done better through a simple natural language search (word match) than
through a SNOMED search (16). It should be noted that there are fundamental differences between searching an electronic listing of surgical pathology reports for a diagnostic entry and performing a database analysis of a SNOMED coded surgical pathology database.
A SNOMED database might be contained in an electronic file consisting of thousands of records, each containing a surgical pathology report number followed by a private and unique patient identifier (encrypted by the institution or by the pathology department), by a listing of SNOMED numbers, and by demographic information. Such databases could be distributed and merged with other databases without violating patient privacy. In such a database, it would be possible to perform searches that relate sets of data with other sets of data. Examples might include compiling the occurrence frequency of every pathologic entity in the database, compiling a list of every pathologic entity not included in the database, or stratifying every neoplastic entity in the database by age of incidence. Studies that relate classes of data with other classes of data simply cannot be done by natural language searches. In addition, free text collections of surgical pathology reports cannot be distributed without violating patient privacy, even when the patient identifiers are stripped or encrypted. This is because a free text report contains highly specific information that may permit an unscrupulous person to disclose a patient's name (e.g., specific time and date of the procedure, date of birth of the patient, clinical history, name of the performing surgeon, and precise descriptions of the specimen that might include identifying features).
THE SYSTEMATIZED NOMENCLATURE OF MEDICINE (SNOMED) AND THE INTERNATIONAL CLASSIFICATION OF DISEASES (ICD)
SNOMED consists of a list of terms and concepts in medicine and pathology (19). The original intent of SNOMED was to serve as a means of preparing population-based disease data (20). The most telling indicator of the difficulties associated with analyzing surgical pathology databases is the absence of published reports of organized data summaries encompassing all diagnostic entities encountered in a SNOMED database. The lack of such studies underscores the failure of pathology departments to satisfy the intended goals of indexed coding. According to Cote and Robboy, current systems of disease nomenclature and classification are directly descended from earlier classifications (beginning with the London Bills of Mortality in the early 1700's), created to determine the prevalence of diseases in a population (20). Cote, a principal investigator in the development of SNOMED, suggests that a coding system should serve the needs of the entire health care system and provide data for epidemiologic studies and medical audit (20, 21). To our knowledge, other than our own study (26), no SNOMED-endoded database has ever been used in any epidemiologic study.
The ICD (International Classification of Diseases) is the coding system of choice for the World Health Organization (WHO), the U.S. Government Health Care Finance Administration (HCFA), tumor registries, the SEER project, and the NCDB (22). ICD has been used in many important epidemiologic studies. These published studies have been produced by governmental agencies or by large, funded institutions and have not, for the most part, been the work of pathology departments.
The ICD is not copyrighted. The codes can be freely used and distributed in any publication. A CD-ROM copy of ICD is available through the U.S. Government printing office for $19 (to cover costs of copying and distribution). SNOMED is owned and copyrighted by the College of American Pathologists. SNOMED codes are nondistributable, and cannot be used without paying a license fee to the College of American Pathologists (19). SNOMED has recently come down in price. A single-user license now costs about $300.
The ICD is not designed by pathologists. It contains terms that pathologists would seldom use, and excludes terms that pathologists often use. ICD is a classification of disease with every disease entity fitting into one of 1000 categories. Within each category, diseases are further subclassified by qualifiers (e.g., tuberculosis with extrapulmonary involvement) that make computerized ICD encoding difficult. SNOMED is constructed by pathologists. The latest version of SNOMED is called SNOMED International, and corresponds to SNOMED version 3 (or the third edition of SNOMED). SNOMED International contains about 132,000 terms and includes human as well as comparative (veterinary) nomenclature (23). Computer algorithms for automatic SNOMED encoding are easy to write and modify, and are available from commercial vendors of anatomic pathology software and from the public domain (24-30).
SNOMED is used in a variety of dialects. SNOMED II is incompatible with SNOMED International, although translation tables between the two systems will soon be available on the Internet. The official SNOMED microglossaries lack most of the terms in the full nomenclature. Most vendors truncate their licensed versions without hesitation for memory and speed considerations. Since the ICD is a classification, not a list of terms, it does not lend itself to truncation, and registries tend to use the complete ICD classification.
Examples of other coding systems include CPT (Current Procedural Terminology, copyrighted by the American Medical Association), and the Read System, used primarily in Great Britain (31).
HUMANS CANNOT CODE CONSISTENTLY OR ACCURATELY
The reason that humans cannot code relates to the complexity of coding (SNOMED International has 18 axes and over 130,000 terms). Humans are inconsistent, idiosyncratic, and prone to errors. Studies of coding accuracy show human coding error rates in the range of 10%-15% (32). These studies have divided manual coding errors into five types: (1) Factually correct but unhelpful codes (e.g., coding all benign lesions as `negative for tumor'); (2) Inconsistent codes (coding `dysplasia' on Monday and `atypia' on Tuesday); (3) Idiosyncratic codes (using a mnemonic for a lesion, often inscrutable to other people, such as coding all fungal infections as "fungus ball," under the morphology axis, rather than taking the time to assign a specific code from the infection axis, and remembering that the now private code "fungus ball" must be used for any future fungal searches); (4) Entry errors (e.g., entering `lipoma' when one intends to enter `lymphoma' and accepting the wrong code matched by the software); (5) Incomplete coding due to impatience or laziness.
SNOMED contains ambiguities that impose inherent limitations in human coding. One example is discussed in the SNOMED International Microglossary introduction. The diagnosis of a lesion occurring in the knee can be assigned the topography of Knee Joint T-15720 or Knee T-D9200 (23). In the face of ambiguities, the CAP recommends that each institution establish its own coding conventions. The absence of shared rules of coding would invalidate database merges between different institutions. Another unresolved issue is the syntax of SNOMED itself. There is no stated set of rules for combining parts of a diagnosis that relate a pathologic concept to multiple topographies. Consider the diagnosis, "Adenocarcinoma of prostate metastatic to liver." Which organ (liver, prostate, or both) deserves the topographic code for the specimen? It has been our experience that even when rules of coding are specified and understood by the pathologists, discrepancies in human coding still occur, because different pathologists have different conceptual understandings of morphologic terminology.
AUTOMATIC ENCODING OF SURGICAL PATHOLOGY DATA IS CONSISTENT
The only way of overcoming problems related to the limitations of human coding and the incompatibilities in the wide variety of available nomenclatures is to have the capacity to recode an entire database with any chosen classification or coding algorithm recommended by the parties participating in the database merge. A hospital might wish to merge its data in a regional cooperative study of pathologic material. The member-institutions may decide that all reports be coded according to a specified convention, using the ICD classification. Hospitals that had been coding in SNOMED should be able to comply by automatically recoding their reports.
Automatic coding falls into the general realm of computer translation. Translating a surgical pathology report into SNOMED is not fundamentally different from translating natural languages (e.g., English into German) (24-25,28,30). Computer translators do a fairly good job of coding, and can automatically code any number of reports in a large computer file (24-27). In other words, a department that has never endeavored to code their reports can have all their reports encoded in a single execution of an automatic encoder. Furthermore, the recall (an informatics term analogous to sensitivity) of the computer generated codes is high (27).
One of the greatest strengths of computer generated code databases is that the algorithms used to construct the databases can be tailored to the intended use of the database. For instance, if the intent of constructing a database is to produce an archive of retrievable cases, then the coding algorithms would be written to permit redundant coding so that cases of a particular kind would be included by almost any subsequent search strategy. For instance, a specimen of skin of face may be diagnosed as an actinic keratosis. For maximal inclusivity one would code under the topographies of skin, face, head and the morphologies of actinic keratosis, solar elastosis, dysplasia, reactive atypia, and perhaps even squamous carcinoma and carcinoma in situ. On the other hand, if a database were being prepared to establish incidence data, then constructing an algorithm that chooses a best-match single topography and single morphology for each case might be preferable. In this case, a report that always included the diagnosis (or diagnoses) of a specimen as the first line of the diagnostic field would greatly simplify the task of writing an algorithm for returning a highly accurate code. The key point is that there is no single coding strategy that is optimal for all purposes. It is unthinkable to expect human coders to recode entire databases, whereas database recoding is a simple task for a computer.
PROBLEMS ENCOUNTERED WITH SURGICAL PATHOLOGY DATABASES
The most difficult problem encountered with analyzing data in a coded pathology database does not relate to the reliability of codes. The single most difficult obstacle involves distinguishing multiple specimens received for a single lesion. For instance, a basal cell carcinoma may be biopsied once and excised the following week. These two specimens and their corresponding reports should somehow reflect the fact that the patient had only one basal cell carcinoma. In the University of Florida database, there were more than 6500 patients with more than one surgical pathology report(in a database that included 29,127 patients) (26). Some of these multiple reports represented separate procedures accessioned for the same lesion (e.g., a diagnostic biopsy followed by a wide resection). In any database study, a mechanism must be established that eliminates redundant biopsies while preserving database entries for multiple different lesions in an individual. In our prior study, redundant codes were eliminated by preparing a list of all the topography and morphology codes for each patient and eliminating topography-morphology pairs that shared the same first two digits of their topography and morpholoogy codes. The reason for matching only the first two digits-pairs was to allow for differences among pathologists in their choice of a topography and morphology codes (i.e., idiosyncratic differences in the last three digits). This solution for dealing with code redundancy is just one possibility. The best strategy for dealing with the problems created by report redundancy would be determined by the intended use of the database. Depending on the specific goals of a study, there might be widely different strategies for dealing with the problem of lesion redundancy.
Another important problem that would be faced by any user of a surgical pathology database is reliability of the data. For instance, is every lymphoma in the database diagnosed accurately? One of the advantages of drawing data from a large database is that resulting large numbers tend to smooth out inaccuracies from a small number of unreliable contributors. However, the issue of the diagnostic accuracy of database entries can be resolved by reviewing the actual cases. In studies whose conclusions rely on a relatively small number of cases, it might be practical to review the pathologic material for every case. Studies culled from hundreds or thousands of cases may choose sampling techniques to review a predetermined percentage of cases, and then adjust for the sample error rate in the final statistical analysis.
SIX GUIDELINES FOR SUCCESSFUL ENCODING OF DATA
In order to automatically code a collection of surgical pathology reports, six conditions must hold.
1. The department must be able to download all its reports. Many pathology departments cannot download a complete file of their own reports. Commercial laboratory information systems are geared toward showing specified reports (usually for a particular patient), and cannot always be modified to perform a complete data dump for a large segment of raw data.
2. The download must include every report once only. Every report must have a unique identifier, and the reports must be separated in an unambiguous manner. There is no guarantee that a laboratory information system maintains report data in a nonredundant or consecutive file. For instance, reports may be attached to the patient identifier. Patient identifiers may reside in different computer files, depending on the area of the hospital that treats the patient (oncology inpatient data may reside in a different file than dermatology outpatients).
3. Every patient must have a single, unique identifier. If the patient comes to the clinic, and his name is entered as Kieth Johnson, that patient may get a second identifier on his next visit when his name is entered as Keith Jonson. This separation of reports effectively ruins patient linkages in the hospital database. Every time a patient is admitted to the hospital, the admission clerk should ask: Have you been here before? If the patient's current name or other demographics are different from the previous visit, then the computer system should be able to handle aliases. The unique identification of a patient is one of the most important responsibilities of a medical center. Maintaining a single, unique identifier for each patient requires diligence, and it is often done inadequately.
4. Every report must be divided into unambiguous data fields. If the report is entirely free-text, without any computer-readable divider between the different types of information (history, preoperative diagnosis, microscopic diagnosis, age, etc.), then you cannot effectively do any computer-based analysis of the report.
5. Demographics must be included. The year of birth of the patient, the gender of the patient, the date of the procedure, the anatomic site of the tissue biopsied, and the microscopic diagnosis are the basic demographics and diagnostic information needed for any epidemiologic study.
6. Sentences should satisfy a few basic syntax conventions. Sentences should be short and declarative, with an unambiguous sentence terminator; and negated concepts should be unambiguous (28,30). Computer translators must be able to find the end of each sentence in order to separate distinct pathologic concepts. Unfortunately, the period is not an unambiguous marker for the end of a sentence. Periods appear all over reports in honorifics, abbreviations, numbers, etc. (Dr., Mr., Ph.D., P.I.N., $5.25, 4.2). Using the period as the sentence terminator would result in the abrupt separation and loss of terms that would otherwise be connected (e.g., P.I.N. of prostate would be parsed into four sentences: `P.', `I.',`N.', and `of prostate.'). We suggest a period followed either by a carriage-return or by a double-space (33). In addition, negated concepts in a sentence should be unambiguous. We suggest that every sentence containing a negation-word should be entirely negated by the automatic coding algorithm. This would mean that you cannot combine an affirmative statement and a negative statement in the same sentence. For instance, the sentence "The margins are positive for tumor, and the lymph nodes are negative for metastatic carcinoma." should be expressed as, "The margins are positive for tumor. The lymph nodes are negative for metastatic carcinoma." This also emphasizes the importance of accurate spelling. If "negative" is spelled "negtive", then the computer may code the sentence "The lymph nodes are negtive for carcinoma." as a positive case.
Departments that comply with these guidelines need never worry that they have chosen a poor coding nomenclature or that they have devised rules of coding that are unacceptable to oversight agencies, or that they have purchased coding software of poor quality. As long as the electronic file of surgical pathology reports conforms to the described guidelines and is written in clear, unambiguous prose with correctly spelled words, then reports can be coded automatically with any coding algorithm or any code nomenclature. A similar set of guidelines has been proposed for a multi-institutional autopsy database (34).
USES OF CODED SNOMED DATABASES.
Anatomic pathologists are among the most prolific prose writers. Based on output (6 thousand pages (6 MB) each year based on 4000 reports of average length 1.5 pages), anatomic pathologists far exceed the productivity of virtually all professional writers. Before the advent of computer-based hospital information systems, reports were professionally bound into heavy volumes and placed in a prominent location, awaiting scholarly attention. Today, pathology archives are stored on magnetic disks and kept in an anonymous shelf in a computer facility. The complete collection of electronically archived reports is usually not accessible to pathologists. Individual reports can be retrieved, and designated SNOMED searches can be retrieved, but the intact collection of every report issued is often unobtainable. It is our opinion, based on numerous conversations with pathologists over the past several years, that the computer is often an obstacle that blocks pathologists from the products of their own work. This is particularly disturbing in view of the remarkable opportunities available to pathologists who wish to use computers to analyze their own data (26).
Using a SNOMED database, the pathologist can determine the exact occurrence of every entity encountered in the department, and can compare these data with published data, as a means of detecting inconsistent or anomalous data. For instance, upon review of all neoplasm diagnoses, it may be noted that 50% of the adenocarcinomas of lung are designated as bronchioloalveolar type. Comparison with published data would suggest that bronchioloalveolar carcinoma should account for only 4-8% (35) of lung cancer, prompting a review of the microscopic specimens. Specimen review might indicate that the diagnosis of bronchioloalveolar carcinoma is being overused, and that applying stricter diagnostic criteria would bring the incidence down to levels reported at other institutions. On the other hand, if a review of cases shows that the departmental diagnoses are accurate, then a high incidence of bronchioloalveolar carcinomas might become a valid public health issue. In the past, the observation of a higher-than-expected incidence of a particular type of cancer has prompted important epidemiologic discoveries, in the cases of angiosarcoma of liver in tire plant workers (36) and clear cell carcinoma of vagina in women exposed in utero to DES (37). The advantage of the SNOMED database in this instance is that it permits a listing for every pathologic entity, with frequency of occurrence in the overall population in any subpopulation defined by the demographic information in the database. These demographics would always include gender and age. Some databases would also include race and occupation. The ability to scan every pathologic entity stratified by demographic data can only be achieved with a coded database.
Patient identifiers in a coded pathology database can be encrypted so that the database can be publicly distributed without violating patient privacy. To our knowledge, there are no instances of public distribution of a coded pathology database, and this may relate to medicolegal issues that institutions would prefer to avoid. However, a multi-institutional autopsy database has been proposed, with the intention of depositing autopsy data (stripped of patient identifiers) onto the InterNet (34). The research value of such databases could be enormous. For instance, if institutional pathology databases were available on the InterNet (in a manner analogous to the availablity of GenBank to molecular biologists), a researcher interested in a specific pathologic entity might query these databases to locate available cases. Once the cases were identified, the researcher might request that the institution prepare and mail unstained paraffin sections to a particular laboratory. The institution that provides the database and the glass slides would be permitted to bill the investigator on a per slide basis.
In conclusion, virtually all pathology departments currently save their reports in electronic form. It has been demonstrated that automatic coding of reports is feasible, and that the performance of automatic coders can be improved by using unambiguous prose with simple syntax and correct spelling. In this investigation, we list six guidelines essential for preparing coded pathology databases intended for automatic coding and inter-institutional sharing. Vendors can readily design their anatomic pathology reporting packages in conformity with these guidelines. The availability of coded surgical pathology databases to pathologists and other health care professionals would represent a major advancement for pathologists, and can be attained with a minimal effort.
1. Fraumeni JF, Boice JD: Bone: In: Schottenfeld D, Fraumeni JF (eds), Cancer Epidemiology and Prevention. Saunders, Philadelphia, p 814, 1982
2. Johnson RL, Hartmann WH: The thyroid. In: Silverberg SG, ed., Principles and Practice of Surgical Pathology, Wiley, New York, pp. 1415, 1983
3. Chung EB: Pitfalls in diagnosing benign soft tissue tumors in infancy and childhood. Pathol Annu 20:323, 1985
4. Rubinstein LJ: Tumors of the nervous system. Armed Forces Institute of Pathology, Washington, D.C. p. 2, 1972
5. Malone M: Soft tissue tumours of childhood. Histopathol 23:203, 1993
6. Bailar JC, Smith EM: Progress against cancer? New Engl J Medicine 314:1226, 1986
7. Frey CM, McMillen MM, Cowan CD, Horm JW, Kessler LG: Representativeness of the surveillance, epidemiology, and end results program data: recent trends in cancer mortality rate. JNCI 84:872, 1992
8. Ashworth TG: Inadequacy of death certification: proposal for change. J Clin Pathol 44:265, 19919. Bjornsson J, Jonasson JG, Nielsen GP: The accuracy of death certificates. Lab Inv 66:106A, 199210. Kircher T, Nelson J, Burdo H: The autopsy as a measure of accuracy of the death certificate. New Engl J Med 313:1263, 198511. Kircher T, Anderson RE: Cause of death: proper completion of the death certificate. JAMA 258:349, 1987
12. Erlander D: Computer data processing of medical diagnoses in pathology. Am J Clin Pathol 63:538, 197513. Slater DN: Certifying the cause of death: an audit of wording inaccuracies. J Clin Pathol 46:232, 1993
14. Walter SD, Birnie SE: Mapping mortality and morbidity patterns: an international comparison. Intl J Epidemiology 20:678, 1991
15. Steele GD, Winchester DP, Menck HR: The National Cancer Data Base. Cancer 73: 499, 1994
16. Friedman BA: The impact of new features of laboratory information systems on quality assurance in anatomic pathology. Arch Pathol Lab Med 112:1189, 1988
19. College of American Pathologists: Systematized nomenclature of medicine (SNOMED). College of American Pathologists, Skokie, 1976
20. Cote RA, Robboy S: Progress in Medical Information Management: systematized nomenclature of medicine (SNOMED). JAMA 243:756, 198021. Cote RA, Rothwell DJ: The classification-nomenclature issues in medicine: a return to natural language. Medical Informatics 14:25, 198922. The International Classification of Diseases, 9th Revision: ICD-9CM, Fourth Edition, U.S. Department of Health and Human Services, Public Health Service, Health Care Financing Administration, U.S. Government Printing Office, 199123. Rothwell DJ, Cote RA, Brochu L: The systematized nomenclature of human and veterinary medicine, SNOMED International Microglossary for pathology. College of American Pathologists, Northfield, IL, p 8, 199324. Moore GW, Berman JJ: Object oriented controlled vocabulary translator using TRANSOFT + HyperPAD. Symposium for Computer Applications in Medicine 15:973, 199125. Moore GW, Berman JJ: Automatic SNOMED Coding. Journal of the American Medical Informatics Association (JAMIA) 1:(Supplement) 225, 1994
26. Berman JJ, Moore GW, Donnelly WH, Massey JK, Craig B: A SNOMED analysis of three years' accessioned cases (40,124) of a surgical pathology department: implications for pathology-based demographic studies. Journal of the American Medical Informatics Association (JAMIA) 1:(Supplement) 188, 199427. Moore GW, Berman JJ: Performance analysis of manual and automated Systematized Nomenclature of Medicine (SNOMED) coding. Am J Clin Pathol 101:253, 1994
28. Giere W, Moore GW: Translating English into German using VA File Manager. M Computing, 1:16, 1993
29. Moore GW, Boitnott JK, Miller RE, Eggleston JC, Hutchins GM: Integrated pathology reporting, indexing, and retrieval system using natural language diagnoses. Modern Pathol. 1:44, 1988
30. Moore GW, Wakai I, Satomura Y, and Giere W: TRANSOFT: Medical translation expert system. Artif Intell Med 1:149, 1989
31. Dodd W: Korner, nomenclature and SNOMED. Brit Med J 296:1198, 1988
32. Hall PA, Lemoine NR: Comparison of manual data coding errors in 2 hospitals. J Clin Pathol 39:622, 1986
33. Hutchins GM and the Autopsy Committee of the College of American Pathologists: Practice guidelines for autopsy pathology: autopsy reporting. Arch Pathol Lab Med 119:123, 1995
34. Moore GW, Berman JJ, Hanzlick RL, Buchino JJ, Hutchins GM: A prototype national autopsy data bank: 1625 consecutive fetal and neonatal autopsy facesheets spanning twenty years. Exhibit at College of American Pathologists Conference XXIX: Restructuring Autopsy Practice for Health Care Reform. May 25-26, 1995, Washington, DC.
35. Daly RC, Trastek VF, Pairolero PC, Murtaugh PA, Huang MS, Allen MS, Colby TV: Broncholalveolar carcinoma: factors affecting survival. Ann Thorac Surg 51:368, 1991
36. Hill RB, Anderson RE: Pathologists and the autopsy. Am J Clin Pathol 95:(Suppl) 42, 1991
37. Herbst AL, Ulfelder H, Poskanzer DC: Association of maternal stilbestrol therapy and tumor appearance in young women. New Engl J Med 284:878, 1971