Books by Jules J. Berman, covers

THE DIRTY LITTLE SECRET
OF THE HEALTH INFORMATICS INDUSTRY:
Without a reliable patient identifier system,
electronic health records (EHRs) are worthless;
And most systems are unreliable

© 2011 Jules J. Berman
Created June 27, 2011


Billions, possibly trillions of dollars will be spent over the next few decades on health information systems that produce electronic health records for patients (replacing paper-based medical charts). The precise cost of the process is impossible to estimate with any certainty, because the effort is so vast. Computer-based record systems are currently being installed in doctor offices, clinics, hospitals, academic medical centers, and military/government healthcare institutions. Costs mount after a sytem has been installed, and expensive systems are often replaced when an institution is merged with another institution or when the original system is deemed obsolute or otherwise inadequate. To give you some idea of how expensive these systems are, I was recently told that a certain prominent academic medical institution was considering replacing its old system, for about $600 million!

The short term goal for the national effort toward healthcare computerization is to provide an electronic health record (EHR) for each treated patient. The long-term goal is inter-operability (merging records across institutions). The assumption is that these goals are desirable for patients, and for society. I understand that there may be people out there who might disagree with this basic assumption, but I can't deal with that issue here. In this article, I'm just going to examine one, essential problem: patient identifiers.

When you walk into your doctor's office complaining of a sore throat, you don't really need to identify yourself in any rigorous way. The doctor performs a brief physical exam, asks you a few pertinent questions your symptoms, about drug allergies, past history of illnesses, and proceeds to the next step in your treatment ("just a viral syndrome, go home and rest", or "take these antibiotics," or "let's wait for the result of this swab"). The doctor knows who you are; you're the patient in the exam room.

The same is true for lab tests. If you want to know if you carry the gene for Huntington's disease, you might prefer to pay for the test in cash, and to tell the laboratory that your name is James T. Kirk, and that your occupation is starship commander. Providing a false id can preserve the anonymity of a patient. In this case, the primary purpose of establishing an identity for the patient is to distinguish the patient from other patients. It may not make much difference if the provided identity is bogus, just as long as it is temporarily unique (i.e., no other patient named James T. Kirk, asks for the same test, the same day, at the same laboratory).

When you're dealing with non-historical clinical transactions, valid patient identification is not always of paramount importance. When you're dealing with historical treatment transactions, wherein the details of the transaction must be communicated to paying agencies, public health agencies, and for which the transaction is added to a patient's permanent EHR, then reliable patient identification is crucial.

Imagine this scenario. You show up for treatment in the hospital where you were born, and in which you have been seen for various ailments over the past three decades. One of the following events transpires:
1. The hospital has a medical record of someone with your name, but it's not you. After much effort, they find another medical record with your name. Once again, it's the wrong person. After much time and effort, you are told that the hospital has no record for you.

2. The hospital has a medical record of someone with your name, but it's not you. Neither you nor your doctor are aware of the identity error. The doctor provides inappropriate treatment based on information that is accurate for someone else, but not for you. As a result of this error, you die.

3. The hospital has your medical record. After a few minutes with your doctor, it becomes obvious to both of you that the record is missing a great deal of information, relating to tests and procedures done recently and in the distant past. Nobody can find these missing records. You ask your doctor whether your records may have been inserted into the electronic chart of another patient or of multiple patients. The doctor does not answer your question.

4. The hospital has your medical record, but after a few moments, it becomes obvious that the record includes a variety of tests done on patients other than yourself. Some of the other patients have your name. Others have a different name. Nobody seems to understand how these records got into your chart.

5. You are informed that the hospital has changed its hospital information system, and your old electronic records are no longer available. You are asked to answer a long list of questions concerning your medical history. Your answers will be added to your new medical chart. You can't answer any of the questions with much certainty.

6. You are told that your electronic record was transferred to the hospital information system of a large multi-hospital system. This occurred as a consequence of a complex acquisition and merger. The hospital in which you are seeking care has not yet been deployed within the information structure of the multi-hospital system and has no access to your record. You are assured that the record has not been lost and will be accessible within the decade.

7. You arrive at your hospital to find that it has been demolished and replaced by a shopping center. Your electronic records are gone forever.
Here's the problem: The much-praised EHR will have no value if you can't guarantee that every patient has a single record that is unique, accessible, complete, uncontaminated (with records of other patients), permanent, and confidential. This cannot be accomplished without a patient identifier system. Hospital information systems alone (even systems that cost $600 million dollars) cannot provide an adequate patient identifier system.

Establishing a reliable patient identifier system is an important task for hospital-based informaticians. The purpose of this opinion piece is to list the features of a patient identifier system, emphasizing the essential role of identifiers in healthcare and biomedical research.

A patient identifier is an alphanumeric string that uniquely identifies the patient. A useful system that employs patient identifiers should have all of the following features.

1. Completeness. Every patient encountered by the hospital*, living or deceased, must be provided with an identifier.

2. Exclusivity. Each patient's identifier must belong to no other patient. Without exclusivity, information on a patient can be scattered among the records of other patients.

3. Uniqueness. A patient can only be assigned one identifier. This means that each patient is registered into the system one time only.

4. Authenticity. It is a terrible mistake to assume that all patients wish to comply with the registration process. A patient may be highly motivated to provide false information to a registrar, or to acquire several different registration identifiers, or to seek a false registration under another person's identity (i.e., commit fraud), or to forego the registration process entirely. In addition, it is a mistake to believe that honest patients are able to fully comply with the registration process. Language barriers, cultural barriers, poor memory, poor spelling, and a host of errors and misunderstandings can lead to duplicative or otherwise erroneous identifiers. It is the job of the registrar to follow hospital policies that overcome these difficulties.

Registration should be conducted by a trained registrar who is well-versed in the registration policies established by the institution. Registrars may require patients to provide a full legal name, any prior held names (e.g. maiden name), date of birth, and a government issue photo id card (e.g., driver's license or photo id card issued by the department of motor vehicles). In my opinion, registration should require a biometric identifier (e.g., fingerprints, retina scan, iris scan, voice recording, photograph). If you accept the premise that hospital have the responsibility of knowing who it is that they are treating, then obtaining a sample of DNA from every patient, at the time of registration, is reasonable. The DNA can be used to create a unique patient profile from a chosen set of informative loci; a procedure used by the CODIS system developed for law enforcement agencies. The registrar should document any distinguishing and permanent physical features that are plainly visible (e.g., scars, eye color, colobomas, tattoos).

The common policy of requiring patients to provide a social security number is of questionable merit. It is likely to produce many false numbers, due to entry error, false memory or the intention to deceive. Efforts to reduce errors by requiring patients to produce their original social security cards puts an unreasonable burden on the honest patient (who does not happen to carry his/her card) and provides an advantage to the dishonest patient (who can easily forge a card).

Hospitals that compel patients to provide a social security number have dubious legal standing. The social security number was originally intended as a device for validating a person's standing in the social security system. More recently, the purpose of the social security number has been expanded to track taxable transactions (i.e., bank accounts, salaries). Other uses of the social security number are not protected by law. The Social Security Act (Section 208 of Title 42 U.S. Code 408) prohibits most entities from compelling anyone to divulge his/her social security number. Legislation or judicial action may one day stop healthcare institutions from compelling patients to divulge their social security numbers as a condition for providing medical care. Considering the unreliability of social security numbers in the hospital setting, and considering the tenuous legitimacy of requiring patients to divulge their social security numbers as a condition for providing care, a prudently designed medical identifier system will limit its reliance on these numbers.

Neonatal and pediatric identifiers pose a special set of problems for registrars. It is quite possible that a patient born in a hospital and provided with an identifier, will return, after a long hiatus, as an adult. An adult should not be given a new identifier when a pediatric identifier was issued in the remote past. Every patient who comes for registration should be matched against a database of biometric data that does not change from birth to death (e.g., fingerprints, DNA).

There are many reasons why identifiers should only be assigned by a trained registrar; authentication tops the list.

5. Aggregation. All documents related to the patient must be included in the patient's electronic chart (i.e., linked to the patient's identifier). This requirement seems obvious, but most institutions have a variety of different services that are not fully integrated into a central information system. It is a disservice to the patient and a corruption to the information system when a bone marrow biopsy report, rendered by a pathologist, and linked to the patient's identifier, becomes part of the patient electronic chart; but a blood smear, reviewed by hematologists within in the department of medicine, does not.

6. Permanence. The identifiers and the associated clinical data must be permanent. When the patient returns to your hospital after 30 years of absence, the record system must be able to access his identifier. When a patient dies, the patient's identifier must be saved. When a new information system is deployed, the old identifiers must be ported into the new system. When a hospital is bought by a multi-institutional healthcare organization, the integrity of the identifiers must be maintained. Even when a hospital ceases to exist entirely, provisions must be made to preserve the medical identifiers and their links to clinical data. In my opinion, a plan to preserve identifiers and records should be part of every patient identifier system; every hospital should set aside money, in an escrow fund, for this purpose.

7. Confidentiality. The alphanumeric character string composing the identifier should not expose the patient's identity. For example, a character string consisting of a concatenation of the patient's name, birthdate, social security number might serve to identify a patient. It would also serve as a device that could be used to steal a person's identity. Patient identifiers should carry no embedded information, patient-related or otherwise. The best patient identifier is a random string of alphanumerics. An addendum to the patient identifier may consist of another alphanumeric string that identifies the hospital.

8. De-identification. Where there is no identification, there can be no de-identification. Scientific research conducted on hospital data almost always requires the preparation of a dataset that is either anonymized (wherein all links to patient are removed), de-identified (wherein identifiers are coded or removed), or conformant to "minimum necessary" guidelines (wherein private data and data irrelevant to the study are removed).

Attempts to de-identify a poorly identified dataset of clinical information will result in replicative records (multiple records for one patient), mixed-in records (single records composed of information on multiple patients), and missing records (unidentified records lost in the de-identification process).

In selected cases, an Institutional Review Board or Privacy Board may approve research on identified medical data. Nonetheless, research has no value unless the primary data, that supports the research conclusions, can be freely distributed to the scientific community. The preparation of clinical datasets for public inspection requires a rigorous de-identification process. De-identification would seem to be a necessary step for all medical record research, either as a pre-analytic step or a post-analytic step. Because de-identification requires identification, it seems that research on clinical records cannot proceed in the absence of reliable identifiers.

9. Immutability. The patient identifier must never change. In the event that legacy data is merged into another information system, a new identifier may be assigned, but the original identifiers must be preserved with the new identifier (see 6. Permanence).

Immutability is jeopardized by data tampering. In a cyber attack, the identifiers may be altered, exchanged, erased, or otherwise corrupted. Unanticipated system errors may produce the same effects. A good hospital information system maintains back-up copies and continually tests the information system to detect any discrepancies between the back-up and real-time identifiers.

10. Reconciliation. The process of identifier registration produces patient-related data that can be used to determine whether an identifier from another institution, links to the same patient. When the two identifiers are reconciled, the records associated with each identifier can be merged.

In the U.S., reconciliation is needed because U.S. Citizens are not provided with a National Patient Identifier. Though the the Health Insurance Portability and Accountability Act of 1996 mandated a Unique Individual Identifier for healthcare purposes, the U.S. congress has since opposed this provision. Specifically, the 1998 Omnibus Appropriations Act (Public Law 105-277) blocked funding for a final standard on unique health identifiers. It seems unlikely that a national patient identifier system will be deployed in the U.S. anytime soon.

In the absence of a national patient identifier, institutions that need to merge records may achieve some measure of success if the institutions have a patient identifier system that complies with the principles described herein. In this case, an identified record in one institution can be unmistakably merged with an identified record in another institution, when both records belong to the same patient. Reconciliation is also useful when an institution is acquired by another institution or when the legacy data from an abandoned information system is merged with the data in a newly deployed information system. In all these instances, reconciliation can only work if all of the institutional systems have good patient identifier systems.

The use of reconciliation techniques has been adapted, quite perversely, for instances wherein an institution seeks to merge records contained in dispersed information systems that have poor patient identifier systems. Despite claims to the contrary, there is no possible way by which information systems with poor identifier systems can be sensibly reconciled.

Consider this example. A hospital has two separate registry systems: one for dermatology cases and another for psychiatry cases. The hospital would like to merge records from the two services. Because of sloppy identifier practices, a sample patient has been registered 10 times in the dermatology system, and 6 times in the psychiatry system, each time with different addresses, social security numbers, birthdates and spellings of the name. A reconciliation algorithm is applied, and one of the identifiers from the dermatology service is matched positively against one of the records from the psychiatry service. Performance studies on the algorithm indicate that the merged records have a 99.8% chance of belonging to the same patient. So what? Though the two merged identifiers correctly point to the same patient, there are 14 (9 + 5) residual identifiers for the patient still unmatched. The patient's merged record will not contain his complete clinical history. Furthermore, in this hypothetical instance, analyses of patient population data will mistakenly attribute one patient's clinical findings to as many as 15 different patients, and the set of 15 records in the corrupted de-identified dataset may contain mixed-in information from an indeterminate number of additional patients! If my analysis seems harsh, consider these words, from the HIMSS White Paper (see reference link below).
"A local system with a poorly maintained or "dirty" master person index (MPI) will only proliferate and contaminate all of the other systems to which it links."
11. Documentation and Quality Assurance. A system should be in place to find and correct errors in the patient identifier system. Review procedures should determine whether the errors were corrected properly and measures should be taken to continually improve the patient identifier system. All registration procedures, all reviews of the patient identifier system, all actions taken, and all modifications of the system should be thoroughly documented.

12. Centrality. Whether the information system belongs to a savings bank, an airline, a prison system, or a hospital, identifiers play the central role. You can think of these information systems as a scaffold of identifiers annotated with documents. In the case of a hospital information system, every patient transaction is provided with a patient identifier, and the patient identifier is linked to the original set of validating data that was collected and certified by the registrar.

13. Autonomy. It is tempting to believe that a reliable patient identifier system is something that vendors of information systems provide to clients, as a basic software feature. This is not true. The patient identifier system may interface with the information system, but the identifier process must be designed, deployed, tested and improved by well-trained and vigilant hospital staff. The procedures for deploying the patient identifier system can be put in place before the acquisition of a hospital information system, and they should stay in place after the hospital information system is abandoned. The patient identifier system should work just as effectively with a paper-based record system as with a computerized record system. In my opinion, it is far better to have a good patient identifier system with no computerized records (i.e., a paper-based record system), than to have a computerized record system with corrupted patient identifiers.

Summary. Developing a reliable patient identifier system requires enormous work and diligence, but the effort is justified. Until you have solved the patient identifier problem, everything in medical informatics is difficult. After you have solved the patient identifier problem, medical informatics becomes a bit easier.

For further reading, I recommend the following white paper written by the Healthcare Information and Management Systems Society (HIMSS)

Patient Identity Integrity. A White Paper by the HIMSS Patient Identity Integrity Work Group, December 2009. link to free HIMSS document

* The word "hospital" is used throughout, as a generic term to indicate any healthcare service or organization.

About the author

After receiving two bachelor of science degrees (mathematics and earth sciences from MIT), I entered the graduate program in pathology at Temple University, where I began my thesis work within the Fels Cancer Research Institute. I spent the final year of my graduate studies at American Health Foundation in Valhalla, New York, before beginning my post-doctoral studies in the Laboratory of Experimental Pathology at the U.S. National Cancer Institute. I earned a medical degree from the University of Miami, followed by a pathology residency at George Washington University Medical Center in Washington, D.C. I became Board Certified in Anatomic Pathology and in Cytopathology, and served as the chief of Anatomic Pathology, Surgical Pathology and Cytopathology at the Veterans Administration Medical Center in Baltimore, Maryland. While working at the Baltimore VA Medical Center, I held appointments at the University of Maryland Medical Center and at the Johns Hopkins Medical Institutions. In 1998, I became the Program Director for Pathology Informatics in the Cancer Diagnosis Program at the U.S. National Cancer Institute. In 2006, I became President of the Association for Pathology Informatics. My name has appeared as a co-author on hundreds of scientific contributions, and I have written, as first author, more than 100 publications. Today I am a free-lance author and have written extensively in my three areas of expertise: medical informatics, computer programming, and cancer biology.

A list of my publications is available.

Also, some of my previously published works on the topic of patient identification or de-identification are as follows:

My panelist presentation to HHS workshop on patient de-identification

My post-workshop panelist statement, submitted to HHS:

My paper describing a protocol for reconciling patient identifiers without violating patient privacy:

My paper on data scrubbing

Key words: hospital information systems, electronic medical record, electronic health record, emr, ehr, medical errors, hospital errors, electronic medical chart, medical informatics, pathology informatics, clinical information systems, healthcare interoperability, healthcare management, medical reports, patient identification, patient privacy, patient confidentiality




Web site: http://www.julesberman.info/
Machiavelli's Laboratory blog site: http://machiavelli-lab.blogspot.com/


Books by Jules J. Berman, covers