biomedical informatics cover Perl Programming for Medicine and Biology Cover

Biomedical Informatics books
by Jules J. Berman

  • Jones & Bartlett sales and informational website for Biomedical Informatics
  • Amazon.com U.S. book site for Biomedical Informatics
  • Full Table of Contents from Library of Congress for Biomedical Informatics
  • List of book-related resources
  • Brief author biography on Association for Pathology Informatics Website
  • Quick link to PubMed listing for Jules J. Berman
  • Full list of Publications for Jules J. Berman
  • Dr. Bruce Friedman's review of Biomedical Informatics
  • Perl Programming for Medicine and Biology companion site.
  • Author's blog on data specifications
  • Contact author




  • Abstract submitted to: The Center for Open Source in Government

    Conference: Open Source for National and local eGovernment Programs in the U.S. and EU

    January 2, 2003

    Title: Open Source Confidentiality Methods

    Scientific progress requires the free exchange of research data. Because medical research is often conducted using confidential records, medical researchers have historically refused to share their primary data, thus denying other scientists the opportunity of using these data sets for further research.

    Pressured by federal regulations restricting the use of identified medical records (HIPAA and the Common Rule), and by recent data-sharing proposals from NIH and from publishers, researchers have devised a variety of innovative technical solutions that permit researchers to obtain and share large data sets derived from medical records without breaching patient confidentiality. Some of the methods used are: one-way hashing of patient identification fields (such as name and social security number),data scrubbing (removing private information from free-text), and threshold splitting (dividing text into multiple files, any one of which can be shared and used for scientific purposes without breaching confidentiality), and data ambiguating (ensuring non-uniqueness of records). Using these methods, large medical data sets can be safely used for research without obtaining patient consent and can be shared by the scientific community. These methods and their available open source implementations will be discussed.

    Speaker Biography:

    Jules Berman is a pathologist/Perl programmer and program director for pathology informatics in the National Cancer Institute's Cancer Diagnosis Program. For the past decade, he has been developing ways to organize, index and share free-text medical data and large heterogeneous biomedical data sets.


    Jules J. Berman, Ph.D., M.D.
    Program Director, Pathology Informatics
    Cancer Diagnosis Program, DCTD, NCI, NIH