Support continues to broaden for the work of unlocking the potential of data in the clinical records for the purposes of research and knowledge building. Recently, it was announced that two separate $100K grants were awarded to UCSF investigative teams through the Silicon Valley Community Foundation (SVCF) to do just that. Under the supervision of ICHS Director, Atul Butte, the teams will work to build open source software that would enrich both the information in the clinical records, and access to it.
One team, lead by Jason Crane, aims to develop software that ultimately will help make MRI images more accessible for inquiry via automated methods. Specifically, the software will take existing information that can be extracted from an image and use it to identify similar images in an image archive. This would automate what is now a time-consuming manual process, enabling both the creation of an efficient workflow for clinicians to easily visualize longitudinal changes in medical images, and the enrichment of information in an image archive via auto labeling, thus opening up a whole new capability to perform data intensive studies on a massive medical image archive. Fortunately, the SVCF saw the potential transformative impact on the utility of medical images for research, and chose to lend their support.
The other team, led by Beau Norgeot, aims to make all of the information currently stored electronic health records (EHR) openly available to investigators for research. The key to this effort is de-identification. If all of the individual patients’ personally identifiable information could be removed from the records, while maintaining the essential connections between symptoms, diagnoses, treatments, lab results, and non-identifiable patient descriptors, the EHR could present a treasure trove of knowledge in the form of Big Data.
However, Norgeot’s software development efforts will go beyond simply de-identification to provide access. This team also aims to augment the EHR by extracting information from the free form text in physician’s clinical notes, and structuring the information so that it can be utilized in large-scale computational analyses. It is estimated that 80% of patient information is stored in unstructured formats within the EHR, so efforts to extract and provide structure to the data have incredible potential to augment this resource; and enable an entirely new realm of biomedical inquiry.