OptumLabs Data Science Internship Program

University of California, San Francisco
Institute for Computational Health Sciences
Summer 2018

Program summary

This program will appeal to those student/trainees who wish to apply their Data Science training to the field of biomedicine and health care, an area ripe for innovation.

The UCSF Institute for Computational Health Sciences (ICHS) administers the OptumLabs Data Science Internship Program (“Program”). This Program aligns with UCSF’s core Research Computing Capability initiative, in particular by providing the Natural Language Processing (NLP) tools to retrieve information from free-form text in the clinical notes; as well as performing and optimizing that Information Retrieval. Clinical notes and free-form textual publications are increasingly seen as critical sources of data for biomedical and health research, practice and training.

Some of the challenges that must be addressed in order to unlock the full potential are:

  • Ensuring patient privacy via data de-identification
  • Extracting and structuring information into computable data
  • Standardizing data formats to ensure interoperability
  • Integrating other specialty data capture systems
  • Automating pipelines for data extraction

UCSF is currently in the second year of a five-year program to build this infrastructure and extract Information from unstructured data sources in support of our mission.

Intern Opportunities

UCSF seeks two to three students from across the UC campuses to participate in the Program. During their summer internship period (extendable part-time into the Fall by mutual agreement), students will carry out specific projects that build upon efforts to create the infrastructure that supports this information extraction effort, as well as perform that extraction on real clinical notes. Specific tasks include:

  • Evaluate NLP tools to extract information from free-form text in the clinical notes
  • Enhance, as needed, these NLP tools
  • Mine the clinical records of our 54M patients
  • Develop a standardized structure and pipeline for the extracted information

Desired Background

  • Substantive coursework in Data Science and Computer Science, specifically related to Information Retrieval at scale
  • Coursework in or demonstrated interest in human biology or pre-med fields
  • Undergraduate rising juniors and seniors, and graduate students, will be given preference

Internship Details

  • Interns are expected to work full-time (40 hrs/wk) for three to four months over June – September (specific schedule and dates flexible)
  • Interns will be based at University of California, San Francisco, Mission Bay Campus
  • Some work may be done remotely, but interns are expected to be physically present for meetings throughout the internship period
  • Housing will not be provided
  • Interns must be eligible to work in the US during the internship period
  • Interns will be paid at a rate of $29/hr plus benefits

To Apply

Please send a cover letter and CV to the Program Contact, and include “OptumLabs Internship” in the subject line:

Angela Rizk-Jackson, PhD
Director of Operations
UCSF Institute for Computational Health Sciences