Postdoctoral Research Position: Data Scientist - Biomedical Informatics and Electronic Health Records-based research

Location: 
United States
Job Posted Date: 
December 1, 2021
Opportunities: 
Postdoc Positions at UCSF
Population: 
Medicine

The Bakar Computational Health Sciences Institute (BCHSI) is looking to recruit a Postdoctoral Research Scholar with expertise in biomedical informatics and the analysis of Electronic Health Records (EHR) data. This postdoctoral research scholar will oversee the application of modern methodologies to retrospective and prospective datasets to uncover ‘real-world evidence’ on a range of diseases and treatments. The scholar will utilize several informatics tools, as well as statistical and machine learning methods to address research questions in the context of ongoing collaborations with industry sponsors and the FDA itself. Candidates will be expected to perform innovative research using both structured and unstructured EHR data, and may also utilize publicly available data and resources such as biomedical ontologies.

Successful candidates will be working on real-world data/evidence projects funded by the US Food and Drug Administration (FDA), NIH, and by pharmaceutical and biotech companies.

A primary appointment in BCHSI will be provided, along with an affiliation in the Department of Pediatrics.

BCHSI and UCSF took a leadership role in 2015 in helping organize a central health data warehouse across the entire University of California.  The University of California Health System, with 20 health professional schools (6 medical schools), 6 academic health centers, and 12 hospitals, is the tenth largest health system in the United States by revenue, and has now built a secure central data warehouse (UC Health Data Warehouse, or UCHDW) for operational improvement, promotion of quality patient care, and to enable the next generation of clinical research.  The repository currently securely holds data on over 7 million patients seen since 2012, treated by nearly 100,000 health care providers in over 200 million encounters, with over 560 million procedures, more than 760 million medication orders and prescriptions, and with over 2 billion vital signs measurements and test results. Over 600,000 of these patients are primary care patients.  The majority of these patients live in California, the most diverse state in the United States.  De-identification of the data has already been completed to enable clinical research projects, under guidance from UC campus institutional reviews boards, privacy and compliance officers, and information security officers. This data is stored in the Observational Medical Outcomes Partnership (OMOP) vendor-neutral open data model, enabling a wide range of software tools and computational methods to be used consistently with other state and national efforts.  This data is currently centralized and available in a Microsoft Azure Databricks secure cloud environment enabling research and development in artificial intelligence and machine learning.  This data helps enable a community of more than 400 researchers in machine learning and artificial intelligence at all levels across the 10 campuses and 3 national labs of the University of California.  Successfully recruited candidate would be working with this extremely unique database.

 

Job Requirements: 

REQUIREMENTS

  • PhD in informatics, computer science, epidemiology, biostatistics, or a related discipline with demonstrated experience/expertise in informatics, data science, and/or a medical, dental, nursing or pharmacy degree with demonstrated experience/training in informatics or data science
  • Strong background in clinical informatics and or network/mathematical modeling with a demonstrated interest in health research.

 

PREFERED QUALIFICATIONS

  • Excellent communication skills, and collegiality.
  • Record of peer-reviewed first-author publications in scientific literature.
  • Experience with the use of natural language processing methods, particularly in the clinical domain. Examples include cTAKES, bag-of-words models, and deep learning (e.g. BERT).
  • Experience with the OMOP common data model, SNOMED vocabularies and cohort study design. Working knowledge of OHDSI is a great advantage. Examples include cohort building using SQL or OHDSI-ATLAS, a comprehensive knowledge of R and Spark programing language for statistical computing and large scale data analysis.
  • Academic software engineering abilities (for example, with Python or R).
  • Experience with the use of clinical vocabularies and ontologies (UMLS, RxNorm, LOINC, SNOMED-CT).
  • Leadership and mentorship qualities, especially with undergraduate and graduate students.
How to Apply: 

Qualified candidates should e-mail a cover letter, curriculum vitae, and contact information for three references to Dr. Atul Butte at [email protected] with “post-doc application” in the title of the e-mail.

Location: 
San Francisco
Greater Bay Area
Peninsula
California