I am currently a senior research scientist at DeepMind. I work on deciphering the human genome with machine learning. I lead the AlphaMissense poject at DeepMind. My previous work involved modeling RNA splicing and degradation, as well as predicting variant effect for coding and non-coding variants. I received my PhD from the Technical University of Munich (TUM) at Julien Gagneur’s lab on computational biology. Please refer to my Google Scholar for a complete list of my publications.

Email: s6juncheng [at] gmail [dot] com


Publications

Co-first and co-corresponding authors are indicated with + and * respectively.

Genetic variant interpretation

Biological Discoveries

Computational Immunology

Bioinformatics & Machine learning


Software

Here is a list of open source software that I developed or had major contribution to. These tools are typically implementation of machine learning models originated from research projects.


AlphaMissense

Implementation of AlphaMissense model.

MMSplice & MTSplice

Predict variant effect on splicing. MMSplice is the winning model of the CAGI5 splicing challenge. MMSplice is also integrated in the popular general purpose variant effect predictor CADD. MTSplice enhances MMSplice by predicting tissue-specific variant effect. Currently, Muhammed Hasan Çelik and I are maintaining the tool.

ggpval

A R package to add statistical test and P-value annotations to ggpplot2. Currently, the user community and myself are maintaining the tool.

BERTMHC

A python package to re-train and predict with BERTMHC model, a transformer model to predict binding and presentation of peptides by MHC class II.

DCC

A python package to detect circRNAs from next-generation sequence data. Currently, the Dieterich lab is maintaining the tool.


I contributed the following projects:

kipoi

Kipoi (pronounce: kípi; from the Greek κήποι: gardens) is an API and a repository of ready-to-use trained models for genomics. It currently contains 2133 different models, covering canonical predictive tasks in transcriptional and post-transcriptional gene regulation.

Inviated talks

  • Keynote speaker at CHIL 2023: Biological Sequence Modeling in Research and Applications
  • Guest Lecture at Imperial College, December 2023
  • Guest Lecture Cambridge University Genomic Medicine Society, March 2023
  • Guest Lecture Vrije Universiteit Amsterdam Guest Lecture, Nov 2023
  • Models, Inference & Algorithms (MIA) seminar at Broad Institute, March 2024
  • Oxford ML School, July 2024
  • Keynote at ISMB VarICOSI 2024
  • CZI workshop: Applications of AI to Rare Disease Diagnosis, October 2024