I develop methods to analyze genome sequencing data in the context of other ‘omics and clinical health data to prioritize and functionally interpret genetic variants with roles in human disease. See my CV for details.

integration of patient phenotype in variant prioritization Image from our Nat Comm, 2023 paper.

Functional genomics for case-based, “N-of-1” analyses

The genome is a big space, and accurately pinpointing variants that underlie specific human health conditions is a formidable challenge. Traditionally, genes have been treated as black box functional units, but we now know that individual variants within and between genes can have wildly different impacts on phenotypes and disease. Because comprehensive, in vivo (in a living system) functional assessment of all possible genetic variants is often infeasible, we instead turn to in silico (computational) variant functionality predictions. We develop integrative tools for assessing the functionality of specific genomic positions and are interested in leveraging multimodal biological and biomedical data to derive new insights on the functional impact of genetic variants. [30535108, 33580225]

Integration of clinical phenotyping

Patient clinical phenotyping data is an essential component in interpreting the impact of genetic variants on human health. Phenotyping data can be noisy, unstructured, and difficult to obtain, and utilizing this information often requires deep clinical intuition. We are interested in developing computational approaches for streamlining the extraction and curation of standardized phenotype data, and leveraging this data for improving diagnostic gene prioritization and interpretation. [37828001, medRxiv, bioRxiv]

Deriving insights from population-level analyses

Even though the genome is a big space, it is also a finite space with respect to simple (single nucleotide and short insertion/deletion) variants, which are the most easily detected and interpreted variant class. This means that as the number of sequenced genomes continues to grow, we will begin to see a saturation of all possible simple variants that are compatible with life, as well as recurrence of disease-relevant variants in phenotypically-matched patient cohorts. Indeed, the number of sequenced tumor genomes has surpassed 10s of thousands; collective cohorts of sequenced Mendelian patients is exceeding 100s of thousands; and ancestrally-diverse, control cohorts of healthy sequenced individuals is set to pass a million. By integrating predicted and experimentally-derived variant functionality information, evolutionary signals of selection and constraint, and accurate mutational models, we will have the power to detect extremely rare variants that play roles in human cancers and other diseases by jointly analyzing these sequenced cohorts. [32711844, bioRxiv]

Publications

:star: = project lead, :love_letter: = corresponding author, :busts_in_silhouette: = team science