Annotating Genetic Variants for Precision Medicine

Dr. Junwen Wang

Professor of Biomedical Informatics

Center for Individualized Medicine

Departmetn of Health Sciences Research

Mayo Clnic



In the past ten years, Genome-wide association studies (GWAS), whole genome sequencing (WGS) and whole exome sequencing (WES) have identified tens of thousands of genetic variants (GVs) that are associated with human diseases. Surprisingly, the majority (>90%) of these GVs are located in regions that do not code for a protein, and currently annotated as Variants of Unknown Significances (VUS). Many of them are in the regions that harbor regulatory elements (such as promoters and enhancers), which affect the target gene expression through interaction with transcription factors (TFs). Novel methods to detect these GVs, and interpret their functions are major challenges for precision medicine. Mayo Clinic has established Center for Individualized Medicine (CIM) in 2008 to bring genomic discoveries to patient care, and was funded for the first Precision Medicine Initiative (PMI) project by NIH.


I will introduce CIM initiatives in metagenomics, bioinformatics and pharmacogenomics and how they impact patients. I will discuss several methods we recently developed in detecting and prioritizing genetic variants:

1) An ensemble method for GV function annotation. The method incorporates eight different tools, including (CADD, GWAVA, Funseq, GWAS3D, SuRFR, DANN and fathmm-MKL) with a Bayes factor composite model.

2) A model that combines several epigenetic/chromatin features to improve GV’s function prediction in tissue/cell type specific manner. Benchmarked by multiple independent causal variant             datasets, we demonstrated that both methods improve the prediction performance.


The two methods are publicly available at




975 West Walnut Street | Medical Research and Library Building, IB 130 | Indianapolis, IN 46202 | (317) 944-3966