Click to Hide.




STEM I Subpage

Implementation of the Graphlet Screening Method in Genomic Analysis Using Hail

Back to STEM I

Project Description

This project aims to improve Genome-Wide Association Studies, or GWAS for short. They connect traits (like eye color, height, etc.) to the genes in your body. GWAS can and has been used to save lives (National Human Genome Research Institute, 2020). GWAS uses a technique called linear regression to help identify the links between traits and genes. Imagine regression as drawing a line of best fit between a bunch of points on a graph -- the goal is to identify a general trend. However, existing gold standards in the industry (such as LASSO and L0-regularization) are not accurate enough to fully link traits and their genes. As such, this project seeks to implement another algorithm, which is better at handling genomic data, into the analysis pipeline. This new method is called Graphlet Screening, as described by Jin et al., 2014. Graphlet Screening uses a two-step screen and clean procedure to more accurately predict these correlations. This is being integrated method with a software called Hail which has already been used to analyze the genome. This will help save lives and make our understanding of complex diseases more accurate.

Grant Proposal / Research Plan


Full Screen

Project Notes


Full Screen