Image Credit: Science Learning Hub – Pokapū Akoranga Pūtaiao, University of Waikato, www.sciencelearn.org.nz
Background: Traditional Genome-wide association studies look for the relationship between a phenotype and genotypes. These have usually only assayed a small number of phenotypes and a most a small number of environmental factors to use as covariates. Some covariates are easy to measure, like age, gender, income. But others are much more challenging, such as diet or excercise, exposure to poor air quality… and so often these go unmeasured and contribute noise to to the GWAS signals.
Hypothesis: With the growing availability of databases such as the UK BioBank or university hospital systems, many more phenotypes are traits are assayed for each individual. While environmental exposures are not measured, it may be possible to infer environmental compentants using unimportant traits, and then used the inferred components as covariates for a target trait or phenotype. Doing this should reduce environmental noise and increase power.
I mentored two BIG Summer students at UCLA on this project in Summer 2019. They developed the inference method and produced simulation studies showing the method should work. The next steps are to increase the size of the simulations in order to scale to realistic sizes, and then finally to apply the method to real data from the UKBioBank or the Finnish cohort.