Department of Computer Science
 Rutgers University

Home page

Home page  Contact us  Site map 




Mixture model based group inference in fused genotype and phenotype data

B. Georgi, M.A. Spence, P. Flodman and A. Schliep

In Studies in Classification, Data Analysis, and Knowledge Organization, Springer, 2007.

The analysis of genetic diseases has classically been directed towards establishing direct links between cause, a genetic variation, and effect, the observable deviation of phenotype. For complex diseases which are caused by multiple factors and which show a wide spread of variations in the phenotypes this is unlikely to succeed. One example is the Attention Deficit Hyperactivity Disorder (ADHD), where it is expected that phenotypic variations will be caused by the overlapping effects of several distinct genetic mechanisms. The classical statistical models to cope with overlapping subgroups are mixture models, essentially convex combinations of density functions, which allow inference of descriptive models from data as well as the deduction of groups. An extension of conventional mixtures with attractive properties for clustering is the context-specific independence (CSI) framework. CSI allows for an automatic adaption of model complexity to avoid overfitting and yields a highly descriptive model.