Department of Computer Science
 Rutgers University

Home page

Home page  Contact us  Site map 

 

 

 

 

 

CSIMixtures: Context-specific independence mixture modeling for sequence motifs

The modeling and analysis of sequence motives is one central task in the elucidation of biological processes such as gene regulation. The choice of model class is crucial to obtain a representation of the motive suitable for the biological application. For instance previous studies showed that for transcription factors which bind to divergent binding sites, mixtures of multiple PWMs increase performance. However, estimating a conventional mixture distribution for each position will in many cases cause overfitting. We avoid this problem by employing a context-specific independence (CSI) framework. In CSI mixtures model complexity is automatically adapted to match the variability found in a given data set.

Another application of the CSI mixture framework is clustering of protein families for simultaneous inference of subgroups and prediction of specificity determining residues based on multiple sequence alignments of protein families. A Dirichlet mixture prior based on nine basic chemical properties of the standard amino acids is used to regularize the structure learning for protein domain data. Evaluation of the method on several well studied families revealed a good clustering performance and ample biological support for the predicted positions.



Publications

Georgi, Benjamin and Schultz, Jörg and Schliep, Alexander. Partially-supervised protein subclass discovery with simultaneous annotation of functional residues (2009) [details]

Georgi, B. and Schliep, A.. Partially-supervised context-specific independence mixture modeling (2007) [details]

Georgi, B. and Spence, M. A. and Flodman, P. and Schliep, A.. Mixture model based group inference in fused genotype and phenotype data (2007) [details]

Georgi, Benjamin and Schultz, Jörg and Schliep, Alexander. Context-Specific Independence Mixture Modelling for Protein Families (2007) [details]

Georgi, Benjamin and Schliep, Alexander. Context-specific independence mixture modeling for positional weight matrices (2006) [details]

Contact: Benjamin Georgi (georgi@molgen.mpg.de).