Department of Computer Science
 Rutgers University

Home page

Home page  Contact us  Site map 

 

 

 

 

SLiQ: Simple linear inequalities based Mate-Pair reads filtering and scaffolding

Scaffolding is an important sub-problem in de novo genome assembly in which mate pair data are used to construct a linear sequence of contigs separated by gaps. A set of simple linear inequalities (SLIQ) derived from the geometry of contigs on the line can be used to predict the relative positions and orientations of contigs from individual mate pair reads and thus produce a contig digraph. The SLIQ inequalities can also filter out unreliable mate pairs and can be used as a pre-processing step for any scaffolding algorithm. This tool filters mate pairs and then produces a Directed Contig Graph (contig diGraph). We also provide a Naive scaffolder that can then produce scaffolds out of the contig diGraph.

The Python scripts and a 'readme' file containing the instructions are available for download here.

Publications

Roy, Rajat S.. Improving genome assembly by identifying reliable sequencing data (2014) [details]