Reconstructing tumor phylogenies from heterogeneous single-cell data.

Date of Original Version




PubMed ID


Abstract or Description

Studies of gene expression in cancerous tumors have revealed that tumors presenting indistinguishable symptoms in the clinic can be substantially different entities at the molecular level. The ability to distinguish between these genetically distinct cancers will make possible more accurate prognoses and more finely targeted therapeutics, provided we can characterize commonly occurring cancer sub-types and the specific molecular abnormalities that produce them. We develop a new method for identifying these common tumor progression pathways by applying phylogeny inference algorithms to single-cell assays, taking advantage of information on tumor heterogeneity lost to prior microarray-based approaches. We combine this approach with expectation maximization to infer unknown parameters used in the phylogeny construction. We further develop new algorithms to merge inferred trees across different assays. We validate the expectation maximization method on simulated data and demonstrate the combined approach on a set of fluorescent in situ hybridization (FISH) data measuring cell-by-cell gene and chromosome copy numbers in a large sample of breast cancers. The results further validate the proposed computational methods by showing consistency with several previous findings on these cancers and provide novel insights into the mechanisms of tumor progression in these patients.




Published In

Journal of bioinformatics and computational biology, 5, 2a, 407-427.