Date of Original Version
IEEE/ACM TRANSACTIONS ON COMPUTATIONAL BIOLOGY AND BIOINFORMATICS, VOL. 5, NO. 2, APRIL-JUNE 2008
Abstract or Table of Contents
We consider a combinatorial problem derived from haplotyping a population with respect to a genetic disease, either recessive or dominant. Given a set of individuals, partitioned into healthy and diseased, and the corresponding sets of genotypes, we want to infer “bad” and “good” haplotypes to account for these genotypes and for the disease. Assume, for example, that the disease is recessive. Then, the resolving haplotypes must consist of bad and good haplotypes so that 1) each genotype belonging to a diseased individual is explained by a pair of bad haplotypes and 2) each genotype belonging to a healthy individual is explained by a pair of haplotypes of which at least one is good. We prove that the associated decision problem is NP-complete. However, we also prove that there is a simple solution, provided that the data satisfy a very weak requirement.