Date of Original Version




PubMed ID


Abstract or Description

The arrival of publicly available genome-wide variation data is creating new opportunities for reconciling model-based methods for associating genotypes and phenotypes with the complexities of real genome data. Such data is particularly valuable for testing the utility of models of conserved haplotype structure to association studies. While there is much interest in "haplotype block" models that assume population-wide regions of low diversity, there is also evidence that such models eliminate correlations potentially useful to association studies. We investigate the value of relaxing the rigidity of block models by developing an association testing method using the previously developed "haplotype motif" model, which retains the notion of representing haploid sequences as concatenations of conserved haplotypes but abandons the assumption of population-wide block boundaries. We compare the effectiveness of motif, block, and single-variant models at finding association with simulated phenotypes using real and simulated data. We conclude that the benefits of haplotype models in any form are modest, but that haplotype models in general and block-free models in particular are useful in picking up correlations near the boundaries of the detectable level.

Included in

Biology Commons



Published In

Pacific Symposium on Biocomputing, 454-466.