Date of Original Version
Bioinformatics, Vol. 19, No. 9 (2003), 1147-1152.
Abstract or Table of Contents
Motivation: One approach to inferring genetic regulatory structure from microarray measurements of mRNA transcript hybridization is to estimate the associations of gene expression levels measured in repeated samples. The associations may be estimated by correlation coefficients or by conditional frequencies (for discretized measurements) or by some other statistic. Although these procedures have been successfully applied to other areas, their validity when applied to microarray measurements has yet to be tested. Results: This paper describes an elementary statistical difficulty for all such procedures, no matter whether based on Bayesian updating, conditional independence testing, or other machine learning procedures such as simulated annealing or neural net pruning. The difficulty obtains if a number of cells from a common population are aggregated in a measurement of expression levels. Although there are special cases where the conditional associations are preserved under aggregation, in general inference of genetic regulatory structure based on conditional association is unwarranted.