Date of Original Version




Rights Management

All Rights Reserved

Abstract or Description

Various algorithms have been proposed for learning (partial) genetic regulatory networks through systematic measurements of differential expression in wild type versus strains in which expression of specific genes has been suppressed or enhanced, as well as for determining the most informative next experiment in a sequence. While the behavior of these algorithms has been investigated for toy examples, the full computational complexity of the problem has not received sufficient attention. We show that finding the true regulatory network requires (in the worst-case) exponentially many experiments (in the number of genes). Perhaps more importantly, we provide an algorithm for determining the set of regulatory networks consistent with the observed data. We then show that this algorithm is infeasible for realistic data (specifically, nine genes and ten experiments). This infeasibility is not due to an algorithmic flaw, but rather to the fact that there are far too many networks consistent with the data (1018 in the provided example). We conclude that gene perturbation experiments are useful in confirming regulatory network models discovered by other techniques, but not a feasible search strategy.


Proceedings of IJCAI-2003 Workshop on Learning Graphical Models for Computational Genomics, (2003), 22-31.

Included in

Philosophy Commons