Tabu Search Enhanced Markov Blanket Classifier for High Dimensional Data Sets

Date of Original Version



Working Paper

Abstract or Description

Data sets with many discrete variables and relatively few cases arise in health care, ecommerce, national security, and many other domains. Learning effective and efficient prediction models from such data sets is a challenging task. In this paper, we propose a Tabu Search enhanced Markov Blanket (TS/MB) procedure to learn a graphical Markov Blanket classifier from data. The TS/MB procedure is based on the use of restricted neighborhoods in a general Bayesian network constrained by the Markov condition, called Markov Equivalent Neighborhoods. Computational results from real world data sets drawn from health care domain indicate that the TS/MB procedure converges fast, is able to find a parsimonious model with substantially fewer predictor variables than in the full data sets, gives comparable or better prediction performance when compared against several machine learning methods, and provides insight into possible causal relations among the variables.