Scaling Heterogeneous Distributed Databases and the Design of DISCO

Access to large numbers of data sources introduces new problems for users of heterogeneous distributed databases. End users and application programmers must deal with unavailable data sources. Database administrators must deal with incorporating new sources into the model. Database implementors must deal with the translation of queries between query languages and schemas. The Distributed Information Search COmponent (Disco) addresses these problems. Query processing semantics are developed to process queries over data sources which do not return answers. Data modeling techniques manage connections to data sources. The component interface to data sources flexibly handles different query languages and translates queries. This paper describes (a) the distributed mediator architecture of Disco, (b) its query processing semantics, (c) the data model and its modeling of data source connections, and (d) the interface to underlying data sources.