Query Processing and Inverted Indices in Distributed Text Document Retrieval Systems

Date of Original Version



Working Paper

Rights Management

All Rights Reserved

Abstract or Description

The performance of distributed text document retrieval systems is strongly influenced by the organization of the inverted index. This paper compares the performance impact on query processing of various physical organizations for inverted lists. We present a new probabilistic model of the database and queries. Simulation experiments determine those variables that most strongly influence response time and throughput. This leads to a set of design trade-offs over a wide range of hardware configurations and new parallel query processing strategies.