Date of Original Version
Copyright © 1996 by the VLDB Endowment. Permission to copy without fee all or part of this material is granted provided that the copies are not made or distributed for direct commercial advantage, the VLDB copyright notice and the title of the publication and its date appear, and notice is given that copying is by the permission of the Very Large Data Base Endowment. To copy otherwise, or to republish, requires a fee and/or special permission from the Endowment.
Abstract or Description
There is mounting evidence [Man77, Sch91] that real datasets are statistically self-similar, and thus, `fractal'. This is an important insight since it permits a compact statistical description of spatial datasets; subsequently, as we show, it also forms the basis for the theoretical analysis of spatial access methods, without using the typical, but unrealistic, uniformity assumption.
In this paper, we focus on the estimation of the number of number of quadtree blocks that a real, spatial dataset will require. Using the the well-known Hausdorff fractal dimension, we derive some closed formulas which allow us to predict the number of quadtree blocks, given some few parameters. Using our formulas, it is possible to predict the space overhead and the response time of linear quadtrees/z-orderings [OM88], which are widely used in practice. In order to verify our analytical model, we performed an extensive experimental investigation using several real datasets coming from different domains. In these experiments, we found that our analytical model agrees well with our experiments as well as with older empirical observations on 2-d [Gae95b] and 3-d [ACF+94] data.
Proceedings of 22th International Conference on Very Large Data Bases.