Multi-dimensional Selectivity Estimation Using Compressed Histogram Information

Ju Hong Lee, Deok Hwan Kim, Chin Wan Chung

Research output: Contribution to journalArticlepeer-review

118 Scopus citations

Abstract

The database query optimizer requires the estimation of the query selectivity to find the most efficient access plan. For queries referencing multiple attributes from the same relation, we need a multi-dimensional selectivity estimation technique when the attributes are dependent each other because the selectivity is determined by the joint data distribution of the attributes. Additionally, for multimedia databases, there are intrinsic requirements for the multi-dimensional selectivity estimation because feature vectors are stored in multi-dimensional indexing trees. In the 1-dimensional case, a histogram is practically the most preferable. In the multi-dimensional case, however, a histogram is not adequate because of high storage overhead and high error rates. In this paper, we propose a novel approach for the multi-dimensional selectivity estimation. Compressed information from a large number of small-sized histogram buckets is maintained using the discrete cosine transform. This enables low error rates and low storage overheads even in high dimensions. In addition, this approach has the advantage of supporting dynamic data updates by eliminating the overhead for periodical reconstructions of the compressed information. Extensive experimental results show advantages of the proposed approach.

Original languageEnglish
Pages (from-to)205-213
Number of pages9
JournalSIGMOD Record
Volume28
Issue number2
DOIs
StatePublished - Jun 1999

Fingerprint

Dive into the research topics of 'Multi-dimensional Selectivity Estimation Using Compressed Histogram Information'. Together they form a unique fingerprint.

Cite this