Search results

Filters

  • Journals

Search results

Number of results: 1
items per page: 25 50 75
Sort by:
Download PDF Download RIS Download Bibtex

Abstract

The problem of query selectivity estimation for database queries is critical for efficientquery execution by database management systems. A query execution method strongly depends on earlyestimated size of a query result. This estimation determines a data access method used later during thequery execution. The selectivity parameter is a fraction of table rows that satisfy a single-table querycondition. For a selection condition of a range query where an attribute has a continuous domain, theselectivity is equivalent to a definite integral form probability density function (PDF) of attribute valuesdistribution. For a compound selection condition based on many attributes we need a multidimensionalspace-efficient non-parametric estimator of multivariate PDF of attribute values distribution. A knownapproach based on Discrete Cosine Transform (DCT) spectrum as an representation of multidimensionalPDF is considered. The energy compaction property of DCT lets omit a region of spectrum coefficientswith small absolute values without significant losing an accuracy of selectivity estimation. An area ofrelevant spectrum coefficients is called a sampling zone. Results of experiments from previous worksshows that applying the reciprocal shape of the sampling zone gives the least selectivity estimation errorsubject to a predetermined size of the zone. The main result of this work is a theoretical confirmation of onlyexperimental results from previous works. The paper presents the proof of the theorem that the reciprocalshape of the sampling zone is asymptotically error-optimal. The proof is based on calculus of variationsand the isoperimetric problem.

Go to article

Authors and Affiliations

Dariusz R. Augustyn

This page uses 'cookies'. Learn more