![]() |
CiteULike | ![]() |
lemire's CiteULike | ![]() |
![]() |
|
![]() |
Register | ![]() |
Log in | ![]() |
Reordering Columns for Smaller Indexesby: Daniel Lemire, Owen Kaser
|
Reviews
[Write a review of this article]
Find related articles from these CiteULike users
Find related articles with these CiteULike tags
Posting History
AbstractColumn-oriented indexes-such as projection or bitmap indexes-are compressed by run-length encoding to reduce storage and increase speed. Sorting the tables improves compression. On realistic data sets, permuting the columns in the right order before sorting can reduce the number of runs by a factor of two or more. For many cases, we prove that the number of runs in table columns is minimized if we sort columns by increasing cardinality. Yet-maybe surprisingly-we must sometimes maximize the number of runs to minimize the index size. Experimentally, sorting based on Hilbert space-filling curves is poor at minimizing the number of runs.
BibTeX record
RIS record