This volume tackles the conflicting requirements of data compression and indexing in massive datasets, by using optimization techniques to improve compression and reconfiguring data structure to increase the efficiency, and speed, of pattern-matching queries.
Data compression is mandatory to manage massive datasets, indexing is fundamental to query them. However, their goals appear as counterposed: the former aims at minimizing data redundancies, whereas the latter augments the dataset with auxiliary information to speed up the query resolution. In this monograph we introduce solutions that overcome this dichotomy. We start by presenting the use of optimization techniques to improve the compression of classical data compression algorithms, then we move to the design of compressed data structures providing fast random access or efficient pattern matching queries on the compressed dataset. These theoretical studies are supported by experimental evidences of their impact in practical scenarios.
Data compression is mandatory to manage massive datasets, indexing is fundamental to query them. However, their goals appear as counterposed: the former aims at minimizing data redundancies, whereas the latter augments the dataset with auxiliary information to speed up the query resolution. In this monograph we introduce solutions that overcome this dichotomy. We start by presenting the use of optimization techniques to improve the compression of classical data compression algorithms, then we move to the design of compressed data structures providing fast random access or efficient pattern matching queries on the compressed dataset. These theoretical studies are supported by experimental evidences of their impact in practical scenarios.