Using Additional Information in Streaming Algorithms

Leseprobe

Fotogalerie

Raffael Buff

Using Additional Information in Streaming Algorithms

Broschiertes Buch

Jetzt bewerten Jetzt bewerten

Weitere Ausgabe:
eBook, PDF

Leseprobe

Andere Kunden interessierten sich auch für

Raffael Buff
Using Additional Information in Streaming Algorithms

39,99 €
Sayan Putatunda
Practical Machine Learning for Streaming Data with Python

45,99 €
Automata, Languages, and Programming

39,99 €
Automata, Languages, and Programming

77,99 €
Toby Segaran
Beautiful Data

43,99 €
Joseph Richards
Real-World Machine Learning

47,99 €
Anatoly Zelenin
Apache Kafka

44,99 €

Produktbeschreibung

Streaming problems are algorithmic problems that are mainly characterized by their massive input streams. Because of these data streams, the algorithms for these problems are forced to be space-efficient, as the input stream length generally exceeds the available storage. In this thesis, the two streaming problems most frequent item and number of distinct items are studied in detail relating to their algorithmic complexities, and it is compared whether the verification of solution hypotheses has lower algorithmic complexity than computing a solution from the data stream. For this analysis, we introduce some concepts to prove space complexity lower bounds for an approximative setting and for hypothesis verification. For the most frequent item problem which consists in identifying the item which has the highest occurrence within the data stream, we can prove a linear space complexity lower bound for the deterministic and probabilistic setting. This implies that, in practice, this streaming problem cannot be solved in a satisfactory way since every algorithm has to exceed any reasonable storage limit. For some settings, the upper and lower bounds are almost tight, which implies that we have designed an almost optimal algorithm. Even for small approximation ratios, we can prove a linear lower bound, but not for larger ones. Nevertheless, we are not able to design an algorithm that solves the most frequent item problem space-efficiently for large approximation ratios. Furthermore, if we want to verify whether a hypothesis of the highest frequency count is true or not, we get exactly the same space complexity lower bounds, which leads to the conclusion that we are likely not able to profit from a stated hypothesis. The number of distinct items problem counts all different elements of the input stream. If we want to solve this problem exactly (in a deterministic or probabilistic setting) or approximately with a deterministic algorithm, we require once again linear storage size which is tight to the upper bound. However, for the approximative and probabilistic setting, we can enhance an already known space-efficient algorithm such that it is usable for arbitrarily small approximation ratios and arbitrarily good success probabilities. The hypothesis verification leads once again to the same lower bounds. However, there are some streaming problems that are able to profit from additional information such as hypotheses, as e.g., the median problem.

Produktdetails

Produktdetails
Verlag: Diplom.de
Seitenzahl: 132
Erscheinungstermin: 5. Oktober 2016
Englisch
Abmessung: 210mm x 148mm x 10mm
Gewicht: 202g
ISBN-13: 9783961165421
ISBN-10: 3961165424
Artikelnr.: 46170163

Herstellerkennzeichnung
Diplomica Verlag
Hermannstal 119k
22119 Hamburg
info@bedey-media.de

Produktdetails

Verlag: Diplom.de
Seitenzahl: 132
Erscheinungstermin: 5. Oktober 2016
Englisch
Abmessung: 210mm x 148mm x 10mm
Gewicht: 202g
ISBN-13: 9783961165421
ISBN-10: 3961165424
Artikelnr.: 46170163

Herstellerkennzeichnung
Diplomica Verlag
Hermannstal 119k
22119 Hamburg
info@bedey-media.de