Information Retrieval (IR) models are a core component of IR research and IR systems. The past decade brought a consolidation of the family of IR models, which by 2000 consisted of relatively isolated views on TF-IDF (Term-Frequency times Inverse-Document-Frequency) as the weighting scheme in the vector-space model (VSM), the probabilistic relevance framework (PRF), the binary independence retrieval (BIR) model, BM25 (Best-Match Version 25, the main instantiation of the PRF/BIR), and language modelling (LM). Also, the early 2000s saw the arrival of divergence from randomness (DFR). Regarding…mehr
Information Retrieval (IR) models are a core component of IR research and IR systems. The past decade brought a consolidation of the family of IR models, which by 2000 consisted of relatively isolated views on TF-IDF (Term-Frequency times Inverse-Document-Frequency) as the weighting scheme in the vector-space model (VSM), the probabilistic relevance framework (PRF), the binary independence retrieval (BIR) model, BM25 (Best-Match Version 25, the main instantiation of the PRF/BIR), and language modelling (LM). Also, the early 2000s saw the arrival of divergence from randomness (DFR). Regarding intuition and simplicity, though LM is clear from a probabilistic point of view, several people stated: "It is easy to understand TF-IDF and BM25. For LM, however, we understand the math, but we do not fully understand why it works." This book takes a horizontal approach gathering the foundations of TF-IDF, PRF, BIR, Poisson, BM25, LM, probabilistic inference networks (PIN's), and divergence-basedmodels. The aim is to create a consolidated and balanced view on the main models. A particular focus of this book is on the "relationships between models." This includes an overview over the main frameworks (PRF, logical IR, VSM, generalized VSM) and a pairing of TF-IDF with other models. It becomes evident that TF-IDF and LM measure the same, namely the dependence (overlap) between document and query. The Poisson probability helps to establish probabilistic, non-heuristic roots for TF-IDF, and the Poisson parameter, average term frequency, is a binding link between several retrieval models and model parameters. Table of Contents: List of Figures / Preface / Acknowledgments / Introduction / Foundations of IR Models / Relationships Between IR Models / Summary & Research Outlook / Bibliography / Author's Biography / Index
Produktdetails
Produktdetails
Synthesis Lectures on Information Concepts, Retrieval, and Services
Thomas Roelleke holds a Dr rer nat (Ph.D.) and a Diplom der Ingenieur-Informatik (MSc in Engineering & Computer Science) of the University of Dortmund. After school education in Meschede, Germany, he attended the b.i.b., the Nixdorf Computer school for professions in informatics, in Paderborn. Nixdorf Computer awarded him a sales and management trainee program, after which he was appointed as product consultant in the Unix/DB/4GL marketing of Nixdorf Computer. He studied Diplom-Ingenieur-Informatik at the University of Dortmund (UniDo), and was later a lecturer/researcher at UniDo. His research focused on probabilistic reasoning and knowledge representations, hypermedia retrieval, and the integration of retrieval and database technologies. His lecturing included information/database systems, object-oriented design and programming, and software engineering. He obtained his Ph.D. in 1999 for the thesis titled "POOL: A probabilistic object-oriented logic for the representation and retrieval of complex objects - a model for hypermedia retrieval." Since 1999, he has been working as a strategic IT consultant, founder and director of small businesses, research fellow, and lecturer at the Queen Mary University of London (QMUL). Research contributions include a probabilistic relational algebra (PRA), a probabilistic object-oriented logic (POOL), the relational Bayes, a matrix-based framework for IR, a parallel derivation of IR models, a probabilistic interpretation of the BM25-TF based on "semi-subsumed" event occurrences, and theoretical studies of retrieval models. Thomas Roelleke lives in England, in a village in the middle between buzzy London and beautiful East Anglia.
Inhaltsangabe
List of Figures.- Preface.- Acknowledgments.- Introduction.- Foundations of IR Models.- Relationships Between IR Models.- Summary & Research Outlook.- Bibliography.- Author's Biography.- Index.
List of Figures.- Preface.- Acknowledgments.- Introduction.- Foundations of IR Models.- Relationships Between IR Models.- Summary & Research Outlook.- Bibliography.- Author's Biography.- Index.
Es gelten unsere Allgemeinen Geschäftsbedingungen: www.buecher.de/agb
Impressum
www.buecher.de ist ein Internetauftritt der buecher.de internetstores GmbH
Geschäftsführung: Monica Sawhney | Roland Kölbl | Günter Hilger
Sitz der Gesellschaft: Batheyer Straße 115 - 117, 58099 Hagen
Postanschrift: Bürgermeister-Wegele-Str. 12, 86167 Augsburg
Amtsgericht Hagen HRB 13257
Steuernummer: 321/5800/1497
USt-IdNr: DE450055826