Statistical topic models are a class of probabilistic latent variable models for textual data that represent text documents as distributions over topics. These models have been shown to produce interpretable summarization of documents in the form of topics. In this book, we describe how the statistical topic modeling framework can be used for information retrieval tasks and for the integration of background knowledge in the form of semantic concepts. We first describe the special-words topic models in which a document is represented as a distribution of (i) a mixture of shared topics, (ii) a special-words distribution specific to the document, and (iii) a corpus-level background distribution. We describe the utility of the special-words topic models for information retrieval tasks. We next describe the problem of integrating background knowledge in the form of semantic concepts into the topic modeling framework. To combine data-driven topics and semantic concepts, we describe theconcept-topic model and the hierarchical concept-topic model which represent a document as a distribution over data-driven topics and semantic concepts.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.