Machine Learning Methods for Stylometry

Authorship Attribution and Author Profiling

Fotogalerie

Jacques Savoy

Machine Learning Methods for Stylometry

Authorship Attribution and Author Profiling

Gebundenes Buch

Jetzt bewerten Jetzt bewerten

Weitere Ausgabe:
eBook, PDF

Andere Kunden interessierten sich auch für

Experimental IR Meets Multilinguality, Multimodality, and Interaction

41,99 €
From Born-Physical to Born-Virtual: Augmenting Intelligence in Digital Libraries

66,99 €
Henning Wachsmuth
Text Analysis Pipelines

41,99 €
Knowledge Engineering and Semantic Web

41,99 €
Speech and Language Technologies for Low-Resource Languages

59,99 €
Sustainability and Empowerment in the Context of Digital Libraries

88,99 €
Isaiah Hull
Machine Learning for Economics and Finance in Tensorflow 2

41,99 €

Produktbeschreibung

This book presents methods and approaches used to identify the true author of a doubtful document or text excerpt. It provides a broad introduction to all text categorization problems (like authorship attribution, psychological traits of the author, detecting fake news, etc.) grounded in stylistic features. Specifically, machine learning models as valuable tools for verifying hypotheses or revealing significant patterns hidden in datasets are presented in detail. Stylometry is a multi-disciplinary field combining linguistics with both statistics and computer science.

The content is divided into three parts. The first, which consists of the first three chapters, offers a general introduction to stylometry, its potential applications and limitations. Further, it introduces the ongoing example used to illustrate the concepts discussed throughout the remainder of the book. The four chapters of the second part are more devoted to computer science with a focus on machine learningmodels. Their main aim is to explain machine learning models for solving stylometric problems. Several general strategies used to identify, extract, select, and represent stylistic markers are explained. As deep learning represents an active field of research, information on neural network models and word embeddings applied to stylometry is provided, as well as a general introduction to the deep learning approach to solving stylometric questions. In turn, the third part illustrates the application of the previously discussed approaches in real cases: an authorship attribution problem, seeking to discover the secret hand behind the nom de plume Elena Ferrante, an Italian writer known worldwide for her My Brilliant Friend's saga; author profiling in order to identify whether a set of tweets were generated by a bot or a human being and in this second case, whether it is a man or a woman; and an exploration of stylistic variations over time using US political speeches covering a period ofca. 230 years.
A solutions-based approach is adopted throughout the book, and explanations are supported by examples written in R. To complement the main content and discussions on stylometric models and techniques, examples and datasets are freely available at the author's Github website.

Produktdetails

Produktdetails
Verlag: Springer / Springer International Publishing / Springer, Berlin
Artikelnr. des Verlages: 978-3-030-53359-5
1st ed. 2020
Seitenzahl: 308
Erscheinungstermin: 29. September 2020
Englisch
Abmessung: 241mm x 160mm x 23mm
Gewicht: 619g
ISBN-13: 9783030533595
ISBN-10: 303053359X
Artikelnr.: 59588510

Herstellerkennzeichnung

Produktdetails

Verlag: Springer / Springer International Publishing / Springer, Berlin
Artikelnr. des Verlages: 978-3-030-53359-5
1st ed. 2020
Seitenzahl: 308
Erscheinungstermin: 29. September 2020
Englisch
Abmessung: 241mm x 160mm x 23mm
Gewicht: 619g
ISBN-13: 9783030533595
ISBN-10: 303053359X
Artikelnr.: 59588510

Herstellerkennzeichnung

Autorenporträt

Jacques Savoy is a Full Professor of Computer Science at the University of Neuchatel (Switzerland). His research interests mainly include natural language processing and particularly information retrieval for languages other than English (European, Asian, and Indian) as well as multilingual and cross-lingual information retrieval. For many years he has participated in various evaluations campaigns (TREC, CLEF, NTCIR, FIRE) dealing with these questions. His current research interests focus on the statistical modeling and evaluation of natural language processing such as text clustering and categorization, as well as authorship attribution.

Inhaltsangabe

Part IFundamental Concepts and Models.- 1Introduction to Stylistic Models and Applications.- 2Basic Lexical Concepts and Measurements.- 3Distance-Based Approaches.- Part IIAdvanced Models and Evaluation.- 4Evaluation Methodology and Test Corpora.- 5Features Identification and Selection.- 6Machine Learning Models.- 7Advanced Models for Stylometric Applications.- Part IIICases Studies.- 8Elena Ferrante: A Case Study in Authorship Attribution.- 9Author Profiling of Tweets.- 10Applications to Political Speeches.- 11Conclusions.

Inhaltsangabe