48,99 €
inkl. MwSt.
Versandkostenfrei*
Versandfertig in 6-10 Tagen
  • Broschiertes Buch

This work is one of the initial experiments towards creating the automatic Part-of Speech (POS) tagger for Bhojpuri language. Bhojpuri is a lesser resource language and does not have much technology available, therefore, this work presents the first big representative Bhojpuri corpus of approx 2,67,000 tokens from different domains and a SVM (Support Vector Machine) based POS tagger trained on this corpus. The accuracy of the tagger achieved under this experiment is approx. 87 %. This work also cover a detail guideline of annotating Bhojpuri corpus following BIS scheme and a comparative…mehr

Produktbeschreibung
This work is one of the initial experiments towards creating the automatic Part-of Speech (POS) tagger for Bhojpuri language. Bhojpuri is a lesser resource language and does not have much technology available, therefore, this work presents the first big representative Bhojpuri corpus of approx 2,67,000 tokens from different domains and a SVM (Support Vector Machine) based POS tagger trained on this corpus. The accuracy of the tagger achieved under this experiment is approx. 87 %. This work also cover a detail guideline of annotating Bhojpuri corpus following BIS scheme and a comparative analysis of performances of Bhojpuri and Hindi POS taggers trained with SVM model.
Autorenporträt
Author is a Research scholar from Jawaharlal Nehru University, New Delhi. Her area of expertise is Computational Linguistics. Natural Language Processing (NLP), Corpora Collection and Resource Creation for lesser resourced languages is author's major area of interest and main objective of the present work (M.Phil dissertation) submitted in 2015.