AUTOMATIC EXTRACTION OF LEMMA-BASED BILINGUAL DICTIONARIES

A CASE STUDY OF MORPHOLOGICALLY RICH LANGUAGES LIKE ARABIC

Fotogalerie

Ibrahim M. H. Saleh

AUTOMATIC EXTRACTION OF LEMMA-BASED BILINGUAL DICTIONARIES

A CASE STUDY OF MORPHOLOGICALLY RICH LANGUAGES LIKE ARABIC

Broschiertes Buch

Jetzt bewerten Jetzt bewerten

Autorenporträt

Andere Kunden interessierten sich auch für

Ventsislav Zhechev
Automatic Generation of Parallel Treebanks

42,99 €
Michael Fell
Verbal Irony: Theories and Automatic Detection

17,95 €
Irina Nikolaeva
Automatic symbolic information processing

56,99 €
Zukile Ndyalivana
Development of an automatic news summarizer for isiXhosa language

39,99 €
Xing Zhang
Contributions of Syntactic Functions towards Term Extraction

56,99 €
Leo Weisgerber
Grundformen sprachlicher Weltgestaltung

54,99 €
Jan van Bakel
Automatic Semantic Interpretation

109,95 €

Produktbeschreibung

This academic work presents an approach for the automatic extraction and filtering of a lemma-based Arabic-English dictionary from parallel corpora. Towards this end, the present approach makes use of Machine Learning algorithms to filter the Arabic-English lemma pairs wrongly extracted from the parallel corpus as good translation pairs. It also makes use of highly accurate morphological analyzers and generators of Arabic to overcome the morphological ambiguity of the Arabic words. A comparison of the automatically generated dictionary with a manually built dictionary widely used in Arabic Computational Linguistics applications shows a high degree of coverage complementarity on the part of the automatically generated dictionary. The comparison also shows that the generated dictionary: (1) has reasonable recall and high precision, (2) is significantly more comprehensive in terms of the covered Arabic-English lemma pairs, and (3) has high potential for future improvement.

Produktdetails

Produktdetails
Verlag: LAP Lambert Academic Publishing
Seitenzahl: 72
Erscheinungstermin: 5. Mai 2010
Englisch
Abmessung: 220mm x 150mm x 5mm
Gewicht: 125g
ISBN-13: 9783838357522
ISBN-10: 3838357523
Artikelnr.: 29763644

Herstellerkennzeichnung
Books on Demand GmbH
In de Tarpen 42
22848 Norderstedt
info@bod.de
040 53433511

Produktdetails

Verlag: LAP Lambert Academic Publishing
Seitenzahl: 72
Erscheinungstermin: 5. Mai 2010
Englisch
Abmessung: 220mm x 150mm x 5mm
Gewicht: 125g
ISBN-13: 9783838357522
ISBN-10: 3838357523
Artikelnr.: 29763644

Herstellerkennzeichnung
Books on Demand GmbH
In de Tarpen 42
22848 Norderstedt
info@bod.de
040 53433511

Autorenporträt

Ibrahim Saleh has a MS in Computational Linguistics from Georgetown University (GU), and is currently completing a PhD in General Linguistics at GU. He published 2 papers in MT XII Summit, Canada 2009, and LREC, Malta 2010. He was an intern at the CCLS, Columbia University, and participated in building a morphological analyzer for Arabic dialects.