Natural language processing (NLP) is a scientific discipline which is found at the interface of computer science, artificial intelligence and cognitive psychology. Providing an overview of international work in this interdisciplinary field, this book gives the reader a panoramic view of both early and current research in NLP. Carefully chosen multilingual examples present the state of the art of a mature field which is in a constant state of evolution. In four chapters, this book presents the fundamental concepts of phonetics and phonology and the two most important applications in the field…mehr
Natural language processing (NLP) is a scientific discipline which is found at the interface of computer science, artificial intelligence and cognitive psychology. Providing an overview of international work in this interdisciplinary field, this book gives the reader a panoramic view of both early and current research in NLP. Carefully chosen multilingual examples present the state of the art of a mature field which is in a constant state of evolution. In four chapters, this book presents the fundamental concepts of phonetics and phonology and the two most important applications in the field of speech processing: recognition and synthesis. Also presented are the fundamental concepts of corpus linguistics and the basic concepts of morphology and its NLP applications such as stemming and part of speech tagging. The fundamental notions and the most important syntactic theories are presented, as well as the different approaches to syntactic parsing with reference to cognitive models, algorithms and computer applications.Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Mohamed Zakaria Kurdi is Assistant Professor at the CS Department of Lynchburg College in Virginia, USA. His research interests include natural language processing, robust parsing, text mining and intelligent computer-assisted language learning.
Inhaltsangabe
Introduction ix Chapter 1. Linguistic Resources for NLP 1 1.1. The concept of a corpus 1 1.2. Corpus taxonomy 4 1.2.1. Written versus spoken 4 1.2.2. The historical point of view 5 1.2.3. The language of corpora 5 1.2.4. Thematic representativity 7 1.2.5. Age range of speakers 8 1.3. Who collects and distributes corpora? 8 1.3.1. The Gutenberg project 9 1.3.2. The linguistic data consortium 9 1.3.3. European language resource agency 9 1.3.4. Open language archives community 10 1.3.5. Miscellaneous 10 1.4. The lifecycle of a corpus 10 1.4.1. Needs analysis 12 1.4.2. Design of scenarios to collect data for the corpus 12 1.4.3. Collection of the corpus 12 1.4.4. Transcription 16 1.4.5. Corpus annotation 18 1.4.6. Corpus documentation 22 1.4.7. Statistical analysis of data 22 1.4.8. The use of corpora in NLP 23 1.5. Examples of existing corpora 23 1.5.1. American National Corpus 23 1.5.2. Oxford English Corpus 23 1.5.3. The Grenoble Tourism Office Corpus 24 Chapter 2. The Sphere of Speech 25 2.1. Linguistic studies of speech 25 2.1.1. Phonetics 25 2.1.2. Phonology 46 2.2. Speech processing 61 2.2.1. Automatic speech recognition 62 2.2.2. Speech synthesis 80 Chapter 3. Morphology Sphere 89 3.1. Elements of morphology 89 3.1.1. Morphological typology 90 3.1.2. Morphology of English 91 3.1.3. Parts of speech 95 3.1.4. Terms, collocations and colligations 99 3.2. Automatic morphological analysis 100 3.2.1. Stemming 101 3.2.2. Regular expressions for morphological analysis 104 3.2.3. Informal introduction to finite-state machines 108 3.2.4. Two-level morphology and FST 112 3.2.5. Part-of-speech tagging 117 Chapter 4. Syntax Sphere 127 4.1. Basic syntactic concepts 127 4.1.1. Delimitation of the field of syntax 127 4.1.2. The concept of grammaticality 128 4.1.3. Syntactic constituents 129 4.1.4. Syntactic typology of topology and agreement 139 4.1.5. Syntactic ambiguity 140 4.1.6. Syntactic specificities of spontaneous oral language 141 4.2. Elements of formal syntax 145 4.2.1. Syntax trees and rewrite rules 145 4.2.2. Languages and formal grammars 152 4.2.3. Hierarchy of languages (Chomsky-Schützenberger) 154 4.2.4. Feature structures and unification 162 4.2.5. Definite clause grammar 169 4.3. Syntactic formalisms 171 4.3.1. X-bar 171 4.3.2. Head-driven phrase structure grammar 178 4.3.3. Lexicalized tree-adjoining grammar 193 4.4. Automatic parsing 201 4.4.1. Finite-state automata 202 4.4.2. Recursive transition networks 203 4.4.3. Top-down approach 207 4.4.4. Bottom-up approach 212 4.4.5. Mixed approach: left-corner 215 4.4.6. Tabular parsing (chart) 221 4.4.7. Probabilistic parsing 225 4.4.8. Neural network 233 4.4.9. parsing algorithms for unification-based grammars 237 4.4.10. Robust parsing approaches 238 4.4.11. Generation algorithms 242 Bibliography 245 Index 275
Introduction ix Chapter 1. Linguistic Resources for NLP 1 1.1. The concept of a corpus 1 1.2. Corpus taxonomy 4 1.2.1. Written versus spoken 4 1.2.2. The historical point of view 5 1.2.3. The language of corpora 5 1.2.4. Thematic representativity 7 1.2.5. Age range of speakers 8 1.3. Who collects and distributes corpora? 8 1.3.1. The Gutenberg project 9 1.3.2. The linguistic data consortium 9 1.3.3. European language resource agency 9 1.3.4. Open language archives community 10 1.3.5. Miscellaneous 10 1.4. The lifecycle of a corpus 10 1.4.1. Needs analysis 12 1.4.2. Design of scenarios to collect data for the corpus 12 1.4.3. Collection of the corpus 12 1.4.4. Transcription 16 1.4.5. Corpus annotation 18 1.4.6. Corpus documentation 22 1.4.7. Statistical analysis of data 22 1.4.8. The use of corpora in NLP 23 1.5. Examples of existing corpora 23 1.5.1. American National Corpus 23 1.5.2. Oxford English Corpus 23 1.5.3. The Grenoble Tourism Office Corpus 24 Chapter 2. The Sphere of Speech 25 2.1. Linguistic studies of speech 25 2.1.1. Phonetics 25 2.1.2. Phonology 46 2.2. Speech processing 61 2.2.1. Automatic speech recognition 62 2.2.2. Speech synthesis 80 Chapter 3. Morphology Sphere 89 3.1. Elements of morphology 89 3.1.1. Morphological typology 90 3.1.2. Morphology of English 91 3.1.3. Parts of speech 95 3.1.4. Terms, collocations and colligations 99 3.2. Automatic morphological analysis 100 3.2.1. Stemming 101 3.2.2. Regular expressions for morphological analysis 104 3.2.3. Informal introduction to finite-state machines 108 3.2.4. Two-level morphology and FST 112 3.2.5. Part-of-speech tagging 117 Chapter 4. Syntax Sphere 127 4.1. Basic syntactic concepts 127 4.1.1. Delimitation of the field of syntax 127 4.1.2. The concept of grammaticality 128 4.1.3. Syntactic constituents 129 4.1.4. Syntactic typology of topology and agreement 139 4.1.5. Syntactic ambiguity 140 4.1.6. Syntactic specificities of spontaneous oral language 141 4.2. Elements of formal syntax 145 4.2.1. Syntax trees and rewrite rules 145 4.2.2. Languages and formal grammars 152 4.2.3. Hierarchy of languages (Chomsky-Schützenberger) 154 4.2.4. Feature structures and unification 162 4.2.5. Definite clause grammar 169 4.3. Syntactic formalisms 171 4.3.1. X-bar 171 4.3.2. Head-driven phrase structure grammar 178 4.3.3. Lexicalized tree-adjoining grammar 193 4.4. Automatic parsing 201 4.4.1. Finite-state automata 202 4.4.2. Recursive transition networks 203 4.4.3. Top-down approach 207 4.4.4. Bottom-up approach 212 4.4.5. Mixed approach: left-corner 215 4.4.6. Tabular parsing (chart) 221 4.4.7. Probabilistic parsing 225 4.4.8. Neural network 233 4.4.9. parsing algorithms for unification-based grammars 237 4.4.10. Robust parsing approaches 238 4.4.11. Generation algorithms 242 Bibliography 245 Index 275
Es gelten unsere Allgemeinen Geschäftsbedingungen: www.buecher.de/agb
Impressum
www.buecher.de ist ein Internetauftritt der buecher.de internetstores GmbH
Geschäftsführung: Monica Sawhney | Roland Kölbl | Günter Hilger
Sitz der Gesellschaft: Batheyer Straße 115 - 117, 58099 Hagen
Postanschrift: Bürgermeister-Wegele-Str. 12, 86167 Augsburg
Amtsgericht Hagen HRB 13257
Steuernummer: 321/5800/1497