Bachelor Thesis from the year 2009 in the subject English Language and Literature Studies - Other, grade: 1,0, Bielefeld University, language: English, abstract: This Bachelor of Arts thesis contributes to the CREAM project between Novartis Pharma AG and Bielefeld University. Throughout the thesis a method called n-gram modeling will be discovered which supplies its user with information about the frequential use of words. This information will be needed in order to improve a database the CREAM project works on. This improvement is to do with a calculation of probabilities in search queries sent to the database. The thesis consists of five chapters. The first chapter introduces the CREAM project and the database. The second chapter provides the reader with information about the current state of n-gram modeling and where it can be found in contemporary literature. The third chapter deals extensively with how corpora have to be prepared in order to be analyzed accordingly and how n-gram modeling can be computed in terms of frequential distribution of words. In chapter four a computer code will be introduced that uses a corpus to obtain certain n-grams. Finally, in chapter five, the information retrieved by the computer code(s) will be evaluated and a forecast of future work will be mentioned.Due to copyright-protected material, the appendix is not part of the thesis.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.