32,99 €
inkl. MwSt.
Versandkostenfrei*
Versandfertig in 6-10 Tagen
  • Broschiertes Buch

In digital libraries, ambiguous author names may occur due to the existence of identical names, name misspellings, pseudonyms. Disambiguating these author names is a major problem during data integration and document retrieval. In this study, we assume that an individual tends to create a distinctively coherent body of work that can hence form a single cluster containing all of his/her articles yet distinguishing them from those of everyone else with the same name. Still, we believe the information contained in a digital library may be not sufficient to allow an automatic detection of such…mehr

Produktbeschreibung
In digital libraries, ambiguous author names may occur due to the existence of identical names, name misspellings, pseudonyms. Disambiguating these author names is a major problem during data integration and document retrieval. In this study, we assume that an individual tends to create a distinctively coherent body of work that can hence form a single cluster containing all of his/her articles yet distinguishing them from those of everyone else with the same name. Still, we believe the information contained in a digital library may be not sufficient to allow an automatic detection of such clusters. Hence, we exploit Topic Models, extracted from Wikipedia, to enhance records metadata and use Agglomerative Clustering to disambiguate ambiguous author names by clustering together similar records, where records in different clusters are supposed to have been written by different people.
Autorenporträt
Dieu-Thu Le is a PhD candidate at Trento University, Italy. Her research interests include natural language processing, text mining, probabilistic modeling and digital libraries. She received her Master degree with cum laude in the European Masters Program in Language & Communication Technology, France and Italy.