Lidong Bing, Wai Lam

Information Discovery from Semi-structured Record Sets on the Web

From Web Pages to Knowledge

Fotogalerie

Lidong Bing, Wai Lam

Information Discovery from Semi-structured Record Sets on the Web

From Web Pages to Knowledge

Broschiertes Buch

Jetzt bewerten Jetzt bewerten

Autorenporträt

Andere Kunden interessierten sich auch für

Produktbeschreibung

In this book, we develop two frameworks to tackle the task of semi-structured Web data record extraction. We first present a record segmentation search tree framework in which a new search structure, named Record Segmentation Tree (RST), is designed and several efficient search pruning strategies on the RST structure are proposed to identify the records in a given Web page. We also present another DOM Structure Knowledge Oriented Global Analysis (Skoga) framework which can perform robust detection of different kinds of data records and record regions. Skoga can conduct a global analysis on the DOM structure to achieve effective detection. Finally, we present a framework that can make use of the detected data records to automatically populate existing Wikipedia categories. This framework takes a few existing entities that are automatically collected from a particular Wikipedia category as seed input and explores their attribute infoboxes to obtain clues for the discovery of more entities for this category and the attribute content of the newly discovered entities.

Produktdetails

Produktdetails
Verlag: LAP Lambert Academic Publishing
Seitenzahl: 124
Erscheinungstermin: 28. Februar 2014
Englisch
Abmessung: 220mm x 150mm x 8mm
Gewicht: 203g
ISBN-13: 9783659206115
ISBN-10: 3659206113
Artikelnr.: 40622903

Herstellerkennzeichnung
Books on Demand GmbH
In de Tarpen 42
22848 Norderstedt
info@bod.de
040 53433511

Produktdetails

Verlag: LAP Lambert Academic Publishing
Seitenzahl: 124
Erscheinungstermin: 28. Februar 2014
Englisch
Abmessung: 220mm x 150mm x 8mm
Gewicht: 203g
ISBN-13: 9783659206115
ISBN-10: 3659206113
Artikelnr.: 40622903

Herstellerkennzeichnung
Books on Demand GmbH
In de Tarpen 42
22848 Norderstedt
info@bod.de
040 53433511

Autorenporträt

He is currently a postdoc fellow in The Chinese University of Hong Kong, where he received his PhD in 2012. Before that, he obtained his MPhil and BSc degrees from Peking University and Northeast Normal University respectively. He has research interests in Information Extraction, Information Retrieval, Web Mining, Natural Language Processing, etc.