SharpSpider: A Continuous, Parallel and Distributed Spider

Design and implementation of software to navigate the Web autonomously

Fotogalerie

Marco Palomino

SharpSpider: A Continuous, Parallel and Distributed Spider

Design and implementation of software to navigate the Web autonomously

Broschiertes Buch

Jetzt bewerten Jetzt bewerten

Autorenporträt

Andere Kunden interessierten sich auch für

Produktbeschreibung

Search engines have become so indispensable that
they rank second only to e-mail as the most popular
online activity. To respond to queries in a timely
fashion, search engines make use of large indices of
word occurrences on Web pages to cross-reference
websites to keywords. Such indices are maintained by
spiders, a special kind of computer program that
browses the Web autonomously. However, due to a
variety of technological limitations, a single
spider has proven insufficient to maintain a search
engine's index. Hence, in this book, we review
several alternatives to split a spider's work into
multiple processes, and define a methodology to
preserve an up-to-date index of the Web.
SharpSpider, our prototype spider, has been
evaluated using the resources of PlanetLab, a
globally distributed platform for developing and
deploying planetary-scale services. Despite the
utilisation of very modest equipment, we have
performed large crawls of the Web, distributing the
workload amongst various computers spread across
different continents. The statistics derived from
our research offer valuable insight into the nature
of educational Web resources.

Produktdetails

Produktdetails
Verlag: VDM Verlag Dr. Müller
Seitenzahl: 160
Englisch
ISBN-13: 9783639148862
ISBN-10: 363914886X
Artikelnr.: 26760740

Produktdetails

Verlag: VDM Verlag Dr. Müller
Seitenzahl: 160
Englisch
ISBN-13: 9783639148862
ISBN-10: 363914886X
Artikelnr.: 26760740

Autorenporträt

After concluding his PhD in Computer Science at the University
of Cambridge, Marco Palomino worked as a software consultant in
London, and then joined the Information Retrieval Group of the
University of Sunderland in 2007. Currently, Marco works as a
research associate, and his work focuses on the automatic
indexing of multimedia collections.

SharpSpider: A Continuous, Parallel and Distributed Spider

Rechnungen

Retourenschein anfordern

Bestellstatus

Storno

Serviceseiten

Schließen

SharpSpider: A Continuous, Parallel and Distributed Spider

SharpSpider: A Continuous, Parallel and Distributed Spider

Bitte wählen Sie Ihr Anliegen aus.

Rechnungen

Retourenschein anfordern

Bestellstatus

Storno

Serviceseiten

Schließen