Data Mining the Web (eBook, PDF)

Uncovering Patterns in Web Content, Structure, and Usage

Fotogalerie

Als Download kaufen

96,99 €

inkl. MwSt.

Sofort per Download lieferbar

0 °P sammeln

Jetzt verschenken

96,99 €

inkl. MwSt.

Sofort per Download lieferbar

Alle Infos zum eBook verschenken

0 °P sammeln

Als Download kaufen

Geschenk

Zdravko Markov, Daniel T. Larose

Data Mining the Web (eBook, PDF)

Uncovering Patterns in Web Content, Structure, and Usage

Format: PDF

Jetzt bewerten Jetzt bewerten

This book introduces the reader to methods of data mining on the web, including uncovering patterns in web content (classification, clustering, language processing), structure (graphs, hubs, metrics), and usage (modeling, sequence analysis, performance).

Geräte: PC
mit Kopierschutz
eBook Hilfe
Größe: 4.91MB

Andere Kunden interessierten sich auch für

Roger Bilisoly
Practical Text Mining with Perl (eBook, PDF)

112,99 €
Darius M. Dziuda
Data Mining for Genomics and Proteomics (eBook, PDF)

87,99 €
Knowledge Discovery in Bioinformatics (eBook, PDF)

122,99 €
Roger Bilisoly
Practical Text Mining with Perl (eBook, ePUB)

112,99 €
Claudio Carpineto
Concept Data Analysis (eBook, PDF)

107,99 €
Daniel T. Larose
Data Mining and Predictive Analytics (eBook, PDF)

111,99 €
Paulraj Ponniah
Data Modeling Fundamentals (eBook, PDF)

136,99 €

Produktbeschreibung

Dieser Download kann aus rechtlichen Gründen nur mit Rechnungsadresse in A, B, BG, CY, CZ, D, DK, EW, E, FIN, F, GR, HR, H, IRL, I, LT, L, LR, M, NL, PL, P, R, S, SLO, SK ausgeliefert werden.

Produktdetails

Produktdetails
Verlag: John Wiley & Sons
Seitenzahl: 240
Erscheinungstermin: 20. August 2007
Englisch
ISBN-13: 9780470108086
Artikelnr.: 37290395

Produktdetails

Verlag: John Wiley & Sons
Seitenzahl: 240
Erscheinungstermin: 20. August 2007
Englisch
ISBN-13: 9780470108086
Artikelnr.: 37290395

Herstellerkennzeichnung

Autorenporträt

Zdravko Markov, PhD, is Associate Professor of Computer Science at Central Connecticut State University. The author of three textbooks, Dr. Markov teaches undergraduate and graduate courses in computer science and artificial intelligence. He is currently a Principal Investigator (PI) in a National Science Foundation-funded project designed to introduce machine learning to undergraduates. Daniel T. Larose, PhD, is Professor of Statistics in the Department of Mathematical Sciences at Central Connecticut State University. He is the author of three data mining books and a forthcoming textbook in undergraduate statistics. He developed and directs CCSU's DataMining@CCSU programs.

Inhaltsangabe

PREFACE.
PART I: WEB STRUCTURE MINING.
1 INFORMATION RETRIEVAL AND WEB SEARCH.
Web Challenges.
Web Search Engines.
Topic Directories.
Semantic Web.
Crawling the Web.
Web Basics.
Web Crawlers.
Indexing and Keyword Search.
Document Representation.
Implementation Considerations.
Relevance Ranking.
Advanced Text Search.
Using the HTML Structure in Keyword Search.
Evaluating Search Quality.
Similarity Search.
Cosine Similarity.
Jaccard Similarity.
Document Resemblance.
References.
Exercises.
2 HYPERLINK-BASED RANKING.
Introduction.
Social Networks Analysis.
PageRank.
Authorities and Hubs.
Link-Based Similarity Search.
Enhanced Techniques for Page Ranking.
References.
Exercises.
PART II: WEB CONTENT MINING.
3 CLUSTERING.
Introduction.
Hierarchical Agglomerative Clustering.
k-Means Clustering.
Probabilty-Based Clustering.
Finite Mixture Problem.
Classification Problem.
Clustering Problem.
Collaborative Filtering (Recommender Systems).
References.
Exercises.
4 EVALUATING CLUSTERING.
Approaches to Evaluating Clustering.
Similarity-Based Criterion Functions.
Probabilistic Criterion Functions.
MDL-Based Model and Feature Evaluation.
Minimum Description Length Principle.
MDL-Based Model Evaluation.
Feature Selection.
Classes-to-Clusters Evaluation.
Precision, Recall, and F-Measure.
Entropy.
References.
Exercises.
5 CLASSIFICATION.
General Setting and Evaluation Techniques.
Nearest-Neighbor Algorithm.
Feature Selection.
Naive Bayes Algorithm.
Numerical Approaches.
Relational Learning.
References.
Exercises.
PART III: WEB USAGE MINING.
6 INTRODUCTION TO WEB USAGE MINING.
Definition of Web Usage Mining.
Cross-Industry Standard Process for Data Mining.
Clickstream Analysis.
Web Server Log Files.
Remote Host Field.
Date/Time Field.
HTTP Request Field.
Status Code Field.
Transfer Volume (Bytes) Field.
Common Log Format.
Identification Field.
Authuser Field.
Extended Common Log Format.
Referrer Field.
User Agent Field.
Example of a Web Log Record.
Microsoft IIS Log Format.
Auxiliary Information.
References.
Exercises.
7 PREPROCESSING FOR WEB USAGE MINING.
Need for Preprocessing the Data.
Data Cleaning and Filtering.
Page Extension Exploration and Filtering.
De-Spidering the Web Log File.
User Identification.
Session Identification.
Path Completion.
Directories and the Basket Transformation.
Further Data Preprocessing Steps.
References.
Exercises.
8 EXPLORATORY DATA ANALYSIS FOR WEB USAGE MINING.
Introduction.
Number of Visit Actions.
Session Duration.
Relationship between Visit Actions and Session Duration.
Average Time per Page.
Duration for Individual Pages.
References.
Exercises.
9 MODELING FOR WEB USAGE MINING: CLUSTERING, ASSOCIATION, AND
CLASSIFICATION.
Introduction.
Modeling Methodology.
Definition of Clustering.
The BIRCH Clustering Algorithm.
Affinity Analysis and the A Priori Algorithm.
Discretizing the Numerical Variables: Binning.
Applying the A Priori Algorithm to the CCSU Web Log Data.
Classification and Regression Trees.
The C4.5 Algorithm.
References.
Exercises.
INDEX.

Inhaltsangabe

Rezensionen

"I can say I really enjoyed reading this bookâ?¦a great educational resource for students and teachers." ( Information Retrieval , 2008)

Data Mining the Web (eBook, PDF)

Uncovering Patterns in Web Content, Structure, and Usage

Data Mining the Web (eBook, PDF)

Uncovering Patterns in Web Content, Structure, and Usage

1. Login

2. tolino select Abo