Main description:
As with any burgeoning technology that enjoys commercial attention, the use of data mining is surrounded by a great deal of hype. Exaggerated reports tell of secrets that can be uncovered by setting algorithms loose on oceans of data. But there is no magic in machine learning, no hidden power, no alchemy. Instead there is an identifiable body of practical techniques that can extract useful information from raw data. This book describes these techniques and shows how they work.
The book is a major revision of the first edition that appeared in 1999. While the basic core remains the same, it has been updated to reflect the changes that have taken place over five years, and now has nearly double the references. The highlights for the new edition include thirty new technique sections; an enhanced Weka machine learning workbench, which now features an interactive interface; comprehensive information on neural networks; a new section on Bayesian networks; plus much more.
- Algorithmic methods at the heart of successful data miningincluding tried and true techniques as well as leading edge methods
- Performance improvement techniques that work by transforming the input or output
- Downloadable Weka, a collection of machine learning algorithms for data mining tasks, including tools for data pre-processing, classification, regression, clustering, association rules, and visualizationin a new, interactive interface
Review quote:
This book presents this new discipline in a very accessible form: both as a text to train the next generation of practitioners and researchers, and to inform lifelong learners like myself. Witten and Frank have a passion for simple and elegant solutions. They approach each topic with this mindset, grounding all concepts in concrete examples, and urging the reader to consider the simple techniques first, and then progress to the more sophisticated ones if the simple ones prove inadequate. If you have data that you want to analyze and understand, this book and the associated Weka toolkit are an excellent way to start.
From the foreword by Jim Gray, Microsoft Research
It covers cutting-edge, data mining technology that forward-looking organizations use to successfully tackle problems that are complex, highly dimensional, chaotic, non-stationary (changing over time), or plagued by. The writing style is well-rounded and engaging without subjectivity, hyperbole, or ambiguity. I consider this book a classic already!
Dr. Tilmann Bruckhaus, StickyMinds.com
Table of contents:
Preface
1. Whats it all about?
2. Input: Concepts, instances, attributes
3. Output: Knowledge representation
4. Algorithms: The basic methods
5. Credibility: Evaluating whats been learned
6. Implementations: Real machine learning schemes
7. Transformations: Engineering the input and output
8. Moving on: Extensions and applications
Part II: The Weka machine learning workbench
9. Introduction to Weka
10. The Explorer
11. The Knowledge Flow interface
12. The Experimenter
13. The command-line interface
14. Embedded machine learning
15. Writing new learning schemes
References
Index
As with any burgeoning technology that enjoys commercial attention, the use of data mining is surrounded by a great deal of hype. Exaggerated reports tell of secrets that can be uncovered by setting algorithms loose on oceans of data. But there is no magic in machine learning, no hidden power, no alchemy. Instead there is an identifiable body of practical techniques that can extract useful information from raw data. This book describes these techniques and shows how they work.
The book is a major revision of the first edition that appeared in 1999. While the basic core remains the same, it has been updated to reflect the changes that have taken place over five years, and now has nearly double the references. The highlights for the new edition include thirty new technique sections; an enhanced Weka machine learning workbench, which now features an interactive interface; comprehensive information on neural networks; a new section on Bayesian networks; plus much more.
- Algorithmic methods at the heart of successful data miningincluding tried and true techniques as well as leading edge methods
- Performance improvement techniques that work by transforming the input or output
- Downloadable Weka, a collection of machine learning algorithms for data mining tasks, including tools for data pre-processing, classification, regression, clustering, association rules, and visualizationin a new, interactive interface
Review quote:
This book presents this new discipline in a very accessible form: both as a text to train the next generation of practitioners and researchers, and to inform lifelong learners like myself. Witten and Frank have a passion for simple and elegant solutions. They approach each topic with this mindset, grounding all concepts in concrete examples, and urging the reader to consider the simple techniques first, and then progress to the more sophisticated ones if the simple ones prove inadequate. If you have data that you want to analyze and understand, this book and the associated Weka toolkit are an excellent way to start.
From the foreword by Jim Gray, Microsoft Research
It covers cutting-edge, data mining technology that forward-looking organizations use to successfully tackle problems that are complex, highly dimensional, chaotic, non-stationary (changing over time), or plagued by. The writing style is well-rounded and engaging without subjectivity, hyperbole, or ambiguity. I consider this book a classic already!
Dr. Tilmann Bruckhaus, StickyMinds.com
Table of contents:
Preface
1. Whats it all about?
2. Input: Concepts, instances, attributes
3. Output: Knowledge representation
4. Algorithms: The basic methods
5. Credibility: Evaluating whats been learned
6. Implementations: Real machine learning schemes
7. Transformations: Engineering the input and output
8. Moving on: Extensions and applications
Part II: The Weka machine learning workbench
9. Introduction to Weka
10. The Explorer
11. The Knowledge Flow interface
12. The Experimenter
13. The command-line interface
14. Embedded machine learning
15. Writing new learning schemes
References
Index