This textbook considers statistical learning applications when interest centers on the conditional distribution of a response variable, given a set of predictors, and in the absence of a credible model that can be specified before the data analysis begins. Consistent with modern data analytics, it emphasizes that a proper statistical learning data analysis depends in an integrated fashion on sound data collection, intelligent data management, appropriate statistical procedures, and an accessible interpretation of results. The unifying theme is that supervised learning properly can be seen as a form of regression analysis. Key concepts and procedures are illustrated with a large number of real applications and their associated code in R, with an eye toward practical implications. The growing integration of computer science and statistics is well represented including the occasional, but salient, tensions that result. Throughout, there are links to the big picture.
The third edition considers significant advances in recent years, among which are:
the development of overarching, conceptual frameworks for statistical learning;the impact of "big data" on statistical learning;the nature and consequences of post-model selection statistical inference;deep learning in various forms;the special challenges to statistical inference posed by statistical learning;the fundamental connections between data collection and data analysis;interdisciplinary ethical and political issues surrounding the application of algorithmic methods in a wide variety of fields, each linked to concerns about transparency, fairness, and accuracy.
This edition features new sections on accuracy, transparency, and fairness, as well as a new chapter on deep learning. Precursors to deep learning get an expanded treatment. The connections between fitting and forecasting are considered in greater depth. Discussion of the estimation targets for algorithmic methods is revised and expanded throughout to reflect the latest research. Resampling procedures are emphasized. The material is written for upper undergraduate and graduate students in the social, psychological and life sciences and for researchers who want to apply statistical learning procedures to scientific and policy problems.
The third edition considers significant advances in recent years, among which are:
the development of overarching, conceptual frameworks for statistical learning;the impact of "big data" on statistical learning;the nature and consequences of post-model selection statistical inference;deep learning in various forms;the special challenges to statistical inference posed by statistical learning;the fundamental connections between data collection and data analysis;interdisciplinary ethical and political issues surrounding the application of algorithmic methods in a wide variety of fields, each linked to concerns about transparency, fairness, and accuracy.
This edition features new sections on accuracy, transparency, and fairness, as well as a new chapter on deep learning. Precursors to deep learning get an expanded treatment. The connections between fitting and forecasting are considered in greater depth. Discussion of the estimation targets for algorithmic methods is revised and expanded throughout to reflect the latest research. Resampling procedures are emphasized. The material is written for upper undergraduate and graduate students in the social, psychological and life sciences and for researchers who want to apply statistical learning procedures to scientific and policy problems.
"It could readily be a textbook for an applications-focused course at the graduate level as each chapter comes with exercises ... . Examples with accompanying code also appear throughout the chapters which provide a scaffold for getting started ... . Berk's pragmatic advice will serve a wide audience from practitioners to educators to students." (Sara Stoudt, MAA Reviews, December 12, 2021)
From the reviews:
"I believe that the practical utility of statistical learning over more traditional non- and parametric regression approaches has yet to be truly demonstrate but the procedures presented in this text do show considerable potential.... The mathematical prerequisites for using this book are minimal.... Some familiarity with using a computer is necessary in order to gain the most benefit for the text, and some previous experience of using a statistical software package would be advantageous." (C.M. O'Brien, International Statistical Review, 2009, 77, 1)
"The readers of this book will obtain the knowledge of the dialectic of the regression modeling problems arising in the study of predictor -response. A large percent of the contents is devoted to discuss how to understand phenomena through regression equation fitting. ... I recommend it for practitioners and professors have the responsibility of teaching on the subject, the book gives an interesting perspective for dealing with regression." (Sovandep.H. Kumar, Revista Investigación Operacional, Vol. 30 (2), 2009)
"On the positive side, SLRP is a nice addition to the data mining literature, more accessible than ESL. It gives good references and provides statistical detail. In general, I enjoyed the philosophical discussions about how statistical learning fits in with statistical inference. I may not hand it over to my colleagues in Biology and Sociology, but I will seriously consider recommending it to the undergraduates in my data mining seminar." (Richard D. DE VEAUX, The American Statistician, Novemeber 2009, Volume 63, Number 4, pp. 297-411)
"...The strength of this book is its extensive discussion of practical issues. Algorithmic details are a starting point for discussing why and how methods work, comparison with other methodologies, limitations and strengths, and so on. Throughout the book, examples are worked through in detail. Each chapter except the first and the last end with a section headed 'Software Considerations', followed by 'Summary and Conclusions' and data analysis exercises. ...Regression methods, both the theory and the practice, remain a work in progress... .Berk has made a good start in pulling together commentary on issues of major importance." (Journal of Statistical Software, Vol. 29, Book Review 12, February 2009)
"This book is unique in that statistical learning is discussed by a sociology-PhD scientist, Professor Richard Berk, who has extensive research accomplishments in the intersection of social science and statistics. ...The key strength of this book is in its emphasis on practical applications and hands-on learning of the statistical learning methods. Each chapter has real data examples ... and goes through their analyses using statistical software R (2009). This design effectively illustrates the use of the methods in practice. 'Software consideration' given at the end of each chapter provides discussions oncurrently available computational tools, both functions/packages of R and other software, and is useful in practice. Emphasis on using R that is freely available worldwide is a major advantage in terms of readers' accessibility to the methods. Furthermore, each chapter contains exercises for practicing different aspects of the methods in the chapter. The solutions and R codes of these exercises are provided at the author's website...: this is another useful feature enhancing the hands-on learning. ...A notable difference...is that this book is written with little mathematics. ...Consequently, emphasis is not to understand the statistical-learning methods mathematically. Rather, the methods are explained mostly algorithmically in English, providing readers story-like descriptions of them. This would appeal to readers who are users of the statistical-learning methods but are not mathematically oriented. ..." (Biometrics 65, 1309-1310, December 2009)
"The author covers a remarkable terrainin a relatively short book. Up-to-date methods are presented, and their main features are explained with a minimum of mathematical notation. ... The problems at the end of each chapter are a real jewel: they lead the reader to a clear understanding of the issues treated in the chapter ... . The book will no doubt be useful for the intended readership. Even the mathematically trained reader ... may find useful ideas in it." (Ricardo Maronna, Statistical Papers, Vol. 52, 2011)
"I believe that the practical utility of statistical learning over more traditional non- and parametric regression approaches has yet to be truly demonstrate but the procedures presented in this text do show considerable potential.... The mathematical prerequisites for using this book are minimal.... Some familiarity with using a computer is necessary in order to gain the most benefit for the text, and some previous experience of using a statistical software package would be advantageous." (C.M. O'Brien, International Statistical Review, 2009, 77, 1)
"The readers of this book will obtain the knowledge of the dialectic of the regression modeling problems arising in the study of predictor -response. A large percent of the contents is devoted to discuss how to understand phenomena through regression equation fitting. ... I recommend it for practitioners and professors have the responsibility of teaching on the subject, the book gives an interesting perspective for dealing with regression." (Sovandep.H. Kumar, Revista Investigación Operacional, Vol. 30 (2), 2009)
"On the positive side, SLRP is a nice addition to the data mining literature, more accessible than ESL. It gives good references and provides statistical detail. In general, I enjoyed the philosophical discussions about how statistical learning fits in with statistical inference. I may not hand it over to my colleagues in Biology and Sociology, but I will seriously consider recommending it to the undergraduates in my data mining seminar." (Richard D. DE VEAUX, The American Statistician, Novemeber 2009, Volume 63, Number 4, pp. 297-411)
"...The strength of this book is its extensive discussion of practical issues. Algorithmic details are a starting point for discussing why and how methods work, comparison with other methodologies, limitations and strengths, and so on. Throughout the book, examples are worked through in detail. Each chapter except the first and the last end with a section headed 'Software Considerations', followed by 'Summary and Conclusions' and data analysis exercises. ...Regression methods, both the theory and the practice, remain a work in progress... .Berk has made a good start in pulling together commentary on issues of major importance." (Journal of Statistical Software, Vol. 29, Book Review 12, February 2009)
"This book is unique in that statistical learning is discussed by a sociology-PhD scientist, Professor Richard Berk, who has extensive research accomplishments in the intersection of social science and statistics. ...The key strength of this book is in its emphasis on practical applications and hands-on learning of the statistical learning methods. Each chapter has real data examples ... and goes through their analyses using statistical software R (2009). This design effectively illustrates the use of the methods in practice. 'Software consideration' given at the end of each chapter provides discussions oncurrently available computational tools, both functions/packages of R and other software, and is useful in practice. Emphasis on using R that is freely available worldwide is a major advantage in terms of readers' accessibility to the methods. Furthermore, each chapter contains exercises for practicing different aspects of the methods in the chapter. The solutions and R codes of these exercises are provided at the author's website...: this is another useful feature enhancing the hands-on learning. ...A notable difference...is that this book is written with little mathematics. ...Consequently, emphasis is not to understand the statistical-learning methods mathematically. Rather, the methods are explained mostly algorithmically in English, providing readers story-like descriptions of them. This would appeal to readers who are users of the statistical-learning methods but are not mathematically oriented. ..." (Biometrics 65, 1309-1310, December 2009)
"The author covers a remarkable terrainin a relatively short book. Up-to-date methods are presented, and their main features are explained with a minimum of mathematical notation. ... The problems at the end of each chapter are a real jewel: they lead the reader to a clear understanding of the issues treated in the chapter ... . The book will no doubt be useful for the intended readership. Even the mathematically trained reader ... may find useful ideas in it." (Ricardo Maronna, Statistical Papers, Vol. 52, 2011)