This book provides an undergraduate introduction to analysing data for data science, computer science, and quantitative social science students. It uniquely combines a hands-on approach to data analysis - supported by numerous real data examples and reusable [R] code - with a rigorous treatment of probability and statistical principles.
Where contemporary undergraduate textbooks in probability theory or statistics often miss applications and an introductory treatment of modern methods (bootstrapping, Bayes, etc.), and where applied data analysis books often miss a rigorous theoretical treatment, this book provides an accessible but thorough introduction into data analysis, using statistical methods combining the two viewpoints. The book further focuses on methods for dealing with large data-sets and streaming-data and hence provides a single-course introduction of statistical methods for data science.
Where contemporary undergraduate textbooks in probability theory or statistics often miss applications and an introductory treatment of modern methods (bootstrapping, Bayes, etc.), and where applied data analysis books often miss a rigorous theoretical treatment, this book provides an accessible but thorough introduction into data analysis, using statistical methods combining the two viewpoints. The book further focuses on methods for dealing with large data-sets and streaming-data and hence provides a single-course introduction of statistical methods for data science.
"Having taught data analytics at the introductory graduate level, I welcome the authors' textbook as an essential resource for training well-grounded entry-level data scientists. ... A data scientist shall provide competent data science professional services to a client. ... Training in both the theory and practice of data analytics is a requirement for such competence. The authors' textbook definitely provides a valuable resource for such training." (Harry J. Foxwell, Computing Reviews, July 7, 2022)