Alle Infos zum eBook verschenken
- Format: ePub
- Merkliste
- Auf die Merkliste
- Bewerten Bewerten
- Teilen
- Produkt teilen
- Produkterinnerung
- Produkterinnerung
Hier können Sie sich einloggen
Bitte loggen Sie sich zunächst in Ihr Kundenkonto ein oder registrieren Sie sich bei bücher.de, um das eBook-Abo tolino select nutzen zu können.
Hands-on Machine Learning with R provides a practical and applied approach to learning and developing intuition into today's most popular machine learning methods. This book serves as a practitioner's guide to the machine learning process and is meant to help the reader learn to apply the machine learning stack within R, which includes using various R packages such as glmnet, h2o, ranger, xgboost, keras, and others to effectively model and gain insight from their data. The book favors a hands-on approach, providing an intuitive understanding of machine learning concepts through concrete…mehr
- Geräte: eReader
- ohne Kopierschutz
- eBook Hilfe
- Größe: 7.54MB
- Brad BoehmkeHands-On Machine Learning with R (eBook, PDF)71,95 €
- Zheng Alan ZhaoSpectral Feature Selection for Data Mining (eBook, ePUB)0,99 €
- Understanding Information Retrieval Systems (eBook, ePUB)103,95 €
- Harry G. PerrosAn Introduction to IoT Analytics (eBook, ePUB)39,95 €
- RapidMiner (eBook, ePUB)42,95 €
- Ken SteifPublic Policy Analytics (eBook, ePUB)39,95 €
- Data Analytics in Project Management (eBook, ePUB)39,95 €
-
-
-
Throughout this book, the reader will be exposed to the entire machine learning process including feature engineering, resampling, hyperparameter tuning, model evaluation, and interpretation. The reader will be exposed to powerful algorithms such as regularized regression, random forests, gradient boosting machines, deep learning, generalized low rank models, and more! By favoring a hands-on approach and using real word data, the reader will gain an intuitive understanding of the architectures and engines that drive these algorithms and packages, understand when and how to tune the various hyperparameters, and be able to interpret model results. By the end of this book, the reader should have a firm grasp of R's machine learning stack and be able to implement a systematic approach for producing high quality modeling results.
Features:
· Offers a practical and applied introduction to the most popular machine learning methods.
· Topics covered include feature engineering, resampling, deep learning and more.
· Uses a hands-on approach and real world data.
Dieser Download kann aus rechtlichen Gründen nur mit Rechnungsadresse in A, B, BG, CY, CZ, D, DK, EW, E, FIN, F, GR, HR, H, IRL, I, LT, L, LR, M, NL, PL, P, R, S, SLO, SK ausgeliefert werden.
- Produktdetails
- Verlag: Taylor & Francis
- Seitenzahl: 488
- Erscheinungstermin: 7. November 2019
- Englisch
- ISBN-13: 9781000730432
- Artikelnr.: 58105287
- Verlag: Taylor & Francis
- Seitenzahl: 488
- Erscheinungstermin: 7. November 2019
- Englisch
- ISBN-13: 9781000730432
- Artikelnr.: 58105287
- Herstellerkennzeichnung Die Herstellerinformationen sind derzeit nicht verfügbar.
Brandon Greenwell is a data scientist at 84.51° where he works on a diverse team to enable, empower, and encourage others to successfully apply machine learning to solve real business problems. He's part of the Adjunct Graduate Faculty at Wright State University, an Adjunct Instructor at the University of Cincinnati, and the author of several R packages available on CRAN.
1.1.1 Regression problems 1.1.2 Classification problems 1.2 Unsupervised
learning 1.3 Roadmap 1.4 The data sets 2. Modeling Process 2.1
Prerequisites 2.2 Data splitting 2.2.1 Simple random sampling 2.2.2
Stratified sampling 2.2.3 Class imbalances 2.3 Creating models in R 2.3.1
Many formula interfaces 2.3.2 Many engines 2.4 Resampling methods 2.4.1 k
-fold cross validation 2.4.2 Bootstrapping 2.4.3 Alternatives 2.5 Bias
variance trade-off 2.5.1 Bias 2.5.2 Variance 2.5.3 Hyperparameter tuning
2.6 Model evaluation 2.6.1 Regression models 2.6.2 Classification models
2.7 Putting the processes together 3. Feature & Target Engineering 3.1
Prerequisites 3.2 Target engineering 3.3 Dealing with missingness 3.3.1
Visualizing missing values 3.3.2 Imputation 3.4 Feature filtering 3.5
Numeric feature engineering 3.5.1 Skewness 3.5.2 Standardization 3.6
Categorical feature engineering 3.6.1 Lumping 3.6.2 One-hot & dummy
encoding 3.6.3 Label encoding 3.6.4 Alternatives 3.7 Dimension reduction
3.8 Proper implementation 3.8.1 Sequential steps 3.8.2 Data leakage 3.8.3
Putting the process together II SUPERVISED LEARNING 4. Linear Regression
4.1 Prerequisites 4.2 Simple linear regression 4.2.1 Estimation 4.2.2
Inference 4.3 Multiple linear regression 4.4 Assessing model accuracy 4.5
Model concerns 4.6 Principal component regression 4.7 Partial least squares
4.8 Feature interpretation 4.9 Final thoughts 5. Logistic Regression 5.1
Prerequisites 5.2 Why logistic regression 5.3 Simple logistic regression
5.4 Multiple logistic regression 5.5 Assessing model accuracy 5.6 Model
concerns 5.7 Feature interpretation 5.8 Final thoughts 6. Regularized
Regression 6.1 Prerequisites 6.2 Why regularize? 6.2.1 Ridge penalty 6.2.2
Lasso penalty 6.2.3 Elastic nets 6.3 Implementation 6.4 Tuning 6.5 Feature
interpretation 6.6 Attrition data 6.7 Final thoughts 7. Multivariate
Adaptive Regression Splines 7.1 Prerequisites 7.2 The basic idea 7.2.1
Multivariate regression splines 7.3 Fitting a basic MARS model 7.4 Tuning
7.5 Feature interpretation 7.6 Attrition data 7.7 Final thoughts 8. K
-Nearest Neighbors 8.1 Prerequisites 8.2 Measuring similarity 8.2.1
Distance measures 8.2.2 Pre-processing 8.3 Choosing k 8.4 MNIST example 8.5
Final thoughts 9 Decision Trees 9.1 Prerequisites 9.2 Structure 9.3
Partitioning 9.4 How deep? 9.4.1 Early stopping 9.4.2 Pruning 9.5 Ames
housing example 9.6 Feature interpretation 9.7 Final thoughts 10. Bagging
10.1 Prerequisites 10.2 Why and when bagging works 10.3 Implementation 10.4
Easily parallelize 10.5 Feature interpretation 10.6 Final thoughts 11.
Random Forests 11.1 Prerequisites 11.2 Extending bagging 11.3
Out-of-the-box performance 11.4 Hyperparameters 11.4.1 Number of trees
11.4.2 mtry 11.4.3 Tree complexity 11.4.4 Sampling scheme 11.4.5 Split rule
11.5 Tuning strategies 11.6 Feature interpretation 11.7 Final thoughts 12.
Gradient Boosting 12.1 Prerequisites 12.2 How boosting works 12.2.1 A
sequential ensemble approach 12.2.2 Gradient descent 12.3 Basic GBM 12.3.1
Hyperparameters 12.3.2 Implementation 12.3.3 General tuning strategy 12.4
Stochastic GBMs 12.4.1 Stochastic hyperparameters 12.4.2 Implementation
12.5 XGBoost 12.5.1 XGBoost hyperparameters 12.5.2 Tuning strategy 12.6
Feature interpretation 12.7 Final thoughts 13. Deep Learning 13.1
Prerequisites 13.2 Why deep learning 13.3 Feedforward DNNs 13.4 Network
architecture 13.4.1 Layers and nodes 13.4.2 Activation 13.5 Backpropagation
13.6 Model training 13.7 Model tuning 13.7.1 Model capacity 13.7.2 Batch
normalization 13.7.3 Regularization 13.7.4 Adjust learning rate 13.8 Grid
Search 13.9 Final thoughts 14. Support Vector Machines 14.1 Prerequisites
14.2 Optimal separating hyperplanes 14.2.1 The hard margin classifier
14.2.2 The soft margin classifier 14.3 The support vector machine 14.3.1
More than two classes 14.3.2 Support vector regression 14.4 Job attrition
example 14.4.1 Class weights 14.4.2 Class probabilities 14.5 Feature
interpretation 14.6 Final thoughts 15. Stacked Models 15.1 Prerequisites
15.2 The Idea 15.2.1 Common ensemble methods 15.2.2 Super learner algorithm
15.2.3 Available packages 15.3 Stacking existing models 15.4 Stacking a
grid search 15.5 Automated machine learning 15.6 Final thoughts 16.
Interpretable Machine Learning 16.1 Prerequisites 16.2 The idea 16.2.1
Global interpretation 16.2.2 Local interpretation 16.2.3 Model-specific vs.
model-agnostic 16.3 Permutation-based feature importance 16.3.1 Concept
16.3.2 Implementation 16.4 Partial dependence 16.4.1 Concept 16.4.2
Implementation 16.4.3 Alternative uses 16.5 Individual conditional
expectation 16.5.1 Concept 16.5.2 Implementation 16.6 Feature interactions
16.6.1 Concept 16.6.2 Implementation 16.6.3 Alternatives 16.7 Local
interpretable model-agnostic explanations 16.7.1 Concept 16.7.2
Implementation 16.7.3 Tuning 16.7.4 Alternative uses 16.8 Shapley values
16.8.1 Concept 16.8.2 Implementation 16.8.3 XGBoost and built-in Shapley
values 16.9 Localized step-wise procedure 16.9.1 Concept 16.9.2
Implementation 16.10Final thoughts III DIMENSION REDUCTION 17. Principal
Components Analysis 17.1 Prerequisites 17.2 The idea 17.3 Finding principal
components 17.4 Performing PCA in R 17.5 Selecting the number of principal
components 17.5.1 Eigenvalue criterion 17.5.2 Proportion of variance
explained criterion 17.5.3 Scree plot criterion 17.6 Final thoughts 18.
Generalized Low Rank Models 18.1 Prerequisites 18.2 The idea 18.3 Finding
the lower ranks 18.3.1 Alternating minimization 18.3.2 Loss functions
18.3.3 Regularization 18.3.4 Selecting k 18.4 Fitting GLRMs in R 18.4.1
Basic GLRM model 18.4.2 Tuning to optimize for unseen data 18.5 Final
thoughts 19. Autoencoders 19.1 Prerequisites 19.2 Undercomplete
autoencoders 19.2.1 Comparing PCA to an autoencoder 19.2.2 Stacked
autoencoders 19.2.3 Visualizing the reconstruction 19.3 Sparse autoencoders
19.4 Denoising autoencoders 19.5 Anomaly detection 19.6 Final thoughts IV
Clustering 20. K-means Clustering 20.1 Prerequisites 20.2 Distance measures
20.3 Defining clusters 20.4 k-means algorithm 20.5 Clustering digits 20.6
How many clusters? 20.7 Clustering with mixed data 20.8 Alternative
partitioning methods 20.9 Final thoughts 21. Hierarchical Clustering 21.1
Prerequisites 21.2 Hierarchical clustering algorithms 21.3 Hierarchical
clustering in R 21.3.1 Agglomerative hierarchical clustering 21.3.2
Divisive hierarchical clustering 21.4 Determining optimal clusters 21.5
Working with dendrograms 21.6 Final thoughts 22. Model-based Clustering
22.1 Prerequisites 22.2 Measuring probability and uncertainty 22.3
Covariance types 22.4 Model selection 22.5 My basket example 22.6 Final
thoughts Bibliography Index
1.1.1 Regression problems 1.1.2 Classification problems 1.2 Unsupervised
learning 1.3 Roadmap 1.4 The data sets 2. Modeling Process 2.1
Prerequisites 2.2 Data splitting 2.2.1 Simple random sampling 2.2.2
Stratified sampling 2.2.3 Class imbalances 2.3 Creating models in R 2.3.1
Many formula interfaces 2.3.2 Many engines 2.4 Resampling methods 2.4.1 k
-fold cross validation 2.4.2 Bootstrapping 2.4.3 Alternatives 2.5 Bias
variance trade-off 2.5.1 Bias 2.5.2 Variance 2.5.3 Hyperparameter tuning
2.6 Model evaluation 2.6.1 Regression models 2.6.2 Classification models
2.7 Putting the processes together 3. Feature & Target Engineering 3.1
Prerequisites 3.2 Target engineering 3.3 Dealing with missingness 3.3.1
Visualizing missing values 3.3.2 Imputation 3.4 Feature filtering 3.5
Numeric feature engineering 3.5.1 Skewness 3.5.2 Standardization 3.6
Categorical feature engineering 3.6.1 Lumping 3.6.2 One-hot & dummy
encoding 3.6.3 Label encoding 3.6.4 Alternatives 3.7 Dimension reduction
3.8 Proper implementation 3.8.1 Sequential steps 3.8.2 Data leakage 3.8.3
Putting the process together II SUPERVISED LEARNING 4. Linear Regression
4.1 Prerequisites 4.2 Simple linear regression 4.2.1 Estimation 4.2.2
Inference 4.3 Multiple linear regression 4.4 Assessing model accuracy 4.5
Model concerns 4.6 Principal component regression 4.7 Partial least squares
4.8 Feature interpretation 4.9 Final thoughts 5. Logistic Regression 5.1
Prerequisites 5.2 Why logistic regression 5.3 Simple logistic regression
5.4 Multiple logistic regression 5.5 Assessing model accuracy 5.6 Model
concerns 5.7 Feature interpretation 5.8 Final thoughts 6. Regularized
Regression 6.1 Prerequisites 6.2 Why regularize? 6.2.1 Ridge penalty 6.2.2
Lasso penalty 6.2.3 Elastic nets 6.3 Implementation 6.4 Tuning 6.5 Feature
interpretation 6.6 Attrition data 6.7 Final thoughts 7. Multivariate
Adaptive Regression Splines 7.1 Prerequisites 7.2 The basic idea 7.2.1
Multivariate regression splines 7.3 Fitting a basic MARS model 7.4 Tuning
7.5 Feature interpretation 7.6 Attrition data 7.7 Final thoughts 8. K
-Nearest Neighbors 8.1 Prerequisites 8.2 Measuring similarity 8.2.1
Distance measures 8.2.2 Pre-processing 8.3 Choosing k 8.4 MNIST example 8.5
Final thoughts 9 Decision Trees 9.1 Prerequisites 9.2 Structure 9.3
Partitioning 9.4 How deep? 9.4.1 Early stopping 9.4.2 Pruning 9.5 Ames
housing example 9.6 Feature interpretation 9.7 Final thoughts 10. Bagging
10.1 Prerequisites 10.2 Why and when bagging works 10.3 Implementation 10.4
Easily parallelize 10.5 Feature interpretation 10.6 Final thoughts 11.
Random Forests 11.1 Prerequisites 11.2 Extending bagging 11.3
Out-of-the-box performance 11.4 Hyperparameters 11.4.1 Number of trees
11.4.2 mtry 11.4.3 Tree complexity 11.4.4 Sampling scheme 11.4.5 Split rule
11.5 Tuning strategies 11.6 Feature interpretation 11.7 Final thoughts 12.
Gradient Boosting 12.1 Prerequisites 12.2 How boosting works 12.2.1 A
sequential ensemble approach 12.2.2 Gradient descent 12.3 Basic GBM 12.3.1
Hyperparameters 12.3.2 Implementation 12.3.3 General tuning strategy 12.4
Stochastic GBMs 12.4.1 Stochastic hyperparameters 12.4.2 Implementation
12.5 XGBoost 12.5.1 XGBoost hyperparameters 12.5.2 Tuning strategy 12.6
Feature interpretation 12.7 Final thoughts 13. Deep Learning 13.1
Prerequisites 13.2 Why deep learning 13.3 Feedforward DNNs 13.4 Network
architecture 13.4.1 Layers and nodes 13.4.2 Activation 13.5 Backpropagation
13.6 Model training 13.7 Model tuning 13.7.1 Model capacity 13.7.2 Batch
normalization 13.7.3 Regularization 13.7.4 Adjust learning rate 13.8 Grid
Search 13.9 Final thoughts 14. Support Vector Machines 14.1 Prerequisites
14.2 Optimal separating hyperplanes 14.2.1 The hard margin classifier
14.2.2 The soft margin classifier 14.3 The support vector machine 14.3.1
More than two classes 14.3.2 Support vector regression 14.4 Job attrition
example 14.4.1 Class weights 14.4.2 Class probabilities 14.5 Feature
interpretation 14.6 Final thoughts 15. Stacked Models 15.1 Prerequisites
15.2 The Idea 15.2.1 Common ensemble methods 15.2.2 Super learner algorithm
15.2.3 Available packages 15.3 Stacking existing models 15.4 Stacking a
grid search 15.5 Automated machine learning 15.6 Final thoughts 16.
Interpretable Machine Learning 16.1 Prerequisites 16.2 The idea 16.2.1
Global interpretation 16.2.2 Local interpretation 16.2.3 Model-specific vs.
model-agnostic 16.3 Permutation-based feature importance 16.3.1 Concept
16.3.2 Implementation 16.4 Partial dependence 16.4.1 Concept 16.4.2
Implementation 16.4.3 Alternative uses 16.5 Individual conditional
expectation 16.5.1 Concept 16.5.2 Implementation 16.6 Feature interactions
16.6.1 Concept 16.6.2 Implementation 16.6.3 Alternatives 16.7 Local
interpretable model-agnostic explanations 16.7.1 Concept 16.7.2
Implementation 16.7.3 Tuning 16.7.4 Alternative uses 16.8 Shapley values
16.8.1 Concept 16.8.2 Implementation 16.8.3 XGBoost and built-in Shapley
values 16.9 Localized step-wise procedure 16.9.1 Concept 16.9.2
Implementation 16.10Final thoughts III DIMENSION REDUCTION 17. Principal
Components Analysis 17.1 Prerequisites 17.2 The idea 17.3 Finding principal
components 17.4 Performing PCA in R 17.5 Selecting the number of principal
components 17.5.1 Eigenvalue criterion 17.5.2 Proportion of variance
explained criterion 17.5.3 Scree plot criterion 17.6 Final thoughts 18.
Generalized Low Rank Models 18.1 Prerequisites 18.2 The idea 18.3 Finding
the lower ranks 18.3.1 Alternating minimization 18.3.2 Loss functions
18.3.3 Regularization 18.3.4 Selecting k 18.4 Fitting GLRMs in R 18.4.1
Basic GLRM model 18.4.2 Tuning to optimize for unseen data 18.5 Final
thoughts 19. Autoencoders 19.1 Prerequisites 19.2 Undercomplete
autoencoders 19.2.1 Comparing PCA to an autoencoder 19.2.2 Stacked
autoencoders 19.2.3 Visualizing the reconstruction 19.3 Sparse autoencoders
19.4 Denoising autoencoders 19.5 Anomaly detection 19.6 Final thoughts IV
Clustering 20. K-means Clustering 20.1 Prerequisites 20.2 Distance measures
20.3 Defining clusters 20.4 k-means algorithm 20.5 Clustering digits 20.6
How many clusters? 20.7 Clustering with mixed data 20.8 Alternative
partitioning methods 20.9 Final thoughts 21. Hierarchical Clustering 21.1
Prerequisites 21.2 Hierarchical clustering algorithms 21.3 Hierarchical
clustering in R 21.3.1 Agglomerative hierarchical clustering 21.3.2
Divisive hierarchical clustering 21.4 Determining optimal clusters 21.5
Working with dendrograms 21.6 Final thoughts 22. Model-based Clustering
22.1 Prerequisites 22.2 Measuring probability and uncertainty 22.3
Covariance types 22.4 Model selection 22.5 My basket example 22.6 Final
thoughts Bibliography Index