A new, full-color, completely updated edition of the key practical guide to chemometrics This new edition of this practical guide on chemometrics, emphasizes the principles and applications behind the main ideas in the field using numerical and graphical examples, which can then be applied to a wide variety of problems in chemistry, biology, chemical engineering, and allied disciplines. Presented in full color, it features expansion of the principal component analysis, classification, multivariate evolutionary signal and statistical distributions sections, and new case studies in metabolomics,…mehr
A new, full-color, completely updated edition of the key practical guide to chemometrics This new edition of this practical guide on chemometrics, emphasizes the principles and applications behind the main ideas in the field using numerical and graphical examples, which can then be applied to a wide variety of problems in chemistry, biology, chemical engineering, and allied disciplines. Presented in full color, it features expansion of the principal component analysis, classification, multivariate evolutionary signal and statistical distributions sections, and new case studies in metabolomics, as well as extensive updates throughout. Aimed at the large number of users of chemometrics, it includes extensive worked problems and chapters explaining how to analyze datasets, in addition to updated descriptions of how to apply Excel and Matlab for chemometrics. Chemometrics: Data Driven Extraction for Science, Second Edition offers chapters covering: experimental design, signal processing, pattern recognition, calibration, and evolutionary data. The pattern recognition chapter from the first edition is divided into two separate ones: Principal Component Analysis/Cluster Analysis, and Classification. It also includes new descriptions of Alternating Least Squares (ALS) and Iterative Target Transformation Factor Analysis (ITTFA). Updated descriptions of wavelets and Bayesian methods are included. * Includes updated chapters of the classic chemometric methods (e.g. experimental design, signal processing, etc.) * Introduces metabolomics-type examples alongside those from analytical chemistry * Features problems at the end of each chapter to illustrate the broad applicability of the methods in different fields * Supplemented with data sets and solutions to the problems on a dedicated website Chemometrics: Data Driven Extraction for Science, Second Edition is recommended for post-graduate students of chemometrics as well as applied scientists (e.g. chemists, biochemists, engineers, statisticians) working in all areas of data analysis.Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
RICHARD G. BRERETON is Director of Brereton Consultancy and Emeritus Professor at the University of Bristol, UK. He is Fellow of the Royal Society of Chemistry, Royal Statistical Society and Royal Society of Medicine. He has applied chemometrics in a wide variety of areas including pharmaceuticals, materials, metabolomics, heritage studies and forensics, and has published over 400 articles, including writing/editing eight books.
Inhaltsangabe
Preface to Second Edition xi Preface to First Edition xiii Acknowledgements xv About the Companion Website xvii 1 Introduction 1 1.1 Historical Parentage 1 1.1.1 Applied Statistics 1 1.1.2 Statistics in Analytical and Physical Chemistry 2 1.1.3 Scientific Computing 3 1.2 Developments since the 1970s 3 1.3 Software and Calculations 4 1.4 Further Reading 6 1.4.1 General 6 1.4.2 Specific Areas 7 References 8 2 Experimental Design 11 2.1 Introduction 11 2.2 Basic Principles 14 2.2.1 Degrees of Freedom 14 2.2.2 Analysis of Variance 17 2.2.3 Design Matrices and Modelling 23 2.2.4 Assessment of Significance 29 2.2.5 Leverage and Confidence in Models 38 2.3 Factorial Designs 43 2.3.1 Full Factorial Designs 44 2.3.2 Fractional Factorial Designs 49 2.3.3 Plackett-Burman and Taguchi Designs 55 2.3.4 Partial Factorials at Several Levels: Calibration Designs 57 2.4 Central Composite or Response Surface Designs 62 2.4.1 Setting up the Design 62 2.4.2 Degrees of Freedom 65 2.4.3 Axial Points 66 2.4.4 Modelling 67 2.4.5 Statistical Factors 69 2.5 Mixture Designs 70 2.5.1 Mixture Space 70 2.5.2 Simplex Centroid 71 2.5.3 Simplex Lattice 74 2.5.4 Constraints 76 2.5.5 Process Variables 81 2.6 Simplex Optimisation 82 2.6.1 Fixed Sized Simplex 82 2.6.2 Elaborations 84 2.6.3 Modified Simplex 84 2.6.4 Limitations 86 Problems 86 3 Signal Processing 101 3.1 Introduction 101 3.1.1 Environmental and Geological Processes 101 3.1.2 Industrial Process Control 101 3.1.3 Chromatograms and Spectra 102 3.1.4 Fourier Transforms 102 3.1.5 Advanced Methods 102 3.2 Basics 103 3.2.1 Peak shapes 103 3.2.2 Digitisation 107 3.2.3 Noise 109 3.2.4 Cyclicity 112 3.3 Linear Filters 112 3.3.1 Smoothing Functions 112 3.3.2 Derivatives 116 3.3.3 Convolution 118 3.4 Correlograms and Time Series Analysis 122 3.4.1 Auto-correlograms 122 3.4.2 Cross-correlograms 124 3.4.3 Multivariate Correlograms 127 3.5 Fourier Transform Techniques 128 3.5.1 Fourier Transforms 128 3.5.2 Fourier Filters 135 3.5.3 Convolution Theorem 140 3.6 Additional Methods 142 3.6.1 Kalman Filters 142 3.6.2 Wavelet Transforms 145 3.6.3 Bayes' Theorem 148 3.6.4 Maximum Entropy 150 Problems 153 4 Principal Component Analysis and Unsupervised Pattern Recognition 163 4.1 Introduction 163 4.1.1 Exploratory Data Analysis 163 4.1.2 Cluster Analysis 164 4.2 The Concept and Need for Principal Components Analysis 164 4.2.1 History 164 4.2.2 Multivariate Data Matrices 165 4.2.3 Case Studies 166 4.2.4 Aims of PCA 171 4.3 Principal Components Analysis: The Method 171 4.3.1 Scores and Loadings 171 4.3.2 Rank and Eigenvalues 175 4.4 Factor Analysis 183 4.5 Graphical Representation of Scores and Loadings 184 4.5.1 Scores Plots 185 4.5.2 Loadings Plots 188 4.6 Pre-processing 191 4.6.1 Transforming Individual Elements of a Matrix 191 4.6.2 Row Scaling 193 4.6.3 Mean Centring 194 4.6.4 Standardisation 197 4.6.5 Further Methods 199 4.7 Comparing Multivariate Patterns 199 4.7.1 Biplots 200 4.7.2 Procrustes Analysis 201 4.8 Unsupervised Pattern Recognition: Cluster Analysis 201 4.8.1 Similarity 202 4.8.2 Linkage 204 4.8.3 Next Steps 206 4.8.4 Dendrograms 206 4.9 Multi-way Pattern Recognition 207 4.9.1 Tucker3 Models 207 4.9.2 Parallel Factor Analysis (PARAFAC) 208 4.9.3 Unfolding 209 Problems 210 5 Classification and Supervised Pattern Recognition 215 5.1 Introduction 215 5.1.1 Background 215 5.1.2 Case Study 216 5.2 Two-Class Classifiers 216 5.2.1 Distance-Based Methods 217 5.2.2 Partial Least-Squares Discriminant Analysis 224 5.2.3 K Nearest Neighbours 226 5.3 One-Class Classifiers 229 5.3.1 Quadratic Discriminant Analysis 229 5.3.2 Disjoint PCA and SIMCA 232 5.4 Multi-Class Classifiers 236 5.5 Optimisation and Validation 237 5.5.1 Validation 238 5.5.2 Optimisation 245 5.6 Significant Variables 246 5.6.1 Partial Least-Squares Discriminant Loadings and Weights 248 5.6.2 Univariate Statistical Indicators 250 5.6.3 Variable Selection for SIMCA 251 Problems 252 6 Calibration 265 6.1 Introduction 265 6.1.1 History, Usage and Terminology 265 6.1.2 Case Study 267 6.2 Univariate Calibration 267 6.2.1 Classical Calibration 269 6.2.2 Inverse Calibration 272 6.2.3 Intercept and Centring 274 6.3 Multiple Linear Regression 276 6.3.1 Multi-detector Advantage 276 6.3.2 Multi-wavelength Equations 277 6.3.3 Multivariate Approaches 280 6.4 Principal Components Regression 284 6.4.1 Regression 284 6.4.2 Quality of Prediction 287 6.5 Partial Least Squares Regression 289 6.5.1 PLS1 289 6.5.2 PLS2 294 6.5.3 Multi-way PLS 297 6.6 Model Validation and Optimisation 302 6.6.1 Auto-prediction 302 6.6.2 Cross-validation 303 6.6.3 Independent Test Sets 305 Problems 309 7 Evolutionary Multivariate Signals 323 7.1 Introduction 323 7.2 Exploratory Data Analysis and Pre-processing 325 7.2.1 Baseline Correction 325 7.2.2 Principal Component-Based Plots 325 7.2.3 Scaling the Data after PCA 329 7.2.4 Scaling the Data before PCA 332 7.2.5 Variable Selection 339 7.3 Determining Composition 341 7.3.1 Composition 341 7.3.2 Univariate Methods 342 7.3.3 Correlation- and Similarity-Based Methods 345 7.3.4 Eigenvalue-Based Methods 348 7.3.5 Derivatives 352 7.4 Resolution 355 7.4.1 Selectivity for All Components 356 7.4.2 Partial Selectivity 360 7.4.3 Incorporating Constraints: ITTFA, ALS and MCR 362 Problems 365 A Appendix 375 A.1 Vectors and Matrices 375 A.1.1 Notation and Definitions 375 A.1.2 Matrix and Vector Operations 375 A.2 Algorithms 377 A.2.1 Principal Components Analysis 377 A.2.2 PLS1 378 A.2.3 PLS2 379 A.2.4 Tri-Linear PLS1 380 A.3 Basic Statistical Concepts 381 A.3.1 Descriptive Statistics 381 A.3.2 Normal Distribution 383 A.3.3 ¿2-Distribution 383 A.3.4 t-Distribution 386 A.3.5 F-Distribution 386 A.4 Excel for Chemometrics 390 A.4.1 Names and Addresses 390 A.4.2 Equations and Functions 394 A.4.3 Add-Ins 398 A.4.4 Charts 398 A.4.5 Downloadable Macros 400 A.5 Matlab for Chemometrics 408 A.5.1 Getting Started 408 A.5.2 File Types 409 A.5.3 Matrices 411 A.5.4 Importing and Exporting Data 416 A.5.5 Introduction to Programming and Structure 417 A.5.6 Graphics 418 Answers to the Multiple Choice Questions 429 Index 433