This introduction to visualization techniques and statistical models for second language research focuses on three types of data (continuous, binary, and scalar), helping readers to understand regression models fully and to apply them in their work. Garcia offers advanced coverage of Bayesian analysis, simulated data, exercises, implementable script code, and practical guidance on the latest R software packages. The book, also demonstrating the benefits to the L2 field of this type of statistical work, is a resource for graduate students and researchers in second language acquisition, applied…mehr
This introduction to visualization techniques and statistical models for second language research focuses on three types of data (continuous, binary, and scalar), helping readers to understand regression models fully and to apply them in their work. Garcia offers advanced coverage of Bayesian analysis, simulated data, exercises, implementable script code, and practical guidance on the latest R software packages. The book, also demonstrating the benefits to the L2 field of this type of statistical work, is a resource for graduate students and researchers in second language acquisition, applied linguistics, and corpus linguistics who are interested in quantitative data analysis.Hinweis: Dieser Artikel kann nur an eine deutsche Lieferadresse ausgeliefert werden.
Guilherme D. Garcia is Assistant Professor of Linguistics at Ball State University, USA.
Inhaltsangabe
Contents List of figures List of tables List of code blocks Acknowledgments Preface Part I Getting ready 1 Introduction 1.1 Main objectives of this book 1.2 A logical series of steps 1.2.1 Why focus on data visualization techniques? 1.2.2 Why focus on full-fledged statistical models? 1.3 Statistical concepts 1.3.1 p-values 1.3.2 Effect sizes 1.3.3 Confidence intervals 1.3.4 Standard errors 1.3.5 Further reading 2 R basics 23 2.1 Why R? 2.2 Fundamentals 2.2.1 Installing R and RStudio 2.2.2 Interface 2.2.3 R basics 2.3 Data frames 2.4 Reading your data 2.4.1 Is your data file ready? 2.4.2 R Projects 2.4.3 Importing your data 2.5 The tidyverse package 2.5.1 Wide-to-long transformation 2.5.2 Grouping, filtering, changing, and summarizing data 2.6 Figures 2.6.1 Using ggplot2 2.6.2 General guidelines for data visualization 2.7 Basic statistics in R 2.7.1 What's your research question? 2.7.2 t-tests and ANOVAs in R 2.7.3 A post-hoc test in R 2.8 More packages 2.9 Additional readings on R 2.10 Summary 2.11 Exercises Part II Visualizing the data 3 Continuous data 3.1 Importing your data 3.2 Preparing your data 3.3 Histograms 3.4 Scatter plots 3.5 Box plots 3.6 Bar plots and error bars 3.7 Line plots 3.8 Additional readings on data visualization 3.9 Summary 3.10 Exercises 4 Categorical data 4.1 Binary data 4.2 Ordinal data 4.3 Summary 4.4 Exercises 5 Aesthetics: optimizing your figures 5.1 More on aesthetics 5.2 Exercises Part III Analyzing the data 127 6 Linear regression 129 6.1 Introduction 6.2 Examples and interpretation 6.2.1 Does Hours affect scores? 6.2.2 Does Feedback affect scores? 6.2.3 Do Feedback and Hours affect scores? 6.2.4 Do Feedback and Hours interact? 6.3 Beyond the basics 6.3.1 Comparing models and plotting estimates 6.3.2 Scaling variables 6.4 Summary 6.5 Exercises 7 Logistic regression 7.1 Introduction 7.1.1 Defining the best curve in a logistic model 7.1.2 A family of models 7.2 Examples and interpretation 7.2.1 Can reaction time differentiate learners and native speakers? 7.2.2 Does Condition affect responses? 7.2.3 Do Proficiency and Condition affect responses? 7.2.4 Do Proficiency and Condition interact? 7.3 Summary 7.4 Exercises 8 Ordinal regression 8.1 Introduction 8.2 Examples and interpretation 8.2.1 Does Condition affect participants' certainty? 8.2.2 Do Condition and L1 interact? 8.3 Summary 8.4 Exercises 9 Hierarchical models 9.1 Introduction 9.2 Examples and interpretation 9.2.1 Random-intercept model 9.2.2 Random-slope and random-intercept model 9.3 Additional readings on regression models 9.4 Summary 9.5 Exercises 10 Going Bayesian 10.1 Introduction to Bayesian data analysis 10.1.1 Sampling from the posterior 10.2 The RData format 10.3 Getting ready 10.4 Bayesian models: linear and logistic examples 10.4.1 Bayesian model A: Feedback 10.4.2 Bayesian model B: Relative clauses with prior specifications 10.5 Additional readings on Bayesian inference 10.6 Summary 10.7 Exercises 11 Final remarks Appendix A: Troubleshooting Appendix B: RStudio shortcuts Appendix C: Symbols and acronyms Appendix D: Files used in this book Appendix E: Contrast coding Appendix F: Models and nested data Glossary References Subject index Function Index
Contents
List of figures
List of tables
List of code blocks
Acknowledgments
Preface
Part I Getting ready
1 Introduction
1.1 Main objectives of this book
1.2 A logical series of steps
1.2.1 Why focus on data visualization techniques?
1.2.2 Why focus on full-fledged statistical models?
1.3 Statistical concepts
1.3.1 p-values
1.3.2 Effect sizes
1.3.3 Confidence intervals
1.3.4 Standard errors
1.3.5 Further reading
2 R basics 23
2.1 Why R?
2.2 Fundamentals
2.2.1 Installing R and RStudio
2.2.2 Interface
2.2.3 R basics
2.3 Data frames
2.4 Reading your data
2.4.1 Is your data file ready?
2.4.2 R Projects
2.4.3 Importing your data
2.5 The tidyverse package
2.5.1 Wide-to-long transformation
2.5.2 Grouping, filtering, changing, and summarizing data
2.6 Figures
2.6.1 Using ggplot2
2.6.2 General guidelines for data visualization
2.7 Basic statistics in R
2.7.1 What's your research question?
2.7.2 t-tests and ANOVAs in R
2.7.3 A post-hoc test in R
2.8 More packages
2.9 Additional readings on R
2.10 Summary
2.11 Exercises
Part II Visualizing the data
3 Continuous data
3.1 Importing your data
3.2 Preparing your data
3.3 Histograms
3.4 Scatter plots
3.5 Box plots
3.6 Bar plots and error bars
3.7 Line plots
3.8 Additional readings on data visualization
3.9 Summary
3.10 Exercises
4 Categorical data
4.1 Binary data
4.2 Ordinal data
4.3 Summary
4.4 Exercises
5 Aesthetics: optimizing your figures
5.1 More on aesthetics
5.2 Exercises
Part III Analyzing the data 127
6 Linear regression 129
6.1 Introduction
6.2 Examples and interpretation
6.2.1 Does Hours affect scores?
6.2.2 Does Feedback affect scores?
6.2.3 Do Feedback and Hours affect scores?
6.2.4 Do Feedback and Hours interact?
6.3 Beyond the basics
6.3.1 Comparing models and plotting estimates
6.3.2 Scaling variables
6.4 Summary
6.5 Exercises
7 Logistic regression
7.1 Introduction
7.1.1 Defining the best curve in a logistic model
7.1.2 A family of models
7.2 Examples and interpretation
7.2.1 Can reaction time differentiate learners and native speakers?
7.2.2 Does Condition affect responses?
7.2.3 Do Proficiency and Condition affect responses?
7.2.4 Do Proficiency and Condition interact?
7.3 Summary
7.4 Exercises
8 Ordinal regression
8.1 Introduction
8.2 Examples and interpretation
8.2.1 Does Condition affect participants' certainty?
8.2.2 Do Condition and L1 interact?
8.3 Summary
8.4 Exercises
9 Hierarchical models
9.1 Introduction
9.2 Examples and interpretation
9.2.1 Random-intercept model
9.2.2 Random-slope and random-intercept model
9.3 Additional readings on regression models
9.4 Summary
9.5 Exercises
10 Going Bayesian
10.1 Introduction to Bayesian data analysis
10.1.1 Sampling from the posterior
10.2 The RData format
10.3 Getting ready
10.4 Bayesian models: linear and logistic examples
10.4.1 Bayesian model A: Feedback
10.4.2 Bayesian model B: Relative clauses with prior specifications
Contents List of figures List of tables List of code blocks Acknowledgments Preface Part I Getting ready 1 Introduction 1.1 Main objectives of this book 1.2 A logical series of steps 1.2.1 Why focus on data visualization techniques? 1.2.2 Why focus on full-fledged statistical models? 1.3 Statistical concepts 1.3.1 p-values 1.3.2 Effect sizes 1.3.3 Confidence intervals 1.3.4 Standard errors 1.3.5 Further reading 2 R basics 23 2.1 Why R? 2.2 Fundamentals 2.2.1 Installing R and RStudio 2.2.2 Interface 2.2.3 R basics 2.3 Data frames 2.4 Reading your data 2.4.1 Is your data file ready? 2.4.2 R Projects 2.4.3 Importing your data 2.5 The tidyverse package 2.5.1 Wide-to-long transformation 2.5.2 Grouping, filtering, changing, and summarizing data 2.6 Figures 2.6.1 Using ggplot2 2.6.2 General guidelines for data visualization 2.7 Basic statistics in R 2.7.1 What's your research question? 2.7.2 t-tests and ANOVAs in R 2.7.3 A post-hoc test in R 2.8 More packages 2.9 Additional readings on R 2.10 Summary 2.11 Exercises Part II Visualizing the data 3 Continuous data 3.1 Importing your data 3.2 Preparing your data 3.3 Histograms 3.4 Scatter plots 3.5 Box plots 3.6 Bar plots and error bars 3.7 Line plots 3.8 Additional readings on data visualization 3.9 Summary 3.10 Exercises 4 Categorical data 4.1 Binary data 4.2 Ordinal data 4.3 Summary 4.4 Exercises 5 Aesthetics: optimizing your figures 5.1 More on aesthetics 5.2 Exercises Part III Analyzing the data 127 6 Linear regression 129 6.1 Introduction 6.2 Examples and interpretation 6.2.1 Does Hours affect scores? 6.2.2 Does Feedback affect scores? 6.2.3 Do Feedback and Hours affect scores? 6.2.4 Do Feedback and Hours interact? 6.3 Beyond the basics 6.3.1 Comparing models and plotting estimates 6.3.2 Scaling variables 6.4 Summary 6.5 Exercises 7 Logistic regression 7.1 Introduction 7.1.1 Defining the best curve in a logistic model 7.1.2 A family of models 7.2 Examples and interpretation 7.2.1 Can reaction time differentiate learners and native speakers? 7.2.2 Does Condition affect responses? 7.2.3 Do Proficiency and Condition affect responses? 7.2.4 Do Proficiency and Condition interact? 7.3 Summary 7.4 Exercises 8 Ordinal regression 8.1 Introduction 8.2 Examples and interpretation 8.2.1 Does Condition affect participants' certainty? 8.2.2 Do Condition and L1 interact? 8.3 Summary 8.4 Exercises 9 Hierarchical models 9.1 Introduction 9.2 Examples and interpretation 9.2.1 Random-intercept model 9.2.2 Random-slope and random-intercept model 9.3 Additional readings on regression models 9.4 Summary 9.5 Exercises 10 Going Bayesian 10.1 Introduction to Bayesian data analysis 10.1.1 Sampling from the posterior 10.2 The RData format 10.3 Getting ready 10.4 Bayesian models: linear and logistic examples 10.4.1 Bayesian model A: Feedback 10.4.2 Bayesian model B: Relative clauses with prior specifications 10.5 Additional readings on Bayesian inference 10.6 Summary 10.7 Exercises 11 Final remarks Appendix A: Troubleshooting Appendix B: RStudio shortcuts Appendix C: Symbols and acronyms Appendix D: Files used in this book Appendix E: Contrast coding Appendix F: Models and nested data Glossary References Subject index Function Index
Contents
List of figures
List of tables
List of code blocks
Acknowledgments
Preface
Part I Getting ready
1 Introduction
1.1 Main objectives of this book
1.2 A logical series of steps
1.2.1 Why focus on data visualization techniques?
1.2.2 Why focus on full-fledged statistical models?
1.3 Statistical concepts
1.3.1 p-values
1.3.2 Effect sizes
1.3.3 Confidence intervals
1.3.4 Standard errors
1.3.5 Further reading
2 R basics 23
2.1 Why R?
2.2 Fundamentals
2.2.1 Installing R and RStudio
2.2.2 Interface
2.2.3 R basics
2.3 Data frames
2.4 Reading your data
2.4.1 Is your data file ready?
2.4.2 R Projects
2.4.3 Importing your data
2.5 The tidyverse package
2.5.1 Wide-to-long transformation
2.5.2 Grouping, filtering, changing, and summarizing data
2.6 Figures
2.6.1 Using ggplot2
2.6.2 General guidelines for data visualization
2.7 Basic statistics in R
2.7.1 What's your research question?
2.7.2 t-tests and ANOVAs in R
2.7.3 A post-hoc test in R
2.8 More packages
2.9 Additional readings on R
2.10 Summary
2.11 Exercises
Part II Visualizing the data
3 Continuous data
3.1 Importing your data
3.2 Preparing your data
3.3 Histograms
3.4 Scatter plots
3.5 Box plots
3.6 Bar plots and error bars
3.7 Line plots
3.8 Additional readings on data visualization
3.9 Summary
3.10 Exercises
4 Categorical data
4.1 Binary data
4.2 Ordinal data
4.3 Summary
4.4 Exercises
5 Aesthetics: optimizing your figures
5.1 More on aesthetics
5.2 Exercises
Part III Analyzing the data 127
6 Linear regression 129
6.1 Introduction
6.2 Examples and interpretation
6.2.1 Does Hours affect scores?
6.2.2 Does Feedback affect scores?
6.2.3 Do Feedback and Hours affect scores?
6.2.4 Do Feedback and Hours interact?
6.3 Beyond the basics
6.3.1 Comparing models and plotting estimates
6.3.2 Scaling variables
6.4 Summary
6.5 Exercises
7 Logistic regression
7.1 Introduction
7.1.1 Defining the best curve in a logistic model
7.1.2 A family of models
7.2 Examples and interpretation
7.2.1 Can reaction time differentiate learners and native speakers?
7.2.2 Does Condition affect responses?
7.2.3 Do Proficiency and Condition affect responses?
7.2.4 Do Proficiency and Condition interact?
7.3 Summary
7.4 Exercises
8 Ordinal regression
8.1 Introduction
8.2 Examples and interpretation
8.2.1 Does Condition affect participants' certainty?
8.2.2 Do Condition and L1 interact?
8.3 Summary
8.4 Exercises
9 Hierarchical models
9.1 Introduction
9.2 Examples and interpretation
9.2.1 Random-intercept model
9.2.2 Random-slope and random-intercept model
9.3 Additional readings on regression models
9.4 Summary
9.5 Exercises
10 Going Bayesian
10.1 Introduction to Bayesian data analysis
10.1.1 Sampling from the posterior
10.2 The RData format
10.3 Getting ready
10.4 Bayesian models: linear and logistic examples
10.4.1 Bayesian model A: Feedback
10.4.2 Bayesian model B: Relative clauses with prior specifications
10.5 Additional readings on Bayesian inference
10.6 Summary
10.7 Exercises
11 Final remarks
Appendix A: Troubleshooting
Appendix B: RStudio shortcuts
Appendix C: Symbols and acronyms
Appendix D: Files used in this book
Appendix E: Contrast coding
Appendix F: Models and nested data
Glossary
References
Subject index
Function Index
Rezensionen
Highly recommended as an accessible introduction to the use of R for analysis of second language data. Readers will come away with an understanding of why and how to use statistical models and data visualization techniques in their research.
Lydia White, McGill University, Canada.
Curious where the field's quantitative methods are headed? The answer is in your hands right now! Whether we knew it or not, this is the book that many of us have been waiting for. From scatter plots to standard errors and from beta values to Bayes theorem, Garcia provides us with all the tools we need-both conceptual and practical-to statistically and visually model the complexities of L2 development.
Luke Plonsky, Northern Arizona University, USA.
This volume is a timely and must-have addition to any quantitative SLA researcher's data analysis arsenal, whether you are downloading R for the first time or a seasoned user ready to dive into Bayesian analysis. Guilherme Garcia's accessible, conversational writing style and uncanny ability to provide answers to questions right as you're about to ask them will give new users the confidence to make the move to R and will serve as an invaluable resource for students and instructors alike for years to come.
Jennifer Cabrelli, University of Illinois at Chicago, USA.
Es gelten unsere Allgemeinen Geschäftsbedingungen: www.buecher.de/agb
Impressum
www.buecher.de ist ein Internetauftritt der buecher.de internetstores GmbH
Geschäftsführung: Monica Sawhney | Roland Kölbl | Günter Hilger
Sitz der Gesellschaft: Batheyer Straße 115 - 117, 58099 Hagen
Postanschrift: Bürgermeister-Wegele-Str. 12, 86167 Augsburg
Amtsgericht Hagen HRB 13257
Steuernummer: 321/5800/1497
USt-IdNr: DE450055826