This book teaches readers to integrate data analysis techniques into humanities research practices using the R programming language. Methods for general-purpose visualization and analysis are introduced first, followed by domain-specific techniques for working with networks, text, geospatial data, temporal data, and images. The book is designed to be a bridge between quantitative and qualitative methods, individual and collaborative work, and the humanities and social sciences. The second edition of the text is a significant revision, with almost every aspect of the text rewritten in some way. The most notable difference is the incorporation of new R packages such as ggplot2 and dplyr that center broad data-science concepts.
This 2nd edition of Humanities Data with R does not presuppose background programming experience. Early chapters take readers from R set-up to exploratory data analysis, with one chapter dedicated to each stage of the data-science pipeline (data collection, visualization, manipulation, and relational joins). Following this, text analysis, networks, temporal data, geospatial data, and image analysis each have a dedicated chapter. These are grounded in examples to move readers beyond the intimidation of adding new tools to their research. The final section of the book extends the core material with additional computer science techniques for processing large datasets.
Everything is hands-on: image analysis is explained using digitized photographs from the 1930s, and networks are applied to page links on Wikipedia. After working through these examples with the provided data, code and book website, readers are prepared to apply new methods to their own work. The open source R programming language, with its myriad packages and popularity within the sciences and social sciences, is particularly well-suited to working with humanities data. R packages are also highlighted in an appendix.
The methodology will have wide application in classrooms and self-study for the humanities, but also for use in linguistics, anthropology, and political science. Outside the classroom, this intersection of humanities and computing is particularly relevant for research and new modes of dissemination across archives, museums and libraries.
This 2nd edition of Humanities Data with R does not presuppose background programming experience. Early chapters take readers from R set-up to exploratory data analysis, with one chapter dedicated to each stage of the data-science pipeline (data collection, visualization, manipulation, and relational joins). Following this, text analysis, networks, temporal data, geospatial data, and image analysis each have a dedicated chapter. These are grounded in examples to move readers beyond the intimidation of adding new tools to their research. The final section of the book extends the core material with additional computer science techniques for processing large datasets.
Everything is hands-on: image analysis is explained using digitized photographs from the 1930s, and networks are applied to page links on Wikipedia. After working through these examples with the provided data, code and book website, readers are prepared to apply new methods to their own work. The open source R programming language, with its myriad packages and popularity within the sciences and social sciences, is particularly well-suited to working with humanities data. R packages are also highlighted in an appendix.
The methodology will have wide application in classrooms and self-study for the humanities, but also for use in linguistics, anthropology, and political science. Outside the classroom, this intersection of humanities and computing is particularly relevant for research and new modes of dissemination across archives, museums and libraries.
Arnold and Tilton are a brilliant team, and this highly accessible book will appeal to a wide range of digital humanists. The text analysis chapters are very good, and the authors' work to develop an R package for interacting with the Stanford CoreNLP java Library fills a huge hole in the R text processing landscape.
Matthew L. Jockers, University of Nebraska-Lincoln; author of Text Analysis with R for Students of Literature (Springer, 2014)
This is the first book that covers analysis of all main parts of humanities data: texts, images, geospatial data, and networks. Now digital humanities finally has its perfect textbook. This is the book many of us were awaiting for years. It teaches you R (the most widely used open source data analysis platform today worldwide) using many examples. The writing is very clear, and information is organized in a logical and easy to follow manner. Whether you are just considering working with humanities data or already have experience, this is the must read book.
Lev Manovich, The Graduate Center, City University of New York; author of The Language of New Media (MIT, 2001)
This book gives a concise yet broadly accessible introduction to R, through the lens of exploratory data analysis, coupled with well-planned forays into key humanities data types and their analysis -- including a nice primer on network analysis.
Eric D. Kolaczyk, Boston University; author of Statistical Analysis of Network Data with R (Springer, 2014)
Matthew L. Jockers, University of Nebraska-Lincoln; author of Text Analysis with R for Students of Literature (Springer, 2014)
This is the first book that covers analysis of all main parts of humanities data: texts, images, geospatial data, and networks. Now digital humanities finally has its perfect textbook. This is the book many of us were awaiting for years. It teaches you R (the most widely used open source data analysis platform today worldwide) using many examples. The writing is very clear, and information is organized in a logical and easy to follow manner. Whether you are just considering working with humanities data or already have experience, this is the must read book.
Lev Manovich, The Graduate Center, City University of New York; author of The Language of New Media (MIT, 2001)
This book gives a concise yet broadly accessible introduction to R, through the lens of exploratory data analysis, coupled with well-planned forays into key humanities data types and their analysis -- including a nice primer on network analysis.
Eric D. Kolaczyk, Boston University; author of Statistical Analysis of Network Data with R (Springer, 2014)