David M. Diez, Mine Çetinkaya-Rundel, Christopher D. Barr
Digital versions | Two PDF versions: full screen or tablet |
LaTeX source available | Yes |
Exercises | Yes |
Solutions | Odd numbered problems |
Solution Manual | Available to verified teachers |
License | Creative Commons |
- Fourth edition (May 2019)
- Black and white paperback version from Amazon $20
- Free desk copy on request to verified teachers
- Text has been used at Duke and Princeton and dozens of other schools and colleges and is the text for the Coursera course taught by the second author
- Companion data sets available on website
- Labs based on freely available R and RStudio
- Short videos for about 75% of book sections
- For more information and to download
- Wholesale bookstore options
As the authors write in the preface, “Data is messy, and statistical tools are imperfect. But, when you understand the strengths and weaknesses of these tools, you can use them to learn about the real world.” This book is full of examples and exercises on topics of current interest pulled from the popular media and published research.
In addition to the exercises at the end of each section and chapter, a novel feature is the incorporation of in-chapter exercises, meant to be done immediately, with answers below in the footnotes.
Chapter Summaries
- Introduction to data. Data structures, variables, summaries, graphics, and basic data collection techniques.
- Summarizing data. Data summaries, graphics, and a teaser of inference using randomization
- Probability. The basic principles of probability.
- Distributions of random variables. The normal model and other key distributions.
- Foundations for inference. General ideas for statistical inference in the context of estimating the population proportion.
- Inference for categorical data. Inference for proportions and tables using the normal and chi-square distributions.
- Inference for numerical data. Inference for one or two sample means using the t-distribution, statistical power for comparing two groups, and also comparisons of many means using ANOVA.
- Introduction to linear regression. Regression for a numerical outcome with one predictor variable. Most of this chapter could be covered after Chapter 1.
- Multiple and logistic regression. Regression for numerical and categorical data using many predictors.