NEW: The entire book is now available for free in pdf format.
The Preface and Table of Contents are reproduced below. See also the selected answers and the authors' errata page, and the Spanish edition. (See also Rosenthal's graduate-level probability book and his probability book for the general public.)
"The authors do an admirable job supplementing theoretical results with numerical examples and potential simulation studies. ... Ample homework problems are provided. ... The authors' organization is logical, with essential ideas from probability placed at the beginning followed by one-sample inference and then regression problems. The authors succeed in unifying a number of seemingly disparate ideas. ... The exposition is clear and uncluttered. ... In addition to an ambitious topic list and numerous examples throughout, the authors provide off-the-cuff remarks to help the reader assimilate information. ... This is a quality text."
-- Dave H. Annis, in The American Statistician 59(3), August 2005.
The text can be used with or without a statistical computer package. It is our opinion that students should see the importance of various computational techniques in applications, and the book attempts to do this. Accordingly, we feel that computational aspects of the subject, such as Monte Carlo, should be covered, even if a statistical package is not used. All of the computations in this text were carried out using Minitab. Minitab is a suitable computational platform to accompany the text, but others could be used. There is a Computations appendix that contains the Minitab code for those computations that are slightly involved (for example, if looping is required); these can be used by students as templates for their own calculations. If a software package like Minitab is used with the course, then no programming is required by the students to do problems.
We have organized the exercises in the book into groups, as an aid to users. Exercises are suitable for all students and are there to give practice in applying the concepts discussed in a particular section. Problems require greater understanding, and a student can expect to spend more thinking time on these. If a problem is marked (MV), then it will require some facility with multivariable calculus beyond the first calculus course, although these problems are not necessarily hard. Challenges are problems that most students will find difficult. The Challenges are only for students who have no difficulty with the Exercises and the Problems. There are also Computer Exercises and Computer Problems, where it is expected that students will make use of a statistical package in deriving solutions.
We have also included a number of Discussion Topics that are designed to promote critical thinking in students. Throughout the book we try to point students beyond the mastery of technicalities to think of the subject in a larger frame of reference. It is important that students acquire a sound mathematical foundation in the basic techniques of probability and statistics. We believe that this book will help students accomplish this. Ultimately, however, these subjects are applied in real-world contexts, so it is equally important that students understand how to go about their application and understand what issues arise. Often there are no right answers to Discussion Topics. Their purpose is to get a student thinking about the subject matter. If these were to be used for evaluation, then they would be answered in essay format and graded on the maturity the student showed with respect to the issues involved. Discussion Topics are probably most suitable for smaller classes, but there will also be benefit to students if these are simply read over and thought about.
Some sections of the book are labelled Advanced. This material is aimed at students who are more mathematically mature (for example, they are taking, or have taken, a second course in calculus). All of the Advanced material can be skipped, with no loss of continuity, by an instructor who wishes to do so. In particular, the final chapter of the text is labelled Advanced and would only be taught in a high-level introductory course aimed at specialists. Also, many proofs are put in a final section of each chapter, labelled Further Proofs (Advanced). An instructor can choose which (if any) of these proofs they wish to present to their students. As such, we feel that the material in the text is presented in a flexible way that allows the instructor to find an appropriate level for the students they are teaching. There is a Mathematical Background appendix that reviews some mathematical concepts that students may be rusty on from a first course in calculus, as well as brief introductions to partial derivatives, double integrals, etc.
Chapter 1 introduces the probability model and provides motivation for the study of probability. The basic properties of a probability measure are developed.
Chapter 2 deals with discrete, continuous, joint distributions, and the effects of a change of variable. The multivariate change of variable is developed in an Advanced section. The topic of simulating from a probability distribution is introduced in this chapter.
Chapter 3 introduces expectation. The probability-generating function is introduced as well as the moments and the moment-generating function of a random variable. This chapter develops some of the major inequalities used in probability. There is a section available on characteristic functions as an Advanced topic.
Chapter 4 deals with sampling distributions and limits. Convergence in probability, convergence with probability 1, the weak and strong laws of large numbers, convergence in distribution, and the central limit theorem are all introduced along with various applications such as Monte Carlo. The normal distribution theory, necessary for many statistical applications, is also dealt with here.
As mentioned, Chapters 1 through 4 include material on Monte Carlo techniques. Simulation is a key aspect of the application of probability theory, and it is our view that its teaching should be integrated with the theory right from the start. This reveals the power of probability to solve real-world problems and helps convince students that it is far more than just an interesting mathematical theory. No practitioner divorces himself from the theory when using the computer for computations or vice versa. We believe this is a more modern way of teaching the subject. This material can be skipped, however, if an instructor doesn't agree with this, or feels they do not have enough time to cover it effectively.
Chapter 5 is an introduction to statistical inference. For the most part this is concerned with laying the groundwork for the development of more formal methodology in later chapters. So practical issues -- such as proper data collection, presenting data via graphical techniques, and informal inference methods like descriptive statistics -- are discussed here.
Chapter 6 deals with many of the standard methods of inference for one-sample problems. The theoretical justification for these methods is developed primarily through the likelihood function, but the treatment is still fairly informal. Basic methods of inference, such as the standard error of an estimate, confidence intervals, and P-values, are introduced. There is also a section devoted to distribution-free (nonparametric) methods like the bootstrap.
Chapter 7 involves many of the same problems discussed in Chapter 6 but now from a Bayesian perspective. The point of view adopted here is not that Bayesian methods are better or, for that matter, worse than those of Chapter 6. Rather, we take the view that Bayesian methods arise naturally when the statistician adds another ingredient -- the prior -- to the model. The appropriateness of this, or the sampling model for the data, is resolved through the model-checking methods of Chapter 9. It is not our intention to have students adopt a particular philosophy. Rather, the text introduces students to a broad spectrum of statistical thinking.
Subsequent chapters deal with both frequentist and Bayesian approaches to the various problems discussed. The Bayesian material is in clearly labelled sections and can be skipped with no loss of continuity, if so desired. It has become apparent in recent years, however, that Bayesian methodology is widely used in applications. As such, we feel that it is important for students to be exposed to this, as well as to the frequentist approaches, early in their statistical education.
Chapter 8 deals with the traditional optimality justifications offered for some statistical inferences. In particular, some aspects of optimal unbiased estimation and the Neyman-Pearson theorem are discussed in this chapter. There is also a brief introduction to decision theory. This chapter is more formal and mathematical than Chapters 5, 6, and 7, and it can be skipped, with no loss of continuity, if an instructor wants to emphasize methods and applications.
Chapter 9 is on model checking. We placed model checking in a separate chapter to emphasize its importance in applications. In practice, model checking is the way statisticians justify the methods of inference they use. So this is a very important topic.
Chapter 10 is concerned with the statistical analysis of relationships among variables. This includes material on simple linear and multiple regression, ANOVA, the design of experiments, and contingency tables. The emphasis in this chapter is on applications.
Chapter 11 is concerned with stochastic processes. In particular, Markov chains and Markov chain Monte Carlo are covered in this chapter, as are Brownian motion and its relevance to finance. Fairly sophisticated topics are introduced, but the treatment is entirely elementary. Chapter 11 depends only on the material in Chapters 1 through 4.
A one-semester course on probability would cover Chapters 1-4 and perhaps some of Chapter 11. A one-semester, follow-up course on statistics would cover Chapters 5-7 and 9-10. Chapter 8 is not necessary, but some parts, such as the theory of unbiased estimation and optimal testing, are suitable for a more theoretical course.
A basic two-semester course in probability and statistics would cover Chapters 1-6 and 9-10. Such a course covers all the traditional topics, including basic probability theory, basic statistical inference concepts, and the usual introductory applied statistics topics. To cover the entire book would take three semesters, which could be organized in a variety of ways.
The Advanced sections can be skipped or included, depending on the level of the students, with no loss of continuity. A similar comment applies to Chapters 7, 8, and 11.
Students who have already taken an introductory noncalculus-based, applied statistics course will also benefit from a course based on this text. While similar topics are covered, they are presented with more depth and rigor here. For example, Introduction to the Practice of Statistics, Fourth Edition, by D. Moore and G. McCabe (W. H. Freeman, 2003) is an excellent text, and we feel that this book will serve as the basis for a good follow-up course.
Many thanks to the reviewers and class testers for their comments: Michelle Baillargeon (McMaster University), Lisa A. Bloomer (Middle Tennessee State University), Eugene Demidenko (Dartmouth College), Robert P. Dobrow (Carleton College), John Ferdinands (Calvin College), Soledad A. Fernandez (The Ohio State University), Dr. Paramjit Gill (Okanagan University College), Ellen Gundlach (Purdue University), Paul Gustafson (University of British Columbia), Jan Hannig (Colorado State University), Susan Herring (Sonoma State University), George F. Hilton, Ph.D., (Pacific Union College), Paul Joyce (University of Idaho), Hubert Lilliefors (George Washington University), Phil McDonnough (University of Toronto), Julia Morton (Nipissing University), Randall H. Rieger (West Chester University), Robert L. Schaefer (Miami University), Osnat Stramer (University of Iowa), Tim B. Swartz (Simon Fraser University), Glen Takahara (Queen's University), Robert D. Thompson (Hunter College), Dr. David C. Vaughan (Wilfrid Laurier University), Joseph J. Walker (Georgia State University), Dongfeng Wu (Mississippi State University), Yuehua Wu (York University), Nicholas Zaino (University of Rochester).
The authors would also like to thank many who have assisted in the development of this project. In particular our colleagues and students at the University of Toronto have been very supportive. Hadas Moshonov, Aysha Hashim, and Natalia Cheredeko of the University of Toronto helped in many ways. A number of the data sets in Chapter 10 have been used in courses at the University of Toronto for many years and were, we believe, compiled through the work of the late Professor Daniel B. DeLury. Professor David Moore of Purdue University was of assistance in providing several of the tables at the back of the text. Patrick Farace, Chris Spavins, and Danielle Swearengin of W. H. Freeman provided much support and encouragement. Our families helped us with their patience and care while we worked at what seemed at times an unending task; many thanks to Rosemary and Heather Evans and Margaret Fulford.
Preface x 1 Probability Models 1 1.1 Probability: A Measure of Uncertainty 1 1.1.1 Why Do We Need Probability Theory? 2 1.2 Probability Models 4 1.2.1 Venn Diagrams and Subsets 7 1.3 Properties of Probability Models 10 1.4 Uniform Probability on Finite Spaces 13 1.4.1 Combinatorial Principles 14 1.5 Conditional Probability and Independence 19 1.5.1 Conditional Probability 20 1.5.2 Independence of Events 23 1.6 Continuity of P 28 1.7 Further Proofs (Advanced) 30 2 Random Variables and Distributions 33 2.1 Random Variables 33 2.2 Distributions of Random Variables 37 2.3 Discrete Distributions 40 2.3.1 Important Discrete Distributions 41 2.4 Continuous Distributions 50 2.4.1 Important Absolutely Continuous Distributions 52 2.5 Cumulative Distribution Functions 61 2.5.1 Properties of Distribution Functions 62 2.5.2 Cdfs of Discrete Distributions 62 2.5.3 Cdfs of Absolutely Continuous Distributions 64 2.5.4 Mixture Distributions 66 2.5.5 Distributions Neither Discrete Nor Continuous 69 2.6 One-Dimensional Change of Variable 72 2.6.1 The Discrete Case 72 2.6.2 The Continuous Case 73 2.7 Joint Distributions 77 2.7.1 Joint Cumulative Distribution Functions 77 2.7.2 Marginal Distributions 79 2.7.3 Joint Probability Functions 80 2.7.4 Joint Density Functions 82 2.8 Conditioning and Independence 89 2.8.1 Conditioning on Discrete Random Variables 90 2.8.2 Conditioning on Continuous Random Variables 91 2.8.3 Independence of Random Variables 93 2.8.4 Order Statistics 99 2.9 Multidimensional Change of Variable 104 2.9.1 The Discrete Case 104 2.9.2 The Continuous Case (Advanced) 105 2.9.3 Convolution 108 2.10 Simulating Probability Distributions 111 2.10.1 Simulating Discrete Distributions 112 2.10.2 Simulating Continuous Distributions 114 2.11 Further Proofs (Advanced) 119 3 Expectation 123 3.1 The Discrete Case 123 3.2 The Absolutely Continuous Case 135 3.3 Variance, Covariance, and Correlation 142 3.4 Generating Functions 154 3.4.1 Characteristic Functions (Advanced) 161 3.5 Conditional Expectation 166 3.5.1 Discrete Case 166 3.5.2 Absolutely Continuous Case 168 3.5.3 Double Expectations 169 3.5.4 Conditional Variance (Advanced) 171 3.6 Inequalities 176 3.6.1 Jensen's Inequality (Advanced) 179 3.7 General Expectations (Advanced) 182 3.8 Further Proofs (Advanced) 185 4 Sampling Distributions and Limits 189 4.1 Sampling Distributions 190 4.2 Convergence in Probability 193 4.2.1 The Weak Law of Large Numbers 195 4.3 Convergence with Probability 1 198 4.3.1 The Strong Law of Large Numbers 200 4.4 Convergence in Distribution 202 4.4.1 The Central Limit Theorem 204 4.4.2 The Central Limit Theorem and Assessing Error 209 4.5 Monte Carlo Approximations 213 4.6 Normal Distribution Theory 222 4.6.1 The Chi-Squared Distribution 223 4.6.2 The t Distribution 226 4.6.3 The F Distribution 227 4.7 Further Proofs (Advanced) 231 5 Statistical Inference 239 5.1 Why Do We Need Statistics? 239 5.2 Inference Using a Probability Model 244 5.3 Statistical Models 247 5.4 Data Collection 254 5.4.1 Finite Populations 254 5.4.2 Simple Random Sampling 256 5.4.3 Histograms 259 5.4.4 Survey Sampling 261 5.5 Some Basic Inferences 266 5.5.1 Descriptive Statistics 267 5.5.2 Plotting Data 271 5.5.3 Types of Inference 273 6 Likelihood Inference 281 6.1 The Likelihood Function 281 6.1.1 Sufficient Statistics 286 6.2 Maximum Likelihood Estimation 291 6.2.1 The Multidimensional Case (Advanced) 299 6.3 Inferences Based on the MLE 302 6.3.1 Standard Errors and Bias 303 6.3.2 Confidence Intervals 307 6.3.3 Testing Hypotheses and P-Values 313 6.3.4 Sample Size Calculations: Confidence Intervals 320 6.3.5 Sample Size Calculations: Power 322 6.4 Distribution-Free Methods 329 6.4.1 Method of Moments 330 6.4.2 Bootstrapping 331 6.4.3 The Sign Statistic and Inferences about Quantiles 335 6.5 Large Sample Behavior of the MLE (Advanced) 342 7 Bayesian Inference 351 7.1 The Prior and Posterior Distributions 352 7.2 Inferences Based on the Posterior 361 7.2.1 Estimation 364 7.2.2 Credible Intervals 368 7.2.3 Hypothesis Testing and Bayes Factors 371 7.2.4 Prediction 377 7.3 Bayesian Computations 383 7.3.1 Asymptotic Normality of the Posterior 383 7.3.2 Sampling from the Posterior 383 7.3.3 Sampling from the Posterior Using Gibbs Sampling (Advanced) 389 7.4 Choosing Priors 397 7.5 Further Proofs (Advanced) 402 8 Optimal Inferences 405 8.1 Optimal Unbiased Estimation 405 8.1.1 The Cramer-Rao Inequality (Advanced) 412 8.2 Optimal Hypothesis Testing 418 8.2.1 Likelihood Ratio Tests (Advanced) 426 8.3 Optimal Bayesian Inferences 430 8.4 Decision Theory (Advanced) 434 8.5 Further Proofs (Advanced) 444 9 Model Checking 449 9.1 Checking the Sampling Model 449 9.1.1 Residual and Probability Plots 456 9.1.2 The Chi-Squared Goodness of Fit Test 460 9.1.3 Prediction and Cross-Validation 465 9.1.4 What Do We Do When a Model Fails? 466 9.2 Checking the Bayesian Model 471 9.3 The Problem with Multiple Checks 477 10 Relationships Among Variables 479 10.1 Related Variables 480 10.1.1 Cause-Effect Relationships and Experiments 483 10.1.2 Design of Experiments 486 10.2 Categorical Response and Predictors 494 10.2.1 Random Predictor 494 10.2.2 Deterministic Predictor 497 10.2.3 Bayesian Formulation 499 10.3 Quantitative Response and Predictors 505 10.3.1 The Method of Least Squares 505 10.3.2 The Simple Linear Regression Model 507 10.3.3 Bayesian Simple Linear Model (Advanced) 521 10.3.4 The Multiple Linear Regression Model (Advanced) 525 10.4 Quantitative Response and Categorical Predictors 543 10.4.1 One Categorical Predictor (One-Way ANOVA) 543 10.4.2 Repeated Measures (Paired Comparisons) 549 10.4.3 Two Categorical Predictors (Two-Way ANOVA) 552 10.4.4 Randomized Blocks 559 10.4.5 One Categorical and One Quantitative Predictor 560 10.5 Categorical Response and Quantitative Predictors 568 10.6 Further Proofs (Advanced) 572 11 Advanced Topic -- Stochastic Processes 579 11.1 Simple Random Walk 579 11.1.1 The Distribution of the Fortune 580 11.1.2 The Gambler's Ruin Problem 582 11.2 Markov Chains 586 11.2.1 Examples of Markov Chains 587 11.2.2 Computing with Markov Chains 590 11.2.3 Stationary Distributions 593 11.2.4 Markov Chain Limit Theorem 597 11.3 Markov Chain Monte Carlo 604 11.3.1 The Metropolis-Hastings Algorithm 607 11.3.2 The Gibbs Sampler 610 11.4 Martingales 613 11.4.1 Definition of a Martingale 613 11.4.2 Expected Values 615 11.4.3 Stopping Times 616 11.5 Brownian Motion 620 11.5.1 Faster and Faster Random Walks 621 11.5.2 Brownian Motion as a Limit 622 11.5.3 Diffusions and Stock Prices 625 11.6 Poisson Processes 629 11.7 Further Proofs 631 A Mathematical Background 639 A.1 Derivatives 639 A.2 Integrals 640 A.3 Infinite Series 641 A.4 Matrix Multiplication 642 A.5 Partial Derivatives 642 A.6 Multivariable Integrals 643 A.6.1 Nonrectangular Regions 644 B Computations 647 C Common Distributions 653 C.1 Discrete Distributions 653 C.2 Absolutely Continuous Distributions 654 D Tables 657 D.1 Random Numbers 658 D.2 Standard Normal Cdf 660 D.3 Chi-Squared Distribution Quantiles 661 D.4 t Distribution Quantiles 662 D.5 F Distribution Quantiles 663 D.6 Binomial Distribution Probabilities 672 Index 677