The entire book is now available for free in pdf format (including errata).

The Preface and Table
of Contents are reproduced below.
(A **Spanish** edition was published by
Editorial Reverté in 2005.
See also Rosenthal's graduate-level
probability book and his probability book for the
general public.)

"The authors do an admirable job supplementing theoretical results with numerical examples and potential simulation studies. ... Ample homework problems are provided. ... The authors' organization is logical, with essential ideas from probability placed at the beginning followed by one-sample inference and then regression problems. The authors succeed in unifying a number of seemingly disparate ideas. ... The exposition is clear and uncluttered. ... In addition to an ambitious topic list and numerous examples throughout, the authors provide off-the-cuff remarks to help the reader assimilate information. ... This is a quality text."-- Dave H. Annis, in The American Statistician 59(3), August 2005.

The text can be used with or without a statistical computer package. It is
our opinion that students should see the importance of various computational
techniques in applications, and the book attempts to do this. Accordingly,
we feel that computational aspects of the subject, such as Monte Carlo,
should be covered, even if a statistical package is not used. All of the
computations in this text were carried out using Minitab. Minitab is a
suitable computational platform to accompany the text, but others could be
used. There is a **Computations** appendix that contains the Minitab
code for those computations that are slightly involved (for example, if
looping is required); these can be used by students as templates for their
own calculations. If a software package like Minitab is used with the
course, then no programming is required by the students to do problems.

We have organized the exercises in the book into groups, as an aid to users.
**Exercises** are suitable for all students and are there to give
practice in applying the concepts discussed in a particular section.
**Problems** require greater understanding, and a student can expect to spend
more thinking time on these. If a problem is marked (MV), then it will
require some facility with multivariable calculus beyond the first calculus
course, although these problems are not necessarily hard. **Challenges**
are problems that most students will find difficult. The **Challenges**
are only for students who have no difficulty with the **Exercises** and
the **Problems**. There are also **Computer Exercises** and
**Computer Problems**, where it is expected that students will make use
of a statistical package in deriving solutions.

We have also included a number of **Discussion Topics** that are
designed to promote critical thinking in students. Throughout the book we
try to point students beyond the mastery of technicalities to think of the
subject in a larger frame of reference. It is important that students
acquire a sound mathematical foundation in the basic techniques of
probability and statistics. We believe that this book will help students
accomplish this. Ultimately, however, these subjects are applied in
real-world contexts, so it is equally important that students understand how
to go about their application and understand what issues arise. Often there
are no right answers to **Discussion Topics**. Their purpose is to get
a student thinking about the subject matter. If these were to be used for
evaluation, then they would be answered in essay format and graded on the
maturity the student showed with respect to the issues involved.
**Discussion Topics** are probably most suitable for smaller classes,
but there
will also be benefit to students if these are simply read over and thought
about.

Some sections of the book are labelled **Advanced**. This material is
aimed at students who are more mathematically mature (for example, they are
taking, or have taken, a second course in calculus). All of the
**Advanced** material can be skipped, with no loss of continuity, by an
instructor who wishes to do so. In particular, the final chapter of the text
is labelled **Advanced** and would only be taught in a high-level
introductory course aimed at specialists. Also, many proofs are put in a
final section of each chapter, labelled **Further Proofs (Advanced)**.
An instructor can choose which (if any) of these proofs they wish to present
to their students. As such, we feel that the material in the text is
presented in a flexible way that allows the instructor to find an
appropriate level for the students they are teaching. There is a
**Mathematical Background** appendix that reviews some mathematical
concepts that students may be rusty on from a first course in calculus,
as well as brief introductions to partial derivatives, double integrals, etc.

**Chapter 1** introduces the probability model and provides motivation
for the study of probability. The basic properties of a probability measure
are developed.

**Chapter 2** deals with discrete, continuous, joint distributions, and
the effects of a change of variable. The multivariate change of variable is
developed in an Advanced section. The topic of simulating from a probability
distribution is introduced in this chapter.

**Chapter 3** introduces expectation. The probability-generating
function is introduced as well as the moments and the moment-generating
function of a random variable. This chapter develops some of the major
inequalities used in probability. There is a section available on
characteristic functions as an Advanced topic.

**Chapter 4** deals with sampling distributions and limits. Convergence
in probability, convergence with probability 1, the weak and strong laws of
large numbers, convergence in distribution, and the central limit theorem
are all introduced along with various applications such as Monte Carlo. The
normal distribution theory, necessary for many statistical applications, is
also dealt with here.

As mentioned, Chapters 1 through 4 include material on Monte Carlo techniques. Simulation is a key aspect of the application of probability theory, and it is our view that its teaching should be integrated with the theory right from the start. This reveals the power of probability to solve real-world problems and helps convince students that it is far more than just an interesting mathematical theory. No practitioner divorces himself from the theory when using the computer for computations or vice versa. We believe this is a more modern way of teaching the subject. This material can be skipped, however, if an instructor doesn't agree with this, or feels they do not have enough time to cover it effectively.

**Chapter 5** is an introduction to statistical inference. For the most
part this is concerned with laying the groundwork for the development of
more formal methodology in later chapters. So practical issues -- such as
proper data collection, presenting data via graphical techniques, and
informal inference methods like descriptive statistics -- are discussed
here.

**Chapter 6** deals with many of the standard methods of inference for
one-sample problems. The theoretical justification for these methods is
developed primarily through the likelihood function, but the treatment is
still fairly informal. Basic methods of inference, such as the standard
error of an estimate, confidence intervals, and P-values, are introduced.
There is also a section devoted to distribution-free (nonparametric) methods
like the bootstrap.

**Chapter 7** involves many of the same problems discussed in Chapter 6
but now from a Bayesian perspective. The point of view adopted here is not
that Bayesian methods are better or, for that matter, worse than those of
Chapter 6. Rather, we take the view that Bayesian methods arise naturally
when the statistician adds another ingredient -- the prior -- to the
model. The appropriateness of this, or the sampling model for the data, is
resolved through the model-checking methods of Chapter 9. It is not our
intention to have students adopt a particular philosophy. Rather, the text
introduces students to a broad spectrum of statistical thinking.

Subsequent chapters deal with both frequentist and Bayesian approaches to the various problems discussed. The Bayesian material is in clearly labelled sections and can be skipped with no loss of continuity, if so desired. It has become apparent in recent years, however, that Bayesian methodology is widely used in applications. As such, we feel that it is important for students to be exposed to this, as well as to the frequentist approaches, early in their statistical education.

**Chapter 8** deals with the traditional optimality justifications
offered for some statistical inferences. In particular, some aspects of
optimal unbiased estimation and the Neyman-Pearson theorem are discussed in
this chapter. There is also a brief introduction to decision theory. This
chapter is more formal and mathematical than Chapters 5, 6, and 7, and it
can be skipped, with no loss of continuity, if an instructor wants to
emphasize methods and applications.

**Chapter 9** is on model checking. We placed model checking in a
separate chapter to emphasize its importance in applications. In practice,
model checking is the way statisticians justify the methods of inference
they use. So this is a very important topic.

**Chapter 10** is concerned with the statistical analysis of
relationships among variables. This includes material on simple linear and
multiple regression, ANOVA, the design of experiments, and contingency
tables. The emphasis in this chapter is on applications.

**Chapter 11** is concerned with stochastic processes. In particular,
Markov chains and Markov chain Monte Carlo are covered in this chapter, as
are Brownian motion and its relevance to finance. Fairly sophisticated
topics are introduced, but the treatment is entirely elementary. Chapter 11
depends only on the material in Chapters 1 through 4.

A one-semester course on probability would cover Chapters 1-4 and perhaps some of Chapter 11. A one-semester, follow-up course on statistics would cover Chapters 5-7 and 9-10. Chapter 8 is not necessary, but some parts, such as the theory of unbiased estimation and optimal testing, are suitable for a more theoretical course.

A basic two-semester course in probability and statistics would cover Chapters 1-6 and 9-10. Such a course covers all the traditional topics, including basic probability theory, basic statistical inference concepts, and the usual introductory applied statistics topics. To cover the entire book would take three semesters, which could be organized in a variety of ways.

The Advanced sections can be skipped or included, depending on the level of the students, with no loss of continuity. A similar comment applies to Chapters 7, 8, and 11.

Students who have already taken an introductory noncalculus-based, applied
statistics course will also benefit from a course based on this text. While
similar topics are covered, they are presented with more depth and rigor
here. For example, *Introduction to the Practice of Statistics*,
Fourth Edition, by D. Moore and G. McCabe (W. H. Freeman, 2003) is an
excellent text, and we feel that this book will serve as the basis for a
good follow-up course.

Many thanks to the reviewers and class testers for their comments: Michelle Baillargeon (McMaster University), Lisa A. Bloomer (Middle Tennessee State University), Eugene Demidenko (Dartmouth College), Robert P. Dobrow (Carleton College), John Ferdinands (Calvin College), Soledad A. Fernandez (The Ohio State University), Dr. Paramjit Gill (Okanagan University College), Ellen Gundlach (Purdue University), Paul Gustafson (University of British Columbia), Jan Hannig (Colorado State University), Susan Herring (Sonoma State University), George F. Hilton, Ph.D., (Pacific Union College), Paul Joyce (University of Idaho), Hubert Lilliefors (George Washington University), Phil McDonnough (University of Toronto), Julia Morton (Nipissing University), Randall H. Rieger (West Chester University), Robert L. Schaefer (Miami University), Osnat Stramer (University of Iowa), Tim B. Swartz (Simon Fraser University), Glen Takahara (Queen's University), Robert D. Thompson (Hunter College), Dr. David C. Vaughan (Wilfrid Laurier University), Joseph J. Walker (Georgia State University), Dongfeng Wu (Mississippi State University), Yuehua Wu (York University), Nicholas Zaino (University of Rochester).

The authors would also like to thank many who have assisted in the development of this project. In particular our colleagues and students at the University of Toronto have been very supportive. Hadas Moshonov, Aysha Hashim, and Natalia Cheredeko of the University of Toronto helped in many ways. A number of the data sets in Chapter 10 have been used in courses at the University of Toronto for many years and were, we believe, compiled through the work of the late Professor Daniel B. DeLury. Professor David Moore of Purdue University was of assistance in providing several of the tables at the back of the text. Patrick Farace, Chris Spavins, and Danielle Swearengin of W. H. Freeman provided much support and encouragement. Our families helped us with their patience and care while we worked at what seemed at times an unending task; many thanks to Rosemary and Heather Evans and Margaret Fulford.

Preface x 1 Probability Models 1 1.1 Probability: A Measure of Uncertainty 1 1.1.1 Why Do We Need Probability Theory? 2 1.2 Probability Models 4 1.2.1 Venn Diagrams and Subsets 7 1.3 Properties of Probability Models 10 1.4 Uniform Probability on Finite Spaces 13 1.4.1 Combinatorial Principles 14 1.5 Conditional Probability and Independence 19 1.5.1 Conditional Probability 20 1.5.2 Independence of Events 23 1.6 Continuity of P 28 1.7 Further Proofs (Advanced) 30 2 Random Variables and Distributions 33 2.1 Random Variables 33 2.2 Distributions of Random Variables 37 2.3 Discrete Distributions 40 2.3.1 Important Discrete Distributions 41 2.4 Continuous Distributions 50 2.4.1 Important Absolutely Continuous Distributions 52 2.5 Cumulative Distribution Functions 61 2.5.1 Properties of Distribution Functions 62 2.5.2 Cdfs of Discrete Distributions 62 2.5.3 Cdfs of Absolutely Continuous Distributions 64 2.5.4 Mixture Distributions 66 2.5.5 Distributions Neither Discrete Nor Continuous 69 2.6 One-Dimensional Change of Variable 72 2.6.1 The Discrete Case 72 2.6.2 The Continuous Case 73 2.7 Joint Distributions 77 2.7.1 Joint Cumulative Distribution Functions 77 2.7.2 Marginal Distributions 79 2.7.3 Joint Probability Functions 80 2.7.4 Joint Density Functions 82 2.8 Conditioning and Independence 89 2.8.1 Conditioning on Discrete Random Variables 90 2.8.2 Conditioning on Continuous Random Variables 91 2.8.3 Independence of Random Variables 93 2.8.4 Order Statistics 99 2.9 Multidimensional Change of Variable 104 2.9.1 The Discrete Case 104 2.9.2 The Continuous Case (Advanced) 105 2.9.3 Convolution 108 2.10 Simulating Probability Distributions 111 2.10.1 Simulating Discrete Distributions 112 2.10.2 Simulating Continuous Distributions 114 2.11 Further Proofs (Advanced) 119 3 Expectation 123 3.1 The Discrete Case 123 3.2 The Absolutely Continuous Case 135 3.3 Variance, Covariance, and Correlation 142 3.4 Generating Functions 154 3.4.1 Characteristic Functions (Advanced) 161 3.5 Conditional Expectation 166 3.5.1 Discrete Case 166 3.5.2 Absolutely Continuous Case 168 3.5.3 Double Expectations 169 3.5.4 Conditional Variance (Advanced) 171 3.6 Inequalities 176 3.6.1 Jensen's Inequality (Advanced) 179 3.7 General Expectations (Advanced) 182 3.8 Further Proofs (Advanced) 185 4 Sampling Distributions and Limits 189 4.1 Sampling Distributions 190 4.2 Convergence in Probability 193 4.2.1 The Weak Law of Large Numbers 195 4.3 Convergence with Probability 1 198 4.3.1 The Strong Law of Large Numbers 200 4.4 Convergence in Distribution 202 4.4.1 The Central Limit Theorem 204 4.4.2 The Central Limit Theorem and Assessing Error 209 4.5 Monte Carlo Approximations 213 4.6 Normal Distribution Theory 222 4.6.1 The Chi-Squared Distribution 223 4.6.2 The t Distribution 226 4.6.3 The F Distribution 227 4.7 Further Proofs (Advanced) 231 5 Statistical Inference 239 5.1 Why Do We Need Statistics? 239 5.2 Inference Using a Probability Model 244 5.3 Statistical Models 247 5.4 Data Collection 254 5.4.1 Finite Populations 254 5.4.2 Simple Random Sampling 256 5.4.3 Histograms 259 5.4.4 Survey Sampling 261 5.5 Some Basic Inferences 266 5.5.1 Descriptive Statistics 267 5.5.2 Plotting Data 271 5.5.3 Types of Inference 273 6 Likelihood Inference 281 6.1 The Likelihood Function 281 6.1.1 Sufficient Statistics 286 6.2 Maximum Likelihood Estimation 291 6.2.1 The Multidimensional Case (Advanced) 299 6.3 Inferences Based on the MLE 302 6.3.1 Standard Errors and Bias 303 6.3.2 Confidence Intervals 307 6.3.3 Testing Hypotheses and P-Values 313 6.3.4 Sample Size Calculations: Confidence Intervals 320 6.3.5 Sample Size Calculations: Power 322 6.4 Distribution-Free Methods 329 6.4.1 Method of Moments 330 6.4.2 Bootstrapping 331 6.4.3 The Sign Statistic and Inferences about Quantiles 335 6.5 Large Sample Behavior of the MLE (Advanced) 342 7 Bayesian Inference 351 7.1 The Prior and Posterior Distributions 352 7.2 Inferences Based on the Posterior 361 7.2.1 Estimation 364 7.2.2 Credible Intervals 368 7.2.3 Hypothesis Testing and Bayes Factors 371 7.2.4 Prediction 377 7.3 Bayesian Computations 383 7.3.1 Asymptotic Normality of the Posterior 383 7.3.2 Sampling from the Posterior 383 7.3.3 Sampling from the Posterior Using Gibbs Sampling (Advanced) 389 7.4 Choosing Priors 397 7.5 Further Proofs (Advanced) 402 8 Optimal Inferences 405 8.1 Optimal Unbiased Estimation 405 8.1.1 The Cramer-Rao Inequality (Advanced) 412 8.2 Optimal Hypothesis Testing 418 8.2.1 Likelihood Ratio Tests (Advanced) 426 8.3 Optimal Bayesian Inferences 430 8.4 Decision Theory (Advanced) 434 8.5 Further Proofs (Advanced) 444 9 Model Checking 449 9.1 Checking the Sampling Model 449 9.1.1 Residual and Probability Plots 456 9.1.2 The Chi-Squared Goodness of Fit Test 460 9.1.3 Prediction and Cross-Validation 465 9.1.4 What Do We Do When a Model Fails? 466 9.2 Checking the Bayesian Model 471 9.3 The Problem with Multiple Checks 477 10 Relationships Among Variables 479 10.1 Related Variables 480 10.1.1 Cause-Effect Relationships and Experiments 483 10.1.2 Design of Experiments 486 10.2 Categorical Response and Predictors 494 10.2.1 Random Predictor 494 10.2.2 Deterministic Predictor 497 10.2.3 Bayesian Formulation 499 10.3 Quantitative Response and Predictors 505 10.3.1 The Method of Least Squares 505 10.3.2 The Simple Linear Regression Model 507 10.3.3 Bayesian Simple Linear Model (Advanced) 521 10.3.4 The Multiple Linear Regression Model (Advanced) 525 10.4 Quantitative Response and Categorical Predictors 543 10.4.1 One Categorical Predictor (One-Way ANOVA) 543 10.4.2 Repeated Measures (Paired Comparisons) 549 10.4.3 Two Categorical Predictors (Two-Way ANOVA) 552 10.4.4 Randomized Blocks 559 10.4.5 One Categorical and One Quantitative Predictor 560 10.5 Categorical Response and Quantitative Predictors 568 10.6 Further Proofs (Advanced) 572 11 Advanced Topic -- Stochastic Processes 579 11.1 Simple Random Walk 579 11.1.1 The Distribution of the Fortune 580 11.1.2 The Gambler's Ruin Problem 582 11.2 Markov Chains 586 11.2.1 Examples of Markov Chains 587 11.2.2 Computing with Markov Chains 590 11.2.3 Stationary Distributions 593 11.2.4 Markov Chain Limit Theorem 597 11.3 Markov Chain Monte Carlo 604 11.3.1 The Metropolis-Hastings Algorithm 607 11.3.2 The Gibbs Sampler 610 11.4 Martingales 613 11.4.1 Definition of a Martingale 613 11.4.2 Expected Values 615 11.4.3 Stopping Times 616 11.5 Brownian Motion 620 11.5.1 Faster and Faster Random Walks 621 11.5.2 Brownian Motion as a Limit 622 11.5.3 Diffusions and Stock Prices 625 11.6 Poisson Processes 629 11.7 Further Proofs 631 A Mathematical Background 639 A.1 Derivatives 639 A.2 Integrals 640 A.3 Infinite Series 641 A.4 Matrix Multiplication 642 A.5 Partial Derivatives 642 A.6 Multivariable Integrals 643 A.6.1 Nonrectangular Regions 644 B Computations 647 C Common Distributions 653 C.1 Discrete Distributions 653 C.2 Absolutely Continuous Distributions 654 D Tables 657 D.1 Random Numbers 658 D.2 Standard Normal Cdf 660 D.3 Chi-Squared Distribution Quantiles 661 D.4 t Distribution Quantiles 662 D.5 F Distribution Quantiles 663 D.6 Binomial Distribution Probabilities 672 Index 677