AP Statistics Premium Practice Exam

📖 Concept Review

Essential formulas and memory keys for each unit

Exploring One-Variable Data

Unit 1

Key Concepts

Describe distributions using SOCS: Shape, Outliers, Center, Spread. Identify skewness direction by the tail, not the peak.

Mean: x̄ = Σxᵢ / n
Median: middle value when sorted
IQR = Q3 − Q1
Outlier rule: < Q1 − 1.5·IQR or > Q3 + 1.5·IQR
Standard deviation: s = √[Σ(xᵢ−x̄)² / (n−1)]

SOCS for distributions Skew → tail direction IQR fence for outliers Mean > Median → right skew

Worked Example

Q: Data set: {2, 4, 5, 7, 100}. Is 100 an outlier?
Q1=3, Q3=53.5, IQR=50.5 → Upper fence = 53.5+75.75 = 129.25

→ 100 < 129.25, so 100 is NOT an outlier by the 1.5×IQR rule.

Exploring Two-Variable Data

Unit 2

Key Concepts

Correlation r measures linear association (−1 to 1). The LSRL minimizes the sum of squared residuals. Residual = Observed − Predicted.

ŷ = a + bx, where b = r·(sᵧ/sₓ), a = ȳ − b·x̄
r² = coefficient of determination (% variation explained)
Residual = y − ŷ

r closer to ±1 = stronger LSRL always passes through (x̄, ȳ) r² = explained variation

Collecting Data

Unit 3

Key Concepts

SRS: every individual equally likely. Stratified: divide into strata, random sample from each. Cluster: randomly select clusters, census within. Experiments need control, randomization, replication, and blinding to establish causation.

Confounding variable ≠ lurking variable Only experiments → causation Blocking reduces variability

Probability & Random Variables

Unit 4

Key Concepts

Law of Large Numbers: long-run frequencies approach true probability. Expected value = Σ(x · P(x)). Variance of sum of independent RVs: σ²(X+Y) = σ²(X) + σ²(Y).

P(A ∪ B) = P(A) + P(B) − P(A ∩ B)
P(A | B) = P(A ∩ B) / P(B)
Independent: P(A ∩ B) = P(A)·P(B)
E(X) = μ = Σ[x · P(x)]
σ²(X) = Σ[(x−μ)² · P(x)]

Sampling Distributions

Unit 5

Key Concepts

Central Limit Theorem: for large n, x̄ is approximately normal regardless of population shape. Standard error of x̄ = σ/√n.

x̄ ~ N(μ, σ/√n) when n ≥ 30 (CLT)
p̂ ~ N(p, √[p(1−p)/n]) when np ≥ 10, n(1−p) ≥ 10
SE(x̄) = σ/√n or s/√n

Larger n → smaller SE CLT: n ≥ 30 rule of thumb np ≥ 10 & n(1−p) ≥ 10 for p̂

Inference for Proportions

Unit 6

Key Concepts

Confidence intervals and hypothesis tests for one proportion and two proportions. Margin of error = z* · SE.

1-prop z-interval: p̂ ± z*·√[p̂(1−p̂)/n]
1-prop z-test: z = (p̂ − p₀) / √[p₀(1−p₀)/n]
2-prop z-test: z = (p̂₁−p̂₂) / √[p̂c(1−p̂c)(1/n₁+1/n₂)]
p̂c = (X₁+X₂)/(n₁+n₂) [pooled proportion]

Inference for Means

Unit 7

Key Concepts

Use t-distribution when σ is unknown (almost always). df = n−1 for 1-sample; for 2-sample, use technology (or conservative: smaller df).

t = (x̄ − μ₀) / (s/√n), df = n−1
CI: x̄ ± t*·(s/√n)
Paired t-test: d̄ = mean of differences, t = d̄ / (sᵈ/√n)

σ unknown → t-test Paired = one-sample on differences Larger df → closer to z

Chi-Square Tests & Regression Inference

Unit 8–9

Key Concepts

Chi-square GOF: test if observed counts match expected. Chi-square homogeneity/independence: test if categorical variables are associated. Regression t-test: test if slope β ≠ 0.

χ² = Σ[(Observed − Expected)² / Expected]
GOF: df = categories − 1
Two-way table: df = (rows−1)(cols−1)
Expected cell = (row total × col total) / grand total
Regression t: t = b / SEb, df = n−2

✏️ Practice Exam

Select one answer per question · Instant feedback after each choice

Unit 1 · Distributions

★★☆ Medium

A distribution of exam scores has a mean of 72 and a median of 68. Which of the following best describes the shape of the distribution?

✓ CORRECT ANSWER: D
When mean > median, the distribution is right-skewed. High outliers pull the mean upward above the median. The tail stretches toward higher values (to the right). Key memory: "Mean is pulled toward the tail."

Unit 1 · Outliers

★★☆ Medium

For the data set {14, 17, 19, 21, 23, 25, 62}, what is the interquartile range (IQR), and is 62 an outlier?

✓ CORRECT ANSWER: C
Sorted: {14, 17, 19, 21, 23, 25, 62}. Q1 = 17, Q3 = 25. IQR = 25 − 17 = 8.
Upper fence: 25 + 1.5(8) = 25 + 12 = 37. Since 62 > 37, 62 is an outlier.

Unit 2 · Regression

★★☆ Medium

A least-squares regression line is ŷ = 4.2 + 1.8x. The correlation coefficient is r = 0.87. What percent of the variation in y is not explained by the linear relationship with x?

✓ CORRECT ANSWER: B
r² = (0.87)² = 0.7569, so 75.69% of variation IS explained.
Variation NOT explained = 1 − 0.7569 = 0.2431 = 24.31%.
This unexplained portion is the residual variation.

Unit 2 · Residuals

★★★ Hard

A residual plot for a linear regression shows a clear curved (U-shaped) pattern. What does this indicate?

✓ CORRECT ANSWER: C
A residual plot should show random scatter around zero if the linear model is appropriate. A curved (U-shaped) pattern indicates a systematic pattern — meaning the linear model fails to capture the true relationship. A non-linear (e.g., quadratic) model should be considered.

Unit 3 · Sampling

★★☆ Medium

A school divides students into freshmen, sophomores, juniors, and seniors, then randomly selects 25 students from each group. This is an example of which sampling method?

✓ CORRECT ANSWER: D
Stratified sampling: divide the population into non-overlapping groups (strata) based on a characteristic (grade level), then take a random sample from each stratum. This differs from cluster sampling, where entire clusters are selected and all members within are surveyed.

Unit 3 · Experiments

★★★ Hard

Researchers want to test whether a new drug reduces blood pressure. Subjects are randomly assigned to receive either the drug or a placebo, and neither the subjects nor the evaluators know which treatment each subject received. This study design is best described as a:

✓ CORRECT ANSWER: E
Key features: (1) Random assignment to treatments → it's an experiment, not observational. (2) Neither subjects nor evaluators know the treatment → double-blind. (3) Includes a placebo → controlled. All three features together define a double-blind randomized controlled experiment.

Unit 4 · Probability

★★☆ Medium

Events A and B are independent. P(A) = 0.4 and P(B) = 0.5. What is P(A ∪ B)?

✓ CORRECT ANSWER: C
Independence → P(A ∩ B) = P(A)·P(B) = 0.4 × 0.5 = 0.20.
Addition rule: P(A ∪ B) = 0.4 + 0.5 − 0.20 = 0.70.

Unit 4 · Random Variables

★★★ Hard

Let X be a random variable with E(X) = 5 and SD(X) = 3. Let Y = 2X − 1. What are E(Y) and SD(Y)?

✓ CORRECT ANSWER: B
Linear transformation rules:
E(aX + b) = a·E(X) + b = 2(5) − 1 = 9
SD(aX + b) = |a|·SD(X) = 2 × 3 = 6
Constants shift the mean but do NOT affect spread; the multiplier scales both mean and SD.

Unit 4 · Binomial

★★★ Hard

A fair coin is flipped 10 times. What are the mean and standard deviation of the number of heads?

✓ CORRECT ANSWER: A
Binomial: n = 10, p = 0.5.
μ = np = 10 × 0.5 = 5
σ = √(np(1−p)) = √(10 × 0.5 × 0.5) = √2.5 ≈ 1.581

Unit 5 · CLT

★★☆ Medium

A population has mean μ = 80 and standard deviation σ = 20. A random sample of n = 100 is taken. What is the probability that the sample mean x̄ is greater than 82?

✓ CORRECT ANSWER: B
SE = σ/√n = 20/√100 = 2.
z = (82 − 80)/2 = 1.0.
P(x̄ > 82) = P(Z > 1.0) = 1 − 0.8413 = 0.1587.

Unit 5 · Sampling Distribution

★★☆ Medium

Which of the following best describes the effect of increasing sample size from n = 25 to n = 100 on the sampling distribution of x̄?

✓ CORRECT ANSWER: D
SE = σ/√n. When n increases from 25 to 100 (×4), √n increases from 5 to 10 (×2), so SE is divided by 2 (cut in half). The mean of the sampling distribution always equals the population mean μ — it does not change.

Unit 6 · Confidence Interval

★★☆ Medium

A 95% confidence interval for a population proportion is (0.42, 0.58). Which of the following is a correct interpretation?

✓ CORRECT ANSWER: C
The correct interpretation of a CI uses the phrase "We are X% confident that the true population parameter lies in this interval."
Common errors: (A) is wrong because the true proportion is fixed — it's not random; (B) is about sampling distribution, not the CI; (D) and (E) are fabricated claims.

Unit 6 · Hypothesis Testing

★★★ Hard

A researcher conducts a one-proportion z-test with H₀: p = 0.30 and Hₐ: p > 0.30, obtaining a p-value of 0.032. At α = 0.05, what is the correct conclusion?

✓ CORRECT ANSWER: B
p-value (0.032) < α (0.05) → Reject H₀. Conclusion: there is sufficient evidence at α = 0.05 to support the claim that p > 0.30. We NEVER "accept H₀" or "accept Hₐ" — we only reject or fail to reject H₀.

Unit 6 · Type I / II Error

★★★ Hard

A researcher sets α = 0.01 instead of α = 0.05 for a hypothesis test. Which of the following best describes the consequence?

✓ CORRECT ANSWER: E
α = P(Type I error). Lowering α from 0.05 to 0.01 → Type I error decreases. But this makes it harder to reject H₀, so the probability of failing to reject a false H₀ (Type II error, β) increases. Power = 1 − β, so power also decreases. There is always a trade-off between Type I and Type II errors.

Unit 7 · t-Procedures

★★☆ Medium

A random sample of 16 students has a mean test score of 78 with a sample standard deviation of 8. Assuming the population is approximately normal, which test statistic should be used to test H₀: μ = 80?

✓ CORRECT ANSWER: C
Since σ is unknown (we only have s = 8), we use the t-distribution. t = (x̄ − μ₀)/(s/√n) = (78−80)/(8/4) = −2/2 = −1.00 with df = n−1 = 15.

Unit 7 · Paired t-Test

★★★ Hard

A researcher measures blood pressure of 12 patients before and after a treatment. The mean of the differences (after − before) is d̄ = −5.2 mmHg with standard deviation sᵈ = 3.8. What is the test statistic for H₀: μᵈ = 0?

✓ CORRECT ANSWER: D
Paired t-test: t = d̄ / (sᵈ/√n) = −5.2 / (3.8/√12) = −5.2 / (3.8/3.464) = −5.2/1.097 ≈ −4.74.
df = n − 1 = 12 − 1 = 11. Note: A and D appear identical; both are correct — in AP Statistics, the paired t-statistic is t = −4.74 with df = 11.

Unit 8 · Chi-Square

★★☆ Medium

A chi-square goodness-of-fit test is performed with 5 categories. The test statistic is χ² = 9.2. At α = 0.05, what is the correct decision? (Critical value: χ²₀.₀₅,₄ = 9.488)

✓ CORRECT ANSWER: B
df = 5 − 1 = 4. Critical value at α = 0.05 with df = 4 is 9.488. Since χ² = 9.2 < 9.488, we fail to reject H₀. There is not sufficient evidence at α = 0.05 that the distribution differs from expected. We never "accept H₀."

Unit 8 · Two-Way Tables

★★★ Hard

In a chi-square test for independence, the expected count for a cell is calculated as:

✓ CORRECT ANSWER: C
Expected cell count = (Row total × Column total) / Grand total. This formula assumes independence: if rows and columns were independent, the joint proportion would equal the product of marginal proportions. Note: all expected counts should be ≥ 5 for the chi-square approximation to be valid.

Unit 9 · Regression Inference

★★★ Hard

A regression analysis of n = 22 data points yields a slope of b = 3.14 with SEb = 1.05. Which of the following correctly tests H₀: β = 0 vs. Hₐ: β ≠ 0?

✓ CORRECT ANSWER: B
For regression inference: t = b / SEb = 3.14/1.05 ≈ 2.99, with df = n − 2 = 22 − 2 = 20. Use t-distribution (not z or χ²). At α = 0.05 two-tailed with df = 20, t* ≈ 2.086. Since 2.99 > 2.086, we reject H₀ and conclude the slope is significantly different from 0.

Unit 3 · Bias

★★☆ Medium

A radio station asks listeners to call in to vote on whether they support a new policy. Of 2,000 callers, 78% say yes. Why is this result likely biased?

✓ CORRECT ANSWER: D
This is a voluntary response sample. People who feel strongly (positively or negatively) are more likely to call in. This creates systematic bias — the sample is not representative of the broader population regardless of sample size. Increasing sample size does NOT fix voluntary response bias.

🎓

Exam Complete!

0/20

Score: 0%

AP Estimate: –

📋 Answer Key & Explanations

Complete solutions for all 20 questions

Official Answer Key

AP STATISTICS · PREMIUM PRACTICE EXAM · © 2024–2025