Variance Study Guide
Study Guide
📖 Core Concepts
Variance – expected squared deviation from the mean: \(\operatorname{Var}(X)=\operatorname{E}[(X-\mu)^2]\).
Standard deviation – positive square‑root of variance, keeps original units.
Second central moment – variance is the second central moment of a distribution.
Population vs. Sample – population variance (\(\sigma^2\)) uses all observations; sample variance (\(s^2\)) estimates \(\sigma^2\) from a subset.
Unbiased estimator – using Bessel’s correction \((n-1)\) makes the expected value of \(s^2\) equal to \(\sigma^2\).
Law of Total Variance – \(\displaystyle\operatorname{Var}(X)=\operatorname{E}[\operatorname{Var}(X\mid Y)]+\operatorname{Var}(\operatorname{E}[X\mid Y])\).
Additivity (Bienaymé) – for uncorrelated (or independent) variables, variances add: \(\operatorname{Var}(X+Y)=\operatorname{Var}(X)+\operatorname{Var}(Y)\).
Scaling & Translation – adding a constant leaves variance unchanged; multiplying by \(a\) scales variance by \(a^2\).
---
📌 Must Remember
Variance formulas
\(\operatorname{Var}(X)=\operatorname{E}[X^2]-(\operatorname{E}[X])^2\).
Discrete: \(\displaystyle\operatorname{Var}(X)=\sum{x}(x-\mu)^2p(x)\).
Continuous: \(\displaystyle\operatorname{Var}(X)=\int{-\infty}^{\infty}(x-\mu)^2f(x)\,dx\).
Population variance \(\displaystyle\sigma^2=\frac{1}{N}\sum{i=1}^{N}(xi-\mu)^2\).
Unbiased sample variance \(\displaystyle s^2=\frac{1}{n-1}\sum{i=1}^{n}(xi-\bar{x})^2\).
Biased sample variance uses divisor \(n\) → underestimates by factor \(\frac{n-1}{n}\).
Common distribution variances
Bernoulli(p): \(p(1-p)\)
Binomial(n,p): \(np(1-p)\)
Poisson(λ): λ
Exponential(λ): \(1/\lambda^{2}\).
Variance of a linear combination
\[
\operatorname{Var}\!\Big(\sumi ai Xi\Big)=\sumi ai^{2}\operatorname{Var}(Xi)+2\!\sum{i<j} ai aj\operatorname{Cov}(Xi,Xj).
\]
Matrix shortcut \(\operatorname{Var}(\mathbf{a}^{\mathsf{T}}\mathbf{X})=\mathbf{a}^{\mathsf{T}}\Sigma\mathbf{a}\).
Recursive update (add one observation)
\[
s{\text{new}}^{2}= \frac{(n-1)s^{2}+ (x{\text{new}}-\bar{x})^{2}}{n}.
\]
---
🔄 Key Processes
Compute population variance
Find population mean \(\mu\).
Subtract \(\mu\) from each observation, square, sum, divide by \(N\).
Compute unbiased sample variance
Find sample mean \(\bar{x}\).
Subtract \(\bar{x}\), square, sum.
Divide by \(n-1\).
Apply Law of Total Variance
Compute \(\operatorname{Var}(X\mid Y)\) for each conditioning level, take expectation.
Compute variance of the conditional means \(\operatorname{Var}(\operatorname{E}[X\mid Y])\).
Add the two pieces.
Variance of a weighted sum of uncorrelated variables
Square each weight, multiply by each variable’s variance, sum the results.
Update variance when a new data point arrives (recursive formula above).
---
🔍 Key Comparisons
Population variance vs. Sample variance
Denominator: \(N\) vs. \(n-1\).
Goal: exact dispersion vs. unbiased estimate of the true dispersion.
Biased vs. Unbiased sample variance
Divisor: \(n\) vs. \(n-1\).
Bias: biased version underestimates variance by \((n-1)/n\).
Independent vs. Uncorrelated
Independence ⇒ uncorrelated (covariance = 0).
Uncorrelated does not guarantee independence (except for jointly normal variables).
Additivity vs. General linear combination
Additivity holds only when covariances are zero.
General formula includes covariance terms.
---
⚠️ Common Misunderstandings
“Variance is in the same units as the data.”
False – variance units are squared (e.g., m²). Use standard deviation for original units.
“Using \(n\) instead of \(n-1\) is just a rounding issue.”
Wrong – it introduces systematic downward bias, especially for small \(n\).
“Zero variance means the variable is always exactly zero.”
Zero variance means the variable is constant; the constant could be any value.
“If two variables are uncorrelated, their sum’s variance is always the sum of variances.”
Only true when uncorrelated (covariance = 0); if they are merely uncorrelated but not independent, the same holds, but beware hidden dependencies that create non‑zero covariance.
---
🧠 Mental Models / Intuition
Spread as “energy” – Think of variance like kinetic energy: squaring amplifies larger deviations, so outliers dominate the value.
Bessel’s correction as “extra room” – When estimating from a sample, you lose one degree of freedom (the sample mean), so you must stretch the denominator to avoid “squeezing” the estimate.
Scaling rule – Multiplying data by \(a\) stretches the “space” by \(|a|\); area (variance) expands by \(a^2\).
---
🚩 Exceptions & Edge Cases
Infinite variance – Distributions without finite second moments (e.g., Cauchy, Pareto with \(\alpha\le2\)) have undefined or infinite variance.
Finite mean but infinite variance – Pareto with \(1<\alpha\le2\).
Non‑normal data & variance tests – F‑test and chi‑square assume normality; otherwise use Levene, Bartlett, or Brown–Forsythe.
Covariance matrix singularity – If variables are linearly dependent, \(\Sigma\) is singular and \(\det(\Sigma)=0\) (generalized variance collapses).
---
📍 When to Use Which
Population variance formula – when you truly have the entire population (e.g., census data).
Unbiased sample variance – standard choice for estimating \(\sigma^2\) from a random sample.
Biased variance – rarely needed; useful only for certain maximum‑likelihood contexts where bias is acceptable.
Law of Total Variance – when variance is needed conditional on another variable (hierarchical models, mixture distributions).
Additivity (Bienaymé) – for sums of independent or uncorrelated variables (e.g., measurement error propagation).
Full linear‑combination formula – when variables are correlated; you must include covariance terms.
Matrix form – in multivariate problems or when using vector notation for compactness.
F‑test / chi‑square – compare variances only if normality holds.
Levene / Bartlett / Brown–Forsythe – when normality is questionable or sample sizes differ.
---
👀 Patterns to Recognize
“\(a^2\) factor” – anytime a constant multiplies a variable, look for variance multiplied by the square of that constant.
“Zero covariance” – if problem states “uncorrelated” or “independent,” drop the covariance terms in the linear‑combination formula.
“Sum of squares / degrees of freedom” – numerator always contains \(\sum (xi-\bar{x})^2\); denominator tells you which variance (biased vs. unbiased).
“Variance of a sum vs. sum of variances” – check for independence/uncorrelatedness before simplifying.
---
🗂️ Exam Traps
Using \(n\) instead of \(n-1\) – many multiple‑choice options will present both; the correct unbiased answer uses \(n-1\).
Confusing units – a choice may list variance in original units; the correct answer should be in squared units.
Mistaking independence for zero covariance – some items give “uncorrelated” but not “independent”; the variance‑additivity still holds, but if they only say “uncorrelated,” remember the covariance term drops.
Applying the F‑test to non‑normal data – distractors may ignore the normality assumption; the correct response will mention the assumption or suggest Levene’s test.
Ignoring the scaling rule – when a problem multiplies data by a factor, answer choices that forget to square the factor are wrong.
Mix‑up between population mean (\(\mu\)) and sample mean (\(\bar{x}\)) – formulas using the wrong mean lead to incorrect variance calculations.
---
or
Or, immediately create your own study flashcards:
Upload a PDF.
Master Study Materials.
Master Study Materials.
Start learning in seconds
Drop your PDFs here or
or