Variance - Core Concepts and Computation
Understand the definition, computation methods, and key properties of variance, including population vs. sample formulas and how it behaves under scaling and addition.
Summary
Read Summary
Flashcards
Save Flashcards
Quiz
Take Quiz
Quick Practice
What is the conceptual definition of variance in terms of expected value?
1 of 24
Summary
Understanding Variance
What is Variance?
Variance measures how spread out a set of values is from their average. Imagine measuring heights in a classroom: if all students are roughly the same height, the variance is small. If heights range from very short to very tall students, the variance is large.
More formally, variance is the expected value of the squared deviation of a random variable from its mean. We square the deviations to ensure that distances above and below the mean are treated equally (and to eliminate the problem of deviations canceling each other out).
The standard deviation is simply the positive square root of variance. We use standard deviation in practice because it's in the same units as the original data, while variance is in squared units.
This visualization shows two distributions with the same average (100) but different standard deviations. The narrower red distribution (SD = 10) has lower variance, while the broader blue distribution (SD = 50) has higher variance. Notice how the red distribution clusters tightly around the mean, while the blue one spreads out much more.
The Mathematical Definition
For a random variable $X$ with mean $\mu = \operatorname{E}[X]$, variance is defined as:
$$\operatorname{Var}(X) = \operatorname{E}\big[(X - \mu)^2\big]$$
This formula captures our intuition: we measure how far each value is from the mean (the deviation $X - \mu$), square it, and then take the average of those squared deviations.
However, in practice, we often use an alternative formula that's easier for computation:
$$\operatorname{Var}(X) = \operatorname{E}[X^2] - (\operatorname{E}[X])^2$$
This version says that variance equals the expected value of the square minus the square of the expected value. You can verify these are equivalent algebraically, and the second form is often faster to calculate.
Population Variance vs Sample Variance: A Critical Distinction
This is where many students get confused, so pay close attention.
Population Variance
When you have the entire population of values available, the population variance is:
$$\sigma^2 = \frac{1}{N}\sum{i=1}^{N}(xi - \mu)^2$$
where $N$ is the number of observations and $\mu$ is the population mean. You simply divide by $N$—the total count.
Sample Variance: Why Do We Divide by n−1?
In reality, we usually work with a sample drawn from a larger population. If we naively computed variance using the same formula (dividing by $n$), we would systematically underestimate the true population variance. This is because the sample mean $\bar{x}$ is always closer to the individual sample values than the true population mean $\mu$ is. Our deviations from the sample mean are artificially small.
To fix this problem, statisticians use Bessel's correction: divide by $n-1$ instead of $n$. This gives the unbiased sample variance:
$$s^2 = \frac{1}{n-1}\sum{i=1}^{n}(xi - \bar{x})^2$$
where $\bar{x}$ is the sample mean and $n$ is the sample size.
Why $n-1$ specifically? Because the sample mean itself was calculated from the data, we've "used up" one degree of freedom. We have $n$ observations, but only $n-1$ independent deviations (the last one is determined by the constraint that all deviations must sum to zero). Dividing by $n-1$ corrects the bias introduced by using the sample mean instead of the population mean.
The biased version (dividing by $n$) would underestimate the population variance by the factor $(n-1)/n$—a small correction that becomes negligible as $n$ grows, but crucial for small samples.
In practice: Always use $s^2 = \frac{1}{n-1}\sum{i=1}^{n}(xi - \bar{x})^2$ when estimating variance from sample data. Most software defaults to this.
Key Properties of Variance
Non-negativity and Zero Variance
Variance is always non-negative, since we're summing squared terms. The only way variance can equal zero is if the random variable is constant—it always takes the same value. Conversely, if a random variable has variance zero, it must be almost surely constant.
Units of Measurement
If your data is measured in meters, variance is in square meters. If your data is in dollars, variance is in dollars squared. This is why standard deviation (the square root) is often more interpretable than variance in practice.
How Variance Transforms
Understanding how variance changes when you transform your data is essential:
Adding a constant does nothing: $$\operatorname{Var}(X + c) = \operatorname{Var}(X)$$
Shifting all values by a constant amount doesn't change how spread out they are.
Multiplying by a constant scales variance by the square of that constant: $$\operatorname{Var}(aX) = a^2\operatorname{Var}(X)$$
If you double all your values, variance increases by a factor of 4 (because $2^2 = 4$). This makes sense: spreading everything out twice as far increases dispersion by a factor of 4.
The Law of Total Variance
This is a powerful decomposition formula. For any two random variables $X$ and $Y$:
$$\operatorname{Var}(X) = \operatorname{E}\big[\operatorname{Var}(X \mid Y)\big] + \operatorname{Var}\big(\operatorname{E}[X \mid Y]\big)$$
What this means: The total variance in $X$ can be split into two parts:
The expected value of the variance of $X$ given $Y$ (the variance that remains even after conditioning on $Y$)
The variance of the conditional expectation (how much $X$ varies on average as $Y$ changes)
This is useful when you can explain part of the variance of $X$ by conditioning on another variable $Y$.
The Bienaymé Formula: Adding Variances
If $X$ and $Y$ are uncorrelated (or independent), then:
$$\operatorname{Var}(X + Y) = \operatorname{Var}(X) + \operatorname{Var}(Y)$$
This is intuitively appealing: if two variables are unrelated, their variances add when you combine them. However, if $X$ and $Y$ are correlated, you must include a covariance term: $\operatorname{Var}(X + Y) = \operatorname{Var}(X) + \operatorname{Var}(Y) + 2\operatorname{Cov}(X, Y)$.
Computing Variance for Different Types of Variables
Discrete Random Variables
For a discrete random variable with probability mass function $p(x)$:
$$\operatorname{Var}(X) = \sumx (x - \mu)^2 p(x)$$
You calculate the squared deviation for each possible value, weight it by its probability, and sum across all values.
Continuous Random Variables
For a continuous random variable with probability density function $f(x)$:
$$\operatorname{Var}(X) = \int{-\infty}^{\infty} (x - \mu)^2 f(x) \, dx$$
This is the continuous analog: instead of summing over discrete values, you integrate over the entire range.
Common Distributions
Memorizing the variance formulas for common distributions is helpful:
Bernoulli(p): $\operatorname{Var}(X) = p(1 - p)$
Binomial(n, p): $\operatorname{Var}(X) = np(1 - p)$
Poisson($\lambda$): $\operatorname{Var}(X) = \lambda$
Exponential($\lambda$): $\operatorname{Var}(X) = \frac{1}{\lambda^2}$
Notice that for a Poisson distribution, the mean and variance are equal (both $\lambda$). This is a distinctive feature of the Poisson distribution.
<extrainfo>
Additional Concepts
Variance as a Central Moment: Variance is formally the second central moment of a probability distribution. The first central moment is always zero (by definition), and the second central moment is variance. Higher central moments describe other properties like skewness and kurtosis.
Finiteness Issues: Not all distributions have finite variance. Some have infinite means (like the Cauchy distribution), which means their variance is also infinite. Others, like the Pareto distribution with shape parameter $\alpha \le 2$, have finite means but infinite variances. This is an important edge case to be aware of when working with theoretical distributions.
Variance Notation: Variance is technically the covariance of a random variable with itself: $\operatorname{Var}(X) = \operatorname{Cov}(X, X)$. This connection to covariance helps explain why variance shares many properties with covariance.
</extrainfo>
Flashcards
What is the conceptual definition of variance in terms of expected value?
The expected value of the squared deviation of a random variable from its mean.
What does variance measure regarding a set of numbers?
The dispersion of the numbers around their average value.
What is the formal equation for the variance of a random variable $X$ with mean $\mu$?
$\operatorname{Var}(X)=\operatorname{E}\big[(X-\mu)^{2}\big]$ (where $\operatorname{E}$ is the expected value).
What is the alternative computational formula for variance involving $\operatorname{E}[X^{2}]$ and $(\operatorname{E}[X])^{2}$?
$\operatorname{Var}(X)=\operatorname{E}[X^{2}] - (\operatorname{E}[X])^{2}$.
How is variance defined in terms of covariance?
It is the covariance of a random variable with itself.
Which specific central moment of a probability distribution corresponds to the variance?
The second central moment.
Why is the variance of a random variable always non‑negative?
Because it is a sum of squares.
What is the variance of a constant random variable?
Zero.
How do the units of measurement for variance relate to the units of the original variable?
They are the square of the original units (e.g., meters squared).
What happens to the variance if a distribution lacks a finite expected value?
The variance is also infinite.
What is the variance of a random variable $X$ after adding a constant $c$?
$\operatorname{Var}(X+c)=\operatorname{Var}(X)$ (it remains unchanged).
What is the effect of scaling a random variable $X$ by a constant $a$ on its variance?
The variance is scaled by $a^{2}$: $\operatorname{Var}(aX)=a^{2}\operatorname{Var}(X)$.
What is the variance formula for a discrete random variable with probability mass function $p(x)$?
$\operatorname{Var}(X)=\sum{x} (x-\mu)^{2}p(x)$.
What is the variance formula for a continuous random variable with probability density function $f(x)$?
$\operatorname{Var}(X)=\int{-\infty}^{\infty}(x-\mu)^{2}f(x)\,dx$.
How is standard deviation mathematically related to variance?
It is the positive square root of the variance.
Under what condition is population variance calculated?
When all possible observations of a random variable are available.
What is the formula for population variance $\sigma^{2}$ for a finite population of size $N$?
$\sigma^{2}= \frac{1}{N}\sum{i=1}^{N}(x{i}-\mu)^{2}$ (where $\mu$ is the population mean).
What is the purpose of computing sample variance?
To estimate the population variance from a subset of observations.
What is the formula for the unbiased sample variance $s^2$ using Bessel’s correction?
$s^{2}= \frac{1}{n-1}\sum{i=1}^{n}(x{i}-\bar{x})^{2}$ (where $n$ is sample size and $\bar{x}$ is sample mean).
By what factor is the sample variance estimator biased if $n$ is used as the divisor instead of $n-1$?
It is biased downward by a factor of $(n-1)/n$.
What is the decomposition formula for $\operatorname{Var}(X)$ given random variables $X$ and $Y$?
$\operatorname{Var}(X)=\operatorname{E}\big[\operatorname{Var}(X\mid Y)\big] + \operatorname{Var}\big(\operatorname{E}[X\mid Y]\big)$.
What is the relationship between the variance of the sum $\operatorname{Var}(X+Y)$ and individual variances for uncorrelated variables?
$\operatorname{Var}(X+Y)=\operatorname{Var}(X)+\operatorname{Var}(Y)$.
What are the variances for the Bernoulli, Binomial, and Poisson distributions?
Bernoulli(p): $p(1-p)$
Binomial(n, p): $np(1-p)$
Poisson($\lambda$): $\lambda$
What is the mean and variance of an exponential distribution with rate parameter $\lambda$?
Mean is $1/\lambda$; Variance is $1/\lambda^{2}$.
Quiz
Variance - Core Concepts and Computation Quiz Question 1: What is the formula for the variance of a random variable \(X\) in terms of its mean \(\mu\)?
- \(\displaystyle\operatorname{Var}(X)=\operatorname{E}\big[(X-\mu)^{2}\big]\) (correct)
- \(\displaystyle\operatorname{Var}(X)=\big(\operatorname{E}[X]\big)^{2}-\operatorname{E}[X^{2}]\)
- \(\displaystyle\operatorname{Var}(X)=\operatorname{E}[X]^{2}\)
- \(\displaystyle\operatorname{Var}(X)=(X-\mu)^{2}\)
Variance - Core Concepts and Computation Quiz Question 2: How is the unbiased sample variance calculated from a sample of size \(n\) with observations \(x_{i}\) and sample mean \(\bar{x}\)?
- \(\displaystyle s^{2}= \frac{1}{\,n-1\,}\sum_{i=1}^{n}(x_{i}-\bar{x})^{2}\) (correct)
- \(\displaystyle s^{2}= \frac{1}{\,n\,}\sum_{i=1}^{n}(x_{i}-\bar{x})^{2}\)
- \(\displaystyle s^{2}= \frac{1}{\,n+1\,}\sum_{i=1}^{n}(x_{i}-\bar{x})^{2}\)
- \(\displaystyle s^{2}= \frac{1}{\,n-1.5\,}\sum_{i=1}^{n}(x_{i}-\bar{x})^{2}\)
Variance - Core Concepts and Computation Quiz Question 3: What does the variance of a random variable measure?
- The expected squared deviation from its mean (correct)
- The probability that the variable equals its mean
- The average value of the variable
- The standard deviation of the variable
Variance - Core Concepts and Computation Quiz Question 4: Which formula correctly gives the population variance σ² for a finite population of size N with values x_i and population mean μ?
- σ² = (1/N) ∑₁ᴺ (x_i − μ)² (correct)
- σ² = (1/(N‑1)) ∑₁ᴺ (x_i − μ)²
- σ² = ∑₁ᴺ (x_i − μ)²
- σ² = (1/N) ∑₁ᴺ |x_i − μ|
Variance - Core Concepts and Computation Quiz Question 5: Why can the variance of a random variable never be negative?
- Because it is defined as a sum of squared terms (correct)
- Because probabilities are always positive
- Because the mean is always non‑negative
- Because standard deviation is always larger than variance
Variance - Core Concepts and Computation Quiz Question 6: Using divisor n instead of n‑1 in the sample variance formula biases the estimator downward by which factor?
- (n − 1) / n (correct)
- n / (n − 1)
- (n − 1)² / n²
- 1 / n
Variance - Core Concepts and Computation Quiz Question 7: If a random variable X is measured in meters, in what units is Var(X) expressed?
- meters squared (correct)
- meters
- square meters per second
- meters cubed
Variance - Core Concepts and Computation Quiz Question 8: In terms of covariance, how can the variance of a random variable \(X\) be expressed?
- Cov\,(X, X) (correct)
- Cov\,(X, Y)
- Cov\,(Y, Y)
- Cov\,(X, X) + Cov\,(Y, Y)
Variance - Core Concepts and Computation Quiz Question 9: Which expression gives the variance of a continuous random variable with density \(f(x)\) and mean \(\mu\)?
- \(\displaystyle\int_{-\infty}^{\infty}(x-\mu)^{2}f(x)\,dx\) (correct)
- \(\displaystyle\int_{-\infty}^{\infty}|x-\mu|\,f(x)\,dx\)
- \(\displaystyle\int_{-\infty}^{\infty}(x-\mu)f(x)\,dx\)
- \(\displaystyle\int_{-\infty}^{\infty}(x+\mu)^{2}f(x)\,dx\)
Variance - Core Concepts and Computation Quiz Question 10: What is the variance of a Binomial\((n,\,p)\) random variable?
- \(np(1-p)\) (correct)
- \(np\)
- \(n^{2}p(1-p)\)
- \(p(1-p)\)
What is the formula for the variance of a random variable \(X\) in terms of its mean \(\mu\)?
1 of 10
Key Concepts
Variance Concepts
Variance
Standard deviation
Population variance
Sample variance
Unbiased estimator (Bessel’s correction)
Law of total variance
Bienaymé formula (Additivity of variance)
Scaling property of variance
Units of variance
Infinite variance
Central moment
Covariance
Definitions
Variance
The expected value of the squared deviation of a random variable from its mean, measuring dispersion.
Standard deviation
The positive square root of the variance, expressing spread in the original units of the variable.
Population variance
The variance computed using all members of a finite population, dividing by the population size \(N\).
Sample variance
An estimator of population variance calculated from a sample, typically dividing by \(n-1\) to correct bias.
Unbiased estimator (Bessel’s correction)
The adjustment of the sample variance divisor from \(n\) to \(n-1\) that makes the estimator unbiased for the true population variance.
Law of total variance
A decomposition stating \(\operatorname{Var}(X)=\operatorname{E}[\operatorname{Var}(X\mid Y)]+\operatorname{Var}(\operatorname{E}[X\mid Y])\).
Bienaymé formula (Additivity of variance)
The result that the variance of the sum of uncorrelated (or independent) variables equals the sum of their variances.
Scaling property of variance
Multiplying a random variable by a constant \(a\) scales its variance by \(a^{2}\); adding a constant leaves variance unchanged.
Units of variance
The square of the units of the original measurement (e.g., meters squared for a length variable).
Infinite variance
A situation where a distribution’s variance does not exist or is unbounded, as with the Cauchy or certain Pareto distributions.
Central moment
The \(k\)th moment of a distribution about its mean; variance is the second central moment.
Covariance
A measure of joint variability of two random variables; variance is the covariance of a variable with itself.