Standard Deviation (σ) is a statistical measure that quantifies the amount of variation or dispersion in a dataset. It indicates how spread out the data points are from the mean.
Definition
The standard deviation is the square root of variance and is given by:
- Population Standard Deviation (σ):
- σ = √[(1/n) * Σ (xᵢ - μ)²]
- Sample Standard Deviation (s):
- s = √[(1/(n-1)) * Σ (xᵢ - x̄)²]
where:
- xᵢ – Each data point.
- μ – Population mean.
- x̄ – Sample mean.
- n – Number of data points.
- Σ – Summation notation.
Interpretation
- Low Standard Deviation – Data points are close to the mean.
- High Standard Deviation – Data points are widely spread.
- Zero Standard Deviation – All values are identical.
Example Calculation
Given dataset: [5, 7, 9, 10, 14]
- Calculate the mean:
- x̄ = (5 + 7 + 9 + 10 + 14) / 5 = 9
- Compute squared differences:
- (5 - 9)² = 16
- (7 - 9)² = 4
- (9 - 9)² = 0
- (10 - 9)² = 1
- (14 - 9)² = 25
- Population variance:
- σ² = (16 + 4 + 0 + 1 + 25) / 5 = 9.2
- Population standard deviation:
- σ = √9.2 ≈ 3.03
- Sample standard deviation:
- s² = (16 + 4 + 0 + 1 + 25) / (5-1) = 11.5
- s = √11.5 ≈ 3.39
Properties
- Always non-negative – Standard deviation is always ≥ 0.
- Same units as the data – Unlike variance, which is squared.
- Sensitive to outliers – Extreme values can significantly impact standard deviation.
Applications
- Finance – Measures risk (volatility of asset returns).
- Machine Learning – Evaluates feature distribution and normalization.
- Quality Control – Assesses consistency in production processes.
Standard Deviation vs. Variance
Measure | Formula | Interpretation |
---|---|---|
Variance (σ²) | (1/n) * Σ (xᵢ - μ)² | Average squared deviation from mean. |
Standard Deviation (σ) | √Variance | Dispersion in original units. |