IT용어위키



Variance

Variance is a statistical measure that quantifies the dispersion of a dataset relative to its mean. It indicates how much the values in a dataset deviate from the average value.

Definition

The variance (σ²) of a dataset with n elements is calculated as:

  • Population Variance (σ²):
    • σ² = (1/n) * Σ (xᵢ - μ)²
  • Sample Variance (s²):
    • s² = (1/(n-1)) * Σ (xᵢ - x̄)²

where:

  • xᵢ – Each data point.
  • μ – Population mean.
  • – Sample mean.
  • n – Number of data points.
  • Σ – Summation notation.

Interpretation

  • Low Variance – Data points are close to the mean.
  • High Variance – Data points are spread out from the mean.
  • Zero Variance – All values are identical.

Example Calculation

Given dataset: [5, 7, 9, 10, 14]

  1. Calculate the mean:
    • x̄ = (5 + 7 + 9 + 10 + 14) / 5 = 9
  2. Compute squared differences:
    • (5 - 9)² = 16
    • (7 - 9)² = 4
    • (9 - 9)² = 0
    • (10 - 9)² = 1
    • (14 - 9)² = 25
  3. Population variance:
    • σ² = (16 + 4 + 0 + 1 + 25) / 5 = 9.2
  4. Sample variance:
    • s² = (16 + 4 + 0 + 1 + 25) / (5-1) = 11.5

Properties

  • Always non-negative – Squaring deviations ensures variance is ≥ 0.
  • Measured in squared units – Different from the original data scale.
  • Used to compute standard deviation – Standard deviation (σ) is the square root of variance.

Applications

  • Finance – Measuring risk (volatility of asset returns).
  • Machine Learning – Evaluating feature distribution in datasets.
  • Quality Control – Assessing variability in production processes.

Variance vs. Standard Deviation

Measure Formula Interpretation
Variance (σ²) (1/n) * Σ (xᵢ - μ)² Average squared deviation from mean.
Standard Deviation (σ) √Variance Measures dispersion in original units.

See Also


  출처: IT위키(IT위키에서 최신 문서 보기)
  * 본 페이지는 공대위키에서 미러링된 페이지입니다. 일부 오류나 표현의 누락이 있을 수 있습니다. 원본 문서는 공대위키에서 확인하세요!