Variance

Variance_sed​ ​Variance_ide_02

For a single ​​variate​Variance_html_03 having a distribution Variance_html_04 with known​population mean​Variance_ide_05, the ​​population variance​Variance_ide_06, commonly also written Variance_ide_07, is defined as


Variance_ide_08


(1)



where Variance_html_09 is the ​​population mean​​ and Variance_html_10 denotes the ​​expectation value​​ of Variance_ide_11. For a ​​discrete distribution​​ with Variance_html_12 possible values of Variance_sed_13, the population variance is therefore


Variance_ide_14


(2)



whereas for a ​​continuous distribution​​, it is given by


Variance_ide_15


(3)



The variance is therefore equal to the second ​​central moment​Variance_html_16.

Note that some care is needed in interpreting Variance_ide_17 as a variance, since the symbol Variance_sed_18 is also commonly used as a parameter related to but not equivalent to the square root of the variance, for example in the ​​log normal distribution​​, ​​Maxwell distribution​​, and ​​Rayleigh distribution​​.

If the underlying distribution is not known, then the ​​sample variance​​ may be computed as


Variance_sed_19


(4)



where Variance_ide_20 is the ​​sample mean​​.

Note that the ​​sample variance​Variance_html_21 defined above is not an ​​unbiased estimator​​ for the ​​population variance​Variance_html_22. In order to obtain an unbiased estimator for Variance_ide_23, it is necessary to instead define a "bias-corrected sample variance"


Variance_ide_24


(5)



The distinction between Variance_sed_25 and Variance_html_26 is a common source of confusion, and extreme care should be exercised when consulting the literature to determine which convention is in use, especially since the uninformative notation Variance_ide_27 is commonly used for both. The bias-corrected sample variance Variance_html_28 for a list of data is implemented as ​​Variance​​[list].

The square root of the variance is known as the ​​standard deviation​​.

The reason that Variance_sed_29 gives a ​​biased estimator​​ of the ​​population variance​​ is that two free parameters Variance_ide_30 and Variance_sed_31 are actually being estimated from the data itself. In such cases, it is appropriate to use a ​​Student's t-distribution​​ instead of a ​​normal distribution​​ as a model since, very loosely speaking, ​​Student's t-distribution​​ is the "best" that can be done without knowing Variance_sed_32.

Formally, in order to estimate the ​​population variance​Variance_sed_33 from a sample of Variance_sed_34 elements with a priori unknown​mean​​ (i.e., the ​​mean​​ is estimated from the sample itself), we need an ​​unbiased estimator​​ for Variance_sed_35. This is given by the ​k-statistic​Variance_sed_36, where


Variance_html_37


(6)



and Variance_ide_38 is the ​​sample variance​​ uncorrected for bias.

It turns out that the quantity Variance_ide_39 has a ​​chi-squared distribution​​.

For set of data Variance_html_40, the variance of the data obtained by a linear transformation is given by

Variance_sed_41

Variance_sed_42

Variance_html_43


(7)


Variance_html_44

Variance_html_45

Variance_sed_46


(8)


Variance_ide_47

Variance_sed_48

Variance_sed_49


(9)


Variance_sed_50

Variance_ide_51

Variance_sed_52


(10)


Variance_ide_53

Variance_html_54

Variance_sed_55


(11)


Variance_html_56

Variance_html_57

Variance_ide_58


(12)


For multiple variables, the variance is given using the definition of ​​covariance​​,

Variance_html_59

Variance_html_60

Variance_html_61


(13)


Variance_sed_62

Variance_ide_63

Variance_html_64


(14)


Variance_html_65

Variance_sed_66

Variance_html_67


(15)


Variance_ide_68

Variance_html_69

Variance_sed_70


(16)


Variance_ide_71

Variance_html_72

Variance_html_73


(17)


A linear sum has a similar form:

Variance_html_74

Variance_sed_75

Variance_sed_76


(18)


Variance_html_77

Variance_sed_78

Variance_sed_79


(19)


Variance_html_80

Variance_html_81

Variance_html_82


(20)


These equations can be expressed using the ​​covariance matrix​​.

SEE ALSO: ​​Central Moment​​, ​​Charlier's Check​​, ​​Covariance​​, ​​Covariance Matrix​​, ​​Error Propagation​​, ​k-Statistic​​, ​​Mean​​, ​​Moment​​, ​​Raw Moment​​, ​​Sample Variance​​, ​​Sample Variance Computation​​, ​​Sample Variance Distribution​​, ​​Sigma​​, ​​Standard Error​​, ​​Statistical Correlation​


 


REFERENCES:

Kenney, J. F. and Keeping, E. S. ​Mathematics of Statistics, Pt. 2, 2nd ed.​ Princeton, NJ: Van Nostrand, 1951.

Papoulis, A. ​Probability, Random Variables, and Stochastic Processes, 2nd ed.​ New York: McGraw-Hill, pp. 144-145, 1984.

Press, W. H.; Flannery, B. P.; Teukolsky, S. A.; and Vetterling, W. T. "Moments of a Distribution: Mean, Variance, Skewness, and So Forth." §14.1 in ​Numerical Recipes in FORTRAN: The Art of Scientific Computing, 2nd ed.​ Cambridge, England: Cambridge University Press, pp. 604-609, 1992.

Roberts, M. J. and Riccardo, R. ​A Student's Guide to Analysis of Variance.​ London: Routledge, 1999.

 

Variance_ide_83