The Method of Least Squares: We come across variables during time series analysis, and many of them are the dependent type. Determining a relation between...
The Method of Least Squares: Definition, Formula, Steps, Limitations
December 24, 2024Variance and Standard Deviation: In statistics, the two most essential measurements are variance and standard deviation. The main difference between variance and the standard deviation is in the units they use. The variance is expressed in square units, while the standard deviation is expressed in the same units as the data. Here, we aim to discuss the relationship between Variance and Standard Deviation by knowing their definitions, and also the formulas for finding the values of Variance and Standard Deviation of various frequency distributions.
Standard deviation is a measure of the distribution of statistical data, whereas the variance of data points is a measure of how they deviate from the mean.
The variance of a variate \(X\) is the arithmetic mean of the squares of all deviations of \(X\) from the arithmetic mean of the observations and is denoted by \({\mathop{\rm Var}\nolimits} (X)\) or \({\sigma ^2}\).
The standard deviation of a variate \(X\) is the positive square root of its variance. Thus, Standard deviation \((\sigma ) = \sqrt {{\mathop{\rm Var}\nolimits} (X)} \)
Learn about Measures of Dispersion here
We shall discuss the calculation of variance and standard deviation in the following three cases:
If \({x_1},\,{x_2},\, \ldots ,\,{x_n}\) are n values of a variate \(X\), then
\({\mathop{\rm Var}\nolimits} (X) = \frac{1}{n}\left\{ {\sum\limits_{i = 1}^n {{{\left( {{x_i} – \overline X } \right)}^2}} } \right\}\) or, \({\sigma ^2} = \frac{1}{n}\left\{ {\sum\limits_{i = 1}^n {{{\left( {{x_i} – \overline X } \right)}^2}} } \right\}\)
And, Standard deviation \((\sigma ) = \sqrt {\operatorname{Var} (X)} = \sqrt {\frac{1}{n}\left\{ {\sum\limits_{i = 1}^n {{{\left( {{x_i} – \bar X} \right)}^2}} } \right\}} \,\,\,\,….(i)\)
Now,
\({\mathop{\rm Var}\nolimits} (X) = \frac{1}{n}\left\{ {\sum\limits_{i = 1}^n {{{\left( {{x_i} – \overline X } \right)}^2}} } \right\}\)
\( = \frac{1}{n}\left\{ {\sum\limits_{i = 1}^n {\left( {x_i^2 – 2{x_i}\overline X + {{\overline X }^2}} \right)} } \right\}\)
\( = \frac{1}{n}\sum\limits_{i = 1}^n {x_i^2} – \frac{1}{n}\sum\limits_{i = 1}^n 2 {x_i}\overline X + \frac{1}{n}\sum\limits_{i = 1}^n {{{\overline X }^2}} \)
\( = \frac{1}{n}\sum\limits_{i = 1}^n {x_i^2} – 2\bar X\left\{ {\frac{1}{n}\sum\limits_{i = 1}^n {{x_i}} } \right\} + \frac{{n{{\bar X}^2}}}{n}\)
\( = \frac{1}{n}\sum\limits_{i = 1}^n {x_i^2} – 2\overline X + {\overline X ^2}\left[ {\frac{1}{n}\sum\limits_{i = 1}^n {{x_i}} = \overline X } \right]\)
\( = \frac{1}{n}\sum\limits_{i = 1}^n {x_i^2} – {\overline X ^2}\)
\(\therefore \,{\mathop{\rm Var}\nolimits} (X) = \frac{1}{n}\sum\limits_{i = 1}^n {x_i^2 – } {\left\{ {\frac{1}{n}\sum\limits_{i = 1}^n {{x_i}} } \right\}^2}\) \(\left[ {\therefore \,\frac{1}{n}\sum\limits_{i = 1}^n {{x_i}} = \overline X } \right]\)
Thus, Standard deviation \((\sigma ) = \sqrt {(X)} = \sqrt {\frac{1}{n}\mathop \sum \limits_{i = 1}^n } x_i^2 – {\left\{ {\frac{1}{n}\sum\limits_{i = 1}^n {{x_i}} } \right\}^2}\,………(ii)\)
If the values of variate \(X\) are large, the calculation of variance from the above formulas is quite tedious and time-consuming.
In that case we take deviations from an arbitrary point \(A\) (say).
If \({d_i} = {x_i} – {A_i}i = 1,\,2,\, \ldots ,\,n\) then
\(\sum\limits_{i = 1}^n {{d_i}} = \sum\limits_{i = 1}^n {\left( {{x_i} – A} \right)} = \sum\limits_{i = 1}^n {{x_i}} – nA\)
\( \Rightarrow \frac{1}{n}\left\{ {\sum\limits_{i = 1}^n {{d_i}} } \right\} = \frac{1}{n}\sum\limits_{i = 1}^n {{x_i}} – A\)
\( \Rightarrow \overline d = \overline X – A\), where \(\overline d = \frac{1}{N}\sum\limits_{i = 1}^n {{d_i}} \)
\(\therefore \,{\mathop{\rm Var}\nolimits} (X) = \frac{1}{n}\sum\limits_{i = 1}^n {{{\left( {{x_i} – \overline X } \right)}^2}} \)
\( = \frac{1}{n}\sum\limits_{i = 1}^n {{{\left( {{x_i} – A + A – \overline X } \right)}^2}} \)
\( = \frac{1}{n}\sum\limits_{i = 1}^n {{{\left( {{d_i} – \overline d } \right)}^2}} \)
\( = \frac{1}{n}\sum\limits_{i = 1}^n {\left( {d_i^2 – 2{d_i}\overline d + {{\overline d }^2}} \right)} \)
\( = \frac{1}{n}\sum\limits_{i = 1}^n {d_i^2} – {\overline d ^2} = \frac{1}{n}\sum\limits_{i = 1}^n {d_i^2} – {\left( {\frac{1}{n}\sum\limits_{i = 1}^n {{d_i}} } \right)^2}\)
\(\therefore \,{\mathop{\rm Var}\nolimits} (X) = \frac{1}{n}\sum\limits_{i = 1}^n {d_i^2} – {\left( {\frac{1}{n}\sum\limits_{i = 1}^n {{d_i}} } \right)^2}\)
Standard deviation \((\sigma ) = \sqrt {{\mathop{\rm Var}\nolimits} (X)} = \sqrt {\frac{1}{n}\sum\limits_{i = 1}^n {d_i^2} – {{\left( {\frac{1}{n}\sum\limits_{i = 1}^n {{d_i}} } \right)}^2}} \,\,\,\,…..(iii)\)
Thus, variance and standard deviation of Individual Observations can be computed by applying any of the formulas given as equations \(\left( i \right),\,\left( {ii} \right)\), or \(\left( {iii} \right)\).
Step 1: Compute the mean \(\overline X \) of the given observations \({x_1},\,{x_2},\, \ldots ,\,{x_n}\).
Step 2: Take the deviations of the observations from the mean i.e. find \({x_i} – \overline X ;\,i = 1,\,2,\, \ldots ,\,n\).
Step 3: Square the deviations obtained in step \(2\) and obtain the sum i.e., find \(\sum\limits_{i = 1}^n {{{\left( {{x_i} – \overline X } \right)}^2}} \).
Step 4: Divide the sum obtained in step \(3\) by \(n\). This gives the value of variance of \(X\).
\({\mathop{\rm Var}\nolimits} (X) = \frac{{\sum\limits_{i = 1}^n {{{\left( {{x_i} – \overline X } \right)}^2}} }}{n}\)
Therefore, Standard deviation \((\sigma ) = \sqrt {{\mathop{\rm Var}\nolimits} (X)} = \sqrt {\frac{{\sum\limits_{i = 1}^n {{{\left( {{x_i} – \bar X} \right)}^2}} }}{n}} \)
Step 1: Choose an assumed mean, say, \(A\).
Step 2: Take the deviations \({d_i}\) of the observations from an assumed mean, i.e. obtain \({d_i} = {x_i} – A,\,i = 1,\,2,\, \ldots ,\,n\).
Step 3: Take the total of these deviations i.e. \(\sum\limits_{i = 1}^n {{d_i}} \).
Step 4: Square the deviations obtained in step \(2\) and calculate \(\sum\limits_{i = 1}^n {d_i^2} \).
Step 5: Substitute the values of \(\sum\limits_{i = 1}^n {d_i^2} ,\,\sum\limits_{i = 1}^n {{d_i}} \) and \(n\) in the formula,
\({\mathop{\rm Var}\nolimits} (X) = {\sigma ^2} = \frac{1}{n}\left( {\sum\limits_{i = 1}^n {d_i^2} } \right) – {\left( {\frac{1}{n}\sum\limits_{i = 1}^n {{d_i}} } \right)^2}\)
Therefore, Standard deviation \((\sigma ) = \sqrt {\frac{1}{n}\left( {\sum\limits_{i = 1}^n \, d_i^2} \right) – {{\left( {\frac{1}{n}\sum\limits_{i = 1}^n \, {d_i}} \right)}^2}} \)
Remarks:
If \({x_i}\) or \({f_i};i = 1,\,2,\, \ldots ,\,n\) is a discrete frequency distribution of a variate \(X\), then
\({\mathop{\rm Var}\nolimits} (X) = \frac{1}{N}\left\{ {\sum\limits_{i = 1}^n {{f_i}} {{\left( {{x_i} – \overline X } \right)}^2}} \right\}\, \ldots \ldots (iv)\)
\( = \frac{1}{N}\left[ {\sum\limits_{i = 1}^n {{f_i}} \left( {x_i^2 – 2{x_i}\overline X + {{\overline X }^2}} \right)} \right]\)
\( = \frac{1}{N}\left( {\sum\limits_{i = 1}^n {{f_i}} x_i^2} \right) – 2\overline X \left( {\frac{1}{N}\sum\limits_{i = 1}^n {{f_i}} {x_i}} \right) + \frac{{N{{\overline X }^2}}}{N}\)
\( = \frac{1}{N}\left( {\sum\limits_{i = 1}^n {{f_i}} x_i^2} \right) – 2{\overline X ^2} + {\overline X ^2}\) \(\left[ {\therefore \,\frac{1}{N}\sum\limits_{i = 1}^n {{f_i}} {x_i} = \overline X } \right]\)
\( = \frac{1}{N}\left( {\sum\limits_{i = 1}^n {{f_i}} x_i^2} \right) – {\overline X ^2}\)
\( = \frac{1}{N}\left( {\sum\limits_{i = 1}^n \, {f_i}x_i^2} \right) – {\left( {\frac{1}{N}\sum\limits_{i = 1}^n \, {f_i}{x_i}} \right)^2}\, \ldots \ldots (v)\)
If the values \({{x_i}}\) of variate \(X\) and/or frequencies \({{f_i}}\) are large the calculation of variance using the formulas \((iii)\), and \((v)\) is quite tedious and time consuming.
In such a case, we take deviations of the values of variable \(X\) from an arbitrary point \(A\) (say). If \({d_i} = {x_i} – A,\,i = 1,\,2,\, \ldots ,\,n\), then the above formula reduces to
\({\mathop{\rm Var}\nolimits} (X) = \frac{1}{N}\left( {\Sigma {f_i}d_i^2} \right) – {\left( {\frac{1}{N}\sum\limits_{i = 1}^n \, {f_i}{d_i}} \right)^2} \ldots \ldots (vi)\)
Sometimes \({d_i} = {x_i} – A\) are divisible by a common number \(ℎ\) (say).If we define \({u_i} = \frac{{{x_i} – A}}{h} = \frac{{{d_i}}}{h},\,i = 1,\,2,\, \ldots ,\,n\) then we obtain the following formula for variance.
\({\mathop{\rm Var}\nolimits} (X) = {h^2}\left[ {\left( {\frac{1}{N}\sum\limits_{i = 1}^n \, {f_i}u_i^2} \right) – {{\left( {\frac{1}{N}\sum\limits_{i = 1}^n \, {f_i}{u_i}} \right)}^2}} \right]\, \ldots .(vii)\)
Thus, the formulas \((iv),\,(v),\,(vi)\), and \((vii)\) can be used finding the variance of a discrete frequency distribution.
Here, we use the formula, \({\mathop{\rm Var}\nolimits} (X) = \frac{1}{N}\left[ {\sum\limits_{i = 1}^n {{f_i}} {{\left( {{x_i} – \overline X } \right)}^2}} \right]\), and Standard Deviation \((\sigma ) = \sqrt {{\mathop{\rm Var}\nolimits} (X)} = \sqrt {\frac{1}{N}\left[ {\sum\limits_{i = 1}^n {{f_i}} {{\left( {{x_i} – \overline X } \right)}^2}} \right]} \)
Step 1: Write the given frequency distribution.
Step 2: Find the mean \(\overline X \) of the given frequency distribution.
Step 3: Compute deviations \(\left( {{x_i} – \overline X } \right)\) from the mean \(\overline X \).
Step 4: Find the squares of deviations obtained in step \(3\).
Step 5: Multiply the squared deviations by respective frequencies and obtain the total \(\Sigma {f_i}{\left( {{x_i} – \overline X } \right)^2}\).
Step 6: Divide the total obtained in step \(5\) by \(N = \Sigma {f_i}\) to obtain the variance.
Here, we use the formula, \({\mathop{\rm Var}\nolimits} (X) = \left[ {\left( {\frac{1}{N}\Sigma {f_i}d_i^2} \right) – {{\left( {\frac{1}{N}\Sigma {f_i}{d_i}} \right)}^2}} \right]\) and Standard Deviation \((\sigma ) = \sqrt {{\mathop{\rm Var}\nolimits} (X)} = \sqrt {\left( {\frac{1}{N}\Sigma {f_i}d_i^2} \right) – {{\left( {\frac{1}{N}\Sigma {f_i}{d_i}} \right)}^2}} \)
Step 1: Let the assumed mean \( = A\). Calculate the deviations of observations from \(A\) i.e., \({d_i} = {x_i} – A\) where deviation \( = {d_i}\).
Step 2: Then, find \(\Sigma {f_i}{d_i}\) i.e., first multiply each deviation by their respective frequencies and then calculate the sum.
Step 3: Calculate the squares of deviations obtained in step \(1\) i.e., \(d_i^2\).
Step 4: Multiply the squared deviations by respective frequencies and obtain the total i.e., \(\Sigma {f_i}d_i^2\).
Step 5: Substitute the values in the formula, \({\mathop{\rm Var}\nolimits} (X) = \left( {\frac{1}{N}\Sigma {f_i}d_i^2} \right) – {\left( {\frac{1}{N}\Sigma {f_i}{d_i}} \right)^2}\) and simplify.
Any of the strategies outlined above for a discrete frequency distribution may be applied in a grouped or continuous frequency distribution. We use the following algorithm for computing variance of a grouped or continuous frequency distribution.
Step 1: Find the mid-points of various classes.
Step 2: Take the deviations of these mid-points from an assumed mean. Denote these deviations by \({d_i}\)
Step 3: Divide the deviations in step \(2\) by the class interval \(ℎ\) and denote them by \({u_i}\), i.e \({u_i} = \frac{{{d_i}}}{h}\).
Step 4: Multiply the frequency of each class with the corresponding \({u_i}\) and obtain \(\Sigma {f_i}{u_i}\).
Step 5: Square the values of \({u_i}\) and multiply them with the corresponding frequencies and obtain \(\Sigma {f_i}u_i^2\).
Step 6: Substitute the values of \(\sum {{f_i}} {u_i},\,\Sigma {f_i}u_i^2h\) and \(N = \sum\limits_i {{f_i}} \) in the formula, \({\mathop{\rm Var}\nolimits} (X) = {h^2}\left\{ {\frac{1}{N}\sum {{f_i}} u_i^2 – {{\left( {\frac{1}{N}\sum {{f_i}} {u_i}} \right)}^2}} \right\}\). Simplify.
The square root of the arithmetic means of the squares of the deviations measured from the arithmetic mean of the data is the standard deviation. The mean of the squares of the deviations from the mean is the variance.
So, mathematically, we can say that the square root of variance is standard deviation, and the square of standard deviation is variance.
i.e., Standard deviation \((\sigma ) = \sqrt {{\mathop{\rm Var}\nolimits} (X)} \)
Q.1. Find the variance and standard deviation for the data:
\({\rm{65,}}\,{\rm{68,}}\,{\rm{58,}}\,{\rm{44,}}\,{\rm{48,}}\,{\rm{45,}}\,{\rm{60,}}\,{\rm{62,}}\,{\rm{60,}}\,{\rm{50}}\)
Ans: Let \(\overline X \) be the mean of the given set of observations. Then,
\(\overline X = \frac{{65 + 68 + 58 + 44 + 48 + 45 + 60 + 62 + 60 + 50}}{{10}} = \frac{{560}}{{10}} = 56\)
\({x_i}\) | \({x_i} – \overline X = {x_i} – 56\) | \({\left( {{x_i} – \overline X } \right)^2}\) |
\(65\) | \(9\) | \(81\) |
\(58\) | \(2\) | \(4\) |
\(68\) | \(12\) | \(144\) |
\(44\) | \(−12\) | \(144\) |
\(48\) | \(−8\) | \(64\) |
\(45\) | \(−11\) | \(121\) |
\(60\) | \(4\) | \(16\) |
\(62\) | \(6\) | \(36\) |
\(60\) | \(4\) | \(16\) |
\(50\) | \(−6\) | \(36\) |
\(\Sigma {\left( {{x_i} – \overline X } \right)^2} = 662\) |
Here,
\(n = 10\)
\(\Sigma {\left( {{x_i} – \overline X } \right)^2} = 662\)
\(\therefore \) Variance, \({\sigma ^2} = \frac{1}{n}\Sigma {\left( {{x_i} – \overline X } \right)^2} = \frac{{662}}{{10}} = 66.2\)
Hence, standard deviation \( = \sqrt {{\rm{ Variance }}} = \sqrt {66.2} \)
\(\therefore \,\sigma = 8.13\)
Q.2. For a group of \(200\) candidates the mean and S.D. were found to be \(40\) and \(15\) respectively. Later on, it was found that the score \(43\) was misread as \(34\). Find the correct mean and correct S.D.
Ans: Given, \(n = 200,\,\overline X = 40,\,\sigma = 15\)
\(\overline X = \frac{1}{n}\Sigma {x_i} \Rightarrow \Sigma {x_i} = n\overline X = 200 \times 40 = 8000\)
Now, \({\rm{Correct}}{\mkern 1mu} {\rm{ed}}\,\Sigma {x_i} = {\rm{Incorrect}}\,\Sigma {x_i} – \left( {{\rm{Sum}}\,{\rm{of}}\,{\rm{incorrect}}\,{\rm{values}}} \right) + \left( {{\rm{Sum}}\,{\rm{of}}\,{\rm{correct}}\,{\rm{values}}} \right)\)
\({\rm{ = 8000 – 34 + 43 = 8009}}\)
And, \(\sigma = 15\)
\( \Rightarrow {15^2} = \)Variance
\( \Rightarrow {15^2} = \frac{1}{{200}}\left( {\Sigma x_i^2} \right) – {\left( {\frac{1}{{200}}\Sigma {x_i}} \right)^2}\)
\( \Rightarrow 225 = \frac{1}{{200}}\left( {\Sigma x_i^2} \right) – {\left( {\frac{{8000}}{{200}}} \right)^2}\)
\( \Rightarrow 225 = \frac{1}{{200}}\left( {\sum {x_i^2} } \right) – 1600\)
\( \Rightarrow \Sigma x_i^2 = 200 \times 1825 = 365000\)
\( \Rightarrow {\rm{Incorrect}}\,\Sigma x_i^2 = 365000\)
\({\rm{Correct}}{\mkern 1mu} {\rm{ed}}\,{\rm{ }}\Sigma x_i^2 = \left( {{\rm{Incorrect}}\,\Sigma x_i^2} \right) – \left( {{\rm{Sum}}\,{\rm{of}}\,{\rm{squares}}\,{\rm{of}}\,{\rm{incorrect}}\,{\rm{values}}} \right) + \left( {{\rm{Sum}}\,{\rm{of}}\,{\rm{squares}}\,{\rm{of}}\,{\rm{correct}}\,{\rm{values}}} \right)\)
\( = 365000 – {(34)^2} + {(43)^2} = 365693\)
\({\rm{Correct}}{\mkern 1mu} {\rm{ed}}\,\sigma = \sqrt {\frac{1}{n}{\rm{ Correct}}{\mkern 1mu} {\rm{ed}}\,\Sigma x_i^2 – {{\left( {\frac{1}{n}{\rm{ Correct}}{\mkern 1mu} {\rm{ed}}\,\Sigma {x_i}} \right)}^2}} \)
\( = \sqrt {\frac{{365693}}{{200}} – {{\left( {\frac{{8009}}{{200}}} \right)}^2}} \)
\( = \sqrt {1828.465 – 1603.602} \)
\(\therefore \,\sigma = 14.995\).
Q.3. Find the variance and standard deviation of the following frequency distribution.
Variable \(\left( {{x_i}} \right)\) | \(2\) | \(4\) | \(6\) | \(8\) | \(10\) | \(12\) | \(14\) | \(16\) |
Frequency \(\left( {{f_i}} \right)\) | \(4\) | \(4\) | \(5\) | \(15\) | \(8\) | \(5\) | \(4\) | \(5\) |
Ans: Calculation of Variance and Standard Deviation
Variable \(\left( {{x_i}} \right)\) | Frequency \(\left( {{f_i}} \right)\) | \({f_i}{x_i}\) | \({x_i} – \overline X = {x_i} – 9\) | \({\left( {{x_i} – \overline X } \right)^2}\) | \(f{\left( {{x_i} – \overline X } \right)^2}\) |
\(2\) | \(4\) | \(8\) | \(−7\) | \(49\) | \(196\) |
\(4\) | \(4\) | \(16\) | \(−5\) | \(25\) | \(100\) |
\(6\) | \(5\) | \(30\) | \(−3\) | \(9\) | \(45\) |
\(8\) | \(15\) | \(120\) | \(−1\) | \(1\) | \(15\) |
\(10\) | \(8\) | \(80\) | \(1\) | \(1\) | \(8\) |
\(12\) | \(5\) | \(60\) | \(3\) | \(9\) | \(45\) |
\(14\) | \(4\) | \(56\) | \(5\) | \(25\) | \(100\) |
\(16\) | \(5\) | \(80\) | \(7\) | \(49\) | \(245\) |
\(N = \Sigma {f_i} = 50\) | \(\Sigma {f_i}{x_i} = 450\) | \(\Sigma {f_i}{\left( {{x_i} – \overline X } \right)^2} = 754\) |
Here,
\(N = 50\)
\(\Sigma {f_i}{x_i} = 450\)
\(\Sigma {f_i}{\left( {{x_i} – \overline X } \right)^2} = 754\)
\(\therefore \,\overline X = \frac{1}{N}\Sigma {f_i}{x_i} = \frac{{450}}{{50}} = 9\)
\({\text{Var}}(X) = \frac{1}{N}\left\{ {{ {\Sigma }}{f_i}{{\left( {{x_i} – \overline X } \right)}^2}} \right\} = \frac{{754}}{{50}}\)
\(\therefore \,{\mathop{\rm Var}\nolimits} (X) = 15.08\)
Hence, \({\rm{ S}}{\rm{. D}}{\rm{. }} = \sqrt {{\mathop{\rm Var}\nolimits} (X)} = \sqrt {15.08} \)
\(\therefore \,\sigma = 3.88\)
Q.4. Calculate the variance and standard deviation from the data given below:
Size of item | \(3.5\) | \(4.5\) | \(5.5\) | \(6.5\) | \(7.5\) | \(8.5\) | \(9.5\) |
Frequency | \(3\) | \(7\) | \(22\) | \(60\) | \(85\) | \(32\) | \(8\) |
Ans: Let the assumed mean be \(A = 6.5\)
Size of item \({x_i}\) | \({f_i}\) | \({d_i}{\rm{ = }}{x_i}{\rm{ – 6}}{\rm{.5}}\) | \(d_i^2\) | \({f_i}{d_i}\) | \({f_i}d_i^2\) |
\(3.5\) | \(3\) | \(−3\) | \(9\) | \(−9\) | \(27\) |
\(4.5\) | \(7\) | \(−2\) | \(4\) | \(−14\) | \(28\) |
\(5.5\) | \(22\) | \(−1\) | \(1\) | \(−22\) | \(22\) |
\(6.5\) | \(60\) | \(0\) | \(0\) | \(0\) | \(0\) |
\(7.5\) | \(85\) | \(1\) | \(1\) | \(85\) | \(85\) |
\(8.5\) | \(32\) | \(2\) | \(4\) | \(64\) | \(128\) |
\(9.5\) | \(8\) | \(3\) | \(9\) | \(24\) | \(72\) |
\(N = \Sigma {f_i} = 217\) | \(\Sigma {f_i}{d_i} = 128\) | \(\Sigma {f_i}d_i^2 = 362\) |
Here,
\(N = 217\)
\(\Sigma {f_i}{d_i} = 128\)
\(\Sigma {f_i}d_i^2 = 362\)
\(\therefore \,{\mathop{\rm Var}\nolimits} (X) = \left( {\frac{1}{N}\Sigma {f_i}d_i^2} \right) – {\left( {\frac{1}{N}\Sigma {f_i}{d_i}} \right)^2}\)
\( = \frac{{362}}{{217}} – {\left( {\frac{{128}}{{217}}} \right)^2}\)
\({\rm{ = 1}}{\rm{.668 – 0}}{\rm{.347}}\)
\({\sigma ^2} = 1.321\)
Hence, Standard Deviation \( = \sqrt {{\mathop{\rm Var}\nolimits} (X)} = \sqrt {1.321} \)
\(\therefore \,\sigma = 1.149\)
Q.5. Calculate the mean and standard deviation for the following distribution:
Marks | \(20 − 30\) | \(30 − 40\) | \(40 − 50\) | \(50 − 60\) | \(60 − 70\) | \(70 − 80\) | \(80 − 90\) |
Number of students | \(3\) | \(6\) | \(13\) | \(15\) | \(14\) | \(5\) | \(4\) |
Ans:
Class-interval | Frequency \(\left( {{f_i}} \right)\) | Mid-values \(\left( {{x_i}} \right)\) | \({u_i} = \frac{{{x_i} – 55}}{{10}}\) | \({f_i}{u_i}\) | \(u_i^2\) | \({f_i}u_i^2\) |
\(20 − 30\) | \(3\) | \(25\) | \(−3\) | \(−9\) | \(9\) | \(27\0 |
\(30 − 40\) | \(6\) | \(35\) | \(−2\) | \(−12\) | \(4\) | \(24\) |
\(40 − 50\) | \(13\) | \(45\) | \(−1\) | \(−13\) | \(1\) | \(13\) |
\(50 − 60\) | \(15\) | \(55\) | \(0\) | \(0\) | \(0\) | \(0\) |
\(60 − 70\) | \(14\) | \(65\) | \(1\) | \(14\) | \(1\) | \(14\) |
\(70 − 80\) | \(5\) | \(75\) | \(2\) | \(10\) | \(4\) | \(20\) |
\(80 − 90\) | \(4\) | \(85\) | \(3\) | \(12\) | \(9\) | \(36\) |
\(N = \Sigma {f_i} = 60\) | \(\Sigma {f_i}u_i^2 = 134\) |
Here,
\(N = 60\)
\(\Sigma {f_i}{u_i} = 2\)
\(\Sigma {f_i}u_i^2 = 134\)
\(ℎ = 10\)
\(\therefore \) Mean, \(\overline X = A + h\left( {\frac{1}{N}\Sigma {f_i}{u_i}} \right)\)
\( = 55 + 10\left( {\frac{2}{{60}}} \right)\)
\(\bar X = 55.333\)
\({\mathop{\rm Var}\nolimits} (X) = {h^2}\left\{ {\left( {\frac{1}{N}\Sigma {f_i}u_i^2} \right) – {{\left( {\frac{1}{N}\Sigma {f_i}{u_i}} \right)}^2}} \right\}\)
\( = 100\left[ {\frac{{134}}{{60}} – {{\left( {\frac{2}{{60}}} \right)}^2}} \right]\)
\({\sigma ^2} = 222.9\)
Standard Deviation \( = \sqrt {{\mathop{\rm Var}\nolimits} (X)} = \sqrt {222.9} \)
\(\therefore \,\sigma = 14.93\)
Standard deviation and variance are statistical qualities that quantify dispersion around a central tendency, the arithmetic means in most cases. The higher the standard deviation and variance of a collection of scores, the more the observations (or data points) are spread out around the mean. This article explains and derives the various formulas to calculate it for three types of data: individual observations, discrete frequency distribution, continuous or grouped frequency distribution. Although they appear to be distinct values, they are related to each other. The square root of the variance is the standard deviation.
Q.1. How do you calculate variance and standard deviation?
Ans: If \({x_1},\,{x_2},\, \ldots ,\,{x_n}\) are \(n\) values of a variable \(X\), then
\({\mathop{\rm Var}\nolimits} (X) = \frac{1}{n}\left\{ {\sum\limits_{i = 1}^n – {{\left( {{x_i} – \overline X } \right)}^2}} \right\}\), and Standard deviation \( = \sqrt {{\mathop{\rm Var}\nolimits} (X)} \)
To calculate the variance, follow these steps:
Step 1: Find the mean \((\overline X )\) of the given observations
Step 2: Subtract \(\overline X \) from each observation
Step 3: Find the square of each result
Step 4: Find the average of all squared values, which is the required variance.
Q.2. Which is better standard deviation and variance?
Ans: Standard deviation and variance are statistical qualities that quantify dispersion around the arithmetic mean. The higher the standard deviation and variance of data, the more the observations (or data points) are spread out around the mean.
Although standard deviation and variance are closely related to descriptive statistics, the standard deviation is more commonly used because it is more intuitive in terms of units of measurement; the variance is reported in the squared values of units of measurement, whereas standard deviation is reported in the same units as the data.
Q.3. What is the relation between standard deviation and variance?
Ans: The variance is the square of the standard deviation. In other words, the standard deviation is the positive square root of variance.
Q.4. Why do we need variance and standard deviation?
Ans: Variance and standard deviation both assist in determining the distribution of data in a population from a mean, but standard deviation provides greater information regarding the deviation of data from a mean.
Q.5. What does the variance tell you?
Ans. The variance is a measure for determining how variable a value is. The average of squared deviations from the mean is used to compute it. The degree of dispersion in a data collection is measured by variance. The bigger the variance in respect to the mean, the more spread out the data is about the central deviation.
Learn about Coefficient of Variation
Hope this detailed article on Variance and Standard Deviation helps you in your preparation. In case of any query, reach out to us in the comment section and we will get back to you at the earliest.