Angle between two planes: A plane in geometry is a flat surface that extends in two dimensions indefinitely but has no thickness. The angle formed...
Angle between Two Planes: Definition, Angle Bisectors of a Plane, Examples
November 10, 2024Mean of Grouped Data: Statistics is the study of numerical data. It deals with the collection, classification and analysis of numerical data. Data can also be classified as grouped and ungrouped data. Mean is one of the important parameters in statistics that measure the central tendency of data. It is the simplest and most widely used measure of the average of a collective data set.
In this article, we will learn arithmetic mean for grouped data or a discrete frequency distribution. There are three means of group data methods used for solving the questions; direct method, short-cut method, and step-deviation method. We will discuss the methods in this article.
Arithmetic Mean in statistics is used for the measurement of average and for denoting the central tendency of data. Mathematically, it is equal to the ratio of the sum of numbers in a given set to the total number of values present in the set. In other words, to find the mean of a set of data, add up all the values and then divide this total by the number of values. The mathematical symbol or notation for average is \(\overline X \) read as “\(x\) bar”.
The mean of \(n\) observations (variables) \({x_1},\,{x_2},\,{x_3},\,{x_4},\,….,\,{x_n}\) is given by the formula:
Mean \( = \frac{{{x_1} + {x_2} + {x_3} + …{x_n}}}{n} = \frac{{\sum {{x_i}} }}{n},\) where \(\sum {{x_i}} = {x_1} + {x_2} + {x_3} + {x_4} + ….{x_n}.\)
Thus, mean \(= \frac{{{\rm{sum}}\,{\rm{of}}\,{\rm{all}}\,{\rm{observations}}}}{{{\rm{Total}}\,{\rm{number}}\,{\rm{of}}\,{\rm{observations}}}}\)
The Greek letter \({\rm{\Sigma }}\) represents the sum.
Data formed by representing each observation into groups so that these groups present a more meaningful way of summarizing the data is known as grouped data. This helps us at perceiving at a glance certain salient features of data.
To understand the concept of grouped data, let us take the following example:
The following table represents the marks group of students studying in a school.
Marks group | Number of Students |
\({\rm{30 – 40}}\) | \({\rm{13}}\) |
\({\rm{40 – 50}}\) | \({\rm{15}}\) |
\({\rm{50 – 60}}\) | \({\rm{5}}\) |
\({\rm{60 – 70}}\) | \({\rm{2}}\) |
The data are summarized in groups with equal intervals. These groups are known as class intervals.
Here, the class intervals are \(30 – 40,\,40 – 50,\,50 – 60,\,60 – 70.\)
From the table, we get to know that the marks of \(13\) students lie in the class interval \(30 – 40,\) and the marks of \(15\) students lie in the class interval \(40 – 50.\)
Similarly, marks of \(5\) students lie between \(50 – 60\) and for \(2\) students, it lies in the class interval \(60 – 70\) respectively.
There are three methods to find the mean of grouped data. They are,
(a) Direct Method
(b) Shortcut Method
(c) Step Deviation Method
The arithmetic mean of a grouped data can be obtained through the direct method.
The formula to find the arithmetic mean with the help of the direct method is as follows:
Let \({x_1},\,{x_2},\,{x_3},\,{x_4},\,…,\,{x_n}\) be the observations with the frequency \({f_1},\,{f_2},\,{f_3},\,{f_4},\,….,\,{f_n}.\)
Then, the mean is calculated using the formula:
\(\overline x = \frac{{{x_1}{f_1} + {x_2}{f_2} + {x_3}{f_3} + {x_4}{f_4} + …. + {x_n}{f_n}}}{{\sum {{f_i}} }}\)
Here, \({x_1}{f_1} + {x_2}{f_2} + {x_3}{f_3} + {x_4}{f_4} + …. + {x_n}{f_n} = \sum {{x_i}{f_i}} \) indicates the sum of all frequencies.
In generalized form, we can write the arithmetic mean direct method formula as,
\(\overline x = \frac{{\sum {{x_i}{f_i}} }}{{\sum {{f_i}} }}\)
If the values of \(x\) and \(f\) are very large then, it would be time-consuming to calculate the mean using the direct method. To avoid the complexities, we use the short-cut method for large values.
We take deviation from any arbitrary point that is known as the assumed mean.
In the assumed mean method, we must assume a certain number within the data as the mean. Then, we will calculate the deviation of different classes from the assumed mean and will calculate the weighted average of the deviations with the weights being the frequencies and the average is added to the assumed mean.
Thus, the mean is calculated by the formula: \(\overline x = A + \frac{1}{N}\sum\limits_{i = 1}^n {{f_i}{d_i}} ,\) where \({d_i} = {x_i} – A.\)
Where, \(A\) is the assumed mean, \({f_i}\) denoted the frequency of \({i^{{\rm{th}}}}\) class which is having the deviation of \({d_i}\) from the assumed mean.
To calculate the arithmetic mean or mean using the step deviation method, we first find the deviations \(d\) which are divisible by a common number \(h\) (say). In such a case, the arithmetic mean can be determined by taking \(\overline x = A + h\left( {\frac{1}{N}\sum\limits_{i = 1}^n {{f_i}{u_i}} } \right),\) where \({u_i} = \frac{{{x_i} – A}}{h}.\)
Q.1. Calculate the mean from the following data using the assumed mean method.
Marks: | \(0 – 10\) | \(10 – 20\) | \(20 – 30\) | \(30 – 40\) | \(40 – 50\) |
Number of students: | \(5\) | \(2\) | \(3\) | \(8\) | \(2\) |
Ans: When the data is presented in the form of class intervals, we must find the mid-point of each class, which is known as the Class Mark.
Class Mark\( = \frac{{{\rm{Upper}}\,{\rm{limit}} + {\rm{Lower}}\,{\rm{limit}}}}{2}\)
Let us say the assumed mean \(\left( A \right)\) is\(25\)
Marks | Midpoint \(\left( {{x_i}} \right)\) | Number of students \(\left( {{f_i}} \right)\) | \({d_i} = {x_i} – A\) | \({f_i}{d_i}\) |
\(0 – 10\) | \(5\) | \(5\) | \(-20\) | \( – 100\) |
\(10 – 20\) | \(15\) | \(2\) | \(-10\) | \( – 150\) |
\(20 – 30\) | \(25\) | \(3\) | \(0\) | \( 0\) |
\(30 – 40\) | \(35\) | \(8\) | \(10\) | \( 350\) |
\(40 – 50\) | \(45\) | \(2\) | \(20\) | \( 900\) |
\(\sum {{f_i}} = 20\) | \(\sum {{f_i}{d_i}} = 1000\) |
We have,
\(N = \sum {{f_i}} = 20,\,A = 25,\,\sum\limits_{i = 1}^n {{f_i}{d_i}} = 1000.\)
\(\overline x = A + \left\{ {\frac{1}{N}\,\sum\limits_{i = 1}^n {{f_i}{d_i}} } \right\} = 25 + \left\{ {\frac{1}{{20}} \times 1000} \right\} = 25 + 50 = 75.\)
Hence, the mean is \(75.\)
Q.2. Find the mean for the following data set using the direct method.
Class Interval | Frequency |
\(0 – 2\) | \(4\) |
\(2- 4\) | \(3\) |
\(4 – 6\) | \(5\) |
\(6 – 8\) | \(7\) |
Ans:
Class Interval | Class mark/midpoint \(\left( {{x_i}} \right)\) | Frequency \(\left( {{f_i}} \right)\) | \({x_i}{f_i}\) |
\(0 – 2\) | \(1\) | \(4\) | \(4\) |
\(2- 4\) | \(3\) | \(3\) | \(9\) |
\(4 – 6\) | \(5\) | \(5\) | \(25\) |
\(6 – 8\) | \(7\) | \(7\) | \(49\) |
\(\sum {{f_i}} = 19\) | \(\sum {{x_i}{f_i}} = 87\) |
(Mean) \(\overline x = \frac{{\sum {{x_i}{f_i}} }}{{\sum {{f_i}} }} \Rightarrow \overline x = \frac{{87}}{{19}} \Rightarrow \overline x = 4.58\) (approx)
Hence, the mean of the given data is \(4.58\) (approx.)
Q.3. Find the mean of the following frequency distribution using the step deviation method.
Class \(\left( {{x_i}} \right)\) | \(0 – 20\) | \(20 – 40\) | \(40 – 60\) | \(60 – 80\) | \(80 – 100\) |
Frequency \(\left( {{f_i}} \right)\) | \(15\) | \(18\) | \(21\) | \(29\) | \(17\) |
Ans:
The assumed mean\(\left( A \right)\) is \(50\) and \(h\) is \(20\)
Class intervals | Mid-Value \(\left( {{x_i}} \right)\) | \(\left( {{f_i}} \right)\) | \({d_i} = {x_i} – A\) | \({u_i} = \frac{{{x_i} – A}}{h}\) | \({f_i}{u_i}\) |
\(0 – 20\) | \(10\) | \(15\) | \(-40\) | \(-2\) | \( – 30\) |
\(20 – 40\) | \(30\) | \(18\) | \(-20\) | \(-1\) | \( – 18\) |
\(40 – 60\) | \(50\) | \(21\) | \(0\) | \(0\) | \( 0\) |
\(60 – 80\) | \(70\) | \(29\) | \(20\) | \(1\) | \( 29\) |
\(80 – 100\) | \(90\) | \(17\) | \(40\) | \(2\) | \( 34\) |
\(\sum {{f_i}} = 100\) | \(\sum\limits_{i = 1}^n {{f_i}{u_i}} = 15\) |
We have,
\(N = \sum {{f_i}} = 100,\,A = 50,\,\sum\limits_{i = 1}^n {{f_i}{u_i}} = 15,\,h = 20.\)
\(\overline x = A + h\left\{ {\frac{1}{N}\sum\limits_{i = 1}^n {{f_i}{u_i}} } \right\} = 50 + 20\left\{ {\frac{1}{{100}} \times 15} \right\} = 53.\)
Hence, the mean is \(53.\)
Q.4. Find the mean of the following distribution using direct method.
Class-Interval | \(15 – 25\) | \(25 – 35\) | \(35 – 45\) | \(45 – 55\) | \(55 – 65\) | \(65 – 75\) | \(75 – 85\) |
Frequency | \(6\) | \(7\) | \(6\) | \(4\) | \(4\) | \(2\) | \(1\) |
Ans:
When the data is presented in the form of class intervals, we must find the mid-point of each class, which is known as the Class Mark.
Class Mark\( = \frac{{{\rm{Upper}}\,{\rm{limit}} + {\rm{Lower}}\,{\rm{limit}}}}{2}\)
Class- Interval | Class Mark\(\left( {{x_i}} \right)\) | Frequency \(\left( {{f_i}} \right)\) | \({x_i}{f_i}\) |
\(15 – 25\) | \(20\) | \(6\) | \(120\) |
\(25 – 35\) | \(30\) | \(7\) | \(210\) |
\(35 – 45\) | \(40\) | \(6\) | \(240\) |
\(45 – 55\) | \(50\) | \(4\) | \(200\) |
\(55 – 65\) | \(60\) | \(4\) | \(240\) |
\(65 – 75\) | \(70\) | \(2\) | \(140\) |
\(75 – 85\) | \(80\) | \(1\) | \(80\) |
Total | \(30\) | \(1230\) |
Now, the mean formula is
\(\overline x = \frac{{\sum {{x_i}{f_i}} }}{{\sum {{f_i}} }} \Rightarrow \overline x = \frac{{1230}}{{30}} \Rightarrow \overline x = 41\)
Hence, the required mean is \(41.\)
Q.5. The following table shows the class intervals and frequencies. Find the mean using the short-cut method.
Class-Interval | \(0 – 20\) | \(20 – 40\) | \(40 – 60\) | \(60 – 80\) | \(80 – 100\) |
Frequencies | \(7\) | \(10\) | \(10\) | \(8\) | \(10\) |
Ans:
Classes | Midpoint \(\left( {{x_i}} \right)\) | \(\left( {{f_i}} \right)\) | \({d_i} = {x_i} – A\) | \({f_i}{d_i}\) |
\(0 – 20\) | \(10\) | \(7\) | \(-20\) | \(-140\) |
\(20 – 40\) | \(30\) | \(10\) | \(-10\) | \(-100\) |
\(40 – 60\) | \(50\) | \(10\) | \(0\) | \(0\) |
\(60 – 80\) | \(70\) | \(8\) | \(10\) | \(80\) |
\(80 – 100\) | \(90\) | \(10\) | \(20\) | \(200\) |
\(\sum {{f_i}} = 50\) | \(\sum {{f_i}{d_i}} = 40\) |
We have,
\(N = \sum {{f_i}} = 50,\,A = 25,\,\sum\limits_{i = 1}^n {{f_i}{d_i}} = 40.\)
\(\overline x = A + \left\{ {\frac{1}{N}\sum\limits_{i = 1}^n {{f_i}{d_i}} } \right\} = 25 + \left\{ {\frac{1}{{50}} \times 40} \right\} = 25 + 0.8 = 25.8.\)
Hence, the mean is \(25.8.\)
In this article, we learned about the mean or the arithmetic mean, the need for it, and its formula. We also learned about the methods of calculating the mean if the large data sets are given. We discussed the Mean of Group Data formulas; the direct method assumed mean method and the step-deviation method. The mean of grouped data notes will help in solving the questions quickly.
Frequently asked questions related to mean of grouped data is listed as follows:
Q.1. What is the mean of data?
Ans: Mean in statistics covers the measurement of average. The arithmetic mean or the mean of the data set is the sum of the values of all the data divided by the total number of data. It denotes the central tendency of data.
Q.2. What do you understand by grouped data?
Ans: To present the data in a more meaningful way, we condense the data into a convenient number of classes or groups, generally not exceeding \(10\) and not less than \(5.\) This helps us in deriving the salient features of data. For grouped data, we use the concept of class interval and class mark.
Q.3. How do you find the mean of data?
Ans: The mean of n observations (variables) \({x_1},\,{x_2},\,{x_3},\,{x_4},\,…,\,{x_n}\) is given by the formula:
Mean \( = \frac{{{x_1} + {x_2} + {x_3} + {x_4} + … + {x_n}}}{n} = \frac{{\sum {{x_i}} }}{n}\)
where \(\sum {{x_i}} = {x_1} + {x_2} + {x_3} + {x_4} + …. + {x_n}.\)
Thus, mean \( = \frac{{{\rm{Sum}}\,{\rm{of}}\,{\rm{all}}\,{\rm{observations}}}}{{{\rm{Total}}\,{\rm{number}}\,{\rm{of}}\,{\rm{observations}}}}\)
Q.4. How do you find the mean, median and mode of the grouped data?
Ans: The formula of finding the mean of grouped data using direct method is
\(\overline x = \frac{{\sum {{x_i}{f_i}} }}{{\sum {{f_i}} }}\)
The formula of finding the mean of grouped data using short-cut method is
\(\overline x = A + \frac{1}{N}\sum\limits_{i = 1}^n {{f_i}{d_i}} \)
The formula of finding the mean of grouped data using step deviation method is
\(\overline x = A + h\left\{ {\frac{1}{N}\sum\limits_{i = 1}^n {{f_i}{u_i}} } \right\}.\)
The formula of finding the median of grouped data is \(l + \left( {\frac{{\frac{n}{2} – cf}}{f}} \right) \times h,\) where \(l\) is the lower limit, n is the sum of the frequencies, \(f\) is the frequency of the median class and \(cf\) is the cumulative frequency before the median class and \(h\) is the class width.
The formula of finding the mode of grouped data is \(l + \frac{{f – {f_1}}}{{2\,f – {f_1} – {f_2}}} \times h\) where, \(l\) is the lower limit of the modal class, \(f\) is the frequency of the modal class, \(h\) is the width of the modal class, \({f_1}\) is the frequency of the class preceding the modal class, \({f_2}\) is the frequency of the class succeeding the modal class.
Q.5. Which method is known as the short-cut method for finding mean?
Ans: Assumed mean method is known as the short-cut method for finding mean. In the assumed mean method, we assume a certain number within the data as the mean. Then, we calculate the deviation of different classes from the assumed mean, and further, we calculate the weighted average of the deviations with the weights being the frequencies and then finally adding the average to the assumed mean.
Thus, the mean is calculated by the formula: \(\overline x = A + \frac{1}{N}\sum\limits_{i = 1}^n {{f_i}{d_i}} .\)
We hope this detailed article on mean of grouped data has helped you in your studies. If you have any doubts or queries, you can ask in the comment section and we will be happy to help you.
Stay tuned to Embibe for more updates on important concepts of Statistics!