Grouped data

From Wikipedia, the free encyclopedia
Jump to: navigation, search

Grouped data is a statistical term used in data analysis. Raw data can be organized by grouping together similar measurements in a table. This frequency table is also called grouped data[1].

Example[change | change source]

For example, someone gave a group of students a simple math question, and timed how long it took them to answer it. The numbers are below:

Table 1: Time taken (in seconds) to answer a simple math question

20 25 24 33 13
26 8 19 31 11
16 21 17 11 34
14 15 21 18 17

The smallest amount of time was 8 seconds, and the largest was 34 seconds. One method we could use to analyze the needed time is to group close numbers together. In order to keep the analysis fair, we'll make each group be the same amount of seconds. We can then count how many students fell in each group. For example, if we organized scores into 5 second ranges:

Table 2: Frequency distribution of the time taken (in seconds) to answer a simple math question

Time taken Frequency
5 to 9 seconds 1 student
10 to 14 seconds 4 students
15 to 19 seconds 6 students
20 to 24 seconds 4 students
25 to 29 seconds 2 students
30 to 34 seconds 3 students


Another way to group data is to organize the scores data into groups based on their performance. Suppose there are three types of students:

  • Smart (5 to 14 seconds)
  • Normal (15 to 24 seconds)
  • Below average (25 or more seconds)

then the grouped data looks like the following:

Table 3: Frequency distribution of the three types of students

Frequency
Smart 5
Normal 10
Below average 5

Mean of grouped data[change | change source]

An estimate, \bar{x}, of the mean can be calculated from grouped data.

\bar{x}=\frac{\sum{f*\,x}}{\sum{f}} .
x refers to the mid-point of the class intervals
f is the class frequency.

Note that this estimated mean may be different from the sample mean of the ungrouped data. The mean of the grouped data in the above example can be calculated as follows:

Class Intervals Frequency ( f ) Midpoint ( x ) f*x
5 to 9 seconds 1 7.5 7.5
10 to 14 seconds 4 12.5 50
15 to 19 seconds 6 17.5 105
20 to 24 seconds 4 22.5 90
25 to 29 seconds 2 27.5 55
30 to 34 seconds 3 32.5 97.5
TOTAL 20 405


Therefore, the mean of the grouped data is

\bar{x}=\frac{\sum{f*\,x}}{\sum{f}} = \frac{405}{20} = 20.25

Related pages[change | change source]

Notes[change | change source]

  1. Newbold et al., 2009, pages 14 to 17

References[change | change source]

  • Newbold, P., W. Carlson and B. Thorne (2009) Statistics for Business and Economics, Seventh edition, Pearson Education. ISBN 9780135072486.

R