## Grouping Data Into Classes

When you start dealing with large amounts of data, it gets a bit overwhelming to try and deal with every individual data point.  What you can do is make up a few classes and classify the data points into each class.  For instance, if you were dealing with the weights of all 700 students at your school, you might make up 40 – 49 kg, 50 – 59 kg, 60 – 69 kg and 70 – 79 kg classes.  Then you could work out which class each student belonged to.  By the end of it, each class would have a certain number of students in it – with 4 classes, you’d have 4 numbers, instead of the original 700 individual weights.  Much simpler!

Of course most of the problems you’d get in an exam wouldn’t have 700 pieces of information, they’re more likely to have 20 or 30.  Like in this question:

Bernie is a teacher.  He’s just finished marking the maths exam for his class, which was scored out of 20.  The marks were:

14, 7, 17, 15, 19, 13, 8, 9, 20, 3, 15, 17, 12, 6, 19, 15, 11, 10, 19, 14

He wanted to group the data so it would be easier to look at.  He decided to group it into a 1 – 5 category, 6 – 10 category and so on.  Draw up a table showing each class and the number of marks in it.

Solution

So first up, before we do any counting, we need a table.  The table should have two parts – one column or row showing each of the classes, and next to this, a row or column where we can write a number showing how many marks fit into that class.  We describe the number of times that something fits into a class as the frequency:

 Mark Class 1-5 6-10 11-15 16-20 Frequency

OR

 Mark Class Frequency 1-5 6-10 11-15 16-20

Once we’ve got the table, we just need to fill in the empty boxes.  We do this by sorting through the list and counting the number of marks that fit into each class.  Once you’ve done this, you should get a final table like this:

 Mark Class Frequency 1-5 1 6-10 5 11-15 8 16-20 6

Be very careful when you’re counting the number of scores in each class.  Just now I had to do the count several times to get it right!

One way to quickly check whether your frequency column is reasonable is to add up the total of the column:

If the total is the same as the total number of items (exam scores in this case), then your answer is probably OK.  We’ve got 20 different scores, and our total frequency added up to 20, so this is a good indication we’ve counted correctly.

### Choosing classes

Some questions will ask you to select some suitable classes to group the data into.  There are some things you should think about when you’re selecting the classes:

·         You need to be able to put every single piece of data into a class – the classes must cover the entire range of data values.

·         There should be a reasonable number of classes.  What’s reasonable?  For most high school questions, you’re looking at somewhere between say 4 and 10 classes.  Sometimes the question will say how many classes you need, but will leave you to work out exactly what each class will cover.

·         There should be no gaps between your classes.  For instance, if one class was 10 – 14, and the next class was 16 – 20, where would you put ‘15’?

·         Try to pick classes which all have the same ‘width’.  For instance, 1 – 5, and 6 – 10 both have a ‘width’ of 5 – each class can contain 5 different possible values.  1 – 5 can contain 1, 2, 3, 4, and 5. 6 – 10 can contain 6, 7, 8, 9, and 10.

·         Usually in maths questions there’s an obvious way you can set up the classes.  If you had data values varying between say 4 and 49, an easy way to set up your classes would be by ‘10’s: 1 – 10, 11 – 20, 21 – 30, 31 – 40, 41 - 50.  If you had values varying between 35 and 245, you might choose your classes by ‘50’s: 1 – 50, 51 – 100, 101 – 150, 151 – 200, 201 – 250.  Both times you get 5 classes – which is a nice number to work with, and the classes each cover a nice ‘regular’ range.

In the maths exam question, the test was marked out of 20, so any mark from 0 to 20 was possible.  The classes chosen for the question were 1 – 5, 6 – 10, 11 – 15 and 16 – 20.  This means that a mark of ‘0’ would not fit into any class!  Luckily, the lowest mark was 3, so every mark fit into a class.

You could change the classes to be 0 – 4, 5 – 9, 10 – 14 and 15 – 19.  That way you would be able to handle a mark of ‘0’.  But you would not be able to handle a mark of 20 now!

How many different possible marks are there?  Well, if you can get anywhere from 0 to 20, that means there are 21 different possible marks.  We could choose a class width of 3, then our classes would nicely span the entire range of possible marks:

0 – 2, 3 – 5, 6 – 8, 9 – 11, 12 – 14, 15 – 17, 18 – 20

You won’t always be able to find a class width that nicely fits all the data.  Sometimes you’ll have to just add one more class.  For this example, we could add a 20 – 24 class, so that all up we got:

0 – 4, 5 – 9, 10 – 14, 15 – 19 and 20 – 24

You’d just have to remember that the last class mostly represents marks that are impossible to get – the only mark that would get you into the last class is ‘20’.  For a question like this, unless you had marks of both 0 and 20, you’d be best off just adopting either the:

1 – 5, 6 – 10, 11 – 15, 16 – 20 class arrangement

or the

0 – 4, 5 – 9, 10 – 14, 15 – 19 class arrangement