Topic

Stories in numbers: Averages and percentage change

Topic Progress:

Click on the numbered tabs to navigate this lesson.

Stories in numbers: Averages and percentage change

This lesson has been adapted from materials prepared for the Sudan Evidence Base Programme, created by Eva Constantaras with support from the World Bank and the UK Department for International Development (DFID).

Spreadsheets are the single most useful tool you have for working with data. Not only are they vital for reading and analysing information on your desktop, they can be used to produce simple charts and graphics as either images or embedded content in a story with interactive elements.

For the purposes of this lesson, we’re going to be using the Google Sheets application. It’s free to use so long as you have a Google (ie Gmail) account, and runs inside a web browser. Any changes you make are automatically saved on Google’s servers and can be accessed from any networked PC or mobile device.

Other spreadsheet software with similar features that can be used offline include Microsoft Excel (paid for) and Libre Office Calc (free).

This lesson follows on from Stories in numbers: Sums and rates, Calculating percentages and Cleaning, filtering and sorting data. It is highly recommended that you read through those lessons first.

In this lesson, you will learn:

  • How to calculate averages in spreadsheets
  • How to add a new worksheet
  • How to copy formulae in spreadsheets
  • How to calculate percentage change

Calculating averages

In this task, we will find the average (mean) rate of Under five mortality for Sudan and all its neighbours for each year from 2003 – 2013. This will help us understand the trend in under five mortality for the entire region over this decade.

Sheets provides you a simple formula to calculate the numerical average or ‘mean=AVERAGE(First cell # : Last cell#).

In the last lesson, you filtered data for the Mortality rate, under-5 (per 1,000) indicator in the CMR Data Sheets workbook. Do the same again now.

Now select all the filtered data and copy it. You can do this by using the keyboard commands CTRL+A to select all, and CTRL+C to copy.

Add a new worksheet

A single spreadsheet can contain multiple “worksheets”, which is useful for keeping all data about a subject in one place. It’s also useful because cell forumlae can link to cells on different sheets.

To create a new worksheet in our CMR Data file, click on the plus icon in the bottom left of the screen. This will create a new sheet, Sheet 1.

 

Now place your cursor in box A1 and right-click. Choose Paste special>Values only from the menu that appears.

Now we have a sheet which only contains the data for Mortality rate, under-5 (per 1,000).

Click on the arrow next to the Sheet1 tab, and rename this Sheet Under 5.

Getting the average

Now select cell A15 and enter the word Mean. In Cell E15, enter the formula =AVERAGE(E2:E14). Now you have the average under-five mortality rate for Sudan and its neighbours in 2006.

Do you remember how to copy this formula to the rest of the cells in row 15?

Copying formulae

Don’t worry if you don’t remember how to copy the formula. Just select cell A15, and hover your mouse cursor over the bottom right-hand corner until it turns into a black cross. Now drag this cross along the rest of the row to the end of the table (row O).

Why is row O showing an error?Hint – an average is the sum of the numbers in a group, divided by the number of numbers in a group. So the average of 1,2,3,4,5,6,7 is (1+2+3+4+5+6+7)÷7=4.

Your table should look like this, showing the average mortality rate of children under five for Sudan and its neighbours for the last decade.

Calculating percentage change

The rate of change of a statistic is one of the important numbers a journalist has to understand. Crime stats are up, but by how much? What does it mean if there were 13% fewer deaths on the road this year.

The rate of change is often expressed as a percentage: there were 23% more strike, tax returns are down 4%, and so on.

In our story about child mortality, we want to know how each country has improved on reducing under five mortality between 2006 and 2015. In this case, you will look at the percentage change in the under five mortality between these two years.

In general to calculate the percentage change between two numbers – you need to first work out the change between the two numbers you are comparing.

Then, divide the changed amount by the original number and multiply the answer by 100.

If your answer is a negative number then this is a percentage decrease. In our scenario, a negative percentage change would indicate that the under five mortality rate has decreased over the decade.

Percentage change in Sheets

Start by deleting all the data in columns O and P. We don’t have any information for 2016, so we could have taken this out in the data cleaning stage.

  • In cell O1 of the “Under 5” sheet that you created, type Percentage Change 2006 to 2015.
  • In cell O2 enter =(N2-E2)/E2, and then press Enter.
  • Now copy that formula down all the rows in column O (use the black cross method).
  • Click on the letter at the top of column O to select all cells in that column.
  • Finally, go to the Format menu and select Number>Percentage.

Percentage change

Does your sheet look like this?

If you followed those instructions carefully, your sheet should look like this. Well done!

Now ask yourself what are the stories in this data that need further investigation?

If the first two stories that you can see are “How did Rwanda do so well?” or “Why is Chad not reducing child mortality at the rate of its neighbours?” – congratulations. You’re starting to think like a data journalist.

The important stories are almost always in the extremes of the data, any outliers which don’t follow the same pattern as other observations in a group are a good starting point for further investigation. But remember, the data doesn’t tell the whole story, most of the time it just tells you where to start looking.

In statistics, an outlier is an observation point that is distant from other observations. It means that there could be something unusual or different about that data point worthy of further investigation.