Monday to Thursday, 10am-4pm, EST
1.855.281.5499 (toll free)

Data analysis is for everyone! A short intro to pivot tables

(Cliquez ici pour la version française)

There are a lot of fancy tools out there for data analysis - some of which are extremely complex, and some of which are actually fairly simple to use and set up. If you are looking to take the next baby step along the road to using your data to make better decisions, a good place to start might be to take a second look at the good old spreadsheet. Spreadsheet programs have a lot of data analysis tools built right in, and today I want to discuss a lesser-known (in my experience) but extremely powerful feature: pivot tables! 

Pivot tables are available in most spreadsheet programs you might use (they even were recently added to Google Spreadsheets), but for this post I'll focus on Excel 2010 since that's what I'm using most at the moment. Where pivot tables really shine is when you have a big, straightforward table full of data and you want to make some links between different types of data (beyond what you can do with filters and sorting). In particular, pivot tables can be a great way to explore your data if you're not quite sure what you're looking for yet.

The Context: tracking donation requests

To show you what I'm talking about, I'll give you an example that is directly relevant to our work here at TechSoup Canada: tracking requests of products in our donations program. I can do an export from my database to get a nice big spreadsheet where each row represents one product that was ordered.

Obviously I've stripped out a lot of data (such as the organization's name and info, and the product they ordered) but I have left in some of the other information such as the type of organization who ordered the product (i.e. what is their primary mission), the province where the organization is from, the number of licenses of the product they ordered (for example, one order might include 5 licenses of Microsoft Office) and the date they ordered.

Spreadsheet of order data

So if I want to find out the number of orders we had in a given month, that's pretty simple. I would just use Excel's auto-filter to filter the month and year and then count up all the rows.

Question 1: from which province do we get the most orders?

But what if I want to do something more complex? Say I want to find out which province we get the most orders from. I could do some sorting and counting, but it would involve a lot of manual work. This is where pivot tables come to the rescue! I select the data I want (which is my entire table of data, all rows and columns) and go to Insert -> Pivot Table.

Now if you've never used pivot tables before, the screen that you get can be pretty confusing:

New pivot table

Don't be intimidated! What you need to do is to try to visualize the report that you want, and then drag your fields from the list on the right into their spot in the table or the boxes on the right.

If you're struggling to figure out what goes where, just try it out and see if it makes sense or not. In this case, let me rephrase my question in terms of my data: I want to know which province had the highest sum of licenses ordered overall.

I'll start by dragging "Province" to the "Row Labels", so I get a list of all the provinces, and then I will drag "Quantity" to "Values". By default it is doing a count of the quantities, so I right-click go into Field Value Settings and choose sum. Now I have a list of all the provinces and the total number of licenses that have been ordered from each one.

This is great, but it's hard to see which one is most popular. So what I'm going to do is to click on the little arrow by "Province" (the header for the table), pick "More Sort Options". After a bit of playing around, I found that I can pick "Descending (Z to A)" and choose "Sum of QTY" and it sorts the provinces in decreasing order by number of license ordered. In other words, the province that ordered the most licenses is at the top:

Number of licenses ordered by province

Not surprisingly, we get the most orders from Ontario. This makes sense because there are quite a lot of charities, nonprofits and libraries in Ontario. However, it also means that we need to be doing more outreach to other provinces to spread the word about the donations program. So we can use this data to inform our outreach efforts and ensure that our time is used most effectively, where it is most needed.

Question 2: in which month do we get the most orders?

So since this is so much fun, let's do one more example. This time I want to see a trend in the number of orders by month, to see which month we get the most orders. Again, I'm going to select all my data in the main spreadsheet, but this time I'll make use of Excel's newer "Pivot Chart" feature and pick Insert -> Pivot Chart.

As before, I'm going to build the table to show my data in the way I want it, but since I picked the Pivot Chart option, Excel is also going to make me a chart of my data which will make it easier to visualize the trend.

New pivot chart

Since I want to see time along the horizontal axis of my chart, I'll drag "Year" and "Month" to the "Axis Field" section. The data that I want to see is the number of orders, so I'll drag "Quantity" to "Values". I notice that it's automatically set to count the values instead of summing them, which is what I want in this case (I want to know about number of orders, not individual licenses). So now I have a table with my values, and a chart as well:

Bar chart showing orders by month

Now this looks good, but a bar chart - which is the default - isn't really the best option for showing a trend over time. Instead, I'm going to change the chart type to be a line chart. As well, I'll do a bit of clean up so the chart is easier to read.

Line chart of orders by month

Now, I can clearly see some trends in my data. For example, I can see that March and June have spikes in all 3 years shown here. If I link this to my general knowledge about the sector and the donations program, I know that March is high because many nonprofits' fiscal year ends on March 31, and that June is high because many of our donor programs reset their allotment limits at the end of June. Now that I'm informed by this data, I can start making decisions such as dedicating additional staff time to customer service during these months.

Of course, all this is only just scratching the surface, there are many other ways I could slice and dice this data. As I mentioned at the beginning of this post, being able to easily play around with your data will help you uncover trends and find out which views of your data are most valuable. At the end of the day, this will help you to be more data-informed and to better tell your organization's story through data.

Have you used pivot tables? What ways/tools for analyzing your data have you found helpful?