Demographic Data Discovery - Learning Qlik Sense The Official Guide (2015)

Learning Qlik Sense The Official Guide (2015)

Chapter 12. Demographic Data Discovery

In this final chapter, we shall finish our exploration of real data with Qlik Sense by moving beyond the standard structures of the office and showing the full possibilities of the software for analysis of almost any kind of imaginable data. We'll therefore be looking at applying Qlik Sense to demographic data. As before, this example and many others are available for you to explore at http://sense-demo.qlik.com.

This chapter will cover the aspects necessary for demographic data discovery, including:

· General information about common KPIs

· Examples showing how to use the lasso selection in maps and scatter charts

· Examples of dimensions and measures

The problem analysis

With Qlik Sense, it is possible to analyze not only business data, but rather any data. One great example is demographic data—statistics of countries and regions on anything from age and gender to income and life expectancy.

Such data can be found on a number of Internet sites and downloaded for your convenience, for example, from the following websites:

· United Nations (https://data.un.org)

· Federal government of the United States (www.data.gov)

· European Union (http://ec.europa.eu/eurostat)

Demographic data is used and analyzed as is by a number of nongovernmental organizations that need it for their activities. The common measures required are GDP per capita, population, unemployment rate, inflation, life expectancy, happiness, trade balance, labor cost, national debt, election results, and so on.

Often, interesting questions about correlations are asked; for example, how does happiness correlate with material standards and health? How is the population growth and number of children affected by factors such as life expectancy, poverty, and average salary? How has life expectancy improved over the years? If you haven't seen Hans Rosling's presentations on the Internet on this topic, we strongly recommend them. They show that data analysis is both important and fun.

Common dimensions in demographic data are country, region, gender, age group, ethnicity, and so on. An example can be seen in the following graph, where you can see life expectancy and per capita GDP for different countries. Many developing countries are found in the lower-left quadrant, whereas the richer countries usually are found in the upper-right quadrant.

The problem analysis

Life expectancy versus per capita GDP

You can clearly see that the two numbers are highly correlated. The higher the GDP, the higher the life expectancy.

These measures can often be linked to your business data as well, to enable a deeper understanding of your data. For instance, you can divide your country sales by the population of the country, thereby getting a relative sales number, which tells you how well you sell in that country. Or if you assume that the market space in the country is roughly proportional to the GDP, you can divide your sales by the GDP and use this number to compare market penetration between countries.

These numbers will answer questions such as, "How well are we selling in this country, given the potential?".

Application features

On our demo site, we have an app with a number of demographic measures per country. You can find it at http://sense-demo.qlik.com under the name Happiness. It analyzes, among other demographic indexes, the Happy Planet Index (HPI) in a number of countries. You can learn more about this index at www.happyplanetindex.org.

This index measures the sustainable well-being of 151 countries across the globe, focusing not on their abilities to produce material goods and services, but rather on their abilities to produce long, happy, and sustainable lives for the people who live in them. A happy life doesn't have to come at the expense of our environment, and the HPI is used to promote a policy that puts the well-being of people and the planet first.

Application features

The app overview of the Happiness application

Below this overview, you will see a number of sheets. The leftmost sheet is an introduction, whereas the other sheets are prepared for analysis and detailed information.

If you click on the Stories button to the left, you will see that the app also contains one story—a story that can be used to present data in the app. It can also be used as an introduction to the app the first time you open it.

Application features

The sheets on the app overview page

The first sheet with charts is called Happy Planet Index (HPI). On it, you will see the happiness index for all countries, first on a map, and then in a table.

The countries in the map are colored according to the happiness index. The darker the color, the higher the index.

Application features

Map showing the happiness index per country

Below the map, there are three scatter charts showing the happiness index per country, plotted against the life expectancy, GDP per capita, and total population. These three charts are excellent tools to analyze any correlation between happiness and the mentioned demographic measures.

Application features

Scatter charts that show the correlation (or lack of correlation) between happiness and other demographic measures

Finally, at the bottom, you have three filter panes, allowing the user to choose only a region, subregion, or country to zoom in the numbers for a specific area.

The other sheets contain additional and more detailed information, ordered by topics. The final sheet contains a table showing the details, should the user be interested in drilling down to the lowest level.

Analysis

When looking at data in this app, the first question that pops up in the user's mind is usually, "Is there any correlation between happiness and x?". To get a qualitative answer to this, you only need to browse through the scatter charts.

On the Happy Planet Index (HPI) sheet, you have three scatter charts. In the leftmost chart, HPI vs Life Expectancy, you can see a correlation between the two measures, at least for lower life expectancies. In the other two charts, however, there is no clear correlation.

On the HPI Comparison sheet, you have three additional scatter charts. In the leftmost chart, HPI vs Happy Life Years, you can see a weak correlation between the two measures. The same is true for the rightmost chart, HPI vs Global Footprint, but in the chart in the middle (HPI vs Governance), there is no clear correlation.

However, as in all of statistics, you have to be careful with your conclusions. Firstly, correlation does not imply causation. You have to look at many factors and use common sense to find the true cause and effect. In this case, it is just that the happiness index is an artificial index calculated from the life expectancy and the ecological footprint among others, hence the correlation.

Using the lasso selector to make selections

Now let's explore the data. One question could be, "Where in the world do we find the countries with a low average life expectancy?" To answer this, you need to make a selection in the scatter chart showing life expectancy:

1. First, maximize the scatter chart by clicking on the Full screen arrow in the upper-right corner of the object.

2. Then click on the chart so that the chart controls, including the lasso symbol, appear in the upper-right corner. Next, click on the Turn on lasso selection option. Now you can draw a line around the points you want to select. Finally, confirm your selection by clicking on the green tick mark in the upper-right corner.

Using the lasso selector to make selections

Lasso selection in the scatter chart

If you now look at the map, you will see where these countries appear in the world. It's predominantly Africa and South Asia. If you click on the map, you can zoom in using the scroll wheel of the mouse. You can also pan the map.

Using the lasso selector to make selections

Countries with low life expectancy

Of course, you can also make a selection the other way around. Use the lasso selector in the map and see how the selected countries are distributed in the scatter chart. The way to do this is as follows:

1. Zoom in on the map.

2. Click on the object.

3. Click on the Turn on lasso selection option and encircle the part of the world you want to explore.

4. Finally, confirm your selection.

Using the lasso selector to make selections

Making a lasso selection of America on the map

Using the global selector to make selections

You can also use the global selector to make selections. If you click on the global selector (to the right in the toolbar with the Selections tool as a popup), you can make selections directly in the fields.

For instance, you may have a question like this: "Where in the world do I find the richest countries?". In such a case, perform the following steps:

1. Open the global selector and find a field called GDP/capita ($PPP). To do this, you first need to check Show fields in the global selector.

2. Once you have found this field, you can investigate it just by scrolling. You will then see that there are some countries with less than $400 in GDP per capita, while the richest countries have more than $80,000 in GDP per capita.

If you want to find the countries where the GDP is greater than $10,000, perform the following steps:

1. Click on the listbox and type >10000.

2. Confirm the search by pressing Enter, and confirm the selection by clicking on the green tick mark.

Using the global selector to make selections

Selecting the countries in the world with the highest GDP

If you now close the global selector and go back to the map and the scatter charts, you will be able to see where you find the richest countries, both on the map and in the scatter charts.

How the application was developed

The data model of the Happiness application looks like what is shown in the following screenshot:

How the application was developed

This is an extremely simple data model that only contains one table of real data, Happy Planet Index, and an additional table listing all countries, World.shhp/Features. The second table has one record per country and holds the map information—the shapes of the country—used in the map object in the user interface.

In this app, the data table has exactly one record per country—a record that contains the relevant information for a given country at a given moment. However, this is not always the situation. More often, the data table contains data for many countries over many points in time, for example, one record per combination of a country and a year. This results in several lines per country.

Dimensions

There are not many fields that can be used as dimensions. The three available fields are region, subregion, and country. The world is split into 7 regions and 19 subregions. A country can only belong to one subregion and one region. These fields have been added to Library. In addition, a drill-down dimension has been created from the three fields.

One way of adding dimensions could be by creating "buckets" based on one of the measures, for example, population. Countries could then be grouped under Large, Medium, and Small classes, which would be stored in a new field, Population Class.

Dimensions

The dimensions in Library

Measures

A number of measures have also been defined, for example, GDP, happiness index, global footprint, life expectancy, and so on.

It is important that the app developer formulates the formulas correctly, since this is something that could be difficult for the business user. The business user doesn't always have knowledge about the data model, which is something you need in order to get all the expressions right.

In the following table, you can find some of the measures defined in this app:

Measure

Definition

GDP per Capita

Avg([GDP/capita])

Global Footprint

Avg([Footprint])

Governance Rank

Only[Governance Rank])

Happy Life Years

Only([Happy Life Years])

Happy Planet Index

Only([Happy Planet Index])

HPI Rank

Only([HPI Rank])

Population

Only(Population)

Several of these measures can be defined differently. How you do this is very much a matter of taste. For instance, the measures where the Only() function is used can also be defined using Sum() or Avg(). As long as you only have a single number, all three functions will return the same answer.

But how do you want Qlik Sense to behave when there are several countries, for example, a region that should be represented by one value? For the Population measure, the obvious function to use would be Sum(). Then the total population of the region will then be shown.

However, for a rank, you won't want to use Sum() because it would show an incorrect number. You could use Avg(), which would give the average rank between the countries. An average is clearly better, but it is still not mathematically correct. Then it might be better to use Only(), which doesn't return an answer at all when more than one country is involved.

Summary

The analysis of demographic data is easy when you use Qlik Sense. Obviously, this analysis can also be made with a number of other tools, since the data model is very simple. However, with Qlik Sense, it is easy to build further. Qlik's associative indexing engine powers the analysis and ensures that you can develop or change your apps quickly and easily. With Qlik Sense, data discovery and analysis is made easy.

With the end of this chapter, we have also reached the end of the book. We took you from the history of Qlik to how to develop applications, and finally gave you some examples of how applications can look.

We hope that after reading this book, you have acquired some skills that will be useful when you develop your own Qlik Sense applications. We also think you now have a better understanding of the thoughts behind Qlik Sense, and wish you good luck in your endeavors.

Welcome to the community of Qlik users!