Income Mountains Dataset — Documentation

Version 3 – November 9 2023

This gives a quick overview of the sources and methods used to create Gapminder’s dataset behind our Income Mountains, with data for the period 1800 to 2100. 

This dataset is available to download for free at this github repository.

The mountains show the number of people on different incomes, measured as mean household income (or consumption) per person per day, in dollars adjusted for inflation over time and price differences in year 2017 (PPP 2017). The global curve is constructed by stacking all countries’ bell-curves on top of each other. Each country’s bell-curve is calculated using three numbers: 1. Mean income; 2. Gini; 3. Population. The World Bank has data for Ginis and Income for countries for the year 2020 in their income database called  PovcalNet, which they use for estimating the number of people in extreme poverty line. In addition to the years when PovcalNet has data, Gapminder has gathered data from a wide range of historic sources as documented for each of the three indicators in each of the latest versions of our datasets for:  Gini v4, Household Income v4 and Population v7. Conceptually the average global household income should be the same as the global GDP per capita, but they are a bit different, and the reasons are outlined  here.   

The trend beyond 2020 into the future, is hypothetical. It was generated only to show a likely “IF-scenario”, how people’s incomes would change if assuming the world’s countries continued having a modest version of their recent economic growth and inequalities in each country remain as in 2020. These assumptions are not meant to say that’s gonna happen. Nobody can know. Instead they are useful only to see what the world would look like IF that would happen, which might be quite likely. The growth forecast of income is documented on the  GDP per capita documentation page »

The trends of mean household income per capita is using the growth rates from our dataset for GDP per capita GDP per capita version 29, to estimate the levels of income backwards to the year 1800. The Income trends are extended into the future with the modest future growth rates of the GDP series, based on IMF’s growth rates for countries up to 2020 and then all countries converging to a modest global growth of 2.2%. It’s worth noting that this is a highly hypothetical forecast, and the future picture would change a lot, depending primarily on what happens in the largest countries.

The uncertainty of this data is large. (We hope in the future to be able to visualise the doubt we have as a blurred outer border of the shape to remind users of the high uncertainty.) We still dared to compile a consistent dataset and fill in all the gaps even if the uncertainty is high. Because the ignorance of global development is even higher. We believe that we can change that ignorance by showing visually what it looked like when billions of people left extreme poverty behind. For such image to be easy to understand, we can not let the image flicker because of missing data. Without clear visual impressions like our animating income mountains, we know that people instead end up imagining a world that has only changed a little bit. Or they might even be imagining the majority still being stuck in extreme poverty, and thinking the world is still as bad as it always was (see the destiny instinct.).

QUICK DESCRIPTION OF OUR METHOD

During the past 20 years, Gapminder has used a “synthetic method” to draw income mountains for all countries and stack them to a global mountain. It was synthetic because it didn’t use the actual “raw-material” from surveys, but replaced the data with a synthetic formula. Today, in 2021, the data situation is very different. Lots of income surveys from many countries and years have been standardised to become comparable, thanks to the LIS project and the World Bank’s Povcal. Almost all countries have some comparable data for some recent year, and we are now we have decided to abandon the synthetic method based on the log-normal assumption and instead we are creating a new method using actual survey data whenever it is available from Povcal, and a new method to generate the shapes for years where we Ginis but no comparable income data. Below steps show how we generate curves that are not log-normal, based on Ginis. 

  • Step 1. We divide the income dimension into many groups (income brackets), and get the population of each income bracket from PovcalNet. If we plot these data points (x = income bracket, y = population) for a given country and year, we will get the curve for that country and year.  Whenever PovcalNet has data we use a smoothed version of it.
  • Step 2. We assume that countries with similar income and gini will have similar shapes. So to extend the income mountain into the past and the future, we first calculate the “neighbours” of the years outside PovcalNet periods based on Gapminder’s long time series of Mean income per capita and Gini data. Then we calculate the average curve of neighbours for every country, year pair.
  • Step 3. Then we blend the nearest PovcalNet curve and average curve from the previous step. How much we take from each, depends on how many years away we are from the nearest PovcalNet curve of that country. The result is the estimated curves for years outside Povcalnet periods
  • Step 4. We pile the shapes of the country-bell-curves on top of each other for each region, and then we pile all the regions together and we have aggregated a single global shape showing where people are on different income levels.  

You can also follow us on Facebook or Twitter where we keep posting about all our updates.

If you have questions about our data please contact us.

Previous Versions

Version 2