In today’s digital economy, companies in virtually every sector—from retail to manufacturing to tourism—collect a vast amount of information about their operations that could be used to better measure the economy. But, for the most part, government statistical agencies have not changed the methods they use for economic forecasting to make use of this data, instead relying on slow and expensive surveys. To better understand how to modernize government statistics, the Center for Data Innovation hosted a policy forum at the European Parliament with MEP Miapetra Kumpula-Natri to discuss the opportunities and challenges involved in using non-traditional data sources to improve the quality and timeliness of official government data.
To begin the discussion, Joonas Tuhkuri, a researcher at ETLA, the Research Institute of the Finnish Economy, provided an overview of ETLAnow, a tool developed in partnership with 29 European research institutes to forecast the unemployment levels in every EU countries using analysis of real-time Google search data and official data from Eurostat. The estimates are automatically updated daily and include forecasts for the next 3 months.
ETLA has also used Internet search data to forecast immigration to Finland during Europe’s recent refugee crisis, and it is working with a start-up in Helsinki to combine information from Finland’s business register and information on the Internet to create a better picture of business activity in the economy. ETLA researchers stressed that this type of modeling requires that a country’s business register be available as open data—but this is lacking in some countries. As a result, creating consistency between European countries will be a challenge.
Following the presentation, Petri Rouvinen, another researcher at ETLA, and Lucy Sioli, the Head of Unit of European semester and knowledge base at DG Connect, joined the panel discussion. The panelists agreed that official government statistics could be improved by using alternative data sources, but European statistical agencies are not yet ready to take full advantage of this type of approach to economic forecasting. One problem is consistency. European countries need to develop consistent methods for gathering and analyzing supplemental data, a particular challenge when data is gathered in different languages. Another problem is availability. Countries need to ensure data used to measure the economy, especially government data, remains available over time to support longitudinal analysis.
In particular, statistical agencies may be reluctant to rely on alternative data sources provided by the private-sector if they are not confident they will be able to use this data in the future. Private sector organizations are generally under no obligation to provide this type of voluntary data in any regular or consistent form. As a result, a company’s changing business needs may lead it to collect different data, or stop collecting certain data entirely, in the future, and the government would have no recourse.
That does not mean government statistical agencies should not try. For example, Eurostat has an agreement with a number of retailers to obtain data from retail scanners. However, this type of data sharing could be greatly expanded, and panelists noted that it is difficult for government agencies to gain access to similar retail data collected by banks and credit card companies. Similarly, while the European Commission already analyzes online job advertisements to assess the European labor market at a regional level, policymakers could get a better picture of how well the skills of job-seekers are matching the demands of employers if they had access to data from employment agencies and professional networks, such as LinkedIn. Policymakers should explore how to form public-private partnerships that could unlock some of these opportunities.
Both the public sector and private sector rely heavily on economic data for key decisions, and government statistical agencies should be encouraged to continue modernizing its methods so that it can provide this data as quickly and accurately as possible.