Saving economic data from Covid-19
Claudia Biancotti, Alfonso Rosolia, Fabrizio Venditti, Giovanni Veronese 12 April 2020
Statistics are to policymakers, markets and the public at large what a compass is to a sailor. However, the inevitable disruption from the coronavirus-induced lockdowns worldwide is hitting not only businesses and households but also that very compass that should guide policymaking through the uncharted waters lying ahead.
We face the serious risk of losing touch with the fast-evolving developments on the ground in precisely when we most urgently need these numbers. Designing the fiscal and monetary packages to sail through the crisis without the proper data-navigation tools will be difficult. This is further complicated by the fact that price discovery in financial markets may be equally impaired, with spikes in financial volatility adding to the Covid-19 shock. More importantly, however, large information gaps are a formidable weapon for those who want to tear at the fabric of our democracies. In the absence of reliable data anchoring the public debate, disinformation thrives. The circulation of inaccurate information on crucial variables such as the economic and human costs of the pandemic, exaggerating or minimising them depending on specific agendas or political goals becomes easier.
This is an unprecedented challenge that requires unprecedented synergies. Everybody needs to do their part. We see three actors in this play.
First, national statistical institutes and other public authorities, including central banks, as producers of official statistics will play a key role. They must strive to keep the flow of information as intact as possible, even in this challenging environment. Perhaps more importantly, they must offer more guidance than usual in interpreting statistics produced at a time of potential disruption. Take, for instance, the collection of data for compiling price indices. In most countries, this involves a mix of fieldwork and digital sources. As the shutdown hampers field activities, users may come to wonder (i) how extensive the loss of information is; (ii) whether missing are items imputed, and (iii) if so how? These are all legitimate questions. Transparent answers, beyond compilation rules and manuals in standard times, would allow analysts and policymakers to properly interpret new data releases. Also, whenever it is possible, producers of official statistics should try and increase the frequency and scope of the data provided. Provided they are sufficiently transparent, users will cope with the inevitable drawback of new data disseminations, even if these come at the cost of qualitative standards below those deemed acceptable in normal times.
Central banks should also do their part, by increasing the breadth of their data dissemination and by providing updates on the state of the economy more often than in standard times. The New York Fed, for instance, has recently started publishing a weekly assessment of economic activity based on retail sales, commodity production, energy consumption and unemployment claims. Many other central banks regularly maintain models of this type. Within the perimeter of public agencies, those not directly engaged in statistical production can also contribute to this effort, for instance by making available to a broader audience some of the data that they collect for administrative and regulatory purposes. The list includes, just to mention a few, social security administrations, fiscal agencies, labour offices, transport authorities and energy authorities. Suitable aggregations of the data they collect would greatly help policymaking and, at the same time, be compliant with any privacy requirement.
A second actor are private data brokers, specialised in producing extremely granular datasets on phenomena that have proven to be relevant to economic forecasting. These information providers have flourished in recent years in the wake of increasing demand from hedge funds and large investors, who search for data that lead official statistics. As an example, information on trade stemming from tracking ships around the globe can be used to nowcast US economic activity on a daily basis. Some of these datasets can be very costly and affordable only to a few large players. Ironically, they are often sourced from raw administrative information produced by government or public agencies themselves, for too long oblivious about their potential. Often times, these same datasets turned out to be very useful for some investors, as well as for policymakers themselves (Einai and Levin 2014). This calls for a renewed effort on the part of governments to improve their administrative data collection, promptly and effectively make it available online. The worldwide effort by epidemiologists in today’s fight against the pandemic is an unprecedented example of data sharing, of health statistics (an example is at the Centre for Humanitarian data), although serious information gaps remain (Stock 2020).
Last, but by no means least, there are the Big Tech and telecommunication companies. There are billions of devices in the world running only a handful of operating systems, and at least 2.4 billion monthly active users of Facebook. Amazon has stepped up its delivery potential worldwide. Devices and platforms collect massive amounts of data on their owners and users. This information can be usefully put to work to sail through the crisis. Over time, a number of proposals have been formulated to leverage at least part of Big Tech’s information trove for the public good. It is now time to look at them in the light of the Covid-19 emergency and see what is viable. Fortuitously, on 19 February 2020 the European Commission released the European Data Strategy, the result of a process started years ago. The strategy illustrates possible models of cooperation between public and private sector data collectors, so as to unravel the re-use potential of different types of data. With timely foresight it suggested that “The use of aggregated and anonymised social media data can for example be an effective way of complementing the reports of general practitioners in case of an epidemic”. Another recent milestone was reached with the agreement last March between Eurostat (the EU statistical office) and Airbnb, Booking, Expedia and Tripadvisor to permit access to reliable and unique data about holiday and other short-stay accommodation offered in these platforms. In the time of Covid-19, such partnerships should be reaffirmed and extended to include the largest tech companies.
There is a continuum of options. Which options are selected is a political choice that must balance, among other things, antitrust considerations, privacy issues, the risk of capture on the part of companies that are bigger than most governments; but a way must be found. We just lay out a few obvious examples. Simple improvements in the quality and breadth of some of the aggregate data that tech giants already provide would be themselves extremely beneficial. Google Trends, for instance, an index of popularity of a given search query on Google, has been found to have predictive power for the US unemployment rate (D’Amuri and Marcucci 2017) and for US GDP (Ferrara and Simoni 2019) and has been extensively used in other social sciences. Prime Now, Amazon’s grocery delivery arm, is booming; its data offer a unique window on the expenditure decisions and ongoing prices at a time when standard statistical sources are impaired. Gig economy platforms such as Uber could offer guidance in interpreting labour supply and demand, albeit in specific sectors and occupations (Mas and Pallais 2019, Angrist et al. 2017). Telecommunication networks could offer currently much needed information on mobility flows to monitor the pandemic and the effectiveness of containment measures (Pepe et al. 2020). There are obviously well-known limitations to a truly full-scale usage in statistical offices of these data (Stephens-Davidowitz and Varian 2015, Algan 2019). Big Data is not the output of instruments designed to produce valid and reliable data amenable for scientific analysis (Lazer et al. 2014), nor for statistical offices, but instruments and scientific methods exist to bridge these gaps (Gelman and Hennig, 2017). This is where partnerships between public and private companies at the time of Covid can make a difference.
The world is sailing through unprecedentedly difficult times. Perhaps for the first time in history, we all face the same imminent objective threat. Every person and every institution can contribute to making it to shore as safely as possible. Many can help in designing a reliable map. Without it, policy will falter. Right now, we cannot afford that.
Authors’ note: The opinions expressed are those of the authors and do not necessarily reflect the views of the Bank of Italy or the European Central Bank.
Algan, Y, F Murtin, E Beasley, K Higa and C Senik (2019), “Well-being through the lens of the internet”, PLoS ONE 14(1).
Angrist, J, S Caldwell and J Hall (2017), “Uber vs Taxi: a driver’s eye view”, VoxEU.org, 8 December.
D’Amuri, F, and J Marcucci (2017), “The predictive power of Google searches in forecasting US unemployment” International Journal of Forecasting 33(4).
Einav, L, and J Levin (2013), “The Data Revolution and Economic Analysis“, NBER Chapters, in: Innovation Policy and the Economy, Volume 14, pages 1-24, National Bureau of Economic Research.
European Commission (2020), “Communication from the Commission to the European Parliament, the Council, the European Economic and Social Committee and the Committee of the Regions”, 19 February 2020.
Ferrara, L and A Simoni (2019), “When are Google data useful to nowcast GDP? An approach via preselection and shrinkage”, BdF WP No. 717.
Gelman, A, and C Hennig (2017), “Beyond subjective and objective in statistics”, Journal of the of the Royals Statistical Society, 180: 967-1033.
Lazer, D, R Kennedy, G King and A Vespignani (2014), “The Parable of Google Flu: Traps in Big Data Analysis”, Science 343(6176): 1203-1205.
Mas, A, and A Pallais (2019), “Labor Supply and the Value of Non-Work Time: Experimental Estimates from the Field”, American Economic Review: Insights 1(1), June 2019.
Narita F, and R Yin (2018) “In Search of Information: Use of Google Trends’ Data to Narrow Information Gaps for Low-income Developing Countries”, Working Paper No. 18/286.
Lewis, D, K Mertens and J Stock (2020), “Monitoring Real Activity in Real Time: The Weekly Economic Index”, Liberty Street Economics, Federal Reserve Bank of New York.
Stephens-Davidowitz, S, and H Varian (2014), “A Hands-on Guide to Google Data”, further details on the construction can be found on the Google Trends page.
Pepe, E, P Bajardi, L Gauvin, F Privitera, C Cattuto and M Tizzoni (2020), “COVID-19 outbreak response: First assessment of mobility changes in Italy following lockdown”, ISI Foundation.
Stock, J H (2020), “Data Gaps and the Policy Response to the Novel Coronavirus,” NBER Working Paper 26902.
1 A first step in this direction is Google’s release of anonymised insights based on Google Maps to help monitoring how containment measures are shaping mobility patterns.