Workshop on Forecasting Heuristics in Infectious-Disease Surveillance

The workshop is scheduled for February 19th, 2024, from 16:00 to 18:30 CET, and will be conducted via Zoom.

Please register below if you would like to attend the workshop and have not yet received a meeting link.

The format of the workshop includes six talks, each lasting about 20 minutes + 5 minutes for questions. If time permits, we will close the workshop with a brief discussion, focusing on the key insights presented during the various talks.

The workshop will be moderated by Professor Gregory Wheeler (Frankfurt School).

Workshop programme

Program Information

16:00 - 16:20	“Forecasting Heuristics: Definition, History and a Success in Flu Prediction" Konstantinos V. Katsikopoulos (University of Southampton) Forecasting can be pursued by complex models and big data. Alternatively, forecasting can be done with heuristics that employ a few pieces of information and combine these pieces in simple ways. Even though simple models, such as tallying variables, have been known to sometimes predict as well as or even better than “optimal” linear regression and other relatively complex schemes since the 1930s and 1970s, simple has been hard to accept, as Robin Hogarth wrote in 2012. With the advent of big data, the most recent AI summer and the triumphs of machine learning, heuristics remain largely un-appreciated, often even losing their place as “naïve” benchmarks. Still, forecasting heuristics based on psychological theory—say, of how people deal with rapidly changing situations—have outperformed big data analytics in predicting outcomes such as U.S. presidential elections, consumer purchases, patient hospitalizations and terrorist attacks. Consider the prediction of flu incidence: The recency heuristic, an instance of people’s ‘naïve forecasting’, predicts that ‘‘this week’s proportion of flu-related doctor visits equals the proportion from the most recent week’’ and it forecasted CDC data more accurately than Google Flu Trends that used big data and a black-box algorithm; also, the recency heuristic is transparent. A natural way forward might be to benchmark big data/machine leaning models on simple heuristics, or to combine the two approaches. Nevertheless, when we published the flu research in 2021, most comments to the article appeared unconvinced of the value of simple heuristics, as many others have been for decades.
16:25 - 16:45	"Recency: Prediction with a Single Data Point" Florian M. Artinger (Berlin International University of Applied Sciences and Max Planck Institute for Human Development) Recency, that is, the time since an event occurred last, is commonly used to predict future events when people make decisions. However, relying solely on recency and ignoring any other information is commonly regarded as a mistake. We develop two sufficient conditions under which recency can predict future events at least as well as integrating all available information: unexpected changes over time and dominant attribute. In an extensive empirical study with 58 prediction tasks, we analyze these conditions and show that they are central to the performance of the recency-based hiatus heuristic as compared with standard prediction models. We conclude that relying on recency can be ecologically rational and a powerful source for decision making.
16:50 - 17:10	“Accuracy of US CDC COVID-19 Forecasting Models" Aviral Chharia (Carnegie Mellon University) Accurate predictive modeling of pandemics is essential for optimally distributing resources and setting policy. Dozens of case predictions models have been proposed but their accuracy over time and by model type remains unclear. In this study, we analyze all US CDC COVID-19 forecasting models, by first categorizing them and then calculating their mean absolute percent error, both wave-wise and on the complete timeline. We compare their estimates to government-reported case numbers, one another, as well as two baseline models wherein case counts remain static or follow a simple linear trend. The comparison reveals that more than one-third of models fail to outperform a simple static case baseline and two-thirds fail to outperform a simple linear trend forecast. A wave-by-wave comparison of models revealed that no overall modeling approach was superior to others, including ensemble models, and error in modeling has increased over time during the pandemic. This study raises concerns about hosting these models on official public platforms of health organizations including the US-CDC which risks giving them an official imprimatur and further raising concerns if utilized to formulate policy. By offering a universal evaluation method for pandemic forecasting models, we expect this work to serve as the starting point towards the development of more accurate models.
17:15 - 17:35	"On the Accuracy of Short-Term COVID-19 Fatality Forecasts" Lucas Böttcher (Frankfurt School and University of Florida) Forecasting new cases, hospitalizations, and disease-induced deaths is an important part of infectious-disease surveillance and helps guide health officials in implementing effective countermeasures. For disease surveillance in the US, the Centers for Disease Control and Prevention (CDC) combined more than 65 individual forecasts of these numbers in an ensemble forecast at both national and state levels. A similar initiative has been launched by the European CDC (ECDC) in the second half of 2021. We collected data on CDC and ECDC ensemble forecasts of COVID-19 fatalities, and we compare them with easily interpretable “Euler” forecasts serving as a model-free benchmark that is based on the local rate of change of the incidence curve. The term “Euler method” is motivated by the eponymous numerical integration scheme that calculates the value of a function at a future time step based on the current rate of change. Our results show that simple and easily interpretable “Euler” forecasts can compete favorably with both CDC and ECDC ensemble forecasts on short-term forecasting horizons of 1 week. However, ensemble forecasts better perform on longer forecasting horizons. Using the current rate of change in incidences as estimates of future incidence changes is useful for epidemic forecasting on short time horizons. An advantage of the proposed method over other forecasting approaches is that it can be implemented with a very limited amount of work and without relying on additional data (e.g., data on human mobility and contact patterns) and high-performance computing systems.
17:40 - 18:00	“Bayesian Inference in Networked Systems and Applications in Infectious Disease Modeling" Sen Pei (Columbia University) Network models are widely used in infectious disease modeling. Many real-world problems need to calibrate high-dimensional network models to observational data. By coupling dynamical network models with real-world data, model-inference systems can support real-time disease forecasting, epidemiological parameter inference, and estimation of unobserved variables. I will introduce data assimilation methods for metapopulation models through applications for influenza and COVID-19.
18:05 - 18:25	“Forecasting Drug Overdose Deaths in The United States" Maria D'Orsogna (California State University, Northridge and University of California, Los Angeles) The United States is in the midst of a dramatic drug-epidemic crisis. Despite decades of research and the implementation of policies ranging from harm-reduction to punitive measures, drug overdose deaths have surpassed 112,000 fatalities in 2023. Annual ex-penditures related to drug control have reached 40 billion USD. There are many questions on whether to intervene on the supply or on the consumption side, and which strategies may be the most hopeful in helping vulnerable persons not get addicted or regain sobriety. One of the most important building blocks however is to understand the evolution of the drug overdose crisis and which communities will be the most affected so that resource allocation for prevention, treatment and survival, is the most effective. The CDC has been tracking drug overdose deaths in the United States for several decades, stratifying fatali-ties by race, gender, age, drug type and geography. In this talk we will discuss the use of heuristic forecasting based on this body of data to predict drug overdose deaths during the latest "synthetic opioid" crisis that began in 2013, fueled by the introduction of fentanyl in the country. We find that the use of heuristic methods for specific subpopulations, in this case quadratic regressions on specific racial and geographic groups, offers good insight into future drug overdose trends and can help identify anomalies, such as the large in-creases observed during the COVID-19 pandemic. Being relatively simple to implement these heuristics can be quite valuable for on the ground practitioners who can adapt them to the available spatio-temporal data, using their insight and practical knowledge as guidance.