Does Complexity Pay in Forecasting?

Research on forecasting across a multitude of economic time series demonstrates an interesting principle: simple, robust statistical models tend to dominate in terms of forecast accuracy. The core finding traces back to early work by Dawes (1979) and Makridakis and Hibon (1979), who showed that simple linear models and naive benchmarks frequently outperformed elaborate statistical methods. This result has proven remarkably robust across decades of replication (Armstrong, 2001). Of course, this is not a hard-and-fast rule. In very data-rich environments with stable data generating processes, complex models, such as random forests or neural networks, can outperform simple benchmarks. But this is more common in engineering and natural science applications than in economics.

This was my frame of reference when asked to review a paper for the Proceedings of the National Academy of Sciences (PNAS) a few years ago that purported to show that a sophisticated, complex model based on USDA crop conditions was a more accurate predictor of US corn and soybean yields than simpler models. Market analysts and traders have long used simple regression models based on crop conditions to predict corn and soybean yields, and I have co-authored several farmdoc daily articles investigating the use of these models. I thought the paper was interesting, even though it went against the aforementioned principle, and I recommended that PNAS publish it. The editor agreed, and it came out in 2020 with the impressive-sounding title “Qualitative Crop Condition Survey Reveals Spatiotemporal Production Patterns and Allows Early Yield Prediction.“

While reviewing the paper, it occurred to me that it would be interesting to see whether the results held up under the type of out-of-sample testing standard in the time-series forecasting literature. And the idea for another academic paper was born. Fortunately, a talented PhD student, Jiarui Li, and a colleague, Todd Hubbs from Oklahoma State, also thought this was an interesting idea. I am pleased to report that the resulting paper was recently published in the Journal of Agricultural and Resource Economics. The title of the paper is “Does Complexity Pay? Forecasting Corn and Soybean Yields Using Crop Condition Ratings.” Here is a summary of what we found: “…the results of this study are consistent with the conventional wisdom in the forecasting literature that complex models generally do not outperform simpler models in terms of forecast accuracy. Complexity does not pay when forecasting corn and soybean yields based on crop condition ratings.” So everyone using simple crop condition models in the grain trade can relax. I think you will find the paper to be interesting reading.

We do not delve into the reasons why complex models generally fail to beat simpler models in terms of forecasting accuracy in our paper, but it certainly is a relevant question. Three factors are most commonly cited in the literature. The first is overfitting. Complex models fit in-sample noise that doesn’t repeat out-of-sample. This is especially damaging when the data-generating process is unstable, or sample sizes are small — both common in economics and finance. The second is parameter estimation error. Armstrong (2001) emphasized that each additional parameter introduces estimation uncertainty that can swamp any theoretical gain from better specification. In essence, forecast error is compounded with model complexity. The third is structural instability. In economic time series, relationships estimated in one regime may not hold in the next. Simpler models are more robust to regime changes because they have less structure to break down.

In closing, I want to re-emphasize that simple models do not always beat complex models in forecasting. The search for better models will never end. But it is always a good idea to keep this general principle in mind when searching, especially given all the hype about AI right now.

This image has an empty alt attribute; its file name is Signature.jpg

Laurence J. Norton Chair of Agricultural Marketing
University of Illinois at Urbana-Champaign

The Back Forty – A Blog About Life as an Agricultural Economist

Does Complexity Pay in Forecasting?

Leave a ReplyCancel reply

The Back Forty – A Blog About Life as an Agricultural Economist

Does Complexity Pay in Forecasting?

Leave a ReplyCancel reply

Discover more from Scott Irwin