Total Recall? Evaluating the Macroeconomic Knowledge of Large Language Models

Crane, Leland D.; Karra, Akhil; Soto, Paul E.

doi:10.17016/FEDS.2025.044

Working Paper

Total Recall? Evaluating the Macroeconomic Knowledge of Large Language Models

Abstract: We evaluate the ability of large language models (LLMs) to estimate historical macroeconomic variables and data release dates. We find that LLMs have precise knowledge of some recent statistics, but performance degrades as we go farther back in history. We highlight two particularly important kinds of recall errors: mixing together first print data with subsequent revisions (i.e., smoothing across vintages) and mixing data for past and future reference periods (i.e., smoothing within vintages). We also find that LLMs can often recall individual data release dates accurately, but aggregating across series shows that on any given day the LLM is likely to believe it has data in hand which has not been released. Our results indicate that while LLMs have impressively accurate recall, their errors point to some limitations when used for historical analysis or to mimic real time forecasters.

JEL Classification: C53; C80; E37;

https://doi.org/10.17016/FEDS.2025.044

Access Documents

File(s): File format is application/pdf https://www.federalreserve.gov/econres/feds/files/2025044pap.pdf

Authors

Crane, Leland D.

Karra, Akhil

Soto, Paul E.

Bibliographic Information

Provider: Board of Governors of the Federal Reserve System (U.S.)

Part of Series: Finance and Economics Discussion Series

Publication Date: 2025-06-25

Number: 2025-044

Fed in Print

Total Recall? Evaluating the Macroeconomic Knowledge of Large Language Models

Access Documents

Authors

Bibliographic Information