
James R. Fischer
— Managing Director, Financial Analytics
March 30, 2024
10 min read
Every major CFO we speak to in 2024 is being asked by their board: are we using AI in our forecasting? The honest answer, for most organizations, should be: not yet, and here is why that is probably the right call.
94%
of CFOs surveyed
report board pressure to adopt AI in finance function (Gartner, 2024)
23%
have deployed
any form of AI in their core forecasting process
8%
report positive ROI
from AI-assisted forecasting after 12 months
The use cases where generative AI provides genuine, measurable lift in financial forecasting are narrower than the vendor pitch decks suggest — but they are real and they are significant for the organizations that find them.
1
Narrative interpretation — parsing earnings call transcripts, analyst reports, and news feeds to extract forward-looking signals that traditional time-series models miss
2
Scenario narration — translating quantitative scenario outputs into coherent, boardroom-ready prose that finance teams can actually use
3
Anomaly hypothesis generation — when a forecast diverges unexpectedly, LLMs can rapidly generate candidate explanations from external data sources
4
Cross-functional synthesis — aggregating inputs from disparate planning systems (HR, supply chain, sales) into a coherent narrative forecast
The performance gap between LLM-based forecasting and established statistical methods is most pronounced in structured, numerical, high-frequency time-series prediction. For monthly or quarterly financial forecasting — the bread-and-butter of FP&A — classical methods including ARIMA, gradient boosted trees, and Bayesian structural models consistently outperform GPT-class models in both accuracy and interpretability.
The Hallucination Problem
In internal testing across six FP&A deployments, we observed LLM-generated forecasts that appeared statistically plausible but were based on fabricated historical analogues. Unlike a classical model whose failure modes are transparent, an LLM's errors can be confidently stated and difficult to detect without domain expertise.
31%
average accuracy improvement
achieved by using LLMs for narrative inputs alongside classical numerical models — the hybrid approach outperforms either alone
The organizations that are extracting genuine value from AI in finance are not replacing their forecasting models. They are augmenting them — using LLMs at the data intake and output narration layers while preserving classical methods at the numerical prediction core. This hybrid architecture is less exciting as a board story, but it works.
The FJ AI Readiness Diagnostic
Before deploying any AI in your finance function, FischerJordan recommends a 30-day AI Readiness Diagnostic that audits data quality, model governance infrastructure, and organizational capability. Teams that skip this step spend significantly more on course correction than the diagnostic costs.
This analysis is based on FischerJordan's proprietary evaluation of six client FP&A AI deployments and a review of 23 published studies on LLM performance in time-series forecasting tasks.

James R. Fischer
Managing Director, Financial Analytics
Published
March 30, 2024
Reading time
10 min read
Topics
Work with FischerJordan
Our experts are available to discuss how these insights apply to your organization.