This site is under active development. Some content may be AI-assisted or incomplete. If something looks off, it probably is.
Financial Instruction Following Evaluation (FIFE)
NLP
Finance
Evaluation
A novel, high-difficulty benchmark designed to assess LM instruction-following capabilities for financial analysis tasks.
Publication Details
arXiv: 2512.08965
Abstract
This work introduces FIFE, a novel, high-difficulty benchmark designed to assess LM instruction-following capabilities for financial analysis tasks, and evaluates 53 models (proprietary, open-weight, open-source) in a zero-shot setting. FIFE provides a rigorous evaluation of how well language models can follow complex, domain-specific instructions in finance.