Finance Language Model Evaluation (FLaME)

NLP

Finance

ACL

A comprehensive evaluation framework for studying language models against reasoning-reinforced LMs across 20+ core NLP tasks in finance.

Authors

Glenn Matlin

Mika Okamoto

Huzaifa Pardawala

Yang Yang

Sudheer Chava

Published

June 18, 2025

Doi

10.18653/v1/2025.findings-acl.1164

Publication

Finance Language Model Evaluation (FLaME)

A comprehensive evaluation framework for studying language models against reasoning-reinforced LMs across 20+ core NLP tasks in finance.

Published

June 18, 2025

Authors

Glenn Matlin, Mika Okamoto, Huzaifa Pardawala, Yang Yang, Sudheer Chava

Venue

Findings of the Association for Computational Linguistics: ACL 2025

Read on arXiv View DOI All Publications

Abstract

This is the first research paper to comprehensively study language models against reasoning-reinforced LMs, with an empirical study of 23 foundation LMs over 20 core NLP tasks in finance. FLaME provides a grounded evaluation framework that reveals significant differences in financial reasoning, with leading models achieving accuracy levels near 80%.