Large Language Models (LLMs) in Finance – A Survey of Applications, Methods, and Challenges

Abstract:

Financial institutions have long employed data analytics and machine learning to gain informational advantages in areas such as risk assessment, trading, portfolio management, and forecasting. Historically, these approaches focused on numerical data—like market prices, macroeconomic indicators, or company fundamentals. However, the proliferation of textual sources (annual reports, filings, news articles, social media, regulatory documents) has prompted analysts to seek advanced natural language processing (NLP) methods that can extract insights from unstructured text (Gentzkow et al., 2019). Similarly, in the accounting domain, researchers have explored text-based approaches to analyze corporate disclosures (Bochkay et al., 2023), complementing the broader shift toward textual analytics in finance. Early work featured statistical techniques (e.g., n-gram models, latent topic models, supervised classification; see Das et al., 2014), while subsequent adoption of deep learning—such as LSTM or convolutional neural networks—further advanced performance.

An especially impactful milestone came with the transformer architecture, exemplified by BERT (Devlin et al., 2019), which allows for contextual understanding of language. In finance, models like FinBERT demonstrated how domain-specific training on financial corpora can boost accuracy in sentiment analysis of corporate disclosures, market news, and related text (Araci, 2019; Huang et al., 2023). More recently, a new generation of Large Language Models (LLMs)—including GPT-3, GPT-4 (OpenAI, 2023), and other massive transformer-based systems (Brown et al., 2020)—has emerged. These models are trained on vast corpora comprising billions or even trillions of tokens, showing remarkable abilities to generate fluent text, solve complex language tasks, and learn from limited examples (zero-shot or few-shot learning).

In finance, where vast volumes of documents must be processed daily—ranging from regulatory filings to social media posts—the advent of LLMs may drive powerful automation: from summarizing corporate reports and market news to powering intelligent advisory chatbots or facilitating regulatory compliance checks. Indeed, early deployments have been documented, such as internal bank chatbots that respond to routine client inquiries and systems that analyze real-time social media sentiment about stocks (Li et al., 2023; Nie et al., 2024). Meanwhile, domain experts highlight potential concerns. First, can a general model (e.g., GPT-4) handle specialized financial language better than a custom-trained domain model (e.g., FinBERT, BloombergGPT)? Second, do LLMs present novel compliance or legal risks, especially in heavily regulated areas? Third, how can we mitigate the phenomenon of “hallucinations,” where a model generates convincing but factually incorrect answers?

This paper aims to provide a holistic survey of LLM applications in the financial sector. We review how these models assist in tasks such as sentiment detection (including market and news sentiment), textual data processing for ESG analysis, and compliance or regulatory tasks. We compare general-purpose vs. domain-specific models, citing relevant empirical benchmarks from the literature. We then examine key challenges for the financial industry, focusing on trustworthiness, interpretability, data protection, and alignment with regulatory demands. We conclude by underscoring the vast potential of LLMs to transform finance—yet emphasize the need for further research to ensure safe, effective, and responsible deployment.