View a PDF of the paper titled NitiBench: A Comprehensive Study of LLM Framework Capabilities for Thai Legal Question Answering, by Pawitsapak Akarajaradwong and 6 other authors
Abstract:The application of large language models (LLMs) in the legal domain holds significant potential for information retrieval and question answering, yet Thai legal QA systems face challenges due to a lack of standardized evaluation benchmarks and the complexity of Thai legal structures. This paper introduces NitiBench, a benchmark comprising two datasets: the NitiBench-CCL, covering general Thai financial law, and the NitiBench-Tax, which includes real-world tax law cases requiring advanced legal reasoning. We evaluate retrieval-augmented generation (RAG) and long-context LLM-based approaches to address three key research questions: the impact of domain-specific components like section-based chunking and cross-referencing, the comparative performance of different retrievers and LLMs, and the viability of long-context LLMs as an alternative to RAG. Our results show that section-based chunking significantly improves retrieval and end-to-end performance, current retrievers struggle with complex queries, and long-context LLMs still underperform RAG-based systems in Thai legal QA. To support fair evaluation, we propose tailored multi-label retrieval metrics and the use of an LLM-as-judge for coverage and contradiction detection method. These findings highlight the limitations of current Thai legal NLP solutions and provide a foundation for future research in the field. We also open-sourced our codes and dataset to available publicly.
Submission history
From: Pawitsapak Akarajaradwong [view email]
[v1]
Sat, 15 Feb 2025 17:52:14 UTC (2,800 KB)
[v2]
Tue, 4 Mar 2025 06:45:23 UTC (2,802 KB)
[v3]
Sat, 8 Mar 2025 05:11:53 UTC (2,803 KB)
[v4]
Thu, 21 Aug 2025 21:51:12 UTC (1,419 KB)
Source link
#Comprehensive #Study #LLM #Framework #Capabilities #Thai #Legal #Question #Answering