Understanding and Generation in Multimodal Reasoning with the MAIA Benchmark

[Submitted on 24 Feb 2025 (v1), last revised 22 Sep 2025 (this version, v3)] View a PDF of the paper ...
Read more
[2506.09627] Benchmarking Debiasing Methods for LLM-based Parameter Estimates

[Submitted on 11 Jun 2025 (v1), last revised 19 Sep 2025 (this version, v2)] View a PDF of the paper ...
Read more
Structured Cross-Source Enhanced Large Language Model Reasoning

[Submitted on 23 May 2025 (v1), last revised 19 Sep 2025 (this version, v4)] View a PDF of the paper ...
Read more
Multilingual Gender-Neutral Translation Evaluation with mGeNTE

[Submitted on 16 Jan 2025 (v1), last revised 18 Sep 2025 (this version, v3)] View a PDF of the paper ...
Read more
Enhancing Web Agent Reasoning by Reconstructing Chain-of-Thought in Reflection, Branching, and Rollback

[Submitted on 26 May 2025 (v1), last revised 18 Sep 2025 (this version, v2)] View a PDF of the paper ...
Read more
A System for Discourse Relation Classification

[Submitted on 15 Sep 2025 (v1), last revised 16 Sep 2025 (this version, v2)] View a PDF of the paper ...
Read more
[2508.19594] Understanding and Leveraging the Expert Specialization of Context Faithfulness in Mixture-of-Experts LLMs

[Submitted on 27 Aug 2025 (v1), last revised 16 Sep 2025 (this version, v2)] View a PDF of the paper ...
Read more
A Supervised Pre-training Framework for Multimodal ECG Representation Learning

[Submitted on 27 Feb 2025 (v1), last revised 16 Sep 2025 (this version, v3)] View a PDF of the paper ...
Read more
MillStone: How Open-Minded Are LLMs?

arXiv:2509.11967v1 Announce Type: cross Abstract: Large language models equipped with Web search, information retrieval tools, and other agentic capabilities are ...
Read more
Cross-Layer Attention Probing for Fine-Grained Hallucination Detection

arXiv:2509.09700v1 Announce Type: new Abstract: With the large-scale adoption of Large Language Models (LLMs) in various applications, there is a ...
Read more