Towards Stable and Efficient Transformer Training via Hybrid Normalization

[Submitted on 6 Mar 2025 (v1), last revised 8 Dec 2025 (this version, v4)] View a PDF of the paper ...
Read more [2409.17120] Deep Learning and Machine Learning, Advancing Big Data Analytics and Management: Handy Appetizer

[Submitted on 25 Sep 2024 (v1), last revised 8 Dec 2025 (this version, v2)] Authors:Benji Peng, Xuanhe Pan, Yizhu Wen, ...
Read more Benchmarking Language Models on Multi-turn Mental Health Support

[Submitted on 23 Nov 2025 (v1), last revised 5 Dec 2025 (this version, v3)] View a PDF of the paper ...
Read more The Failure of Instruction Hierarchies in Large Language Models

[Submitted on 21 Feb 2025 (v1), last revised 4 Dec 2025 (this version, v4)] View a PDF of the paper ...
Read more AdaptVision: Efficient Vision-Language Models via Adaptive Visual Acquisition

arXiv:2512.03794v1 Announce Type: cross Abstract: Vision-Language Models (VLMs) have achieved remarkable success in visual question answering tasks, but their reliance ...
Read more Fine-Tuning Language Models to Resist Hallucination in Retrieval-Augmented Generation

[Submitted on 16 May 2025 (v1), last revised 3 Dec 2025 (this version, v3)] View a PDF of the paper ...
Read more Contextual Image Attack: How Visual Context Exposes Multimodal Safety Vulnerabilities

arXiv:2512.02973v1 Announce Type: cross Abstract: While Multimodal Large Language Models (MLLMs) show remarkable capabilities, their safety alignments are susceptible to ...
Read more Efficient Distillation of Multi-task Speech Models via Language-Specific Experts

[Submitted on 2 Nov 2023 (v1), last revised 29 Nov 2025 (this version, v4)] View a PDF of the paper ...
Read more H-Neurons: On the Existence, Impact, and Origin of Hallucination-Associated Neurons

arXiv:2512.01797v1 Announce Type: cross Abstract: Large language models (LLMs) frequently generate hallucinations — plausible but factually incorrect outputs — undermining ...
Read more SO-Bench: A Structural Output Evaluation of Multimodal LLMs

arXiv:2511.21750v1 Announce Type: cross Abstract: Multimodal large language models (MLLMs) are increasingly deployed in real-world, agentic settings where outputs must ...
Read more 








