Combining Base and Instruction-Tuned Language Models for Better Synthetic Data Generation

[ad_1] [Submitted on 3 Feb 2025 (v1), last revised 5 Feb 2025 (this version, v2)] View a PDF of the ...
Read more
Rule-Guided Retrieval-Augmented Generation with Language Models for Question Answering

[ad_1] [Submitted on 15 Oct 2024 (v1), last revised 5 Feb 2025 (this version, v2)] View a PDF of the ...
Read more
An Integrated Toolkit for Evaluating Jailbreak Attempts Against Large Language Models

[ad_1] [Submitted on 13 Jun 2024 (v1), last revised 4 Feb 2025 (this version, v2)] View a PDF of the ...
Read more
Exploring the Role of Punctuation in Semantic Processing

[ad_1] [Submitted on 10 Jan 2025 (v1), last revised 2 Feb 2025 (this version, v3)] View a PDF of the ...
Read more
[2501.07927] Gandalf the Red: Adaptive Security for LLMs

[ad_1] [Submitted on 14 Jan 2025 (v1), last revised 2 Feb 2025 (this version, v2)] Authors:Niklas Pfister, Václav Volhejn, Manuel ...
Read more
Language Bias in Self-Supervised Learning For Automatic Speech Recognition

[ad_1] arXiv:2501.19321v1 Announce Type: cross Abstract: Self-supervised learning (SSL) is used in deep learning to train on large datasets without ...
Read more
Representation Space Guided Reinforcement Learning for Interpretable LLM Jailbreaking

[ad_1] [Submitted on 28 Jan 2025 (v1), last revised 30 Jan 2025 (this version, v2)] View a PDF of the ...
Read more
DFPE: A Diverse Fingerprint Ensemble for Enhancing LLM Performance

[ad_1] arXiv:2501.17479v1 Announce Type: cross Abstract: Large Language Models (LLMs) have shown remarkable capabilities across various natural language processing tasks ...
Read more
Virus: Harmful Fine-tuning Attack for Large Language Models Bypassing Guardrail Moderation

[ad_1] arXiv:2501.17433v1 Announce Type: cross Abstract: Recent research shows that Large Language Models (LLMs) are vulnerable to harmful fine-tuning attacks ...
Read more
Direct Schema Linking via Question Enrichment in Text-to-SQL

[ad_1] [Submitted on 25 Sep 2024 (v1), last revised 28 Jan 2025 (this version, v2)] View a PDF of the ...
Read more