Multi-Agent LLM Defense against Jailbreak Attacks

[Submitted on 2 Mar 2024 (v1), last revised 14 Nov 2024 (this version, v2)] View a PDF of the paper ...
Read more
[2407.15339] Deep Learning for Economists

[Submitted on 22 Jul 2024 (v1), last revised 13 Nov 2024 (this version, v3)] View a PDF of the paper ...
Read more
[2411.07820] Query Optimization for Parametric Knowledge Refinement in Retrieval-Augmented Large Language Models

[Submitted on 12 Nov 2024 (v1), last revised 13 Nov 2024 (this version, v2)] View a PDF of the paper ...
Read more
[2405.00722] LLMs for Generating and Evaluating Counterfactuals: A Comprehensive Study

[Submitted on 26 Apr 2024 (v1), last revised 12 Nov 2024 (this version, v2)] View a PDF of the paper ...
Read more
[2311.07468] An Analysis and Mitigation of the Reversal Curse

[Submitted on 13 Nov 2023 (v1), last revised 10 Nov 2024 (this version, v3)] View a PDF of the paper ...
Read more
A Technology Probe for Resolving Value Conflicts through Expert-Driven and User-Driven Strategies in AI Companion Applications

arXivLabs is a framework that permits collaborators to develop and share new arXiv options straight on our web site. Each ...
Read more
LLMs as Research Tools: A Large Scale Survey of Researchers' Usage and Perceptions

arXiv:2411.05025v1 Announce Sort: new Summary: The rise of huge language fashions (LLMs) has led many researchers to think about their ...
Read more
[2406.11944] Transcoders Find Interpretable LLM Feature Circuits

[Submitted on 17 Jun 2024 (v1), last revised 6 Nov 2024 (this version, v2)] View a PDF of the paper ...
Read more
A Chinese Dialogue Dataset Towards Multi-turn Topic-driven Conversation

[Submitted on 3 Mar 2021 (v1), last revised 7 Nov 2024 (this version, v3)] View a PDF of the paper ...
Read more
How Transformers Solve Propositional Logic Problems: A Mechanistic Analysis

arXiv:2411.04105v1 Announce Kind: cross Summary: Giant language fashions (LLMs) have proven wonderful efficiency on duties that require planning and reasoning. ...
Read more