View a PDF of the paper titled Beyond Jailbreaking: Auditing Contextual Privacy in LLM Agents, by Saswat Das and 2 other authors
Abstract:LLM agents have begun to appear as personal assistants, customer service bots, and clinical aides. While these applications deliver substantial operational benefits, they also require continuous access to sensitive data, which increases the likelihood of unauthorized disclosures. Moreover, these disclosures go beyond mere explicit disclosure, leaving open avenues for gradual manipulation or sidechannel information leakage. This study proposes an auditing framework for conversational privacy that quantifies an agent’s susceptibility to these risks. The proposed Conversational Manipulation for Privacy Leakage (CMPL) framework is designed to stress-test agents that enforce strict privacy directives against an iterative probing strategy. Rather than focusing solely on a single disclosure event or purely explicit leakage, CMPL simulates realistic multi-turn interactions to systematically uncover latent vulnerabilities. Our evaluation on diverse domains, data modalities, and safety configurations demonstrates the auditing framework’s ability to reveal privacy risks that are not deterred by existing single-turn defenses, along with an in-depth longitudinal study of the temporal dynamics of leakage, strategies adopted by adaptive adversaries, and the evolution of adversarial beliefs about sensitive targets. In addition to introducing CMPL as a diagnostic tool, the paper delivers (1) an auditing procedure grounded in quantifiable risk metrics and (2) an open benchmark for evaluation of conversational privacy across agent implementations.
Submission history
From: Saswat Das [view email]
[v1]
Wed, 11 Jun 2025 20:47:37 UTC (712 KB)
[v2]
Sat, 14 Jun 2025 01:16:24 UTC (712 KB)
[v3]
Sat, 27 Sep 2025 20:28:18 UTC (2,755 KB)
Source link
#Auditing #Contextual #Privacy #LLM #Agents