...

Causal Scan for LLM Misbehavior Detection


View a PDF of the paper titled LLMScan: Causal Scan for LLM Misbehavior Detection, by Mengdi Zhang and three different authors

View PDF
HTML (experimental)

Summary:Regardless of the success of Massive Language Fashions (LLMs) throughout numerous fields, their potential to generate untruthful, biased and dangerous responses poses important dangers, significantly in essential purposes. This highlights the pressing want for systematic strategies to detect and stop such misbehavior. Whereas current approaches goal particular points akin to dangerous responses, this work introduces LLMScan, an revolutionary LLM monitoring approach primarily based on causality evaluation, providing a complete answer. LLMScan systematically displays the inside workings of an LLM by way of the lens of causal inference, working on the premise that the LLM’s `mind’ behaves otherwise when misbehaving. By analyzing the causal contributions of the LLM’s enter tokens and transformer layers, LLMScan successfully detects misbehavior. In depth experiments throughout numerous duties and fashions reveal clear distinctions within the causal distributions between regular conduct and misbehavior, enabling the event of correct, light-weight detectors for a wide range of misbehavior detection duties.

Submission historical past

From: Mengdi Zhang [view email]
[v1]
Tue, 22 Oct 2024 02:27:57 UTC (6,054 KB)
[v2]
Wed, 23 Oct 2024 03:41:49 UTC (6,054 KB)

Source link

#Causal #Scan #LLM #Misbehavior #Detection


Unlock the potential of cutting-edge AI options with our complete choices. As a number one supplier within the AI panorama, we harness the facility of synthetic intelligence to revolutionize industries. From machine studying and knowledge analytics to pure language processing and laptop imaginative and prescient, our AI options are designed to reinforce effectivity and drive innovation. Discover the limitless prospects of AI-driven insights and automation that propel your enterprise ahead. With a dedication to staying on the forefront of the quickly evolving AI market, we ship tailor-made options that meet your particular wants. Be a part of us on the forefront of technological development, and let AI redefine the best way you use and reach a aggressive panorama. Embrace the long run with AI excellence, the place prospects are limitless, and competitors is surpassed.