[2409.19663] Identifying Knowledge Editing Types in Large Language Models

[Submitted on 29 Sep 2024 (v1), last revised 1 Oct 2024 (this version, v2)]

View a PDF of the paper titled Figuring out Data Enhancing Varieties in Massive Language Fashions, by Xiaopeng Li and seven different authors

View PDF
HTML (experimental)

Summary:Data enhancing has emerged as an environment friendly expertise for updating the data of enormous language fashions (LLMs), attracting growing consideration in recent times. Nonetheless, there’s a lack of efficient measures to stop the malicious misuse of this expertise, which may result in dangerous edits in LLMs. These malicious modifications may trigger LLMs to generate poisonous content material, deceptive customers into inappropriate actions. In entrance of this threat, we introduce a brand new process, Data Enhancing Sort Identification (KETI), geared toward figuring out various kinds of edits in LLMs, thereby offering well timed alerts to customers when encountering illicit edits. As a part of this process, we suggest KETIBench, which incorporates 5 varieties of dangerous edits protecting hottest poisonous sorts, in addition to one benign factual edit. We develop 4 classical classification fashions and three BERT-based fashions as baseline identifiers for each open-source and closed-source LLMs. Our experimental outcomes, throughout 42 trials involving two fashions and three data enhancing strategies, reveal that every one seven baseline identifiers obtain respectable identification efficiency, highlighting the feasibility of figuring out malicious edits in LLMs. Extra analyses reveal that the efficiency of the identifiers is impartial of the reliability of the data enhancing strategies and reveals cross-domain generalization, enabling the identification of edits from unknown sources. All information and code can be found in this https URL. Warning: This paper incorporates examples of poisonous textual content.

Submission historical past

From: Xiaopeng Li [view email]
[v1]
Solar, 29 Sep 2024 11:29:57 UTC (934 KB)
[v2]
Tue, 1 Oct 2024 06:35:24 UTC (638 KB)

Source link

#Figuring out #Data #Enhancing #Varieties #Massive #Language #Fashions

Unlock the potential of cutting-edge AI options with our complete choices. As a number one supplier within the AI panorama, we harness the ability of synthetic intelligence to revolutionize industries. From machine studying and information analytics to pure language processing and laptop imaginative and prescient, our AI options are designed to boost effectivity and drive innovation. Discover the limitless prospects of AI-driven insights and automation that propel what you are promoting ahead. With a dedication to staying on the forefront of the quickly evolving AI market, we ship tailor-made options that meet your particular wants. Be part of us on the forefront of technological development, and let AI redefine the best way you use and achieve a aggressive panorama. Embrace the longer term with AI excellence, the place prospects are limitless, and competitors is surpassed.

[2409.19663] Identifying Knowledge Editing Types in Large Language Models

Submission historical past

Recent Posts

FBI seizes Nintendo Switch piracy site, Nsw2u, as “part of a law enforcement operation”

Lloyds and Aberdeen Investments use tokenised collateral for FX trades

Artificial Intelligence at Manulife – Emerj Artificial Intelligence Research

California is set to become the first US state to manage power outages with AI

There and Back Again: An AI Career Journey

Pebblebee tracker’s new SOS alert reminds us that updates can be good for gadgets

AI ‘Nudify’ Websites Are Raking in Millions of Dollars

The Download: California’s AI power plans, and and why it’s so hard to make welfare AI fair

Superman is a box office hit, but the hard part comes next

GM’s Final EV Battery Strategy Copies China’s Playbook: Super Cheap Cells