Friday, January 2, 2026

Login

Home
News
Hillary Clinton in white pantsuit for Trump inauguration

Amazon has 143 billion reasons to keep adding more perks to Prime

Shooting More than 40 Years of New York’s Halloween Parade

These Are the 5 Big Tech Stories to Watch in 2017

Why Millennials Need to Save Twice as Much as Boomers Did

Doctors take inspiration from online dating to build organ transplant AI
Trending Tags
Tech
- All
- Apps
- Gadget
- Mobile
- Startup
The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

Shadow Tactics: Blades of the Shogun Review

macOS Sierra review: Mac users get a modest update this year

Hands on: Samsung Galaxy A5 2017 review

The Last Guardian Playstation 4 Game review

These Are the 5 Big Tech Stories to Watch in 2017
Trending Tags
Entertainment
- All
- Gaming
- Movie
- Music
- Sports
The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

macOS Sierra review: Mac users get a modest update this year

Hands on: Samsung Galaxy A5 2017 review

Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

Harnessing the power of VR with Power Rangers and Snapdragon 835

So you want to be a startup investor? Here are things you should know
Lifestyle
- All
- Fashion
- Food
- Health
- Travel
Shooting More than 40 Years of New York’s Halloween Parade

Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

Why Millennials Need to Save Twice as Much as Boomers Did

Doctors take inspiration from online dating to build organ transplant AI

How couples can solve lighting disagreements for good

Ducati launch: Lorenzo and Dovizioso’s Desmosedici
Trending Tags
Review

The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

Shadow Tactics: Blades of the Shogun Review

macOS Sierra review: Mac users get a modest update this year

Hands on: Samsung Galaxy A5 2017 review

The Last Guardian Playstation 4 Game review

Intel Core i7-7700K ‘Kaby Lake’ review

No Result

View All Result

Ai News

No Result

View All Result

No Result

View All Result

Home AI & Sentiment Analysis

Investigating Temporal Vulnerabilities in LLMs

by AiNEWS2025

in AI & Sentiment Analysis

0

SHARES

0

VIEWS

Share on Facebook Share on Twitter

[Submitted on 4 Jul 2024 (v1), last revised 23 Dec 2024 (this version, v3)]

View a PDF of the paper titled Future Events as Backdoor Triggers: Investigating Temporal Vulnerabilities in LLMs, by Sara Price and 3 other authors

View PDF
HTML (experimental)

Abstract:Backdoors are hidden behaviors that are only triggered once an AI system has been deployed. Bad actors looking to create successful backdoors must design them to avoid activation during training and evaluation. Since data used in these stages often only contains information about events that have already occurred, a component of a simple backdoor trigger could be a model recognizing data that is in the future relative to when it was trained. Through prompting experiments and by probing internal activations, we show that current large language models (LLMs) can distinguish past from future events, with probes on model activations achieving 90% accuracy. We train models with backdoors triggered by a temporal distributional shift; they activate when the model is exposed to news headlines beyond their training cut-off dates. Fine-tuning on helpful, harmless and honest (HHH) data does not work well for removing simpler backdoor triggers but is effective on our backdoored models, although this distinction is smaller for the larger-scale model we tested. We also find that an activation-steering vector representing a model’s internal representation of the date influences the rate of backdoor activation. We take these results as initial evidence that, at least for models at the modest scale we test, standard safety measures are enough to remove these backdoors.

Submission history

From: Sara Price [view email]
[v1]
Thu, 4 Jul 2024 18:24:09 UTC (4,621 KB)
[v2]
Wed, 17 Jul 2024 18:45:46 UTC (4,621 KB)
[v3]
Mon, 23 Dec 2024 19:24:44 UTC (5,030 KB)

Source link

#Investigating #Temporal #Vulnerabilities #LLMs

E-Commerce AI Video Generation Benchmark

Top Trends in SaaS Security Testing: Safeguarding the Cloud in 2024

AiNEWS2025

Next Post

Top Trends in SaaS Security Testing: Safeguarding the Cloud in 2024

Top Trends in SaaS Security Testing: Safeguarding the Cloud in 2024

No Result

View All Result

© 2026 JNews - Premium WordPress news & magazine theme by Jegtheme.