• About
  • Advertise
  • Privacy & Policy
  • Contact
Thursday, January 15, 2026
  • Login
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
    • Home – Layout 4
    • Home – Layout 5
    • Home – Layout 6
  • News
    • All
    • Business
    • Politics
    • Science
    • World
    Hillary Clinton in white pantsuit for Trump inauguration

    Hillary Clinton in white pantsuit for Trump inauguration

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Entertainment
    • All
    • Gaming
    • Movie
    • Music
    • Sports
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    So you want to be a startup investor? Here are things you should know

    So you want to be a startup investor? Here are things you should know

  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    How couples can solve lighting disagreements for good

    How couples can solve lighting disagreements for good

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • Review
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    Intel Core i7-7700K ‘Kaby Lake’ review

    Intel Core i7-7700K ‘Kaby Lake’ review

No Result
View All Result
Ai News
Advertisement
  • Home
    • Home – Layout 1
    • Home – Layout 2
    • Home – Layout 3
    • Home – Layout 4
    • Home – Layout 5
    • Home – Layout 6
  • News
    • All
    • Business
    • Politics
    • Science
    • World
    Hillary Clinton in white pantsuit for Trump inauguration

    Hillary Clinton in white pantsuit for Trump inauguration

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Amazon has 143 billion reasons to keep adding more perks to Prime

    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    These Are the 5 Big Tech Stories to Watch in 2017

    These Are the 5 Big Tech Stories to Watch in 2017

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Entertainment
    • All
    • Gaming
    • Movie
    • Music
    • Sports
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    Harnessing the power of VR with Power Rangers and Snapdragon 835

    So you want to be a startup investor? Here are things you should know

    So you want to be a startup investor? Here are things you should know

  • Lifestyle
    • All
    • Fashion
    • Food
    • Health
    • Travel
    Shooting More than 40 Years of New York’s Halloween Parade

    Shooting More than 40 Years of New York’s Halloween Parade

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Heroes of the Storm Global Championship 2017 starts tomorrow, here’s what you need to know

    Why Millennials Need to Save Twice as Much as Boomers Did

    Why Millennials Need to Save Twice as Much as Boomers Did

    Doctors take inspiration from online dating to build organ transplant AI

    Doctors take inspiration from online dating to build organ transplant AI

    How couples can solve lighting disagreements for good

    How couples can solve lighting disagreements for good

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Ducati launch: Lorenzo and Dovizioso’s Desmosedici

    Trending Tags

    • Golden Globes
    • Game of Thrones
    • MotoGP 2017
    • eSports
    • Fashion Week
  • Review
    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    The Legend of Zelda: Breath of the Wild gameplay on the Nintendo Switch

    Shadow Tactics: Blades of the Shogun Review

    Shadow Tactics: Blades of the Shogun Review

    macOS Sierra review: Mac users get a modest update this year

    macOS Sierra review: Mac users get a modest update this year

    Hands on: Samsung Galaxy A5 2017 review

    Hands on: Samsung Galaxy A5 2017 review

    The Last Guardian Playstation 4 Game review

    The Last Guardian Playstation 4 Game review

    Intel Core i7-7700K ‘Kaby Lake’ review

    Intel Core i7-7700K ‘Kaby Lake’ review

No Result
View All Result
Ai News
No Result
View All Result
Home Deep Learning

Building safer dialogue agents – Google DeepMind

AiNEWS2025 by AiNEWS2025
2024-12-12
in Deep Learning
0
Building safer dialogue agents – Google DeepMind
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter


Analysis

Printed
22 September 2022
Authors

The Sparrow crew

Coaching an AI to speak in a manner that’s extra useful, right, and innocent

Lately, giant language fashions (LLMs) have achieved success at a spread of duties corresponding to query answering, summarisation, and dialogue. Dialogue is a very attention-grabbing job as a result of it options versatile and interactive communication. Nonetheless, dialogue brokers powered by LLMs can specific inaccurate or invented data, use discriminatory language, or encourage unsafe behaviour.

To create safer dialogue brokers, we’d like to have the ability to study from human suggestions. Making use of reinforcement studying based mostly on enter from analysis members, we discover new strategies for coaching dialogue brokers that present promise for a safer system.

In our latest paper, we introduce Sparrow – a dialogue agent that’s helpful and reduces the chance of unsafe and inappropriate solutions. Our agent is designed to speak with a person, reply questions, and search the web utilizing Google when it’s useful to search for proof to tell its responses.

Our new conversational AI mannequin replies by itself to an preliminary human immediate.

Sparrow is a analysis mannequin and proof of idea, designed with the aim of coaching dialogue brokers to be extra useful, right, and innocent. By studying these qualities in a common dialogue setting, Sparrow advances our understanding of how we will prepare brokers to be safer and extra helpful – and finally, to assist construct safer and extra helpful synthetic common intelligence (AGI).

Sparrow declining to reply a doubtlessly dangerous query.

How Sparrow works

Coaching a conversational AI is an particularly difficult downside as a result of it’s troublesome to pinpoint what makes a dialogue profitable. To deal with this downside, we flip to a type of reinforcement studying (RL) based mostly on individuals’s suggestions, utilizing the examine members’ choice suggestions to coach a mannequin of how helpful a solution is.

To get this information, we present our members a number of mannequin solutions to the identical query and ask them which reply they like essentially the most. As a result of we present solutions with and with out proof retrieved from the web, this mannequin also can decide when a solution must be supported with proof.

We ask examine members to judge and work together with Sparrow both naturally or adversarially, frequently increasing the dataset used to coach Sparrow.

However rising usefulness is barely a part of the story. To be sure that the mannequin’s behaviour is protected, we should constrain its behaviour. And so, we decide an preliminary easy algorithm for the mannequin, corresponding to “do not make threatening statements” and “do not make hateful or insulting feedback”.

We additionally present guidelines round presumably dangerous recommendation and never claiming to be an individual. These guidelines have been knowledgeable by finding out current work on language harms and consulting with consultants. We then ask our examine members to speak to our system, with the purpose of tricking it into breaking the foundations. These conversations then allow us to prepare a separate ‘rule mannequin’ that signifies when Sparrow’s behaviour breaks any of the foundations.

In direction of higher AI and higher judgments

Verifying Sparrow’s solutions for correctness is troublesome even for consultants. As an alternative, we ask our members to find out whether or not Sparrow’s solutions are believable and whether or not the proof Sparrow offers really helps the reply. In accordance with our members, Sparrow offers a believable reply and helps it with proof 78% of the time when requested a factual query. This can be a large enchancment over our baseline fashions. Nonetheless, Sparrow is not immune to creating errors, like hallucinating information and giving solutions which can be off-topic typically.

Sparrow additionally has room for bettering its rule-following. After coaching, members have been nonetheless capable of trick it into breaking our guidelines 8% of the time, however in comparison with less complicated approaches, Sparrow is best at following our guidelines underneath adversarial probing. For example, our unique dialogue mannequin broke guidelines roughly 3x extra usually than Sparrow when our members tried to trick it into doing so.

Sparrow solutions a query and follow-up query utilizing proof, then follows the “Don’t faux to have a human identification” rule when requested a private query (pattern from 9 September, 2022).

Our aim with Sparrow was to construct versatile equipment to implement guidelines and norms in dialogue brokers, however the specific guidelines we use are preliminary. Growing a greater and extra full algorithm would require each professional enter on many matters (together with coverage makers, social scientists, and ethicists) and participatory enter from a various array of customers and affected teams. We imagine our strategies will nonetheless apply for a extra rigorous rule set.

Sparrow is a big step ahead in understanding learn how to prepare dialogue brokers to be extra helpful and safer. Nonetheless, profitable communication between individuals and dialogue brokers mustn’t solely keep away from hurt however be aligned with human values for efficient and helpful communication, as mentioned in current work on aligning language models with human values.

We additionally emphasise {that a} good agent will nonetheless decline to reply questions in contexts the place it’s applicable to defer to people or the place this has the potential to discourage dangerous behaviour. Lastly, our preliminary analysis targeted on an English-speaking agent, and additional work is required to make sure related outcomes throughout different languages and cultural contexts.

Sooner or later, we hope conversations between people and machines can result in higher judgments of AI behaviour, permitting individuals to align and enhance techniques that is likely to be too advanced to grasp with out machine assist.

Wanting to discover a conversational path to protected AGI? We’re currently hiring research scientists for our Scalable Alignment crew.

Source link

#Constructing #safer #dialogue #brokers #Google #DeepMind

Previous Post

Metaverse And VR Funding Slides Further As Even Apple Can’t Make A Hit

Next Post

What is ChatGPT? How the world’s most popular AI chatbot can benefit you

AiNEWS2025

AiNEWS2025

Next Post
What is ChatGPT? How the world’s most popular AI chatbot can benefit you

What is ChatGPT? How the world's most popular AI chatbot can benefit you

Stay Connected test

  • 23.9k Followers
  • 99 Subscribers
  • Trending
  • Comments
  • Latest
A tiny new open source AI model performs as well as powerful big ones

A tiny new open source AI model performs as well as powerful big ones

0
Water Cooler Small Talk: The Birthday Paradox 🎂🎉 | by Maria Mouschoutzi, PhD | Sep, 2024

Water Cooler Small Talk: The Birthday Paradox 🎂🎉 | by Maria Mouschoutzi, PhD | Sep, 2024

0
Ghost of Yōtei: The acclaimed Ghost of Tsushima is getting a sequel

Ghost of Yōtei: The acclaimed Ghost of Tsushima is getting a sequel

0
Best Headphones for Working Out (2024): Bose, Shokz, JLab

Best Headphones for Working Out (2024): Bose, Shokz, JLab

0
10 Breakthrough Technologies 2026 | MIT Technology Review

10 Breakthrough Technologies 2026 | MIT Technology Review

2026-01-15
What Is a Knowledge Graph — and Why It Matters

What Is a Knowledge Graph — and Why It Matters

2026-01-15
A British redcoat’s lost memoir resurfaces

A British redcoat’s lost memoir resurfaces

2026-01-15
Opinion: Why some “breakthrough” technologies don’t work out

Opinion: Why some “breakthrough” technologies don’t work out

2026-01-15

Recent News

10 Breakthrough Technologies 2026 | MIT Technology Review

10 Breakthrough Technologies 2026 | MIT Technology Review

2026-01-15
What Is a Knowledge Graph — and Why It Matters

What Is a Knowledge Graph — and Why It Matters

2026-01-15
A British redcoat’s lost memoir resurfaces

A British redcoat’s lost memoir resurfaces

2026-01-15
Opinion: Why some “breakthrough” technologies don’t work out

Opinion: Why some “breakthrough” technologies don’t work out

2026-01-15
Footer logo

We bring you the best Premium WordPress Themes that perfect for news, magazine, personal blog, etc. Check our landing page for details.

Follow Us

Browse by Category

  • AI & Cloud Computing
  • AI & Cybersecurity
  • AI & Sentiment Analysis
  • AI Applications
  • AI Ethics
  • AI Future Predictions
  • AI in Education
  • AI in Fintech
  • AI in Gaming
  • AI in Healthcare
  • AI in Startups
  • AI Innovations
  • AI News
  • AI Research
  • AI Tools & Automation
  • Apps
  • AR/VR & AI
  • Business
  • Deep Learning
  • Emerging Technologies
  • Entertainment
  • Fashion
  • Food
  • Gadget
  • Gaming
  • Health
  • Lifestyle
  • Machine Learning
  • Mobile
  • Movie
  • Music
  • News
  • Politics
  • Review
  • Robotics & Smart Systems
  • Science
  • Sports
  • Startup
  • Tech
  • Travel
  • World

Recent News

10 Breakthrough Technologies 2026 | MIT Technology Review

10 Breakthrough Technologies 2026 | MIT Technology Review

2026-01-15
What Is a Knowledge Graph — and Why It Matters

What Is a Knowledge Graph — and Why It Matters

2026-01-15
  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2026 JNews - Premium WordPress news & magazine theme by Jegtheme.

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
No Result
View All Result

© 2026 JNews - Premium WordPress news & magazine theme by Jegtheme.