AI will add to the e-waste problem. Here’s what we can do about it.

AI will add to the e-waste problem. Here’s what we can do about it.
E-waste is the term to describe things like air conditioners, televisions, and personal electronic devices such as cell phones and ...
Read more

Open Source Replication of Anthropic’s Crosscoder paper for model-diffing — AI Alignment Forum

Open Source Replication of Anthropic’s Crosscoder paper for model-diffing — AI Alignment Forum
Intro Anthropic recently released an exciting mini-paper on crosscoders (Lindsey et al.). In this post, we open source a model-diffing ...
Read more

Video lectures on the learning-theoretic agenda — AI Alignment Forum

Video lectures on the learning-theoretic agenda — AI Alignment Forum
This is a YouTube playlist of recorded lectures on the learning-theoretic AI alignment agenda (LTA) I gave for my MATS ...
Read more

AI Safety Camp 10 — AI Alignment Forum

Video lectures on the learning-theoretic agenda — AI Alignment Forum
We are pleased to announce that the 10th version of the AI Safety Camp is now entering the team member ...
Read more

BIG-Bench Canary Contamination in GPT-4 — AI Alignment Forum

Video lectures on the learning-theoretic agenda — AI Alignment Forum
The BIG-Bench canary string is a unique string included in documents intended to be excluded from the training datasets of ...
Read more

How Wayve’s driverless cars will meet one of their biggest challenges yet

How Wayve’s driverless cars will meet one of their biggest challenges yet
Figuring out why the model behaves as it does tells Wayve what kinds of scenarios require extra help. Using a ...
Read more

Self-prediction acts as an emergent regularizer — AI Alignment Forum

Self-prediction acts as an emergent regularizer — AI Alignment Forum
TL;DR: In our recent work with Professor Michael Graziano (arXiv, thread), we show that adding an auxiliary self-modeling objective to supervised ...
Read more

The Download: Wayve’s driverless ambitions, and AI models built by kids

How Wayve’s driverless cars will meet one of their biggest challenges yet
The UK driverless-car startup Wayve is headed west. The firm’s cars learned to drive on the streets of London. But ...
Read more

Avoiding value decay in digital transformation

Avoiding value decay in digital transformation
“Most implementations are viewed as IT projects,” says Tim Hertzig, a principal in Deloitte’s Technology practice and global product owner ...
Read more

A bird’s eye view of ARC’s research — AI Alignment Forum

A bird’s eye view of ARC’s research — AI Alignment Forum
This post includes a “flattened version” of an interactive diagram that cannot be displayed on this site. I recommend reading ...
Read more