AI will add to the e-waste problem. Here’s what we can do about it.
E-waste is the term to describe things like air conditioners, televisions, and personal electronic devices such as cell phones and ...
Read more
Open Source Replication of Anthropic’s Crosscoder paper for model-diffing — AI Alignment Forum
Intro Anthropic recently released an exciting mini-paper on crosscoders (Lindsey et al.). In this post, we open source a model-diffing ...
Read more
Video lectures on the learning-theoretic agenda — AI Alignment Forum
This is a YouTube playlist of recorded lectures on the learning-theoretic AI alignment agenda (LTA) I gave for my MATS ...
Read more
AI Safety Camp 10 — AI Alignment Forum
We are pleased to announce that the 10th version of the AI Safety Camp is now entering the team member ...
Read more
BIG-Bench Canary Contamination in GPT-4 — AI Alignment Forum
The BIG-Bench canary string is a unique string included in documents intended to be excluded from the training datasets of ...
Read more
How Wayve’s driverless cars will meet one of their biggest challenges yet
Figuring out why the model behaves as it does tells Wayve what kinds of scenarios require extra help. Using a ...
Read more
Self-prediction acts as an emergent regularizer — AI Alignment Forum
TL;DR: In our recent work with Professor Michael Graziano (arXiv, thread), we show that adding an auxiliary self-modeling objective to supervised ...
Read more
The Download: Wayve’s driverless ambitions, and AI models built by kids
The UK driverless-car startup Wayve is headed west. The firm’s cars learned to drive on the streets of London. But ...
Read more
Avoiding value decay in digital transformation
“Most implementations are viewed as IT projects,” says Tim Hertzig, a principal in Deloitte’s Technology practice and global product owner ...
Read more
A bird’s eye view of ARC’s research — AI Alignment Forum
This post includes a “flattened version” of an interactive diagram that cannot be displayed on this site. I recommend reading ...
Read more