Schelling game evaluations for AI control — AI Alignment Forum
Playing Schelling games is a key dangerous capability for schemers: it’s much harder to control AIs that are very capable ...
Read more
These are the best ways to measure your body fat
I, on the other hand, have never been all that muscular. I like to think I’m a healthy weight—but nurses ...
Read more
Behavioral red-teaming is unlikely to produce clear, strong evidence that models aren’t scheming — AI Alignment Forum
One strategy for mitigating risk from schemers (that is, egregiously misaligned models that intentionally try to subvert your safety measures) is ...
Read more
Roundtables: Producing Climate-Friendly Food | MIT Technology Review
The latest iteration of a legacy Founded at the Massachusetts Institute of Technology in 1899, MIT Technology Review is a ...
Read more
Safe Predictive Agents with Joint Scoring Rules — AI Alignment Forum
Thanks to Evan Hubinger for funding this project and for introducing me to predictive models, Johannes Treutlein for many fruitful ...
Read more
Preventing Climate Change: A Team Sport
Read more from MIT Technology Review Insights & MEDC about addressing climate change impacts About the speaker Hilary Doe, Chief ...
Read more
The Download: Another Nobel Prize for AI, and Adobe’s anti-scraping tool
Google DeepMind founder Demis Hassabis has won a joint Nobel Prize for Chemistry for using artificial intelligence to predict the ...
Read more
Adobe wants to make it easier for artists to blacklist their work from AI scraping
Content credentials are based on C2PA, an internet protocol that uses cryptography to securely label images, video, and audio with ...
Read more
Two new datasets for evaluating political sycophancy in LLMs — AI Alignment Forum
TLDR: I created two datasets (154 and 759 statements) that can aid in measuring political sycophancy (in the US in ...
Read more
a plan to deal with AI extinction risk — AI Alignment Forum
We have published A Narrow Path: our best attempt to draw out a comprehensive plan to deal with AI extinction ...
Read more