AI Safety: Constitutional AI vs Human Feedback Artwork

Super Prompt: Generative AI

Examining generative AI—not to hype breakthroughs or warn of apocalypse, but to understand how things actually work. Mental models over hot takes. Technology specifics over marketing fog.

Welcome to Super Prompt. Hosted by Tony Wan, ex-Silicon Valley insider.

For The Independents—people who think for themselves, refuse narrative capture, and value depth over certainty.

Independent analysis. Unsponsored. Weekly.

The future belongs to better questions.

All Episodes

Super Prompt: Generative AI

AI Safety: Constitutional AI vs Human Feedback

June 17, 2024 • Tony Wan • Season 1 • Episode 27

0:00 | 16:38

With great power comes great responsibility. How do leading AI companies implement safety and ethics as language models scale? OpenAI uses Model Spec combined with RLHF (Reinforcement Learning from Human Feedback). Anthropic uses Constitutional AI. The technical approaches to maximizing usefulness while minimizing harm. Solo episode on AI alignment.

REFERENCE

OpenAI Model Spec

https://cdn.openai.com/spec/model-spec-2024-05-08.html#overview

Anthropic Constitutional AI

https://www.anthropic.com/news/claudes-constitution

To stay in touch, sign up for our newsletter at https://www.superprompt.fm

Tony Wan

Host