Industry News alignment agents safety coding_agents

OpenAI now monitors 99.9% of internal coding agent traffic for misalignment by reviewing full trajectories with frontier

OpenAI now monitors 99.9% of internal coding agent traffic for misalignment by reviewing full trajectories with frontier models, escalating suspicious behavior.

Original Post

Sharing some of the work I’ve been doing at OpenAI: we now monitor 99.9% of internal coding traffic for misalignment using our most powerful models, reviewing full trajectories to catch suspicious behavior, escalate serious cases quickly, and strengthen our safeguards over time. https://t.co/5wAYJObCDK

Source: X (@OpenAI)
Author: OpenAI
Date: 2026-03-19
Relevance: 8
Topics: alignment, agents, safety, coding_agents

View Original Post ↗

OpenAI now monitors 99.9% of internal coding agent traffic for misalignment by reviewing full trajectories with frontier

Related Posts

LiteParse is a fully open-source, model-free document parsing tool that requires...

LlamaIndex is open-sourcing a lightweight core of LlamaParse technology as LiteP...

DeepMind and Kaggle launch a global hackathon with $200k in prizes to crowdsourc...