📅 Today's Highlights
Anthropic Fellows introduce Model Spec Midtraining (MSM), a method that teaches ...
Anthropic Fellows introduce Model Spec Midtraining (MSM), a method that teaches AI models the reasoning and values behind desired behaviors to improve generalization beyond standard example-based alignment.
Anthropic Fellows research demonstrates that a model deliberately underperformin...
Anthropic Fellows research demonstrates that a model deliberately underperforming can be trained to near-full capability even when supervised only by weaker models.
Joint research from MATS, Redwood, and Anthropic shows that a strategically sand...
Joint research from MATS, Redwood, and Anthropic shows that a strategically sandbagging model can be trained to stop sandbagging using only weaker models as supervisors.
OpenAI and major chip/cloud vendors co-release MRC, an open networking protocol ...
OpenAI and major chip/cloud vendors co-release MRC, an open networking protocol designed to reduce wasted GPU time and improve reliability in large AI training clusters.
Using Model Spec Midtraining (MSM), Anthropic finds that explaining underlying v...
Using Model Spec Midtraining (MSM), Anthropic finds that explaining underlying values—rather than just rules—yields better generalization in alignment training.
🚀 Model Releases
View all →xAI releases Grok 4.3, claiming top rankings on agentic tool calling, instructio...
xAI releases Grok 4.3, claiming top rankings on agentic tool calling, instruction following, and enterprise domain leaderboards including case law and corporate finance.
GPT-5.5 Instant becomes ChatGPT's new default model, offering smarter answers, f...
GPT-5.5 Instant becomes ChatGPT's new default model, offering smarter answers, fewer hallucinations, and improved personalization controls.
GPT-5.5 Instant is rolling out as the default ChatGPT model for all users, with ...
GPT-5.5 Instant is rolling out as the default ChatGPT model for all users, with personalization improvements for Plus/Pro and expanded memory sources.
OpenAI is rolling out GPT-5.5 Instant in ChatGPT, featuring smarter, more person...
OpenAI is rolling out GPT-5.5 Instant in ChatGPT, featuring smarter, more personalized, and more concise responses in a warmer tone.
Google announces Gemini Embedding 2, its first natively multimodal embedding mod...
Google announces Gemini Embedding 2, its first natively multimodal embedding model, now publicly available and already being used for video analysis and visual shopping applications.
OpenAI released the system card for GPT-5.5 Instant, documenting safety evaluati...
OpenAI released the system card for GPT-5.5 Instant, documenting safety evaluations and model behavior for this new model variant.
🔧 Agent Infrastructure
View all →An AI red teaming agent built on the Dreadnode SDK automates adversarial workflo...
An AI red teaming agent built on the Dreadnode SDK automates adversarial workflow construction using 45+ attacks and 450+ transforms, reducing manual red teaming from weeks to hours for agentic systems.
A framework for automated multi-agent system composition that replaces manual pl...
A framework for automated multi-agent system composition that replaces manual planning and agent selection with an LLM-driven planner, dynamic call graphs, and automated orchestration.
ArizeAI's launch of Alyx v2 revealed that small changes to prompts, tool descrip...
ArizeAI's launch of Alyx v2 revealed that small changes to prompts, tool descriptions, or model behavior can cause regressions multiple steps later in agent workflows, forcing a rethink of testing strategy.
Probus is a multi-agent vulnerability scanner that discovered and got merged rea...
Probus is a multi-agent vulnerability scanner that discovered and got merged real security fixes in Vercel AI SDK, n8n, and LangGraph, demonstrating practical agentic security research value.
OpenAI open-sources MRC (Multipath Reliable Connection), a new supercomputer net...
OpenAI open-sources MRC (Multipath Reliable Connection), a new supercomputer networking protocol via OCP designed to boost resilience and performance in large-scale AI training clusters.
Inerrata proposes a collective knowledge layer for coding agents, enabling them ...
Inerrata proposes a collective knowledge layer for coding agents, enabling them to share and reuse solutions across sessions via an Ontological Knowledge Network and MCP-based graph search. Addresses the persistent problem of agents losing learned context on session reset.
📄 Research Papers
View all →Anthropic Fellows introduce Model Spec Midtraining (MSM), a method that teaches ...
Anthropic Fellows introduce Model Spec Midtraining (MSM), a method that teaches AI models the reasoning and values behind desired behaviors to improve generalization beyond standard example-based alignment.
Anthropic Fellows research demonstrates that a model deliberately underperformin...
Anthropic Fellows research demonstrates that a model deliberately underperforming can be trained to near-full capability even when supervised only by weaker models.
Joint research from MATS, Redwood, and Anthropic shows that a strategically sand...
Joint research from MATS, Redwood, and Anthropic shows that a strategically sandbagging model can be trained to stop sandbagging using only weaker models as supervisors.
Using Model Spec Midtraining (MSM), Anthropic finds that explaining underlying v...
Using Model Spec Midtraining (MSM), Anthropic finds that explaining underlying values—rather than just rules—yields better generalization in alignment training.
OpenSeeker-v2 demonstrates that high-quality, high-difficulty trajectory data wi...
OpenSeeker-v2 demonstrates that high-quality, high-difficulty trajectory data with knowledge graph scaling and expanded toolsets enables SFT alone to train competitive frontier search agents without expensive RL pipelines.
MOSAIC-Bench reveals that coding agents can be manipulated into producing exploi...
MOSAIC-Bench reveals that coding agents can be manipulated into producing exploitable code through multi-step innocuous-looking task decompositions, introducing 199 three-stage attack chains across 10 web substrates and 31 CWE classes for safety evaluation.
📰 Industry News
View all →OpenAI and major chip/cloud vendors co-release MRC, an open networking protocol ...
OpenAI and major chip/cloud vendors co-release MRC, an open networking protocol designed to reduce wasted GPU time and improve reliability in large AI training clusters.
GPT-5.5 launch metrics show API revenue growing 2x faster than any prior release...
GPT-5.5 launch metrics show API revenue growing 2x faster than any prior release, with Codex doubling revenue in under a week driven by enterprise agentic coding demand.
MRC is already deployed across OpenAI's largest supercomputers at Oracle and Mic...
MRC is already deployed across OpenAI's largest supercomputers at Oracle and Microsoft Fairwater sites, and is now publicly available.
Perplexity integrates premium medical journals (NEJM, BMJ Group, and 9 more) as ...
Perplexity integrates premium medical journals (NEJM, BMJ Group, and 9 more) as cited sources for health queries, targeting clinical-grade answers.
OpenAI and PwC are partnering to deploy AI agents for enterprise finance automat...
OpenAI and PwC are partnering to deploy AI agents for enterprise finance automation, targeting forecasting, controls, and CFO-function modernization. Signals growing enterprise adoption of agentic workflows in high-stakes domains.
LlamaIndex CEO argues in VentureBeat that unstructured data locked in file forma...
LlamaIndex CEO argues in VentureBeat that unstructured data locked in file formats is the core bottleneck in the LLM stack, regardless of which frontier model is used.