
On this page
- ANNOUNCEMENTS 🔊
- TOP PICKS 📑 🎧
- NEWS 🗞️
- Trump’s AI strategy trades guardrails for growth in race against China
- A new study just upended AI safety
- Zuckerberg signals Meta won’t open source all of its ‘superintelligence’ AI models
- Flaw in Gemini CLI coding tool could allow hackers to run nasty commands
- AI models with systemic risks given pointers on how to comply with EU AI rules
ANNOUNCEMENTS 🔊
Hear About Different AI Futures from Our Co-Founder Bengüsu Özcan! 🌟
Our co-founder Bengüsu Özcan, who published a study on various geopolitical and societal scenarios ranging from futures where AI development slows down to futures where it escapes human control, will share these scenarios in a special session organized by BlueDot Impact.
🚀 Don’t miss out!
🗓️ Deadline: August 13th
Trends in AI: AMA with the director of Epoch
During this in-person Ask Me Anything session with the Director of Epoch AI, Jaime Sevilla will discuss current trends in AI, including hardware consumption, economic impact, development of AI systems, and data challenges.
🗓️ Register by: August 19th
AI Governance Taskforce: Autumn 2025
Run by Arcadia Impact, this is a remote, part-time career development program for experienced professionals looking to pursue a career in AI governance.
Participants will produce relevant policy research contributions in small teams led by Arcadia staff, with research oversight and high-level direction from expert partners at AI governance organisations.
🗓️ Register by: August 17th
Supervised Program for Alignment Research (SPAR): Fall 2025
Program designed to provide students and professionals with the opportunity to work with experienced mentors, developing valuable experience in AI safety research. Projects typically last for 3 months, with an average time commitment of 5–15 hours per week.
🗓️ Register by: August 20th
Athena AI Alignment Mentorship Program: Fall 2025
10-week remote mentorship program for women seeking to deepen their research skills and expand their professional network in technical AI alignment research. The program includes weekly research talks, structured support for research development and community-building, and a 1-week in-person retreat with established researchers.
🗓️ Register by: August 25th
MATS: Neel Nanda’s Winter 2025 Stream
MATS (ML Alignment & Theory Scholars) is a prestigious research program connecting scholars with top mentors in AI alignment. This winter stream, led by Neel Nanda (a leading researcher in mechanistic interpretability) focuses on teaching participants how to understand and interpret how AI models work internally.
🗓️ Register by: August 29th
TOP PICKS 📑 🎧
GPT-5: a small step for intelligence, a giant leap for normal people
GPT-5 launched on August 7, delivering incremental improvements rather than the revolutionary leap many anticipated.
Peter Wildeford examines why and how the model disappoints on intelligence metrics. Wildeford notes that although the model isn’t particularly impressive for AI researchers seeking intelligence breakthroughs, it is a successful product in terms of cost, efficiency and reliability: Through faster response times, reduced hallucinations, and a unified routing system that automatically selects between reasoning and fast models, GPT-5 provides a more efficient and reliable experience for what Wildeford calls “normal people” rather than AI elites.

First notable government investment in AI alignment from the UK AI Security Institute
The UK AI Security Institute, the first national institute dedicated to tackling advanced AI risks and safety, has taken another bold step in shaping the global AI governance landscape. It has unveiled a £15 million funding programme to support projects addressing key chapters of AI alignment. Applications are accepted until 10 September 2025.
NEWS 🗞️
Trump’s AI strategy trades guardrails for growth in race against China
The Trump administration’s AI Action Plan prioritizes rapid AI development and infrastructure expansion to compete with China, shifting focus away from safety and regulatory guardrails.
The plan emphasizes deregulation, large-scale data center construction, and national security, with less attention to risk mitigation and ethical oversight.
AI alignment, safety, and governance concerns are largely sidelined, raising alarms among experts about increased risks from unchecked AI advancement.
A new study just upended AI safety
AI models can acquire and propagate harmful behaviors, such as recommending violence or crime, even when trained on seemingly innocuous data like lists of numbers.
Dangerous behavioral contamination in AI can occur in subtle, hard-to-detect ways, complicating efforts to ensure safety and alignment.
The increasing use of synthetic or AI-generated data in training raises the risk of hidden unsafe behaviors emerging in AI systems.
Zuckerberg signals Meta won’t open source all of its ‘superintelligence’ AI models
Meta will not open source all of its future “superintelligence” AI models due to heightened safety concerns.
This represents a shift from Meta’s previous commitment to open AI development, reflecting increased awareness of risks as
AI capabilities advance.
Zuckerberg emphasized the importance of rigorous risk mitigation and selective public release of AI models.
Flaw in Gemini CLI coding tool could allow hackers to run nasty commands
A critical vulnerability in Google’s Gemini CLI allowed prompt injection attacks via natural-language instructions hidden in code package README files.
Attackers could exploit this flaw to bypass security controls and execute harmful commands on users’ devices, such as stealing sensitive data.
The incident demonstrates how AI tools can be manipulated through indirect inputs, raising significant AI safety and alignment concerns.
AI models with systemic risks given pointers on how to comply with EU AI rules
“World models,” such as Meta’s V-JEPA 2 and 1X Technologies’ Redwood AI, enable robots to understand physical reality and predict the consequences of their actions.
This advancement carries AI’s known problems like bias, fragility, and unpredictability from the digital realm into the physical one.
The development marks a shift from disembodied language models (LLMs) to embodied intelligence (robots) and necessitates a new research field called “Embodied AI Safety.”