AI Pulse

An industry pulse of what is happening in AI.

Sim2Reason: Solving Physics Olympiad via RL on Physics Simulators

x.comApr 16, 2026, 5:29 PM

Sim2Reason trains LLMs inside MuJoCo physics simulators, zero human annotation. Generate scenes, auto-label QA pairs, RL-train on synthetic data. Zero-shot: +5-10% IPhO, +17.9% JEEBench, +4.4% MATH 500. Outperforms models trained on curated real-world QA pairs. CMU + Lambda.

1Apr 16, 2026, 10:58 PM

The AI Compute Crisis, 2026

tomtunguz.comApr 16, 2026

Blackwell chips: $4.08/hr, up 48% from $2.75 just 2 months ago. CoreWeave +20% & extended minimum contracts from 1 to 3 years. Anthropic limits Mythos to ~40 orgs. OpenAI CFO: "We're making some very tough trades...because we don't have enough compute." The age of abundant AI is over.

2Apr 16, 2026, 10:13 PM

Android CLI: Build Android apps 3x faster using any agent

android-developers.googleblog.comApr 16, 2026

Google launches Android CLI in preview: a terminal interface for agent-driven Android development. 70% fewer LLM tokens for project setup and 3x faster tasks in internal tests. Includes Android Skills (SKILL.md files) and Android Knowledge Base. Works with Claude Code, Codex, Gemini CLI, and others.

2Apr 16, 2026, 9:30 PM

A new way to explore the web with AI Mode in Chrome

blog.googleApr 16, 2026

AI Mode in Chrome now opens web pages side-by-side with your search panel -- no more switching tabs to follow a link. A new plus menu lets you add open tabs, images, and PDFs as context for follow-up questions. Canvas and image creation tools are accessible from the plus menu anywhere in Chrome. Rolling out in the US today.

2Apr 16, 2026, 8:46 PM

The Genie and the Monkey's Paw

www.danshapiro.comApr 16, 2026

For a long time, GPT has been a monkey's paw. Claude has been a genie. Opus 4.7 changes that: now "substantially better at following instructions." Prompts for earlier models may now produce unexpected results. Anthropic built a genie. Today they shipped something closer to a paw.

1Apr 16, 2026, 8:05 PM

Best practices for using Claude Opus 4.7 with Claude Code

claude.comApr 16, 2026

Opus 4.7 in Claude Code defaults to a new xhigh effort level (between high and max). Fixed Extended Thinking is gone -- adaptive thinking lets the model decide when to reason. It calls tools less and spawns fewer subagents by default. Specify tasks fully in the first turn.

3Apr 16, 2026, 7:23 PM

Codex for (almost) everything

openai.comApr 16, 2026

Codex update: background computer use, in-app browser for frontend iteration, image generation via gpt-image-1.5, 90+ new plugins (Atlassian, GitLab, CircleCI, Microsoft Suite, Superpowers), memory across sessions, automations that can schedule work across days or weeks, and proactive task suggestions.

3Apr 16, 2026, 6:40 PM

Qwen3.6-35B-A3B on my laptop drew me a better pelican than Claude Opus 4.7

simonwillison.netApr 16, 2026, 5:16 PM

A 21GB quantized Qwen3.6-35B-A3B on a MacBook M5 beat Claude Opus 4.7 on the pelican-riding-a-bicycle SVG benchmark -- the flamingo-on-unicycle backup test went to Qwen too. This benchmark used to roughly track general model quality; that correlation may now be broken.

4Apr 16, 2026, 5:58 PM

AI cybersecurity is not proof of work

antirez.comApr 16, 2026, 11:11 AM

The proof-of-work analogy for AI cybersecurity is wrong. Bug-finding is intelligence-capped, not compute-capped. Run an inferior model infinite times and it still won't find the OpenBSD SACK bug's multi-step chain. "More GPU wins" is the wrong frame. Better models win.

4Apr 16, 2026, 5:14 PM

Qwen3.6-35B-A3B: Agentic Coding Power, Now Open to All

qwen.aiApr 15, 2026

Qwen open-sources Qwen3.6-35B-A3B: a sparse MoE (35B total / 3B active) that rivals much larger dense models on agentic coding. SWE-bench Verified 73.4%, Terminal-Bench 2.0 51.5%. Natively multimodal. Open weights on Hugging Face.

5Apr 16, 2026, 4:31 PM