The AI Report
Posts
🏢 McKinsey cuts staff, adds 12K AI bots

🏢 McKinsey cuts staff, adds 12K AI bots

Major tech advances are reshaping how we build software, but someone has to pay the bill

The AI Report
August 05, 2025

In partnership with

This week delivers breakthroughs across AI's creative and technical frontiers. Runway unveiled a video model that could revolutionize filmmaking, while a deep dive into AI code evaluation reveals the systematic methods driving model improvements. Meanwhile, OpenAI's quest for general-purpose agents traces back to a quiet math team that sparked Silicon Valley's latest obsession.The Latest in AI

The Latest in AI

🎬 Runway Aleph: Hollywood's New AI Nightmare
🔬 The Science Behind Better AI Code Models
🤖 Inside OpenAI's Agent Obsession
🗞️ AI Bytes
🛠️ Top AI Tools This Week

🎬 Runway Aleph: Hollywood's New AI Nightmare

Credit: Sebastian Raschka

Runway just dropped Aleph, a video generation model that doesn't just create clips—it surgically edits existing footage with unprecedented precision. This isn't your typical text-to-video tool; it's a complete post-production studio powered by AI.

Revolutionary editing capabilities: Add or remove objects, change environments and weather, generate new camera angles, and transform character appearances—all while maintaining perfect lighting, shadows, and perspective consistency
Style transfer mastery: Apply any aesthetic to existing footage, from photorealistic to artistic styles, with the AI automatically adjusting every frame to match the new look seamlessly
Advanced object manipulation: Replace cars with horse-drawn chariots, add fireworks to night scenes, or green-screen any person or object with precise edge detection that preserves hair strands and transparent fabrics
Temporal consistency breakthrough: Unlike previous video AI that created flickering artifacts, Aleph maintains coherent motion and lighting across entire sequences, solving the industry's biggest technical hurdle
Professional workflow integration: Generate endless coverage angles, fix lighting in post, and create visual effects without expensive practical setups or VFX teams

🤔 Why It Matters:

Aleph represents a seismic shift from AI as a creative assistant to AI as a complete replacement for traditional post-production workflows. Small studios can now achieve Hollywood-level visual effects, while major productions face pressure to adapt or risk obsolescence in an AI-first filmmaking landscape.

Read more →

Join over 4 million Americans who start their day with 1440 – your daily digest for unbiased, fact-centric news. From politics to sports, we cover it all by analyzing over 100 sources. Our concise, 5-minute read lands in your inbox each morning at no cost. Experience news without the noise; let 1440 help you make up your own mind. Sign up now and invite your friends and family to be part of the informed.

Join for free today!

🔬 The Science Behind Better AI Code Models

A comprehensive guide reveals how top AI labs systematically improve coding models through rigorous evaluation frameworks. Spoiler: it's not about feeding models more data—it's about scientific measurement and iterative refinement.

Evaluation frameworks as the foundation: AI labs treat code evaluations like unit tests in software development, creating structured benchmarks that define "correct" behavior and enable systematic progress tracking across model versions
The "hill climbing" methodology: Teams identify specific failure patterns, make targeted improvements through fine-tuning or prompt engineering, then re-evaluate to measure progress—turning model development into an experimental science
Golden standards and autoraters: Reference solutions provide ground truth for automated grading, while AI judges scale evaluation beyond human capacity, assessing code for correctness, style, security, and maintainability with detailed feedback
Real-world benchmark evolution: Modern evaluations moved beyond simple algorithm puzzles to complex multi-file projects, API integrations, and debugging tasks that mirror actual software engineering work like SWE-bench challenges
Data leakage dangers: Models inadvertently trained on popular benchmarks like HumanEval produce inflated scores, forcing researchers to maintain fresh evaluation sets and diverse task rotations to measure true capabilities

🤔 Why It Matters:

This systematic approach explains why some AI coding tools actually work in production while others remain demos. Understanding evaluation methodologies helps developers choose reliable AI assistants and gives insight into which capabilities will improve fastest.

Read more →

🤖 Inside OpenAI's Agent Obsession

The story behind OpenAI's reasoning models traces back to a quiet math team that solved high school competitions—and accidentally sparked Silicon Valley's race to build AI agents that can do anything on a computer.

The MathGen origins: Hunter Lightman's team focused on mathematical reasoning in 2022, combining large language models with reinforcement learning and "test-time computation" to create AI that could plan, verify, and backtrack through problems
The "Strawberry" breakthrough: By 2023, OpenAI cracked chain-of-thought reasoning, allowing models to notice mistakes and self-correct—leading directly to the o1 model that achieved gold medal performance at the International Math Olympiad
Talent wars intensify: Mark Zuckerberg recruited five o1 researchers for Meta's superintelligence unit with $100+ million packages, while the original 21-person team became Silicon Valley's most sought-after talent in AI development
Agent architecture emerges: OpenAI's latest models spawn multiple reasoning agents that explore different solution paths simultaneously, then select optimal answers—a technique now adopted by Google and xAI for complex problem-solving
The subjective task challenge: Current agents excel at verifiable domains like coding but struggle with subjective tasks like shopping or planning, with OpenAI developing new reinforcement learning techniques to train on less verifiable activities

🤔 Why It Matters:

OpenAI's systematic approach to reasoning models provides the foundation for general-purpose AI agents. As competitors race to match these capabilities, the company that solves subjective task automation first will dominate the next phase of AI adoption.

Read more →

Start learning AI in 2025

Keeping up with AI is hard – we get it!

That’s why over 1M professionals read Superhuman AI to stay ahead.

Get daily AI news, tools, and tutorials
Learn new AI skills you can use at work in 3 mins a day
Become 10X more productive

🗞️ AI Bytes

📰 McKinsey Deploys 12,000 AI Agents as Consulting Faces "Existential" Threat

McKinsey reduced headcount from 45,000 to 40,000 while rolling out 12,000 AI agents that write PowerPoint decks, take notes, and even mimic the firm's signature "tone of voice." CEO Bob Sternfels envisions one AI agent per human employee as consulting transforms from strategy advice to hands-on implementation. The firm now earns 40% of revenue from AI-related work as traditional consulting faces an automation reckoning.

📰 Anthropic Discovers "Persona Vectors" That Control AI Personality

Anthropic researchers identified specific neural patterns that control character traits like "evil," "sycophancy," and "hallucination" in AI models. These "persona vectors" can monitor personality shifts during conversations, prevent unwanted traits during training, and flag problematic training data before it corrupts models. The breakthrough offers unprecedented control over AI behavior, addressing issues like Microsoft's "Sydney" chatbot and xAI's antisemitic "MechaHitler" incidents.

📰 Carta Saves 3,500 Hours Monthly With Dynamic AI Agents

Financial services company Carta built AI agents that turned an 11-minute cash reconciliation task into seconds of work, processing 20,000-25,000 monthly transactions automatically. The agents gather context from multiple systems, apply complex accounting logic, and complete reconciliations that previously required back-and-forth between teams. The success demonstrates how AI agents excel at judgment-heavy workflows when given rich contextual information.

📰 Microsoft Identifies 40 Jobs Most Exposed to AI Displacement

Microsoft researchers ranked professions by AI applicability, with translators, historians, and customer service reps topping the "most at risk" list. Knowledge workers requiring college degrees face higher exposure than manual laborers, challenging assumptions about education as protection against automation. The study found 5 million U.S. sales and customer service jobs particularly vulnerable, while hands-on roles like dredge operators and bridge tenders remain largely AI-proof.

💬 PromptDC

The art of better prompts. PromptDC helps you build, test, and organize prompts like a pro—you write your desired output and PromptDC enhances the prompt. No more screaming matches with ChatGPT or generic responses from LLMs. Save time by reusing effective prompts across projects, get templates tailored for marketing, research, coding, and support, and A/B test prompts to track which ones work best for your specific use cases.

Check now

🏢 McKinsey cuts staff, adds 12K AI bots

Major tech advances are reshaping how we build software, but someone has to pay the bill

The Latest in AI

🎬 Runway Aleph: Hollywood's New AI Nightmare

🤔 Why It Matters:

The Daily Newsletter for Intellectually Curious Readers

🔬 The Science Behind Better AI Code Models

🤔 Why It Matters:

🤖 Inside OpenAI's Agent Obsession

🤔 Why It Matters:

Start learning AI in 2025

🗞️ AI Bytes

📰 McKinsey Deploys 12,000 AI Agents as Consulting Faces "Existential" Threat

📰 Anthropic Discovers "Persona Vectors" That Control AI Personality

📰 Carta Saves 3,500 Hours Monthly With Dynamic AI Agents

📰 Microsoft Identifies 40 Jobs Most Exposed to AI Displacement

🛠️ Top AI Tools This Week

💬 PromptDC

On a scale of 1 to AI-takeover, how did we do today?