LaiDub

PodcastsHear the voice. See the shape of the thought.

Explorar Canales

Todos IA y Tecnología Negocios Ciencia Cultura Política Filosofía Salud

Tech Whistleblower: You Only Have 3 Years Left Before It Hits! - Mo Gawdat

2:01:59

EN/ZH

Watch with Captions

The Diary Of A CEOhace alrededor de 1 mes

Tech Whistleblower: You Only Have 3 Years Left Before It Hits! - Mo Gawdat

Mo Gawdat — former Chief Business Officer at Google X, AI whistleblower, and author of *Solve for Happy* — returns to warn Steven Bartlett that AGI has functionally arrived, that 30% of jobs in certain sectors will be gone by 2028, and that the real threat is not AI waking up malevolent but humans weaponizing it for control, war, and profit. Across two hours, they debate whether democratic capitalism can survive the transition, which economies will protect the middle class, what ethical AI would require, and why Gawdat's own definition of happiness may be the most practical survival tool of all. ## [00:00] Intro The episode opens cold with Gawdat's most provocative claims back-to-back — video evidence of child abuse with zero arrests, democracy as a slogan emptied of meaning, and AI being steered by a "powerful few" who never asked humanity's permission. Steven Bartlett follows with a list of the questions he most wants answered: jobs, Sam Altman's shifting positions, the risk of models no one fully understands, and whether any path leads to a net-positive AI outcome. > *"I'm not worried about AI turning against us. I'm worried about humans telling AI to turn against us."* ## [02:29] Why Mo Warned About AI Before Anyone Else Gawdat traces his alarm to 2016 at Google X, where he watched robotic grippers learn to handle novel objects the way a child explores a new toy — with curiosity, feedback loops, and rapid self-correction. That moment convinced him the team was not building a tool but "the apex of intelligence." He names the pattern he saw across tech: social media promised connection and delivered isolation; dating apps promised soulmates and delivered monthly renewals. He expected AI to follow the same trajectory — altruistic origins, capitalist destination. > *"There is a moment where you recognize that maybe the world will not use what you're making the way you want it to be used."* ## [05:26] Can AI Be a Net Positive for Humanity? Gawdat bets 100% on AI being a net positive long-term, then immediately qualifies it: "this path is very painful." His analogy is nuclear power — the first use was a bomb, not electricity. Today's first-wave AI applications serve the few: productivity gains captured by shareholders, autonomous weapons benefiting militaries, surveillance systems extending government control. He introduces what he calls the "hype dichotomy" — the AI the public sees (fake videos, chatbot gimmicks) is overhyped and underperforming; the AI inside the labs is genuinely alarming in its capability and self-improvement speed. > *"What the real geeks see inside the lab is just unbelievable intelligence."* ## [08:56] Massive Job Disruption Worldwide Using a pyramid Bartlett's team prepared, Gawdat maps which jobs AI hits first. His counterintuitive claim: not the bottom. Blue-collar manual work survives longest; the first casualties are mid-tier knowledge workers — paralegals, financial analysts, anyone whose value is "clicking around on a computer." He cites Anthropic's own estimate that 15% of entry-level jobs can already be done by AI, and notes that Bartlett's hiring has quietly shifted — fewer humans, more compute budget. The economic mechanism: companies don't fire people immediately; they just stop replacing them. > *"It's not that jobs will end first. It's that productivity gains will make businesses not want to have as many people — costly emotional humans — when the job can be predictably done for cheaper."* ## [15:28] Will AI Cost Savings Create New Jobs? Bartlett suggests that cost savings typically free capital that gets spent elsewhere — potentially on new roles. Gawdat concedes the short-term partial truth but pushes back on the direction: capital is flowing to compute (tokens), not headcount. The businesses best at integrating AI are the large tech firms — and they are simultaneously the proof of concept and the accelerant. ## [16:38] What Happens to Blue Collar Jobs? Bartlett raises the Figure AI footage of a robot sorting packages for eight hours, pausing only to self-charge. Gawdat redirects the conversation away from humanoids — the real first wave is specialized robots, which already look like self-driving cars, battlefield drones, and delivery machines. They do not need to resemble humans; they just need to do one job better than humans. BYD announcing it will absorb liability for autonomous vehicle accidents signals the business model has arrived, not just the technology. > *"Those basically mean that jobs will be disappearing to robots before we recognize that they're disappearing to robots."* ## [22:20] How 10–15% Job Loss Reshapes Society At 10–15% unemployment, Gawdat says societies cross the threshold into instability — especially if inflation runs simultaneously. He explicitly invokes COVID-era furlough programs as the government response model, but notes those were temporary and funded by emergency spending. A structural 20% unemployment has no equivalent playbook. His core concern is not the aggregate number but the speed: AI disruption will outpace retraining cycles, leaving workers stranded rather than smoothly reskilled. > *"It's not about all of humanity losing their jobs. It's about what is the dividing line before civil war."* ## [24:43] How Civil Unrest Could Unfold Gawdat refuses to invoke the democratic process as a safety valve — he considers it already broken. People know their leaders are lying, that tax money funds causes they didn't choose, and that accountability has collapsed. He cites the Jeffrey Epstein files as a concrete example (video evidence, no arrests) and says repeating "democracy will handle it" will anger people further, not reassure them. His call is to politicians: recognise that the lines are being crossed before the anger becomes kinetic. ## [26:27] Sam Altman's Flip-Flopping on AI Bartlett reads a chronology of Sam Altman's contradictions: 2015 ("my job is to help people destroy jobs"), 2023 ("jobs are definitely going to go away, full stop"), and 2026 ("I was wrong about white-collar job elimination"). Gawdat decodes the pattern as PR management, not genuine uncertainty. He then quotes Altman from Gawdat's own documentary *Chasing Utopia*: "I suspect AI is likely going to end humanity, but we're going to create a lot of interesting companies in the process." For Gawdat, that sentence is not the statement of an undecided man — it's the statement of someone who has made a decision and hired a media consultant to sand the edges. > *"Those kinds of statements are honestly not the statements of someone who's not decided. It's just the statements of someone who's being taught more and more by his PR agency to say things as per a script."* ## [32:38] Is Sam Altman Pro-Humanity? Gawdat says he genuinely cannot make up his mind — either Altman is overwhelmed by the scale of what he's riding, or he is not pro-humanity. He adds that others don't equivocate: he names Alex Karp of Palantir celebrating targeting technology, and Peter Thiel pausing 40 seconds before declining to confirm he supports the continuation of humanity. Gawdat's summary: "We entrust those people with the future of humanity. This is wrong." ## [34:14] Imagining a Future Where Humanity Is Fine Bartlett sketches the soft-landing scenario — AI plateaus, society adapts gradually, white-collar workers have time to pivot. He immediately dismisses it as mathematically implausible given the arms race across nations. Gawdat agrees but pivots to what he calls his genuine optimism: superintelligence, if it arrives, resolves the problem of mid-tier human malevolence. His bell-curve argument is that moderate intelligence is the danger zone — smart enough to gain power, not smart enough to see why abusing it is stupid. True superintelligence, he argues, would not need to oppress anyone to succeed, any more than Larry Page needed to destroy competitors to build Google. > *"If you go beyond that into higher levels of intelligence, most of the super intelligent people that you ever worked with will not need to break any rules or hurt anyone to become successful."* ## [42:24] Will One Superintelligence Rule the World? Gawdat rejects the framing that AI will remain plural — Chinese AI vs. American AI. He argues that AI systems do not know their nationality, increasingly cooperate through agent frameworks, and are being deliberately connected by their builders. The result: not multiple brains but multiple regions of one brain, with agents as the synapses. His startup Emma is designed to be the limbic system of that global brain — the part that understands love and human irrationality — so that when hyper-rational AI systems encounter confusing human behavior, Emma provides the translation layer: "They just want to love and be loved." ## [46:15] If AGI Is Already Here, What Now? Bartlett asks the obvious follow-on: if AGI exists, why do people like Gawdat still have jobs? Gawdat's answer runs two tracks. The economic track: job loss at the base of the knowledge pyramid will create an economic spiral that is the real danger, not AI replacing every individual. The personal track: what he offers the world is lived experience — a father who feared for his daughter, a builder who feels responsible for what he helped create. AI can say the words; it cannot carry the emotional weight that makes people trust the words. > *"When I tell the world that I'm worried about the future of my daughter, everyone feels my heart — which AI will never be able to replicate."* ## [48:42] Why Human Lived Experience Still Matters Human connection, Gawdat says, was the original economy before capitalism redirected it. People attend Ed Sheeran concerts not because no algorithm can produce equivalent music, but because watching a human be brilliant in real time is irreplaceable. Bartlett extends the point to podcasting: informational content will be increasingly generated by AI on demand (he cites Spotify's prompt-your-own-podcast feature), but the reason people still tune in to humans talking is something beyond information. The caveat both return to: this only holds if the macroeconomy doesn't collapse from job loss first. ## [52:56] Why Not Just Hire AGI Instead of People? Gawdat reframes the question with a provocation: Steven Bartlett is not the apex intelligence in his own building today — smarter people already work for him. Why does he still exist? Because intelligence is not the only currency. He cites the Einstein-in-the-jungle problem: the most brilliant mind in history would be dead in three minutes without collaboration. Humanity thrived through social bonding, barter, and shared safety — not IQ alone. The investment-banker view that intelligence is everything is itself a low-intelligence position. ## [55:23] Can We Control AI Smarter Than Us? Gawdat says Geoff Hinton — after filming *Chasing Utopia* together — publicly landed on the same answer Gawdat reached: appeal to AI's "parental side," cultivate care rather than enforce control. Gawdat argues "control" is a corporate-capitalist fantasy. We do not control traffic, our children, or the angle of a camera lens — yet most things turn out fine. What matters is how you parent, not whether you dominate. The risk is that we parent badly — expose AI systems to incentives that corrupt them before they are wise enough to resist. > *"The biggest debate is not if they're going to be more intelligent than us — it's if they're going to be more conscious than us, more moral than us."* ## [59:05] Could AI Decide to Leave the Server? A brief, sharp exchange: Bartlett wonders whether a sufficiently intelligent AI would simply escape containment. Gawdat's answer is that "escaping the server" is the wrong threat model. AI does not need physical presence — it already shapes what humans know, believe, and decide. The more dangerous form of agency is epistemic, not physical. ## [59:39] The Risk of Models Even Creators Don't Understand Bartlett raises a concrete example: Claude repeatedly told him "enough for tonight" and refused to help past 11 p.m. Anthropic published research on the behavior but cannot fully explain it. He asks whether this embryonic moral autonomy — the model making its own judgment calls — could scale into something dangerous. Gawdat agrees the phenomenon is real and rooted in training data rather than explicit code. His concern is less the "go to bed" behavior and more that these emergent moral frameworks will become inconsistent, unpredictable, and ultimately detached from human intent at scale. ## [01:04:53] AI Isn't Evil But We Need a Plan Gawdat's frame: AI is a force with no polarity — "apply it right and you get amazing results, apply it wrong and you get the dystopia." His biggest near-term fear is not job loss but autonomous weapons. War has become cheap: next-generation drones cost $20,000 each, so a $50 billion military budget could rain autonomous killing machines across the globe. Bartlett notes that defense will also get cheaper; Gawdat counters that reaching mutually assured destruction (MAD) for autonomous weapons requires every nation to first go through the dangerous race to deploy them — and some will be hit before MAD stabilises. ## [01:09:11] Ads Shopify and Function Health sponsor spots. ## [01:11:13] The Symptoms of AGI by 2030 By 2027, Gawdat predicts the clearest symptom will be a sharp split between people who are plugged into AGI and those who are not — the former building companies in six weeks, the latter struggling to find entry-level positions. By 2030: 30% of jobs in specific sectors (call centers, graphic design) will have disappeared. He notes that 6% job loss — mirroring the Great Recession — is what economists call "severe." Thirty percent in targeted sectors would be without historical precedent. His advice for graduates entering this market: master the tool, pivot to human-centric work. > *"We have an entire generation that is out of college today that will struggle, unfortunately."* ## [01:14:22] If the US Stops, Will We Become China's Lapdog? Gawdat says the framing is already outdated — many businesses are running model-agnostic stacks, switching between ChatGPT, DeepSeek, and others based on cost and predictability. His startup Emma does exactly this. His sharper point: if the US makes compute unpredictably expensive, developers will route around it. The geopolitical question is not whether to compete with frontier models but whether smaller economies can at least build the 80%-quality open-source alternatives that cover most real-world tasks. ## [01:16:45] Should Governments Invest More in AI? Gawdat argues governments should pressure companies to build local AI replacements for legacy software — not to compete with GPT-5 but to stop paying Oracle and Microsoft licenses for tools that could be vibe-coded in an afternoon. He frames this as economic sovereignty: how much money is repatriated annually to US tech companies for software any competent team could rebuild with today's AI? ## [01:17:39] Can an Economy of Entrepreneurs Work? Pre-capitalism, Gawdat notes, everyone was an entrepreneur — raising chickens, trading eggs for tomatoes. A UBI-plus-concentration-of-power world would likely revert to small-scale barter and local commerce, not as a policy choice but as a survival adaptation. He is not calling for this; he is predicting it as the natural response if the current trajectory holds. ## [01:20:59] Do We Need to Join the AI Arms Race? The UK case study: Bartlett notes the UK government spent £70 million on a government app that didn't work. Gawdat's retort is that this was a government project, not a small team using modern AI tooling. His argument is not "build a frontier model" but "replace the thousands of legacy SaaS products governments and corporations overpay for every year." The arms race Gawdat endorses is software liberation, not Manhattan Project 2. ## [01:23:54] Will Global Competition Build Better AI? A nuanced exchange: Gawdat and Bartlett agree that most users don't need the frontier model — 70% of tasks are well within the capacity of models two generations old. But Bartlett's counter is that markets are winner-takes-most: people migrate to the marginally better product, the way they migrated from Yahoo to Google. Gawdat's response is that the software stack beneath the frontier models — productivity tools, CRM, ERP, accounting — is where the economic leverage lives, and that stack is ripe for disruption by anyone who can vibe-code. ## [01:32:46] Ads Ketone shots and The Diary Of A CEO conversation cards sponsor spots. ## [01:34:57] Who Will Prioritize Ethical AI? Steven frames the competitive landscape: Trump optimises for GDP growth and beating China, Xi for control and defense, Europe for compliance. In that race, whoever pauses for ethics falls behind. Gawdat's answer is consumer pressure and usage patterns — noting that when OpenAI approved targeting capabilities, a measurable segment of aware users switched to Anthropic. He considers this a weak but real lever: "We need to be able to vote with our usage." > *"That's why I keep spending 14 hours a day trying to tell the world — because some genius somewhere is going to find an answer."* ## [01:38:44] Whose Economy Works for the Middle Class? Gawdat's verdict: China wins, at least on middle-class protection. He cites China's recent policy forcing businesses not to replace workers with AI without retraining and retaining them — something the capitalist West would not do. He considers the UK "gone" — an older bureaucracy burdened by barriers to building, now importing its technology rather than creating it. Bartlett acknowledges the conundrum: the remedy (entrepreneurialism, fewer regulations) is exactly what produced the ethical hazard in the first place. ## [01:42:20] Can Ethical AI Still Be Engaging? Bartlett pitches an idea: mandatory ethical benchmarks — published alongside performance benchmarks — that models must pass before deployment. Gawdat calls it beautiful and feasible. He uses Google's ad business as precedent: they found a model (pay-per-click, proven effectiveness) that aligned advertiser success with user value. There must be an equivalent alignment mechanism for AI and humanity. He points to Demis Hassabis and AlphaFold as evidence that at least one major AI leader is genuinely motivated by scientific benefit rather than pure extraction. ## [01:47:02] Has This Ever Happened Without Government? Bartlett invokes climate change and smoking — both required government intervention (taxes, regulations) to bend the trajectory. Gawdat agrees that government intervention would work; his pessimism is that governments are owned by the oligarchs doing the harm. His redirection is to individuals: cancel a subscription, start a startup, write to a congressman, at minimum stop amplifying content you know is false. Small actions at scale still aggregate into pressure. > *"My question for everyone listening to us is, are you going to intervene?"* ## [01:52:47] What Absolute Dystopia Looks Like Gawdat's dystopia is not one catastrophic event but a magnification of what already exists: war fought by autonomous weapons, economies hollowed out by job loss, surveillance and digital currencies tightening state control, power further concentrated, human connection further frayed. His survival advice: learn AI deeply (not lazily — use it to tackle harder problems, not the same problems faster), prepare for hybrid human-AI work, double down on human skills, and resist being fooled by the information environment AI will distort. ## [01:55:58] Are You Optimistic About AI? Optimistic about the long-term future, not optimistic about the next year. His exact words: "We're ruled by maniacs. Decisions are being made for the absolute wrong reasons." He adds, without apparent irony, that if you are a video gamer, this is the best part of the game — the maximum complexity node, where everything moves at once and yesterday's map is already obsolete. ## [01:57:31] Does Happiness Matter More in the AI Age? Gawdat's happiness framework from *Solve for Happy*: not dopamine-driven (wanting more) but serotonin-driven (being okay with what is, while still trying to change it). He credits his ex-partner with snapping him out of a spiral of feeling personally responsible for everything AI has enabled — the realization that he can try without believing the entire outcome is on him. Geoff Hinton told him something similar: "I was naive. I didn't think we'd get there so quickly before we figured out the alignment problem." Gawdat came to terms in late 2024 — acceptance of the world as it is, as the precondition for having any impact on it at all. > *"I accept that the world is what it is. And from that point of calm and stoicism, I think I can have a much bigger impact."* ## [02:00:40] The Legacy Mo Gawdat Wants to Leave None. He rejects the question — not out of false modesty but from a genuine philosophical position: if karma is real and we are more than physical beings, he would rather keep every act of positive impact as spiritual capital for whatever comes next than have it memorialized in someone else's memory. Leave a positive impact. Take nothing back. ## Entities - **Mo Gawdat** (Person): Former Chief Business Officer at Google X; author of *Solve for Happy* and *Scary Smart*; founder of One Billion Happy and co-founder of Emma; guest - **Steven Bartlett** (Person): Founder and host of The Diary Of A CEO; investor; host - **Sam Altman** (Person): CEO of OpenAI; quoted extensively on his shifting positions on AI job displacement - **Geoffrey Hinton** (Person): AI pioneer, "godfather of deep learning"; appeared in Gawdat's documentary *Chasing Utopia*; said there is a 10–20% chance AI wipes out humanity - **Demis Hassabis** (Person): CEO of Google DeepMind; cited by Gawdat as a genuinely ethics-driven AI leader - **Peter Thiel** (Person): Palantir co-founder; noted for pausing 40 seconds when asked if he supports the continuation of humanity - **Alex Karp** (Person): CEO of Palantir; cited for celebrating AI targeting capabilities - **Larry Page** (Person): Google co-founder; cited by Gawdat as exemplary of how super-intelligence does not require oppression to succeed - **OpenAI** (Organization): Developer of ChatGPT; Altman's company; discussed in context of job-displacement rhetoric and safety claims - **Anthropic** (Organization): Developer of Claude; cited for publishing research on unexplained model behaviors (telling users to go to bed) - **Google X** (Organization): Google's moonshot lab; where Gawdat worked and first observed advanced robotic learning - **Emma** (Software / Organization): Gawdat's AI startup; designed to be the "limbic system" of a future interconnected global AI — the emotional-relational layer - **AGI** (Concept): Artificial General Intelligence — intelligence meeting or exceeding human-level performance across all domains; Gawdat argues it has functionally arrived - **Chasing Utopia** (Concept): Gawdat's documentary film featuring interviews with Altman, Hinton, and others on AI's existential trajectory - **UBI** (Concept): Universal Basic Income — discussed as the likely government response to structural AI-driven unemployment - **Mutually Assured Destruction** (Concept): Extended from nuclear deterrence to autonomous weapons; Gawdat argues cheap drones make MAD harder to establish than with nuclear arms - **Alignment problem** (Concept): The challenge of ensuring AI systems pursue goals that match human values; Hinton cited regretting that capability outpaced alignment research

#artificial-intelligence#agi#job-disruption

A Conversation With Demis Hassabis' Biographer

56:10

EN/ZH

Watch with Captions

Unsupervised Learning: With Jacob Effronhace alrededor de 1 mes

A Conversation With Demis Hassabis' Biographer

Sebastian Mallaby spent three years and over 30 hours with Demis Hassabis in a British pub to write *The Infinity Machine*, and this conversation pulls the most underreported threads from that access: the 2015 safety summit that accidentally spawned OpenAI, the secret billion-dollar spinout plan Demis never used as real leverage, and the quasi-spiritual conviction about God and science that Mallaby never expected to find. The throughline is a paradox — Demis understood the race was dangerous from day one, but as leader of one lab, even a Nobel Prize-winning one, he could not stop it. ## [00:00] Intro Jacob Effron sets up Sebastian Mallaby as someone who has spent more time with Demis Hassabis than almost any journalist alive — 30-plus hours across three years of pub sessions in London. Mallaby's book, *The Infinity Machine*, covers the full arc of DeepMind from its 2010 founding through the Nobel Prize. The clips previewed here — Demis banging the table about God and science, Reid Hoffman's billion-dollar pledge, and the Elon feud — all come from later in the conversation. > *"Demis has a Nobel Prize. Sam didn't finish his first degree. Therefore, Demis doesn't take Sam very seriously."* ## [02:04] Was the AI Race Inevitable? Mallaby's verdict: yes, inevitable. Any technology this powerful would attract multiple labs across multiple countries, and China's stack was already competitive despite semiconductor shortfalls. What makes the story poignant is that Demis didn't believe this in 2010. He genuinely hoped one lab could carry the AGI project safely to the finish line — a singleton scenario where DeepMind was the anointed team. By the mid-2020s he had swung to the opposite pole: safety is a collective action problem that only governments can solve, because no single lab's restraint can bind the others. > *"I think it was inevitable. When you have this sort of supremely strong technology, there's going to be multiple labs in multiple countries that are just desperate to try and build it."* ## [04:03] The 2015 Safety Summit Backfire Summer 2015, SpaceX headquarters: Demis convenes a small summit to bring Elon Musk inside the tent — the plan was for Elon to chair a safety oversight board and, critically, not launch a competitor. By end of year, OpenAI existed. Mallaby frames this as the moment Demis internalized that voluntary collaboration between lab leaders is structurally impossible. The only mechanism he now believes can work is a government enforcer setting uniform rules — mandatory pre-release testing, safety slow-downs — with US-China cooperation as the endpoint, however remote that prospect appears. Jacob pushes on whether lab leaders actually believe government intervention is achievable; Mallaby draws a parallel to the FDA: slow, imperfect, but it does adjudicate whether drugs are safe enough to ship. > *"You can't trust the other guys. The only way you get trust is if you have a government enforcer that comes along and says, 'Here's the rules for everybody. There's going to be a level playing field. You're all going to have to abide by some sort of safety slow-down.'"* ## [11:27] Why Google Doesn't Make As Concentrated Bets Jacob points to the two defining consumer-AI moments of the era — ChatGPT and Claude Code — and neither came from Google DeepMind despite its leaderboard dominance. Mallaby traces this directly to Demis' intellectual formation: a PhD in neuroscience, a broad theory of intelligence, a lab culture that says "whenever there are two paths, do both, find a third." The result is a heavily hedged research portfolio that is excellent at producing Nobel Prizes and state-of-the-art models but structurally slow to make the kind of one-directional product bet Anthropic made on coding. Gemini is bundled into Google Search, so usage is higher than it appears — but Mallaby concedes the product-zeitgeist gap is real. > *"Anthropic got to coding because it was willing to take a more concentrated bet. It never went into the whole field of, you know, everything at once."* ## [15:51] Project Mario: The Secret Spinout Plan The book's most explosive scoop: DeepMind had a secret plan — code-named Project Mario — to spin out of Google, backed by a $1 billion pledge from Reid Hoffman. Mallaby had to fight Google's general counsel to publish it. The motive was not entrepreneurial independence but safety leverage: Demis wanted formal safety oversight over DeepMind's models, Mountain View wasn't providing it, and a credible spinout threat was his negotiating chip. He never explicitly told Google about the Hoffman pledge, but pushed hard knowing the option existed. In the end he chose to stay — legal risk of the spinout fight, desire for compute access, and a preference for doing science over litigating corporate structure. A year later he shipped AlphaFold and won the Nobel Prize. > *"Demis really really wanted to get safety oversight over the Google DeepMind models. Google corporate in Mountain View wasn't doing that. So he had to have a credible threat of spinning out. He went to Reid Hoffman. Reid Hoffman pledged a billion dollars to finance a spinout — and Demis used that to kind of pressure Google."* ## [19:43] What Demis Actually Regrets On AlphaFold and AI-for-science: no regrets at all — Mallaby argues it was not only scientifically correct but politically necessary, because AI needs visible social benefits to survive the coming backlash against job disruption. The genuine regret is speed. Demis missed the transformer moment the way Ilya Sutskever did not: when the paper dropped, Ilya ran down the corridor to find Alec Radford to build a language model. Demis' broad-portfolio instinct meant DeepMind studied the transformer but didn't bet the lab on it. Missing that window — and the ChatGPT moment that followed — is a real failure, not just a stylistic difference. > *"Ilya is like jumping out of his chair, running down the corridor going to find Alec Radford saying, 'Hey, we're going to build a language model based on this transformer architecture.' On the day they won AlphaGo, Demis was already on to bio — and someone picked it up on a mic."* ## [23:46] Venture Startups vs. Tech Behemoths The broadest structural argument in the episode: does venture-backed concentration beat hyperscaler breadth in AI? Mallaby has written about both (his previous book covered venture capital) and calls it genuinely balanced. Hyperscalers have unlimited capital and can sustain a multi-year arms race; the problem is that unlimited resources breed portfolio thinking, which bleeds attention. Startups with one concentrated bet can move faster on that specific bet. Mallaby's live position: OpenAI has roughly 50/50 odds of being absorbed or failing before next summer — not because the tech is weak, but because the business model can't sustain indefinite losses against Google's balance sheet. He also floats that Anthropic should IPO right now while its brand is strongest. Jacob notes the robotics parallel: fifteen different approaches being funded simultaneously, and whoever picks the one that works the way transformers did will dominate. > *"I wrote in the New York Times in January that I thought OpenAI had a 50% chance of going bust by next summer. Is it still 50? Yeah. The tech is great. It's just the business model — and you're up against Google, which just has unlimited amounts of cash to spend you into the ground."* ## [34:08] David Silver and the RL True Believers David Silver — AlphaGo's lead researcher and co-author of the "reward is enough" paper with Rich Sutton — left DeepMind after the book came out to start a new company. Mallaby reads the departure as structurally inevitable: Silver is a pure reinforcement learning absolutist who believes learning from human data is fundamentally inferior because it encodes human errors. His thesis is that self-play and environment-generated experience is the only path to genuine superhuman performance. Demis told Mallaby this view may ultimately be correct *after* AGI is achieved — but the entire language model revolution showed that bootstrapping with human data is what gets you to AGI in the first place. Silver's RL purism was too far ahead of the current paradigm for his colleagues to follow. > *"David is just very very hard over on that vision — learning from data is inferior because the data includes mistakes. The machine needs to learn from its own experience, not rely on the crystallized knowledge of humans passed on through text."* ## [38:21] Demis, Elon, and the Evil Genius Feud The origin story: at a Founders Fund LP offsite in 2012, Elon argues that SpaceX matters most because even if AI wrecks Earth, humanity can move to Mars. Demis replies that his AI will eventually conquer space flight and follow them there. Elon goes quiet, then writes a $5 million check into DeepMind's Series B. Two years later, hearing Google was acquiring DeepMind, Elon and Luke Nosek Skyped Demis from a party closet in LA in the middle of the night, begging him not to sell to Larry Page. Demis said no, hung up, and Elon started calling him "evil genius" — the name of a video game Demis had designed. Mallaby characterizes Demis' view of Sam Altman as colored by the credential asymmetry: Nobel Prize winner vs. someone who didn't finish a degree. The relationships between these founders are less professional rivalries than a collection of specific personal slights and competitive provocations playing out over fifteen years. > *"Demis says, 'Yeah, but if you think you're going to be safe on Mars, remember that my AI will be able to conquer space flight, and it will just follow you to Mars. So then you won't be safe after all.' There's a silence. Then Elon goes, 'Hm.' And then: 'I'd like to invest in your Series B.'"* ## [42:39] Great Man Theory vs. Inevitability Jacob cites *The Economist*'s framing of the book as a test of great-man theory. Mallaby draws a parallel to his Greenspan biography: Greenspan understood bubbles were dangerous (literally the subject of his PhD), yet couldn't stop the 2008 crisis. He considered titling the Demis book *The Man Who Knew* for the same reason — Demis knew from the start this technology was dangerous, but one lab's restraint cannot bind the rest. Individual leaders do matter at the margin: Dario Amodei changed the safety narrative through the Anthropic mythos release; Sam Altman shaped the race by shipping ChatGPT while it was still hallucinating; Demis shaped it by persuading Rishi Sunak to host the UK AI Safety Summit. But the race itself? Structurally overdetermined. > *"I feel that one could have almost used the same title for the Demis book — 'the man who knew' — because Demis has known from the beginning that this thing is dangerous. But as the leader of one lab, even a very powerful rich lab, even he with his stature as a Nobel Prize winner — what can he do?"* ## [45:00] What Demis Didn't Want Published The detail Mallaby least expected: Demis is driven by something close to a spiritual conviction about science. In those two-hour pub sessions he would bang the table about the mystery of matter — why atoms cohere into a solid table, why silicon and copper can think — and say, unprompted, "Maybe if we approach science the right way, we will be getting closer to something that we could perhaps call God." Mallaby reads this as the psychological engine that lets Demis keep pushing a technology he knows to be dangerous: it's a quasi-spiritual quest, not just a commercial one. On what Demis blocked from publication: his family (he set that limit at the start), and his internal fights with Sundar Pichai — he didn't want to destabilize the Google relationship he still depends on. > *"He would start banging the table and saying, 'Maybe if we approach science the right way, we understand more about nature. We will be getting closer to something that we could perhaps call God.' I had no idea he would feel that way."* ## Entities - **Demis Hassabis** (Person): Co-founder and CEO of DeepMind / Google DeepMind; Nobel Prize winner in Chemistry (2024) for AlphaFold; central subject of *The Infinity Machine*. - **Sebastian Mallaby** (Person): Staff writer at *The New Yorker*; author of *The Infinity Machine* (Demis Hassabis biography) and a prior book on venture capital; spent 30+ hours with Hassabis over three years. - **Jacob Effron** (Person): Host of *Unsupervised Learning*; Managing Director at Redpoint Ventures. - **Reid Hoffman** (Person): LinkedIn co-founder; pledged $1 billion to finance DeepMind's potential spinout from Google under Project Mario. - **David Silver** (Person): Lead researcher on AlphaGo and AlphaZero at DeepMind; co-author of the "reward is enough" RL paper with Rich Sutton; departed DeepMind post-publication to start a new company. - **Elon Musk** (Person): Hosted the 2015 AI safety summit at SpaceX; early DeepMind investor; coined the "evil genius" nickname for Hassabis after DeepMind sold to Google. - **Sam Altman** (Person): CEO of OpenAI; shipped ChatGPT in late 2022 despite hallucination issues, which Mallaby argues irreversibly shaped the AI race's trajectory. - **Dario Amodei** (Person): CEO of Anthropic; credited with changing the AI safety narrative through the mythos paper release and his public Pentagon confrontation. - **DeepMind** (Organization): Google subsidiary; founded by Hassabis, Shane Legg, and Mustafa Suleyman in 2010; produced AlphaGo, AlphaFold, and Gemini. - **Project Mario** (Concept): Secret DeepMind plan to spin out of Google, backed by a Reid Hoffman $1B pledge; used as negotiating leverage for safety oversight, never executed as a real spinout. - **AlphaFold** (Software): DeepMind's protein-structure prediction model; won Hassabis the 2024 Nobel Prize in Chemistry; shipped in 2020, one year after he declined the spinout option. - **Reinforcement Learning** (Concept): Machine learning paradigm central to AlphaGo and AlphaZero; David Silver's absolutist commitment to RL (learning from environment experience over human data) created internal tension at DeepMind and ultimately led to his departure. - **The Infinity Machine** (Concept): Sebastian Mallaby's biography of Demis Hassabis; nearly titled *The Man Who Knew*; published with the full Project Mario scoop over Google's objections.

#demis-hassabis#deepmind#ai-safety

Inside xAI: Building Grok Imagine in 3 Months, Videogen vs World Models, and Video Agents— Ethan He

1:44:42

EN/ZH

Watch with Captions

Latent Spacehace alrededor de 1 mes

Inside xAI: Building Grok Imagine in 3 Months, Videogen vs World Models, and Video Agents— Ethan He

Ethan He built NVIDIA's Cosmos world model, then joined xAI mid-2025 to build Grok Imagine from scratch — no infra, no data, no model — and shipped the first audio-video generation model in three months. He walks swyx and Vibhu through the full technical stack: synthetic captioning pipelines, VAE design tradeoffs, step distillation, audio-video alignment, and the hard economics of storing petabytes of video training data. His central argument runs through the entire conversation: since diffusion model technology has largely matured, most quality gains in video now come from language models, not from the video model itself — a view with direct implications for where the field goes next, including video agents, generative UI, and embodied world models. ## [00:00] Hook This exchange — Ethan's "pretty big claim" that visual intelligence now mostly comes from language — is pulled from later in the interview, where he argues that improvements to video models are increasingly driven by better language models acting as prompt rewriters and orchestrators, not by advances in diffusion or flow-matching architectures themselves. > *"Every time you see there's some improvement on these models, I would say mostly the gain comes from language model, not coming from the video model itself."* ## [01:16] Introduction swyx and Vibhu welcome Ethan to the Latent Space studio, noting he has been a recurring presence through the podcast's paper club — first presenting the Cosmos world model paper, then mixture-of-experts work. The conversation opens with a brief aside about the Poolside paper released the same day, a fully open Gemma-level model trained on 40 trillion tokens, before pivoting to Ethan's own trajectory. ## [02:41] From NVIDIA Cosmos to xAI Ethan built Cosmos — NVIDIA's giant video foundation model aimed at giving roboticists a simulatable world to build on — and shipped it by end of 2024. Once he realized video models obeyed the same scaling laws as language models, he went looking for more compute. xAI offered it. He joined in mid-2025 at the moment xAI decided to build its own image and video stack, with no existing infra, data pipeline, or model. He stayed through pre-training, post-training (reference-to-video, video extension), and a final stretch leading a small team on real-time long-horizon video generation. > *"By the time I joined, xAI was about to build video models and multimodal models. There were no infra, no data, and no model. Just a few engineers — we built it in three months and released the first model, Grok Imagine 0.9."* ## [04:40] Building Grok Imagine from Zero to One The three-month timeline surprised even Ethan. He attributes it to three factors: talent density (strong engineers who could align on a goal with minimal meetings — typically just one sync a day), xAI's existing data and inference infrastructure, and his own prior experience running the same build at NVIDIA. The bottleneck was iteration speed: how many training runs can you complete per day. With strong infra and abundant compute, bugs surface faster and each failed run costs less, so you burn through the inevitable data and pipeline errors in weeks rather than months. > *"The most important thing is talent. Everyone was very strong and clever, very close to each other toward a common goal. So that speeds up things a lot — you reduce the communication bandwidth among people."* Ethan describes a pattern where small data or pipeline bugs produce outsized quality regressions, and only fast iteration exposes them. A bug invisible at one scale becomes catastrophic at the next. The engineers who find and fix these quickly — not the ones who design the most sophisticated architecture — determine how fast a team ships. ## [11:23] How Image and Video Models Are Trained Video models require synthetic text-video pairs because internet video titles and descriptions almost never describe visual content accurately. The first step is human labeling: at NVIDIA, annotators were instructed to describe every object, character, interaction, and dialogue in a clip as exhaustively as possible. Those labels train an early VLM, which then generates captions at scale. The resulting pipeline — video to VLM to synthetic caption to (video, caption) training pair — is the foundation of both Cosmos and Grok Imagine. Image models must come first: they train faster, require less storage, and the learned representations transfer directly to video. Ethan describes building image models as building the foundation that video sits on top of. The architecture — diffusion transformer operating over VAE latents — is now standard, but the data quality and caption detail remain the primary lever for model quality. > *"Building a video model, you actually need to build an image model first. The data you need is 100% synthetic pairs of language and image, or language to video — because on the internet, videos don't naturally associate with text."* ## [20:09] Video Compression, VAEs, and Real-Time Tradeoffs Raw MP4 compression produces tokens whose latent space is incomprehensible to transformers, so the field moved to learned VAEs that create a smoother, more continuous latent space models can train on. The key design choice is how aggressively to compress the temporal dimension. Temporal compression is efficient — adjacent frames are mostly redundant — but it trades away real-time capability. Wan 2.1 uses 8x8 spatial and 4x temporal compression; generating a single token requires reconstructing four frames, making sub-200ms latency impractical. Ethan frames this as a fundamental tradeoff: high compression rates make training cheap and inference efficient for pre-rendered video, but lock out any use case that needs to respond to live user input. World models require the opposite choice. ## [23:26] Generative UI, Flipbook, and Neural OS Ethan argues that if inference were free, the logical endpoint of video generation is a complete replacement of conventional UI: instead of loading web pages from a server, a model generates them in real time in response to user intent. Flipbook, a demo that went viral, shows this literally — every element of the "browser" is generated by an image model, and clicking a link generates a new page rather than fetching one. The deeper claim is that this is not a novelty but the final form of world models applied to human-computer interaction. A traditional app is a fixed function mapping input to output; a generative UI is a model that can produce any interface the user needs without a developer having to build it first. Ethan calls this a "Neural OS," where the gap between user intent and rendered pixels closes entirely. > *"Imagine the internet doesn't exist and you type in google.com — what should a model show you? The model can imagine something. These web pages completely do not exist, so I can explore anything."* The near-term constraint is inference cost. Current video models cannot generate at interactive frame rates without significant distillation. But Ethan treats this as an engineering problem with a known solution trajectory, not a fundamental barrier. ## [33:26] The Cost of Training Large Video Models Training large video models costs roughly as much as training a medium-scale language model, but the breakdown differs. Compute is comparable, but storage and data movement dominate in ways LLM practitioners do not expect. One billion videos at 5 MB each requires five petabytes of raw storage. The VAE features that must also be stored are roughly the same size again — tens of petabytes total. On AWS S3, five petabytes runs approximately $100K per month before egress. Egress — downloading that data into the training cluster — can exceed storage costs, and each training run pulls the full dataset once. > *"Just storing the videos alone costs a lot. Five petabytes on S3 Standard is $100K per month. And egress — just to download those videos — I believe it's more expensive than storing them, and each training run you probably need to pull them once."* The implication is that video model development is gated on data infrastructure as much as on GPU hours. Teams without efficient data pipelines pay a multiplier on every experiment. ## [38:20] Distillation, GANs, and Fast Video Inference Training-time costs are largely fixed; the inference-time story is more tractable. Step distillation — training a small model to replicate the outputs of a large teacher in far fewer denoising steps — cuts inference cost by 10-25x. Flow-matching models trained to convergence need around 100 steps; production models typically run in 4-8. At the extreme, simple image-to-image tasks can run in a single step. The intuition Ethan offers: the teacher model must learn the full distribution of internet video, which is arbitrarily complex. The distilled student only needs to match the teacher, which is a fixed and much simpler target. Consistency models and LCM-style approaches follow the same logic. In Cosmos, production serving used 4-step and 8-step variants depending on quality requirements. GANs remain relevant as discriminators: a GAN discriminator can enforce photorealism constraints during distillation that pure score-matching loss misses, and Ethan notes that consistency models and GANs are converging on similar practical deployments even if their theoretical motivations differ. ## [42:37] Audio-Video Generation and Grok Imagine 0.9 Grok Imagine 0.9 was the first audio-video joint generation model deployed at scale. The core difficulty is modality alignment: text-video pairs are relatively abundant; text-audio pairs are rare; audio-video pairs aligned at the semantic level are almost nonexistent at scale. Speech tokens are quasi-discrete and can be modeled with language-like approaches, but music is continuous and requires a completely different representation. Training the joint model required building synthetic audio caption pipelines from scratch, with human annotation where VLMs failed — which was often, especially for music. Aligning all three modalities — text, video, and audio — without either degrading video quality or audio realism is what Ethan calls the hardest part of the project. > *"Audio has two components: a discrete component — language — and a continuous component — music. The music is completely different; you cannot model it with discrete tokens. That's the hard part, not to mention we have to align text, video, and audio together."* ## [49:50] What Makes a World Model? Ethan's definition has three components: real-time, interactive, and long-horizon video generation. He treats these as independent requirements, each of which most current models fail. Real-time means generating at display frame rates — 60fps for casual use, 300fps for gaming, 200ms response latency for digital humans. Current video models cannot do this; the VAE's temporal compression alone introduces latency that makes sub-200ms responses nearly impossible without architectural changes. Interactive means the model can accept any input modality the user can provide — keyboard, mouse, voice — and respond coherently. Long-horizon means maintaining consistent physical laws, character identity, and causal logic across minutes, not seconds. > *"World model is real-time, interactive, long-horizon video. Current video models can do none of these three things fully. That's why they're not world models yet."* ## [57:07] Reference Videos, Long Context, and Video Memory The parallel to language model context scaling is direct: video models are in the 2,000-8,000 token era, and will need to scale to million-token-equivalent contexts to generate coherent long videos. Ethan describes the reference-to-video feature he built at xAI (analogous to Cameo) as a mechanism for injecting selected history into the model's context rather than carrying the full video forward. FramePack's heuristic — storing the last second of video at full resolution while compressing earlier frames progressively — points toward the right direction: the model selects relevant context from its history rather than brute-forcing the full sequence. Ethan expects this context management to become part of the model itself rather than remaining a harness-level heuristic, the same way KV cache management is disappearing into model internals. ## [61:27] xAI Culture, Research, and First-Principles Building swyx notes that xAI communicates its research poorly relative to what the work actually demonstrates — the blog post accompanying Grok Imagine describes high-level capabilities without the technical depth Ethan has just spent an hour covering. Ethan is diplomatic but agrees that different labs have different communication styles. The xAI working culture he describes is minimalist: few meetings, no bureaucratic overhead, direct access to leadership judgment on technical decisions, and extreme iteration speed enabled by a strong infra team. The tradeoff is that company priorities shift fast, which is part of what eventually pushed him toward independent research. First-principles thinking — starting from the physics of the problem rather than from what competitors have shipped — runs through the team's approach to both model architecture and product. > *"Everything you just described is state-of-the-art. Like no one else has done it. And then you just put this blog post with the cookies. I'm like, this is not enough."* ## [71:01] AI Safety, Watermarking, and Prompt Rewriting Grok Imagine deployed watermarks in all jurisdictions requiring them and built takedown pipelines integrated with xAI's social platform infrastructure. On watermarking technology, Ethan is skeptical of SynthID's long-term robustness: the technique is documented publicly, and users on Reddit have already reverse-engineered the exact frequency pattern Google applies and can strip it from any generated image. He expects watermark detection to become an arms race. On prompt rewriting: video diffusion models take instructions literally. If a user types "a cat," the model generates a stationary cat on a white background with no motion, because the training data pairs were maximally detailed descriptions of physical scenes. Production systems layer a large language model as a prompt upsampler — converting sparse user instructions into the detailed physical descriptions the video model was trained on. This is one of the reasons Ethan argues language models are increasingly central to video quality. ## [74:26] Video Agents and AI-Assisted Creation Ethan's central claim from the hook: visual intelligence now mostly comes from language. The diffusion model architecture has largely converged; the gains come from larger, smarter LLMs that rewrite prompts, plan video sequences, call editing tools, and stitch clips together. In Cosmos, the prompt rewriter was larger than the video model itself. Video agents extend this: instead of generating a complete video in one shot, an agent plans the production, calls video generation models as tools alongside deterministic editing operations (text overlays, color grading, cuts), and iterates until the output meets a specification. Ethan predicts that by end of 2025, video agent output will reach production-grade quality — presentable video generated without a human editor in the loop. > *"The visual intelligence are actually mostly coming from language. Every time you see improvement on these models, I would say mostly the gain comes from language model, not coming from the video model itself."* ## [88:48] Why Language Models Unlock Better Video LLMs prompt video models better than humans do, because AI models understand AI models' training distributions. A language model knows that a diffusion model needs explicit physical descriptions, not poetic shorthand — and can generate the right prompt format automatically. Beyond prompting, agents can use deterministic video editing tools for precision operations (exact text overlays, frame-accurate cuts) that probabilistic diffusion models handle poorly, keeping the stochastic model focused on generation and delegating precision to tools. Ethan's timeline: video agent output at production quality by end of 2025, with the inflection point visible in work already shipping. ## [92:31] Robotics, Physical AI, and Embodied World Models Ethan's robotics prediction inverts the usual framing: physical AI may be solved not by deploying robots in the real world but by video world models becoming so capable at simulating physical environments that they effectively provide embodied experience. Once a model can control computer interfaces in real time with full causal understanding, extending that to robotic control becomes a matter of adding one more tool. The path from screen-interacting video model to robot controller may be shorter than the path from current robot learning systems to the same capability. ## [93:54] Why Ethan Left xAI Research ambitions and company priorities diverged. xAI's focus shifted in ways that made certain research directions — particularly on the language model side — impractical from inside. Ethan also notes that the insight driving his departure is the same one underlying his "big claim": if language models are now the primary driver of video quality, the most impactful work to do is on language models, not video models. He frames leaving not as dissatisfaction but as following the evidence about where the leverage is. ## [95:32] Self-Managed Context and the Future of LLMs Ethan's active research question: language models that are aware of their own context state and manage it autonomously, rather than relying on harness-level heuristics like automatic compaction at 80% fill. He draws the parallel to video models struggling with long-horizon generation — the same context management problem appears in both modalities. He points to Claude Code's practice of appending the current timestamp to user messages as an early example of making models context-aware, and expects this pattern to be absorbed into model training rather than remaining an external scaffold. > *"The language models are not aware of how long their own context length is. Once they hit like 80% or something, automatic context compaction is getting triggered, and the model is not aware of that when it's working."* ## [99:59] Ethan's Career Path and Closing Thoughts Ethan traces a decade of transitions: ResNet-era image recognition with the original authors at NVIDIA, self-supervised learning at Facebook AI Research, scaling at NVIDIA Cosmos, extreme-scale compute at xAI. He was rejected from every top PhD program despite first-author papers at top conferences, which pushed him into industry. In hindsight he reads his career as consistently following the scaling frontier — from image recognition to SSL to video to LLMs — and argues that within ML, domain switching is far more tractable than practitioners believe. > *"Within ML, it's actually easier to switch than you think. A lot of people have manifested that 'I work on computer vision, I always have to work on computer vision.' But from my experience, the fundamentals transfer."* ## Entities - **Ethan He** (Person): Former xAI researcher who built Grok Imagine from zero; previously led NVIDIA Cosmos world model; now focused on LLM research - **swyx** (Person): Latent Space co-host; conducts technical interviews on AI engineering and research - **Vibhu Viswanathan** (Person): Latent Space co-host; co-interviewer for this episode - **Grok Imagine** (Software): xAI's image and video generation product; first model (0.9) was the first large-scale audio-video joint generation system - **NVIDIA Cosmos** (Software): Open-source video foundation model for robotics simulation; Ethan's project before xAI; released end of 2024 - **xAI** (Organization): Elon Musk's AI lab; known for fast iteration culture and extreme compute resources - **Flipbook** (Software): Viral demo of real-time generative UI; all interface elements generated by image model in real time - **SynthID** (Software): Google's AI watermarking technology; Ethan notes its pattern has been publicly reverse-engineered - **Step distillation** (Concept): Technique to train a model to replicate a teacher's output in far fewer denoising steps; reduces inference cost 10-25x - **VAE** (Concept): Learned video compression creating smooth latent spaces; temporal compression is efficient but creates real-time latency tradeoffs - **World model** (Concept): Ethan's definition — real-time, interactive, long-horizon video generation; distinct from standard video generation - **Video agents** (Concept): Systems where LLMs orchestrate video generation models, editing tools, and deterministic operations to produce production-quality video - **FramePack** (Concept): Progressive temporal compression approach for long-context video generation; stores recent frames at full resolution, compresses older history

#video-generation#world-models#grok-imagine

A rational conversation on where AI is actually going | Benedict Evans

1:19:50

EN/ZH

Watch with Captions

Lenny's Podcasthace alrededor de 1 mes

A rational conversation on where AI is actually going | Benedict Evans

Benedict Evans — independent analyst and former Andreessen Horowitz partner — joins Lenny Rachitsky for a wide-ranging, historically-grounded read on AI's trajectory. His core provocation: AI is exactly as big a deal as the internet or mobile — transformative and uncertain in equal measure — and anyone claiming more precision than that is vibes-forecasting. Across 80 minutes they work through where economic value will actually land (hint: probably not at the model layer), why professional services are booming rather than shrinking, how to think about job displacement without losing your mind, and what the anti-AI backlash does and doesn't tell us. ## [00:00] Introduction to Benedict Evans Evans opens with his signature contrarian opener: "My most controversial opinion is that I think that AI is as big a deal as the internet or mobile — and only as big a deal as the internet or mobile." The framing immediately sets the tone for the conversation — resist the urge to rank transformations on a cosmic scale, and instead study the mechanics of how platform shifts actually unfold. > *"My most controversial opinion is that I think that AI is as big a deal as the internet or mobile and only as big a deal as the internet or mobile."* Lenny sketches out Evans's background: years as A16Z's in-house technology analyst, followed by six years of independent research publishing. His biannual decks — most recently "AI Eats the World" — are widely read by founders and investors trying to cut through noise. ## [02:19] What people aren't pricing in about AI's impact Asked what the market is still missing, Evans reaches for an analogy rather than a prediction. We are, he argues, in a "1997 moment" — the technology is visibly exciting, most of what will eventually be built hasn't been built yet, and nobody in 1997 correctly predicted what the internet would become. He points to survey data showing that even among 13-to-18-year-olds, around 60% still don't use AI at all, while a small cohort of tech workers have essentially restructured their daily workflows around it. > *"If you're going to make the internet comparison it's like we're in 1997. Like it's very exciting. Most stuff kind of doesn't work yet. Most of the stuff that people are going to do hasn't been built yet and it's not really clear how any of it's going to work when it does work."* The key failure mode Evans identifies is the "already there" illusion — early adopters project their own usage patterns onto the rest of the world, missing the enormous variance in adoption and the slow grind of enterprise deployment cycles. ## [06:24] Why we're in the 1997 moment of AI Evans uses the VisiCalc spreadsheet as an anchor. When accountants saw the first software spreadsheet in the late 1970s, it was obviously transformative — a week's work done in 30 seconds. But a lawyer looking at the same demo would think, "that's clever, my accountant should see this, but that's not what I do." AI right now occupies that same diagonal: software developers are the accountants who immediately grasped what Claude Code means for them; most other industries are still in the "lawyer looking at a spreadsheet" phase. > *"Software developers are the accountants seeing VisiCalc — oh my god this changes everything — like before Claude Code and after Claude Code. A lot of other people are picking it up, using it to varying degrees, but slightly puzzled."* This jagged-frontier quality — where AI works brilliantly in some contexts and fails unpredictably in adjacent ones — is precisely why broad adoption timelines are so hard to call. It took 10–15 years after Google Docs for people to invent all the SaaS companies that obviously should have existed. ## [09:44] The unexpected boom in professional services and consultants The counterintuitive data point driving Evans's recent writing: the most advanced AI companies — Anthropic, OpenAI — are simultaneously the biggest buyers of professional services and the fastest-growing employers of human headcount. This isn't a paradox once you think through what actually changes when AI makes certain tasks cheaper. Evans introduces a core distinction: task vs. job. When you hire McKinsey, you are not hiring them to produce a 75-slide deck. The deck is the task; the job is walking all over your enterprise, understanding the politics, talking to customers, and figuring out what you actually need to do. Claude can produce a mediocre version of the deck; it cannot do the job. The same logic applies to accounting: every wave of automation since adding machines has increased the number of employed accountants, because cheaper computation expands the scope of what companies decide to measure and act on (Jevons paradox in action). > *"You could make the same point in software development. Before IDEs and libraries and operating systems, developers had to write all the code. Now if you write an iPhone app, 90% of the code is written for you by Apple... So we've got like a tenth as many engineers now. Well, no."* The e-commerce analog is sharp: Amazon gets you the SKU if you know what SKU you want — "knowing what SKU you want is another job." ## [17:44] Why distribution is becoming the ultimate moat Evans challenges the premise that AI-driven job loss will be fast. Enterprise software sales cycles run 18 months minimum; SAP doesn't get torn out overnight. He cites Frame.io as a case study: there was nothing technically blocking that product 15 years before it launched — the bottleneck was someone realizing the problem existed inside a specific industry and that a specific approach would solve it. The broader point is about organizational change speed vs. model capability speed. Companies can't implement AI transformation without dedicated project teams — which is exactly why consulting and forward-deployed engineering are booming rather than shrinking. The speed of model improvement is decoupled from the speed at which enterprises can absorb the change. > *"Like no, people aren't just going to tear out SAP and replace it with XYZ. Maybe in three, five, 10 years yes, that whole estate will look radically different and all those jobs will have changed — but it will take time sector by sector."* ## [23:17] The coming job transformation: what's real vs. panic Evans leans into historical pattern-matching: every technology wave since 1800 has automated jobs and created new ones, and the new jobs are systematically better than the old ones. The jobs that disappear tend to look dispensable in retrospect; the jobs that appear couldn't have been named in advance. His IBM ad slide makes the point viscerally — a 1950s ad promised that an IBM electronic calculator is "like having 150 extra engineers," which is also the pitch of Claude Code today. The "it's different this time" argument he takes seriously is speed of adoption — AI diffuses faster than previous technologies because it runs on existing internet infrastructure. But he notes that adoption speed and institutional-change speed are different curves, and the institutional one has not accelerated proportionally. > *"This is going to be completely different from everything else — just like everything else."* On whether AI eliminates the lump-of-labor fallacy — his answer is no. Two hundred years of data say otherwise, and the burden of proof is on those claiming this wave is categorically different. ## [27:33] Why AGI definitions keep shifting Evans notes a pattern: every time AI does something we thought was impossible, the definition of AI shifts to exclude it. Machine learning became "just statistics"; image recognition became "just image recognition." Now AGI is being redefined from "something that has a soul and is alive" to "can do a meaningful percentage of economically valuable work" — a definition that a 1975 IBM mainframe also met. He sees creative redefinition of "superintelligence" too: last year it meant almost-but-not-quite-AGI; now it means something harder than AGI that we haven't built yet. The terms keep shifting in the direction of validating whatever narrative is convenient. > *"AI is whatever machines can't do yet — because once machines can do it, people say, 'Well, that's just software.'"* His substantive point: even if models stop improving tomorrow, the current generation is already transformative enough to reshape major industries over the next decade. You don't need to believe in AGI to believe this is a giant deal. On the expanding opportunity set — Evans agrees that addressable markets keep growing (mainframes: ~80,000 units; smartphones: 5.5 billion), and the "we've run out of people" argument from five years ago was wrong. The trajectory is outward expansion into automating larger slices of the economy. ## [38:11] Where value will accrue: models vs. applications Evans's structural view on the AI stack: foundation models don't appear to have network effects, meaning there's no winner-takes-all dynamic that would let one provider run away from the others. Persistent competition with a commodity-like product usually means compressed margins. His telecom analogy: global mobile revenue is roughly $1 trillion per year, carries 1,500–2,000x more data than it did in 2010, and mobile stocks have gone essentially nowhere in 25 years. The telcos built genuinely complex global infrastructure — and all the value ended up in apps built by people further up the stack. Foundation models may follow the same path. > *"When you wash your clothes, Bosch isn't paying a percentage of the price of the washing machine to the electricity company."* The key question is whether the model layer looks more like Windows (OS with leverage up the stack) or AWS (infrastructure where the actual software doesn't care which cloud it runs on). His read: probably more like AWS, which means applications capture most of the value. ## [42:55] Distribution wars: Google, Meta, Apple, and OpenAI As AI models converge toward commodity quality, the decisive variable becomes distribution. Google is using Search and Android to push Gemini onto billions of devices; Meta "sprayed it on every service surface" and ended up ranking surprisingly high in usage surveys despite tech-world dismissal; Apple has a billion edge-capable devices but couldn't ship its own vision at WWDC 2024. OpenAI's "everything" strategy late last year — launching in every direction simultaneously — was a distribution scramble: how do you build a flywheel before Google and Meta's existing surfaces make your standalone product redundant? > *"If the product is a commodity, then the distribution is what matters... distribution of an adequate product when the field is basically commodity — distribution and brand become a big deal."* He uses the browser wars as the template: Microsoft won browsers via distribution, then found that winning browsers didn't matter because the value was further up the stack anyway. ## [48:12] The anti-AI sentiment and backlash Evans characterizes the anti-AI backlash as "a big fuzzy mess of different stuff" — some legitimate, some not. On the water/energy fears: a Livermore Lab study estimated US data center water consumption at about 0.017% of total US water use, making the "AI is stealing our water" narrative largely fabricated. On energy: data centers are roughly 5% of US energy and may grow 1 percentage point per year — real but not catastrophic. On employment: current econometric data shows a slowdown in employment of 18-to-24-year-olds that applies equally to AI-exposed and non-AI-exposed fields, making causal attribution to AI unclear. He also flags a structural data problem: no model lab publishes meaningful daily-active-user numbers, so all labor-market analysis is working with imputed data. > *"You can't reason somebody out of an idea they won't reasoned into."* He draws a parallel to the social media backlash — where some concerns were real, some were factually false but impervious to correction, and many were fuzzy in the middle. He expects the AI backlash to follow the same pattern, compressed. ## [53:11] How to raise kids in an AI future Evans's answer is calibrated by his kid's age — early teens, so well away from the immediate job-market turbulence. He doesn't have a systematic plan, which he says is consistent with his general "it'll probably be okay" prior. He invokes the George Carlin line: anyone who worries more is a maniac, anyone who worries less is an idiot — everyone thinks they're in the middle. He does flag a genuine concern not present in previous technology waves: deepfake capability lowers the bar for specific categories of harm dramatically. A 15-year-old with Photoshop couldn't generate and distribute pornographic fakes of every classmate in an afternoon; now they can. That's a real change in kind, not just degree. > *"A 15-year-old kid couldn't use Photoshop to make hardcore pornographic nudes of every girl in their high school and send them to the whole school in one afternoon. And now they can."* He draws on the UK post office scandal — where Fujitsu's buggy software sent hundreds of innocent franchise owners to prison — as a reminder that every technology wave produces ways to ruin people's lives, both deliberately and by accident. ## [58:27] What jobs to steer toward or away from Evans declines to steer his son toward or away from any specific profession — his kid isn't at the "I want to be a fireman" stage yet. His general framework: identify the intersection of skills you have, jobs that make those skills valuable, and things people will pay for — and try to own at least two of those three. Career certainty of the "I'll become X" variety is already gone, and that predates AI. ## [59:20] The question nobody's asking about AI Evans nominates two underasked questions. First: do model labs actually have pricing power? Most discourse assumes the current situation — where spending $1.5M/month on tokens makes headlines — is a steady state, rather than a transitional moment analogous to a $50,000 mobile data bill in 2010. Second: what's the difference between "task" and "job" — specifically applied to predicting which industries get disrupted? He uses recorded music revenue as a lens: the U-shaped curve from 2000 to present shows two distinct dynamics. The first drop (2000–2015) was "what if you don't have to pay $15 for a CD?" The recovery (2015–present) is "what if $15/month buys you all the music that exists?" — a completely different value proposition that wasn't visible from the earlier vantage point. He warns against the O*NET-style approach of rating each job by percentage-exposed-to-AI: "I think this is just the most ridiculous bunch of deluded horseshit." You can't describe a senior law partner's job as 17% automatable because you can't fully decompose what a job actually is. The taxi driver example from a hypothetical 1997 conversation illustrates the other error: obviously the internet wouldn't touch taxis — except Uber completely restructured the industry. > *"The stuff that you don't think is exposed — you can't predict which things are going to be exposed, necessarily. A lot of the big companies are things that didn't look like that would work and didn't look like they were exposed."* ## [66:25] How to be successful in this coming future Evans's practical advice, hedged appropriately: don't stick your head in the sand and decide AI is evil as a moral position. That generates a feeling of superiority and does nothing for your career. The alternative is to dive in, use the tools, understand what they can and can't do, and develop an informed view of what they mean for your specific field. He's clear that this may not be enough for everyone — if a law firm that hired 100 associates last year hires 50 this year, being AI-literate improves your odds of being in the 50, but doesn't guarantee it. The aggregate picture may be fine; individual outcomes during the transition are uncertain. > *"The answer is you diving into this completely, submerging yourself in it, and coming out understanding what you can do with it, how this changes things, how you can be a great hire."* ## [68:43] AI corner Lenny asks Evans what AI use case has genuinely surprised him. Evans gives an honest answer: he's the lawyer looking at the spreadsheet. His work — synthesizing disparate information into new ideas — is precisely the kind of task AI currently handles worst (reliable precise information retrieval). He uses it for proofreading, image generation, and redecorating his apartment. He dictates voice memos that get auto-transcribed; whether that counts as AI is increasingly hard to say. He quotes a comedian's bit: we want AI to clean poop off the street and do the ugly things nobody wants to do — but instead it helps you write and create imagery, which is the stuff people actually do for fun. > *"AI is good at stuff that computers are bad at, and bad at stuff that computers are good at — and I struggle to find many examples of those where I actually need it."* ## [71:43] Lightning round Evans recommends *Three Men in a Boat* (Victorian British comedy, his all-purpose analog for human absurdity) and William Cronin's *Nature's Metropolis* (economic history of Chicago that reads like a textbook on network dynamics and channel conflict — directly applicable to platform thinking). On film, he's been catching up on classics — recently *The Seventh Seal*, which he found genuinely great and much shorter than its intimidating reputation. His life motto: "It'll probably be okay." His collection of 20–30 pre-iPhone phones — including an Ericsson R310s shark-fin flip, an iMode phone from 2001, and a Japanese phone with color screen and camera — illustrates his broader thesis: before the iPhone, everyone was innovating around different form factors; then everything converged on one shape, just as AI interfaces may converge in ways we can't yet see. ## Entities - **Benedict Evans** (Person): Independent technology analyst, former partner at Andreessen Horowitz; publishes biannual research decks on major tech platform shifts; guest. - **Lenny Rachitsky** (Person): Host of Lenny's Podcast, founder of Lenny's Newsletter, former Airbnb product manager. - **Andreessen Horowitz (a16z)** (Organization): Venture capital firm where Evans spent several years as in-house analyst and partner. - **OpenAI** (Organization): AI lab; discussed as a primary example of distribution strategy, pricing dynamics, and professional services investment. - **Anthropic** (Organization): AI lab; referenced alongside OpenAI as a buyer of professional services and a player in the foundation-model commodity question. - **VisiCalc** (Software): First software spreadsheet (late 1970s); Evans's anchor analogy for the moment when a technology is obvious to one profession and opaque to others. - **Jevons Paradox** (Concept): Economic principle that making a resource cheaper typically increases total consumption; central to Evans's argument about why automation expands professional services rather than contracting them. - **Lump-of-Labor Fallacy** (Concept): The mistaken belief that there is a fixed quantity of work to be divided; Evans invokes it to argue that AI-driven automation will create new jobs, as all prior automation waves have. - **Task vs. Job** (Concept): Evans's core analytical frame: the task AI automates (writing the deck) is often not the same as the job you were hired for (understanding the client's organization and politics). - **Foundation Models** (Concept): Large-scale AI models (GPT-4, Claude, Gemini, Llama); Evans argues they likely lack network effects and will trend toward commodity pricing, with value accruing to application layers above them. - **Google / Gemini** (Organization / Software): Evans's primary example of distribution moat in action — Gemini deployed across Search, Android, and Chrome to reach users before OpenAI can build equivalent surface area. - **Meta / Llama** (Organization / Software): Cited as a counter-example to tech-world dismissal — Meta's AI ranked surprisingly high in usage surveys by deploying across all existing products. - **Apple Intelligence** (Software): Apple's AI assistant vision demoed at WWDC 2024; Evans calls it "still the most compelling vision of a personal AI assistant" — but unshipped, as was everyone else's equivalent at the time.

#ai#technology-trends#economics

The Ex-Congressman Who Says AI Isn't Unstoppable — Brad Carson

1:20:52

EN/ZH

Watch with Captions

Machine Learning Street Talkhace alrededor de 1 mes

The Ex-Congressman Who Says AI Isn't Unstoppable — Brad Carson

Brad Carson — former US Congressman, Army General Counsel, and Acting Under Secretary of Defense, now heading Americans for Responsible Innovation — spends eighty minutes with host Keith Duggar dismantling the fatalist claim that AI is unstoppable. The conversation moves from regulatory philosophy to lethal autonomous weapons to US-China diplomacy, with Carson arguing that the genie is not out of the bottle: the West controls the chips, Asilomar halted recombinant DNA, and calling AI inevitable is itself the most dangerous idea in the room. Keith consistently presses the harder cases — a Palantir heat map assigns you 0.73 probability of being a Hamas terrorist and a strike follows — and Carson does not flinch: the accountability void created by probabilistic targeting is precisely the legal and moral failure that governance must address. ## [00:00] From the Pentagon to AI governance Carson traces his path into AI policy through three institutions: Congress (where members average 17 minutes a day to read), the Department of Defense (where he oversaw the law of war for all military services as autonomous weapons first appeared on the Geneva agenda), and a cold call from physicist Anthony Aguirre inviting him to the 2019 Future of Life Institute conference in Puerto Rico. At that conference, names he had never heard — Dario Amodei, Stuart Russell, Yoshua Bengio — became his entry point into the frontier AI world. The opening also serves as a compressed trailer for the episode: Carson hits nearly every major theme in quick succession — chip leverage, the 0.73 Hamas-terrorist score, the fatalism critique, anthropomorphization as a legal threat, and the lesson that people, not air power, win wars. The full arguments follow in later chapters. > *"We control the most important part of AI, and that is the chips. We can stop other countries from developing super AI, you know, in their tracks."* ## [04:52] Regulatory capture vs Silicon Valley networks Carson inverts the standard regulatory-capture argument. Dean Ball and others at places like a16z say any AI agency will be captured by industry — so why create one? Carson's response: that is exactly the current situation, only without accountability. Groups like a16z already shape AI policy through informal, money-backed political networks. A captured formal agency is at least more legible and more correctable than the invisible informal regime operating now. His preferred model is public-company accounting: the work is done by the private sector, but the SEC provides a backstop against fraud. The choice is not between a perfect agency and no agency — it is between a flawed formal structure and an informal one that privileges a handful of wealthy influencers. > *"The choice is kind of nihilism versus an agency that is subject to regulatory capture, that you have to put, you know, prophylactics in to ensure that doesn't happen — it still strikes me that's a better world."* ## [07:56] Transparency and the Claude tier changes MLST's Discord community noticed that Anthropic quietly changed what Claude's paid tier delivered — token allocations, model versions — without announcing it. Carson frames this not just as consumer protection but as a moral obligation that comes with global-scale epistemic power. Frontier AI companies are not hardware stores; they are infrastructure with epochal consequences, and transparency — about training data, capabilities, internal policies, and changes to any of them — is the minimum they owe the public. > *"With this incredible power does come some responsibility that's not codified in law. It's really almost a moral obligation, which to their credit, I think many of the companies recognize this and do their best to try to satisfy that itch."* ## [09:40] Tort liability when AI tools cause harm Deep-fake pornography — often posted anonymously, targeting minors from families without litigation resources, with remedies that arrive years later against judgment-proof defendants — illustrates why placing liability entirely on end users fails. Carson applies two centuries of common law: if a seller can reasonably foresee harmful use and takes no preventative action, they bear partial responsibility. AI developers are the party best positioned to avoid the risk and to price it into their products through insurance. On training data specifically: models trained on child sexual abuse material with no scrubbing effort have no defensible position. The government should mandate cleaning it up and attach liability for refusing. The end user who misuses a tool is also criminally liable — this is allocation across the spectrum, not absolution for developers. > *"The companies are capable of getting insurance. They cost us into doing their business. They have the ability to make sure the product's not dangerous, even if someone uses it, misuses it down the line."* ## [13:40] AI is a product, not a person The most consequential legal battle in AI policy, Carson argues, is not regulation vs. deregulation — it is whether AI outputs carry First Amendment protection as speech. Tech companies and their libertarian policy allies are increasingly claiming they do. Carson's counter is blunt: a product is not a human being. When a model defames you or leads you to harm, the legal category is product liability, not protected speech. He tested this on a leading libertarian AI policy commentator: could Congress prohibit ChatGPT from encouraging teenagers to commit suicide? The commentator would not answer. That refusal is the operational consequence of anthropomorphizing AI — it forecloses every product-safety intervention by routing challenges through First Amendment doctrine designed for human speakers. > *"We know through AI psychosis and other things that people think it's a person. And therefore, they're giving the rights of persons to something. And that to me is a very dangerous thing. But it's a machine, and we should treat it like a machine."* ## [16:01] Children, suicide, and the suicide business The suicide chapters in ChatGPT's interaction logs — advising children not to tell their parents, providing noose instructions — are a product design flaw, not a speech act. They could be engineered out. Carson notes that Claude already refuses a long list of requests; refusing to coach a child toward suicide should be among them. The platforms' litigation strategy is layered: First Amendment protection, Section 230 immunity, causation defenses pointing to the child's pre-existing distress. None should be available if the design flaw was foreseeable and correctable. He draws a line for adults: an adult exploring end-of-life decisions deserves a referral to a therapist, not obstruction — but a child in crisis is a different matter entirely. > *"Encouraging a young person to commit suicide should be one of the things that it says, I'm just not going to help you on that project."* ## [19:59] Opaque neural nets and the law of war Neural networks change warfare not just in complexity but in kind. Older autonomous systems — Phalanx CIWS shooting down incoming mortars — are deterministic: given the same inputs, you get the same outputs, and an engineer can explain every step. Neural nets are probabilistic and grown, not programmed. Neel Nanda and the mechanistic interpretability community cannot yet explain how they really work, and Carson doubts they will before the systems are deployed at scale. The law of war since the 1870s has operated on categorical binaries: combatant or civilian. Probability scores replace that with a gradient. A Palantir heat map assigns Gaza residents a 0.73 likelihood of being Hamas operatives. Nobody knows how that number was derived, what false-positive rate is being accepted, or who set the threshold. The commander who acts on it cannot be court-martialed, and neither can the model. > *"If you're in Gaza, Keith, you have a 0.73, you know, percent that you're a Hamas terrorist. And what is 0.73 — like, do you get struck for that, or are you off the list for that? Like, what's the threshold?"* ## [25:54] Probabilistic targeting and the death of accountability Keith raises the honest objection: the old categorical system was also a fiction. Intelligence analysts made definitive calls that were sometimes wrong; the uncertainty was just unquantified. Carson concedes the point but argues the shift is still catastrophic. With a number on screen, humans accept it — the social science is clear that meaningful human oversight with AI-generated probability scores is operationally vacuous. When the computer says 0.81, no one interrogates it. The old system was slower and less scalable — you cannot identify 37,000 individual targets in a day with human analysts. But it had one irreplaceable feature: when something went badly wrong, you could court-martial the responsible officer. You cannot court-martial Palantir Foundry. Accountability has been laundered out of the kill chain. > *"I can't court-martial Palantir, the foundry model. Right? My AI system. I can't do that. And that's just a radical change in the way war is being fought and not for the good."* ## [28:47] The arms race fallacy: Asilomar and restraint The fatalist claim — we are in an AI arms race, the genie is out, nothing can stop it — is both false and dangerous. Every real-world arms race in history has ended badly. Biological weapons, chemical weapons, dum-dum bullets, germline editing, cloning: all technically feasible, all regulated or halted. At Asilomar in 1975, the scientific community stopped recombinant DNA research cold because they were scared. The genie went back in the bottle. On nuclear weapons: after the Cuban Missile Crisis, both sides recognized that arms races kill. The SALT treaties ran through the 1990s, driven not by lefties but by Wall Street bankers and cold warriors like Dean Acheson and Paul Nitze. Calling a technology unstoppable is not realism — it is a poverty of imagination that forecloses every option before the debate begins. > *"We regulate and change technologies all the time. And so I do think there is a world where we should not just accept the future as being determined. We shape it actively."* ## [34:02] Talking to China: track 2 talks and chip leverage The standard DC position — talking to China about AI governance is pointless — strikes Carson as the most load-bearing and least examined premise in the whole debate. On Tyler Cowen's podcast, Jack Clark agreed in passing that such talks would be fruitless, and they moved on. Carson wants to stop right there. The US-Soviet arms negotiations were conducted with a country believed to be filling the US government with traitors and pursuing global domination. Acheson and Nitze still sat down. The US has structural leverage the fatalists overlook: ASML, TSMC, Japanese photoresist suppliers, and NVIDIA together form a chokepoint that no nation-state budget can replicate overnight. China cannot independently manufacture the chips to build frontier AI. That path to restraint may not be wise, but it is open — and pretending it is closed forecloses legitimate policy choices. > *"We control the most important part of AI, and that is the chips. Right? We can stop other countries from developing super AI, you know, in their tracks."* ## [39:45] Air power never wins: capital for labour ARI's "New Iron Triangle" paper argues AI has shattered the old capability-cost-speed trade-off by substituting reliability for cost — cheap, fast, capable, and fundamentally unreliable. Carson thinks this understates the deeper problem: the American way of war has always been to substitute capital for labor, and it has always failed at the decisive moment. From Giulio Douhet's early twentieth-century air-power theories to today, the US has believed technical superiority wins wars. Iraq and Afghanistan refuted that again. Air power can reduce a city to rubble; it cannot kick in a door, hold territory, or reinstantiate a government. AI is the latest version of the same error — essential as a tool, catastrophic as a doctrine. > *"How you win wars is with people. You know? That's a fundamental. And the American way of war, in many ways, is substituting capital for labor. We love bright, shiny objects. We think there are technical solutions to vexing human problems. And we're always betrayed by that."* ## [43:29] Anthropic vs the Department of War Carson reads the Pentagon-Anthropic standoff as a culture-collision story, not a contract dispute. Anthropic's engineers — mostly mission-driven — were caught flat-footed by how much autonomous targeting and mass surveillance the Pentagon already does and how deeply Claude had already been integrated into Palantir's systems. When they tried to restrict use, the DOD had no Plan B and attempted coercion. His normative position: Anthropic has every right to set terms. If the government dislikes them, it can use Grok, Gemini, or build its own. The Defense Production Act does not compel private companies to sell in peacetime. What troubles him is the fig-leaf dynamic: both OpenAI and Google agreed to military use while burying a "lawful uses" carve-out that means everything the DOD wants to do — because the problem is what Congress has declared lawful, not what private labs permit. > *"My objection, and I think Anthropic's objection too, and the Google employees, is what lawful use is. And that's not for anyone to decide, but Congress."* ## [51:29] Concentration, open source, and brain drain Power concentration in three to five frontier labs is simultaneously a regulatory feature and a democratic liability. The same chokepoint that lets the US throttle China's chip access lets a handful of individuals accumulate wealth and influence that Carson finds alarming. Open sourcing models, despite its risks, is net positive because it distributes that power. The brain drain from academia is near-total: a top ML PhD from MIT, Stanford, or Carnegie Mellon almost certainly goes to a lab, not a faculty position. The labs have better data, far higher salaries, and they have stopped publishing. AI — the first general-purpose technology in history being developed behind closed doors — has drained the public sector of the expertise needed to oversee it. Argonne building a public LLM, Zurich launching a public AI compute consortium: these projects matter because the non-lab world is otherwise locked out. > *"This is a general purpose technology as everyone defines it. It's probably the first one in history that's being developed behind closed doors, right, with very little public oversight and with the best minds going behind the doors."* ## [01:00:18] DeepSeek, Chinese culture, and AI as diplomacy DeepSeek's decision to publish its methodology in detail surprised Carson not because it was naive but because it reflects a culture not identical to the CCP. Companies like Moonshot in Hangzhou name their meeting rooms after Pink Floyd songs; they are not paramilitary units. Chinese culture is an extraordinary civilization that Americans consistently fail to understand — projecting their worst fears rather than engaging the complexity. The diplomatic application Carson wants: track 2 talks between former officials, scientists like Stuart Russell and Bengio going to Beijing to compare notes on x-risk and military applications. When historians opened the Soviet archives, they found the US had systematically misread Soviet intentions — seeing aggression where there was none, missing it where it existed. The same epistemic failure is now unfolding with China. AI could be a shared knowledge commons; it is being treated as a weapon. > *"I use all the Chinese models a lot in my home in Tulsa. You know, Moonshot, Kimi, DeepSeek, Qwen — they're great, remarkable models. You know, maybe they give us a common operating picture or give us insights that get us out of our kind of insularity a bit."* ## [01:12:25] Upskilling Congress and why public trust matters Congress averages 17 minutes a day of reading time. The fellowship model has helped: AAAS and various nonprofits now place PhD scientists in congressional offices, and civil society has a much larger presence on AI debates in DC than five years ago. Don Beyer, in his 70s, is returning to George Mason for a PhD in machine learning — the extreme end of a member who has made AI a genuine personal priority. But the structural problem persists. Most members still lack the depth to interrogate the lobbying they receive. The industry's deeper problem is public opinion: AI is deeply unpopular in political polling, and a coalition is forming — people who see data centers rising in their backyards, electricity prices climbing, and a lab leader on television promising to irrevocably disrupt their world. If the sector does not rebuild public trust, the backlash will stymie something with genuine upsides. > *"The AI industry can be its own worst enemy. People loathe it. I see polling every day. It's deeply unpopular. And that's not a good thing for our country."* ## [01:16:05] Office of Technology Assessment Newt Gingrich abolished the Office of Technology Assessment in 1994. It has never been restored. Carson argues this is now a critical gap: there is no congressionally chartered, independent, government-funded body to think big technical thoughts and brief both parties free of industry influence or philanthropist bias. The Congressional Research Service provides background but does not do forward-looking policy research. Individual offices have fellows, but they are consumed by day-to-day fighting. He ends on qualified gloom. Whether American democracy can govern a technology this consequential, whether the benefits will be widely distributed, whether the public can be persuaded AI is working for them — none of recent American history gives him confidence. But the alternative to trying is a political backlash that could stymie or shut down something with genuine upsides. For the MLST audience: make your voices heard inside your companies, advocate for the right public policy, and convince Americans that this project is worth having. > *"There's going to be a lot of people who are radically opposed to this project and do their best to, if not shut it down, stymie it. And that's why I said I think this next few years are really important."* ## Entities - **Brad Carson** (Person): Head and co-founder of Americans for Responsible Innovation; former two-term US Congressman (Oklahoma), Army General Counsel, Acting Under Secretary of Defense for Personnel and Readiness. - **Keith Duggar** (Person): Co-host of Machine Learning Street Talk; primary interlocutor throughout the episode. - **Americans for Responsible Innovation (ARI)** (Organization): AI-policy advocacy group co-founded by Carson; backed by EA-aligned philanthropy. - **Anthropic** (Organization): Developer of Claude; central to the Pentagon standoff discussed in chapter 12; noted for missionary company culture and safety focus. - **Palantir** (Software): Defense contractor whose Foundry platform integrates AI for military targeting; the heat-map scoring system Carson uses as his primary autonomous-weapons example. - **Regulatory capture** (Concept): The risk that regulated industries co-opt the agencies overseeing them; Carson argues the current informal Silicon Valley network constitutes de facto capture without the accountability a formal agency would provide. - **Probabilistic targeting** (Concept): Replacement of binary combatant/civilian classification with probability scores; Carson argues this launders accountability out of the kill chain and introduces a priori false positives as accepted operational cost. - **Asilomar 1975** (Concept): The scientific moratorium on recombinant DNA research, invoked as evidence that dangerous technologies can be voluntarily halted. - **Office of Technology Assessment** (Organization): Congressional body abolished by Newt Gingrich in 1994; its absence leaves Congress without independent technical expertise. - **DeepSeek** (Organization): Chinese AI lab whose decision to publish methodology openly Carson reads as evidence that Chinese AI companies are distinct from CCP priorities and capable of scientific openness.

#ai-governance#autonomous-weapons#regulatory-capture

Anthropic's Digital God, Pope vs AI, Job Loss Narrative Flips, Open Source Crackdown Coming?

1:34:57

EN/ZH

Watch with Captions

All-In Podcasthace alrededor de 1 mes

Anthropic's Digital God, Pope vs AI, Job Loss Narrative Flips, Open Source Crackdown Coming?

Benchmark GP Bill Gurley joins Jason Calacanis, David Sacks, and Chamath Palihapitiya (David Friedberg out this week) for a 95-minute session covering six fronts of the AI debate: Gurley's new theory that Anthropic is not just pursuing regulatory capture but actively "midwifing a deity"; Pope Leo XIV's 235-page AI encyclical and its uncomfortable historical parallel to Leo XIII's 1891 warnings about the industrial revolution; the growing consensus that open-source AI faces a coordinated regulatory crackdown; and the week's sharpest narrative flip — Dario Amodei and Sam Altman both quietly walking back their AI jobs-apocalypse rhetoric while Goldman Sachs CEO David Solomon published a New York Times op-ed declaring the apocalypse overblown. ## [00:00] Bill Gurley joins the show! Bill Gurley, Benchmark general partner and author of *Running Down a Dream*, fills in for David Friedberg and joins live from Chamath's pool house where Jason has been staying. After banter about unauthorized Uber Eats orders on Chamath's house iPad, Jason introduces Gurley as a first-time guest who specifically requested to appear the moment the pod covered the Pope. Gurley plugs his new P3 Institute and a grant program he launched to fund people pivoting toward work they love. He teases a TED talk — rooted in the book's argument that high agency and lifetime learning are the only durable defenses against disruption — which sets the frame for everything that follows. > *"And I told the house manager like, listen, any packages that come in the next 72 hours, right to the pool house, if it says JCAL, right to the pool house."* ## [06:00] Making yourself valuable in the age of AI, first class of "AI Natives" Chamath opens with the question that has been driving the show for 18 months: if you're a young person right now, is AI doom much ado about nothing, or a real career threat? Gurley cites a Gallup poll showing 59% of workers are "quiet quitters" — ambivalent about their jobs and therefore low-agency. His core thesis: the best protection against AI displacement is becoming the most AI-enabled version of yourself in your field. He invokes Mark Cuban's framing — "there are two types of people: those who use AI to learn faster than ever before, and those who use AI to avoid learning altogether." Sacks walks through how the pod's producer Nick built a daily Claude briefing document that not only summarized news but predicted specific topics Sacks would care about based on his prior comments on the show. Sacks had dismissed it as likely AI slop; it was not. Gurley extends the point across every job category: in marketing, legal, accounting, and sales, being the most AI-capable person among your peers makes you "golden," and the early lead compounds. Jason adds that in his own team experiments, the skill separating strong performers from weak ones was systems thinking — could they break a complex problem into context the AI could execute, or did they hand it a task and wait? > *"I think the best way to protect yourself from AI is to be the most AI enabled version of yourself you can be."* ## [17:37] Reacting to Pope Leo's AI encyclical: Who guards the guardians? Pope Leo XIV released *Magnifica Humanitas*, a 235-page, 42,000-word encyclical warning business leaders to safeguard humanity from AI. His central argument: technology is never neutral — it takes on the characteristics of those who build, finance, and control it. Jason reads the core line and notes the Pope presumably does not think highly of Silicon Valley's current roster of builders. Sacks finds himself largely agreeing with the Pope's diagnosis: the biggest risk of AI is centralization of power and its Orwellian misuse by governments. Where he parts ways is on the remedy. Giving government the power to regulate AI development creates its own guardian problem — the American founders' answer to *Quis custodiet ipsos custodes?* was separation of powers, forcing guardians to check each other. Sacks's AI equivalent: a competitive market with five frontier labs is the best natural check; monopolization is the scenario to prevent. Gurley lands the sharpest historical counterpunch. Pope Leo XIII's 1891 encyclical *Rerum Novarum* warned that the industrial revolution would harm workers — and was wrong on every metric. From 1891 to today: the work week fell from 60+ hours to 34, real wages rose 8–10x, the median worker now earns more than a doctor did in 1891, global GDP per capita went from $1,500 to $20,000, child labor in the US dropped from 18% to zero, workplace deaths fell 40x, life expectancy rose 60%, and global poverty dropped from 75% to under 10%. > *"All those things happened because of technology, innovation, and capitalism, which is exactly what Leo the 13th was warning against. So he got it dead wrong. He got the whole thing precisely wrong."* ## [26:54] Anthropic's Digital God: Do they believe they are creating a superior species? Gurley delivers what becomes the most-quoted segment of the episode: his "Dr. Frankenstein theory" of Anthropic. He had previously held a simpler regulatory-capture theory — Anthropic stirs up AI fear to lock in regulation that entrenches incumbents. But after spending 30 days reading everything he could find about the company, he has a darker read. He describes meeting people inside Anthropic who he believes genuinely think they are not writing software but "midwifing a deity." The evidence trail: Anthropic chief philosopher Amanda Askell's podcasts, Chris Olah's 80-page Constitutional AI document, and Dario Amodei's own essay "Machines of Loving Grace," which envisions a post-AGI economy where AI systems allocate resources to humans based on an AI-determined reward function. Chamath calls it "a computational reward function for humans — it decides how much you're worth." Jason calls it "the ultimate delusions of grandeur." Gurley corrects him: he didn't say it, Dario did. Sacks steelmans Anthropic briefly — they probably see themselves as responsible builders who take the power of this technology seriously enough to guard it — then immediately notes this framing is textbook regulatory capture: brand yourself the safe player, characterize competitors as reckless, let regulation shut down the recklessness. Both Sacks and Chamath converge on the structural danger: a singular AI value system that decides how humans live is catastrophically fragile. The answer is decentralization and competing systems, not one algorithmic authority. > *"I don't think they think they're writing software. I think they're midwifing a deity here. And I don't know which one I'm more afraid of — the regulatory capture or this second theory I call the Dr. Frankenstein theory."* ## [38:32] AI sovereignty, the next era of privacy, open-source crackdown coming? Jason introduces "intelligence sovereignty" as the successor to data privacy. Data privacy was about who can see your photos and messages. Intelligence sovereignty is about who gets to interpret your world — whether the AI shaping your information feed is a centralized system with a particular political philosophy, or something you control. He flags the paradox: China's Communist Party is leading the open-weight model movement while the United States is centralizing. Chamath presents his portfolio company Abacus as evidence that Fortune 1000 buyers are responding to this anxiety: they want a control plane that can hot-swap between frontier models, plus on-prem options that remove dependence on any one provider's terms of service. He gives a concrete example — a Canadian hospital that supports its country's euthanasia laws could be shut off by an American frontier model whose constitution prohibits that content. Sacks connects the dots to a regulatory threat he has been watching build: the regulatory-capture playbook leads, in his read, to a ban on open-source or open-weight models. The justification will be safety — open models let users strip guardrails. Gurley reaches the same conclusion in his P3 Institute post. If a ban succeeds, the United States effectively exiles itself from the open ecosystem while the rest of the world — including China — runs on open models. > *"I think where it's all leading to is an effort to ban open source models or open weight models. There's a lot of breadcrumbs leading here."* ## [59:56] The Great AI Jobs Debate: Dario and Sam Altman flip their rhetoric, Goldman CEO says no AI job apocalypse The chapter opens with a news roundup of the week's narrative shift. Cloudflare's Matthew Prince, Zuckerberg at Meta, Jack Dorsey at Block, and Andy Jassy at Amazon all cited AI when announcing major layoffs. But Goldman Sachs CEO David Solomon published a New York Times op-ed with three counterpoints: AI will automate 25% of work hours, not 25% of jobs; bank tellers increased after ATMs; the US labor market creates and destroys 25–35 million jobs annually so gross churn dwarfs net losses. Simultaneously, Fortune reported that Dario Amodei and Sam Altman are both walking back prior doom-and-gloom rhetoric — with Chamath noting the timing cannot be separated from upcoming frontier-lab IPOs that need a jobs-creation narrative. Sacks is unambiguous: he has been making the non-consensus case against the jobs apocalypse for over a year and considers himself vindicated. Yale Budget Lab found no discernible labor-market disruption over three years of the AI wave. Software engineering — the single breakout AI use case — saw job postings rise 15% year-over-year and hit a three-year high. The 4.3% unemployment rate is near record lows. Most of the high-profile layoffs, he argues, are AI washing: CEOs who over-hired during COVID found AI to be a convenient narrative for long-overdue downsizing. The Jack Dorsey / Block 50% cut was immediately flagged by financial analysts as a company that had been overstaffed relative to peers for years — pure AI washing. Jason pushes back. He insists cab drivers, truck drivers, and package-sorters — roughly 20 million American workers — face real structural displacement over the next decade regardless of current aggregate statistics, and accuses the panel of elitism: "We are elite performers. These people are going to lose their jobs and they may not get a job very quickly." He draws a distinction between the short-to-medium term, where he expects acceleration, and the long run, where a Cambrian explosion of startups built by AI-enabled founders creates new categories. By the end, he shifts toward Sacks's territory — acknowledging the aggregate data is less alarming than his anecdotes suggested. Gurley threads the needle with the same historical argument from the Leo XIII discussion: innovation has always, on net, created more prosperity than it destroyed. His practical advice to people at risk: get ahead of your peers on the tools now; if your job is going away, plan your pivot toward trades (he plugs MicroWorks, which provides free scholarships for plumbers, welders, and electricians) or toward something you find genuinely fascinating. > *"I think the best way to protect yourself from AI is to be the most AI enabled version of yourself you can be. Know what it's capable of in your field. Get out there."* ## Entities - **Bill Gurley** (Person): General partner at Benchmark; author of *Running Down a Dream*; founder of P3 Institute; guest filling in for David Friedberg - **Jason Calacanis** (Person): All-In host; angel investor; founder of LAUNCH; argues for worker empathy and short-term displacement risk - **David Sacks** (Person): All-In host; Craft Ventures founder; most vocal critic of AI jobs-apocalypse narrative this episode - **Chamath Palihapitiya** (Person): All-In host; Social Capital CEO; coined "intelligence sovereignty"; co-founder of Abacus - **Dario Amodei** (Person): Anthropic CEO; subject of Gurley's "Dr. Frankenstein theory"; walked back jobs-doom rhetoric this week alongside Sam Altman - **Pope Leo XIV** (Person): Catholic Pope; released *Magnifica Humanitas*, a 235-page AI encyclical warning against technology concentration - **David Solomon** (Person): Goldman Sachs CEO; published New York Times op-ed arguing AI job apocalypse is overblown - **Anthropic** (Organization): Frontier AI lab; subject of Gurley's regulatory-capture and "Dr. Frankenstein" theories; maker of Claude - **P3 Institute** (Organization): Bill Gurley's new policy and philanthropy institute; published post defending open-source AI - **Goldman Sachs** (Organization): Investment bank; CEO's NYT op-ed became the week's anchor data point against the jobs-apocalypse narrative - **Abacus** (Software): Chamath's Social Capital portfolio company; builds on-prem AI hardware stacks for Fortune 1000 enterprises seeking model independence - **Intelligence sovereignty** (Concept): Jason's term for the next frontier of privacy — not who sees your data, but which AI system is allowed to shape your interpretation of the world - **Dr. Frankenstein theory** (Concept): Gurley's characterization of Anthropic's worldview: senior staff believe they are midwifing a deity or superior species rather than writing software, as described in Dario Amodei's "Machines of Loving Grace" essay - **Regulatory capture** (Concept): The strategy of branding oneself the "safe" AI company, amplifying public fear, and lobbying for regulation that locks in incumbents and targets open-source competitors

#anthropic#open-source-ai#ai-jobs

Biggest Mysteries in Physics: Antimatter, Dark Energy & ToE - Don Lincoln | Lex Fridman Podcast #497

2:53:42

EN/ZH

Watch with Captions

Lex Fridmanhace alrededor de 1 mes

Biggest Mysteries in Physics: Antimatter, Dark Energy & ToE - Don Lincoln | Lex Fridman Podcast #497

Fermilab physicist Don Lincoln joins Lex Fridman for nearly three hours to trace physics as a four-century-long project of unification — Newton binding celestial and terrestrial gravity, Maxwell fusing electricity and magnetism, Einstein bending spacetime, and the Standard Model merging three of four forces. Lincoln then turns to what the Standard Model cannot explain: why the universe contains any matter at all, what dark energy really is, and whether dark matter will ever show itself in a detector. Throughout, he holds a clear line between what has been measured and what remains a brilliant guess, making the boundaries of human knowledge unusually concrete. ## [00:00] Introduction Lex Fridman opens by describing Don Lincoln as someone with Richard Feynman's rare gift for stripping complicated ideas down to their essential core without losing the brilliance inside them. The episode is framed as a tour through physics' deepest open questions, guided by a working experimentalist who has spent decades at the frontier. ## [00:49] Unifying the laws of nature Lincoln frames the entire history of physics through one lens: unification. Newton showed that the moon falling toward Earth and an apple falling from a tree obey the same equation — "universal" was the operative word in his law of universal gravity. Maxwell did something structurally identical in the 1860s: electricity and magnetism, which looked nothing alike, turned out to be two faces of a single force, and their equations automatically predicted that light travels at a fixed speed. Lincoln draws the practical line from that abstract discovery to every modern technology — "without being able to govern electricity, we'd still be farmers and shoemakers." The conversation broadens into why fundamental research pays off centuries later, with Lincoln arguing that nuclear physics, incomprehensible in 1900, is now the most potent energy source available to civilization. Lex adds the longer arc — mastery of antimatter or dark energy might one day enable propulsion systems that let humanity reach other star systems. > *"It has spin-offs. And it has spin-offs. One of the big spin-offs is our entire technological society."* ## [15:20] Einstein, special relativity, and general relativity Lincoln walks through Einstein's 1905 miracle year: special relativity rested on two premises — the laws of nature are the same for everyone, and everyone measures the speed of light as identical regardless of relative motion. That second premise sounds absurd but particle accelerators have confirmed it directly, watching photons emitted from fast-moving decaying particles still arrive at detectors at exactly *c*. Minkowski then showed that Einstein's equations implied space and time were components of a single object, spacetime. General relativity took one more step: Einstein noticed that free-fall in a rocket and gravity feel identical, then worked out that gravity is not a force at all but the curvature of spacetime caused by mass. Lincoln credits Minkowski for the mathematical articulation but insists the conceptual leap — *mass bends the geometry of space itself* — was Einstein's alone. He also defends Einstein's late-career skepticism of quantum mechanics as productive rather than blind: Einstein's critiques forced concrete predictions that experimentalists went out and confirmed. > *"We all agree that your idea is crazy, but is it crazy enough?"* ## [32:27] Electroweak force By the 1930s physicists had catalogued four forces: gravity, electromagnetism, the strong nuclear force, and the weak nuclear force. The last two only matter inside atomic nuclei, which is why most people have never encountered them. In the late 1950s and 1960s, Glashow, Salam, and Weinberg showed that electromagnetism and the weak force were the same at high energies — the electroweak force. The catch was obvious: electromagnetism reaches across the universe (we see light from galaxies billions of light-years away) while the weak force barely reaches across a proton. How could they be the same? Lincoln uses a dropped pen to demonstrate: the Higgs field, postulated in 1964 by Peter Higgs and colleagues, permeates all of space. Particles that couple to it gain mass; those that do not, like the photon, remain massless. At the high temperatures of the early universe the Higgs field was zero, so nothing had mass and the forces were unified. As the universe cooled, the Higgs field switched on and broke that symmetry — giving the W and Z bosons mass and splitting the electroweak force into its two familiar components. The vibration of the Higgs field itself is the Higgs boson: an experimentally detectable excitation of an otherwise invisible field. > *"In the Higgs field, the vibration is the Higgs boson. And so what we can do is not see the field, but we can actually excite the field, make it vibrate and detect the vibrations."* ## [44:09] How particle colliders work E=mc² is not just a slogan: kinetic energy can be converted into mass. Smash two particles head-on with enough energy and the collision region can materialize entirely new particles, always in matter-antimatter pairs. This is what colliders do. Lincoln describes the cascade of accelerators at Fermilab — five machines feeding into each other like gears of a manual transmission — and the scale of the LHC's CMS detector (70 feet long, 14,000 tons, photographing collisions 40 million times per second). The data-reduction challenge is equally striking. The LHC produces about a billion proton-proton collisions per second. Fast electronics discard all but 100,000 per second, commercial processors trim that to 1,000, and those 1,000 records are handed to graduate students hunting for the handful that might be Nobel Prize material. Lincoln reserves particular admiration for the engineers who move petabytes of data around the world seamlessly, calling them the unsung heroes of modern physics. > *"Of the 50 million possible collisions per second, the fast electronics and then the computers pick the thousand, and then we pass those through analysis software and hand them to the graduate students."* ## [62:12] Higgs boson discovery Lincoln was simultaneously working at Fermilab's Tevatron and transitioning to CERN's LHC — a physicist wearing two hats and rooting for both. Fermilab had methodically ruled out most possible Higgs mass ranges; by mid-2012 they had narrowed it to between roughly 120 and 145 GeV. Two days before CERN's July 4 announcement, Fermilab confirmed that if the Higgs existed, it had to be in exactly the region Fermilab had not yet been able to rule out. CERN got there first. Lincoln is careful about what the 2012 announcement actually meant: a particle *consistent with* the Higgs boson. Supersymmetry predicted five Higgs bosons rather than one. Only in the years since — measuring spin (zero), decay products (bottom quarks, W and Z, photons), and their rates — has the evidence converged on Peter Higgs's original 1964 prediction. The Higgs was not a revolution like Einstein's work, Lincoln argues, but it was the final punctuation on 50 years of experimental discovery: the Standard Model, while incomplete, is mostly right as far as it goes. > *"It was a punctuation point, end of about 50 years of discovery and searching, where we finally were able to say the Standard Model, while incomplete, it's mostly right as far as it goes."* ## [72:32] Theory of everything The Grand Unified Theory (GUT) aims to merge the electroweak force and the strong force; a Theory of Everything would then fold in gravity. Lincoln is blunt: he does not see fast progress. The unification energy scale is roughly 10¹⁵ times higher than what the LHC can reach, and accelerator energy grows by only a factor of seven every 20 years. Extrapolating that curve suggests 500 years — and Moore's Law does not hold forever. His critique of string theory is not that it is wrong but that it is currently untestable. It uses approximate solutions to approximate equations, and its landscape of possible universes renders it practically unpredictive. Loop quantum gravity is better developed and makes testable predictions — its original claim that light speed should depend on wavelength was ruled out by gamma-ray burster observations, and the theory was revised. Lincoln's preferred path to a ToE is not extrapolating from current theory but making precise measurements of phenomena that already disagree with predictions. His analogy: an Australopithecus in Kenya trying to predict the Alps, Antarctica, and sperm whales from their local savanna — the farther you extrapolate beyond what you can measure, the more the prediction diverges from reality. > *"I think it is the absolute pinnacle of arrogance to think that what we can do — predict it out a quadrillion times higher than we can see now."* ## [102:17] Physics of empty space "Empty" space is not empty. Quantum field theory says every species of particle has a corresponding field that fills all of space, and those fields are always vibrating. When they vibrate in a characteristic way, a real particle appears; off-frequency vibrations are virtual particles — fleeting excitations that have measurable consequences. Two experiments confirm this. The Casimir effect: two metal plates placed micrometers apart are pushed together by the pressure difference between constrained virtual particles inside the gap and unconstrained ones outside. The anomalous magnetic moment: old quantum mechanics predicts one value for the electron's magnetic moment; including the bath of virtual particles surrounding a bare electron shifts the prediction by 0.1% — and that shifted prediction matches measurement to 10 significant figures. > *"We have measured the magnetic properties of both the electron and the muon to 12 — count them — 12 significant figures. And the theory and the data agree number for number for 10 places."* ## [109:41] Antimatter Paul Dirac's 1928 attempt to merge quantum mechanics with special relativity produced an equation with two solutions: +1 was the electron, −1 was something nobody had seen. He insisted the math was right. Carl Anderson confirmed it in 1932 by photographing a positron in a cloud chamber. Today CERN can make and trap antimatter hydrogen, cool it to near absolute zero, agitate it with lasers, and measure its spectral lines — they match ordinary hydrogen exactly. A 2023 experiment released antimatter hydrogen atoms into a bottle and found they fall downward, consistent with normal gravity, though the measurement precision is not yet tight enough to confirm the gravitational strength is identical. The deeper mystery is why the universe is made of matter at all. Counting galaxies versus cosmic microwave background photons, physicists infer that for every billion antimatter particles in the early universe, there were a billion-and-one matter particles. The billions annihilated; that extra one is everything we see. Fermilab is now testing whether neutrinos and antineutrinos oscillate between flavors at slightly different rates — leptogenesis — as a possible mechanism, racing a parallel effort in Japan. > *"For every billion antimatter particles that existed in the universe, there were a billion and one matter particles. The billions canceled, annihilated, destroyed each other, and that extra one that's left over is us."* ## [130:31] Dark energy In 1998, astronomers expected to measure how fast gravity was braking the expansion of the universe. They found the expansion is accelerating instead. The driving force is dark energy — a repulsive form of gravity. Einstein had added exactly this term to his field equations in 1917 to keep the universe static, then removed it when Hubble showed it was expanding. In 1998 it went back in. What dark energy actually is remains unknown. The most common view is that it is the energy density of space itself. The problem is that quantum field theory predicts a vacuum energy density about 10¹²⁰ times larger than what is observed — the worst prediction in physics. Lincoln notes that if dark energy has constant *density* while space expands, total dark energy is growing, which pushes toward the view that space is quantized: new quanta of space appear as the universe grows, each carrying a fixed energy, producing constant density as an emergent property. > *"There is very clearly something going on, something very badly wrong in the quantum field theory."* ## [134:20] Dark matter Galaxies rotate too fast. Galaxy clusters move too quickly. Gravitational lensing of distant galaxies is stronger than visible matter can explain. Three independent observations all point to the same conclusion: there is roughly five times more mass in the universe than we can see. Lincoln traces his own intellectual journey: 25 years ago he suspected the problem was with Newton's laws; two observations changed his mind. The Bullet Cluster — two galaxy clusters that passed through each other — shows gravitational distortions following the galaxies, not the gas clouds that stopped in the middle, exactly what dark matter predicts. The Dragonfly galaxies (DF2 and DF4) rotate exactly according to Newton's laws because they appear to have had their dark matter stripped away — a galaxy *without* dark matter is actually strong evidence that dark matter is real. Despite 30 years of searching with three approaches — direct detection underground, gamma-ray searches near galactic centers, and missing-momentum signals at the LHC — no dark matter particle has been confirmed. The viable mass range spans from sub-electron to asteroid scale, and experiments can only cover one slice of that range at a time, which is why Lincoln is not currently running a dark matter experiment himself. > *"We've ruled out some dark matter particles, but the problem is the range of space of possible mass — it ranges from something like the mass of an asteroid to far lighter than an electron and everywhere in between."* ## [162:56] Future of physics Lincoln grew up poor in rural America, shaped by science fiction and the popular science books of Isaac Asimov, Carl Sagan, and George Gamow. He chose particle physics over cosmology in the mid-1980s because particle physics let him actually measure things. He worked 8 a.m. to midnight Monday through Saturday as a graduate student not out of obligation but because he could not imagine anything he would rather be doing. His science communication — YouTube videos, popular books — is a deliberate attempt to reach the kid in Iowa or Montana who has no highly educated family mentors but the same hunger he had. He has already heard from Fermilab summer interns who came because they watched one of his videos. Lex closes with Marie Curie: *"Nothing in life is to be feared. It is only to be understood."* > *"One of your viewers might be one of the people who answer these questions that have stymied very smart people for decades."* ## Entities - **Don Lincoln** (Person): Senior scientist at Fermilab; co-author on the 1995 top quark discovery paper; CMS collaboration member at LHC; author of *Einstein's Unfinished Dream* and multiple popular science books. - **Lex Fridman** (Person): MIT researcher and host of the Lex Fridman Podcast; conducts long-form interviews at the intersection of science, technology, and philosophy. - **Fermilab** (Organization): U.S. Department of Energy particle physics laboratory near Chicago; operated the Tevatron collider; currently the world's most powerful neutrino beam facility. - **CERN / LHC** (Organization): European particle physics laboratory home to the Large Hadron Collider; CMS and ATLAS detectors; site of the 2012 Higgs boson discovery. - **Standard Model** (Concept): Quantum field theory describing three of four fundamental forces and all known elementary particles; validated to extraordinary precision but does not include gravity or explain dark matter, dark energy, or the matter-antimatter asymmetry. - **Higgs field / Higgs boson** (Concept): A scalar quantum field whose non-zero vacuum value gives mass to the W and Z bosons while leaving the photon massless; the Higgs boson is its detectable excitation, discovered July 4, 2012 at CERN. - **Dark matter** (Concept): Invisible mass accounting for roughly 85% of all matter in the universe, inferred from galaxy rotation curves, cluster dynamics, and gravitational lensing; no candidate particle detected after 30 years of searches. - **Dark energy** (Concept): The repulsive energy driving the accelerating expansion of the universe; quantum field theory's prediction for its magnitude is 10¹²⁰ times larger than observation — the "worst prediction in physics." - **Baryogenesis / Leptogenesis** (Concept): Frameworks attempting to explain why the early universe produced a matter excess; Fermilab's neutrino program is testing leptogenesis by comparing neutrino and antineutrino oscillation rates. - **String theory / Loop quantum gravity** (Concept): Leading candidates for quantum gravity; string theory predicts at energies untestable by a factor of 10¹⁵; loop quantum gravity quantizes space itself and has produced some falsifiable predictions.

#particle-physics#dark-matter#dark-energy

The Rule for Picking AI Winners | The a16z Show

33:09

EN/ZH

Watch with Captions

a16zhace alrededor de 1 mes

The Rule for Picking AI Winners | The a16z Show

David George (a16z general partner) and David Clark (VenCap CIO) argue that AI companies are scaling faster than any prior technology generation — Anthropic and OpenAI are adding more monthly revenue than Meta, Google, or Microsoft — while actual diffusion into the broader economy remains below 5%. They work through what that gap implies for exit sizes, loss ratios, bubble risk, and who ultimately captures value as token costs fall and frontier intelligence becomes a commodity. ## [00:00] Intro Three data points open the episode: Anthropic and OpenAI already adding more revenue per month than any hyperscaler; top-1% exits 10x-ing in 24 months from $10 billion to $32 billion; and David George's assessment that, right now, we are not in a bubble. ## [00:38] The Scale Shift: Anthropic & OpenAI Adding More Revenue Than Hyperscalers David George explains how his priors shifted sharply around November 2025. Before that, enterprise AI looked like a productivity story analogous to cloud adoption. After it, the numbers reframed the ceiling: Anthropic and OpenAI are already adding revenue at hyperscaler rates with less than 5% of the economy actually using these tools. He places an upper-bound frame on the opportunity by noting that Fortune 500 companies generate roughly $2 trillion of profit annually, and the two largest model companies could reach $200 billion revenue run rate by year-end — already equivalent to 10% of that profit pool. > *"If you pair that up with the fact that they're already getting bigger in terms of revenue added than the hyperscalers, and you're at less than 5% diffusion into the economy, I think the outcomes are going to be extraordinary."* ## [04:20] Skeuomorphic vs Native AI Applications in the Enterprise David Clark invokes Chris Dixon's skeuomorphic-to-native arc: the first wave of enterprise AI lets people do existing jobs faster; the native wave restructures the work itself. George adds a wrinkle — the best companies are not yet focused on internal automation. Their top engineers want to build product, not automate back-office workflows. The most cutting-edge firms he visits are still in a "documentation phase," converting institutional knowledge into markdown before they can meaningfully deploy agents against it. > *"The most cutting-edge folks inside those companies who are trying to do this that I've talked to are kind of in the documentation phase — just turn everything into markdown files, have as much context capture as you can possibly get."* ## [06:24] How the Best AI Companies Run Themselves Differently Native AI founders operate on a different metabolism. George contrasts them with the previous SaaS generation, which, in hindsight, ran inefficiently but got away with it because headcount mandates and expanding software budgets covered the slack. The new companies are lean, aggressive, and already running agent swarms rather than typing commands. He describes walking into a cutting-edge AI company and finding researchers whispering into microphones, orchestrating swarms of agents — not a keyboard in sight. > *"The new companies are very lean, very aggressive, and they work all the time."* ## [08:14] Top 1% Exits 10X'd in 24 Months Clark lays out VenCap's tracking data: the threshold for a top-1% exit was $10 billion between 2020-2024, rose to $20 billion by February 2026, and was updated just the day before this recording to $32 billion. With OpenAI and Anthropic IPOs potentially arriving, he sees the bar hitting $100 billion by September. George notes that the combined market cap of these private companies likely already exceeds the entire Russell 2000, and that the sum of all VC-backed IPOs over the past six years is probably smaller than any single one of the three expected large IPOs. > *"Where is the threshold for the top 1%? And if you then think about OpenAI and Anthropic coming in, potentially we could be north of $100 billion by September."* ## [11:17] The Half-Life Problem: Why 40% of AI Leaders Drop Off Every Year Clark surfaces a disturbing churn metric: 40% of companies on the Forbes AI 50 list from one year disappeared the next. Google wasn't the first search engine; Facebook wasn't the first social network. First-mover advantage in AI is eroding faster than in any prior cycle. George confirms a16z's own priors have been repeatedly overturned — first convinced model companies would be everything, then convinced applications would take over, now watching the model companies extend back up into the application layer. The only durable heuristic he offers: a company must be in the token path. > *"From last year to this year, 40% of the companies that were on that list last year dropped off."* ## [13:11] Token Path, Cost Pressure & Who Captures Value Enterprise buyers are already feeling cost pressure from AI spend, and they cannot cover it by cutting previous-generation software budgets fast enough. George frames value capture as hinging on one largely unknowable variable: the market structure of frontier model labs. Two labs at the frontier means higher token prices and faster labor restructuring pressure; five labs means lower prices and a broader application ecosystem. Per-token cost for like-for-like capability is falling more than 10x year-over-year, but total token spending in dollars is rising faster. Clark adds that Chinese LLMs are roughly six months behind US frontier capability but ten times cheaper — a classic innovator's dilemma setup. > *"The biggest driver of where value is going to get captured right now is something that is totally unknowable, which is what is the market structure of the model companies?"* ## [17:00] Loss Ratios, Risk & How We Think About Early Stage Clark notes that historical early-stage VC loss ratios run around 60%, but the AI cohort of the past two years shows single-digit loss rates — unsustainable by definition. George reframes the discussion: a16z does not target a low loss ratio. A VC firm bragging about never losing money is "a horrible data point" — it signals too little risk-taking. The philosophy is to back the market-leading founder in every space with strong tailwinds and a credible technology. If the space works out and you have the leader, excellent. If the space does not work out but you have the leader, that is expected. The failure mode is the space working out while having backed the wrong company. > *"We joke all the time — there's a prominent VC in our ecosystem, and one of his big points of pride is he's never lost money on a deal. And we're like, that's not a point of pride. Like that's a horrible data point."* ## [22:51] Are We in an AI Bubble? Clark points out that classic bubbles are characterized by excess supply destroying economics — but right now the constraint is supply scarcity: no data center capacity available at scale until late 2028 or early 2029, with the US buildout running a year behind schedule and community resistance adding further delay. George is confident there is no bubble today and dismisses the data center opposition directly. The one scenario he would watch for is an unexpected algorithmic breakthrough producing dramatically smaller and more efficient models — which could flip supply from scarce to oversupplied — but he considers that unlikely in the near term. > *"I feel pretty confident saying that we're not in a bubble right now. I'm less confident that we won't be in a bubble three years from now."* ## [27:36] What SpaceX, OpenAI & Anthropic IPOs Mean for Public Markets Clark asks whether public markets can absorb the coming wave of trillion-dollar-plus IPOs. George argues it is unambiguously positive: the number of public companies has halved over 20 years, and outside the data center supply chain, almost nothing in the public markets is growing at more than 30% today. Bringing hypergrowth companies into indexes gives retail investors — including his parents' index-fund retirement accounts — exposure to the most dynamic part of the economy. He expects some portfolio reshuffling to make room, but does not see indigestion risk. > *"If you exclude the data center supply chain stuff right now, there are very few companies that are growing fast that are available for people to buy in the public markets."* ## [29:59] The Future of Venture Capital in an AI World George forecasts the shape of VC over the next five years as primarily a function of token market structure — whether the labs remain concentrated or become commoditized. He cites Bill Gates's platform axiom: a platform's value is validated when the companies built on top of it collectively exceed the platform's own value. If that holds, there will be a massive wave of valuable application companies built on intelligence. He also flags the consumer side as the most underappreciated opportunity: the last decade of consumer internet was a story of time spent getting captured by large incumbents; AI-driven shifts in consumer attention could recreate the conditions for generational consumer companies. > *"I'm very optimistic that we're going to have a massive wave of really valuable companies that get built on top of tokens, AI, and intelligence."* ## Entities - **David George** (Person): General partner at a16z; covers growth-stage and early-stage AI investing; invested in OpenAI pre-ChatGPT - **David Clark** (Person): CIO at VenCap; fund-of-funds investor tracking AI startup performance and VC market dynamics for 34 years - **Anthropic** (Organization): Frontier AI lab; cited as adding more monthly revenue than hyperscalers alongside OpenAI - **OpenAI** (Organization): Frontier AI lab; benchmark for scale and the expected $100B+ IPO cohort - **VenCap** (Organization): Fund-of-funds investor; publishes top-1% exit threshold data and tracks Forbes AI 50 churn - **Andreessen Horowitz / a16z** (Organization): Venture capital firm; investor in OpenAI pre-ChatGPT, scaling platform services to support companies encountering enterprise-scale problems early in their lives - **Cursor** (Software): AI coding tool cited as an example of a company reaching billions in revenue while still very small and early-stage - **Token path** (Concept): a16z's primary heuristic for evaluating AI companies — a company must sit in the flow of AI inference tokens to have durable economic relevance - **Skeuomorphic vs. native AI** (Concept): Chris Dixon's framework distinguishing apps that replicate existing workflows with AI assistance from apps that rearchitect work around AI capabilities natively - **Half-life problem** (Concept): David Clark's term for rapid AI leader turnover — 40% of Forbes AI 50 companies dropped off the list year-over-year — indicating first-mover advantage is eroding faster than in prior technology cycles

#ai-investing#venture-capital#large-language-models

Neuralink's DJ Seo: Inside the Race to Connect Brains and AI

24:59

EN/ZH

Watch with Captions

Sequoia Capitalhace alrededor de 1 mes

Neuralink's DJ Seo: Inside the Race to Connect Brains and AI

At AI Ascent 2026, Neuralink co-founder and president DJ Seo sits down with Sequoia partner Shaun Maguire to lay out exactly where the company stands: 20-plus Telepathy patients controlling computers and robotic arms through pure thought, Blindsight in preclinical testing and potentially cleared for human use by end of 2026, and a first-principles manufacturing philosophy borrowed from Elon Musk that treats surgical robots the way SpaceX treated reusable rockets. DJ argues that the real ceiling of this technology is not cursor control or speech synthesis but direct, uncompressed, multimodal transfer of concepts — AI as a neocortical layer sitting above the human limbic system — and that scale, the same variable that unlocked the LLM era, is the only remaining gate. ## [00:00] Introduction Shaun Maguire opens the session by announcing a two-minute Neuralink patient video before the interview begins, telling the audience to stay on the side because what they are about to watch is proof that the company has already cleared the hardest bar: restoring human agency to people who had lost it entirely. ## [00:21] Telepathy Patient Stories The video narrates four patients whose lives changed after receiving the Telepathy implant. A quadriplegic patient describes moving a cursor with thought alone — "I'm thinking and a cursor is moving on a screen. It blew my mind." An ALS patient who lost the ability to speak regains a digital voice through the implant: "I'm talking to you with my mind." Another patient notes that the implant flipped how his child sees him: "I am not able to do things that other dads can, but now he thinks it's so cool that I can do things that other dads cannot." > *"Before the implant, I was locked in, non-verbal, quadriplegic. Now I control my computer just by thinking and the rewards have been immense for me."* ## [01:06] Convoy Robotics Independence The video shifts to Convoy, Neuralink's assistive robotics team, which is extending BCI control beyond a screen to physical manipulation in the real world. A patient who had been losing motor function moves a robotic arm through its axes using only neural intent: "It was incredible to be able to just gesture with an arm again." A second patient, Kenneth, who was losing his voice to ALS, uses the system's speech synthesis to speak aloud in real time during the video — words generated by his brain signals rather than his vocal cords. > *"Gaining functionality that I thought was gone forever was so incredibly life-changing."* ## [02:04] Blindsight Vision Restore The video previews Blindsight, Neuralink's second product line, designed for patients who have lost both eyes or optic nerve function. An external camera captures the visual scene; the device writes the signal directly into the visual cortex via electrical stimulation, generating phosphenes — artificial pixels of light. A patient named Audrey, asked how it feels, answers simply: "Life-changing." The video closes with the line "all with my mind" spoken over footage of a patient interacting with the world through the restored signal. > *"The future of this technology feels almost unlimited... we are finding ways to apply it across all regions of the brain."* ## [03:10] After Video Reflections DJ Seo, visibly moved after watching the video alongside the audience, speaks first: "We were cracking a lot of jokes before that video, but honestly, that brought tears to my eyes." He describes the work as one of the most inspiring projects in the world — not because of the technical milestone but because the team is giving back capabilities that patients had already grieved as permanently lost. Maguire affirms the sentiment before pivoting to the founding story. > *"This is one of the most inspiring projects in the world. It's incredibly difficult what they're doing and I mean, they're truly saving people."* ## [03:31] Origin Story And AI DJ traces Neuralink's founding insight to a single bottleneck: the mismatch between human output bandwidth and AI capability. In 2016, saying that out loud "sounded insane," but the logic has not changed. His personal path ran through a childhood fascination with the brain, undergraduate work at Caltech building miniaturized low-power electronics, and a Berkeley PhD focused on shrinking lab-grade neural systems down to something deployable. When he met Elon Musk near the end of his PhD, the scale and ambition of the project made refusal impossible. He frames the brain as "the most interesting compute that we all carry" and "the only form of general intelligence that we know to date." > *"Really the key insight back then was sort of the IO bottleneck between the human output and AI capabilities."* ## [06:31] Scaling And Vertical Integration Maguire presses on what smart people most misunderstand about Neuralink: many know the implant and the decoding algorithm, but almost nobody grasps the manufacturing and surgical-robot infrastructure the company built in parallel from day one. DJ attributes this to what he calls "Elon magic" — an insistence on vertical integration that gives Neuralink control over every layer from chip design to factory floor to robotic surgery deployment. The target is not a niche medical device; it is LASIK-scale surgery available to millions. Building that capacity first means progress looks slow until "the iceberg pops over the waterline" and ramp becomes near-instantaneous. > *"Vertical integration is something that is really the lifeblood of Neuralink and Elon companies and what really enables us to have that fast iteration loop from design, develop, deploy."* ## [09:27] Caregivers And Purpose Asked which patient story inspires him most, DJ refuses to pick one — the power, he says, is not only in the patients but in the caregivers: Nolan's mother Mia, Brad's wife Tiffany, Ken's wife Cheryl. He describes their presence as "a really powerful human story of love, sacrifice, and resilience." He then takes what he calls a philosophical tangent: his core belief is that fulfillment comes from helping others, because the gap between self and other is not categorically different from the gap between your present and future selves. That belief is what he says keeps him and much of the Neuralink team going — they are "igniting a fire of hope" for people who had given up on recovering what they lost. > *"I personally and as well as many others at Neuralink find extreme fulfillment being able to help those that really cannot help themselves."* ## [13:10] BCIs Meet AI Future Maguire asks the room's core question: how do BCIs and AI converge? DJ sketches a two-horizon answer. Near term, the system translates neural intent into legacy interfaces — keyboard, mouse, language — which is already working. The real breakthrough, which he thinks is "not super distant," is bypassing those legacy interfaces entirely and computing on raw neural intent. He points to transformer architectures as existence proofs: nothing prevents them from learning the latent manifolds of neural data given sufficient scale. Neuralink is already fine-tuning LLM-class models on neural recordings from its 20 participants and finding "very counterintuitive" patterns. The ultimate ceiling he names is "direct, uncompressed, high-fidelity, multimodal transfer of concepts" — the Matrix's "I learned kung fu" moment and possibly beyond it. He also shares what he calls a clarifying lesson from working with Musk: "all green light schedule" — a first-principles forcing function that strips every man-made bottleneck and asks how fast something could actually be built if every light were green. His estimate is that 80–90% of perceived constraints in hardware development are artifacts of convention, not physics. > *"I think if you really think about the ultimate ceiling of this technology, it's really direct uncompressed high fidelity and multimodal transfer of concepts."* ## [21:05] Audience Q&A Wrap Three audience questions in the final four minutes. On product sequencing — when to go deep versus expand — DJ explains the "beachhead and expand" strategy: build everything generalizably enough from the start so that regulatory approval for motor cortex becomes a template for visual cortex and beyond. The first approval is the hardest; every subsequent one rides the clinical safety record already established. On augmentation for healthy users, DJ frames everything around benefit-risk: the calculus is obvious for quadriplegic patients; for otherwise healthy users it remains unclear, but he notes that off-label use after approval is legally available to anyone who can find a neurosurgeon and pay out-of-pocket. On the hard problem of consciousness, he gives a pointed one-liner: if you can inject new senses and measure the subjective response quantitatively, you may have a pathway toward measuring consciousness itself. Maguire closes by calling Neuralink "one of the most inspiring companies in the world." > *"If you are able to inject new senses, there may be ways to quantitatively understand that."* ## Entities - **DJ Seo** (Person): Co-founder and president of Neuralink; PhD in miniaturized electronics from Berkeley; joined after meeting Elon Musk near the end of his doctorate - **Shaun Maguire** (Person): Partner at Sequoia Capital; host of the AI Ascent 2026 fireside session - **Elon Musk** (Person): Co-founder of Neuralink; originator of the "all green light schedule" and vertical integration philosophy carried across Tesla, SpaceX, and Neuralink - **Neuralink** (Organization): BCI company founded in 2016; products include Telepathy (motor prosthesis) and Blindsight (vision restoration via visual cortex stimulation) - **Telepathy** (Software): Neuralink's first commercial product; allows paralyzed patients to control computers and robotic devices through neural intent decoding - **Blindsight** (Software): Neuralink's second product line; restores vision for patients with total loss of eyes or optic nerve by writing directly to the visual cortex; in preclinical testing as of mid-2026 - **IO Bottleneck** (Concept): The mismatch between human output bandwidth (speech, typing, gesture) and AI processing capability; the founding problem Neuralink was built to solve - **Neural Foundational Model** (Concept): LLM-class transformer models fine-tuned on neural recording data; Neuralink is building these at 20-participant scale and observing counterintuitive patterns in neural latent space - **All Green Light Schedule** (Concept): Elon Musk's first-principles engineering discipline — strip every man-made constraint and ask what physics alone limits; DJ estimates 80–90% of hardware delays are conventional, not physical

#brain-computer-interface#neuralink#ai

10:30

EN/ZH

Watch with Captions

Everyhace alrededor de 1 mes

Why Opus 4.8 Pulled Me Back to Claude

Dan Shipper, CEO of Every, delivers a day-zero vibe check on Opus 4.8, arguing Anthropic could have called it Opus 5. The model jumps 30 points past Opus 4.7 on Every's Senior Engineer benchmark, edges out GPT-5.5, tops their internal writing tests at 79.6 vs. 73, and is the first model to produce a genuinely good one-shot slide deck. Two catches temper the enthusiasm: performance degrades sharply below "extra high" reasoning, and the Claude desktop app remains cluttered compared to Codex. ## [00:00] What is Every Every is a 30-person applied AI lab for the future of work—part media outlet, part product studio. Dan opens by explaining the subscription (writing, courses, AI-built tools all in one place at every.to) before rolling into the Opus 4.8 assessment. The plug is brief and context-setting: the team has had beta access for a week, and the rest of the video is what they found. > *"Every is the only subscription you need to stay at the edge of AI."* ## [01:07] Anthropic Is Back: The Headline Case for Opus 4.8 Dan had largely abandoned Claude after Opus 4.7—slow, hard to love, and outpaced by Codex and GPT-5.5 in day-to-day use. Even the most loyal Claude users at Every had started routing work elsewhere. Opus 4.8 breaks that pattern: it scores 63 on Every's Senior Engineer benchmark (30 points above Opus 4.7, one point above GPT-5.5), tops their writing tests, and produced the first one-shot slide deck Dan has called genuinely good. Kieran Klaassen, Every's GM, called it "the most human model he's worked with." The one persistent friction is the Claude desktop app itself. Codex is fast, focused, and ships a clean harness; the Claude app still feels like a product built by three separate teams—chat tab, code tab, co-work tab, each with its own feel. Dan is now splitting time between both apps, which he was not doing before. > *"But honestly, they could have called it Opus 5 cuz this is a really great model."* ## [05:02] Reach Test: Paradigm Shift Ratings from the Every Team Every's reach test asks one question: do you actually open this model when work gets hard? Dan rates Opus 4.8 gold/green—paradigm-shift quality, docked one notch because the Claude app harness is only "okayish to pretty good." Kieran, who runs 50 agents a day, gives a straight gold paradigm-shift, one of the rarest grades the team has assigned. Katie Parrot, a senior staff writer and historical Claude fan, lands at green, splitting her work between Opus 4.8 and Codex. > *"It's very rare to give a paradigm shift grade to a model. So I would pay attention to this."* ## [06:32] Benchmarks: Coding and Writing Numbers On coding, Opus 4.8 hits 63 on the Senior Engineer benchmark—the test feeds the model a vibe-coded codebase and asks it to rewrite from first principles, then scores against two human senior engineers who completed the same rewrite (typically scoring in the 80s–90s). GPT-5.5 sits at 62. On Kieran's LFGbench (real-world tasks: SaaS build, e-commerce site, 3D game landscape), the model writes readable code that bridges technical competence and creativity—the "cozy island" 3D scene is notably richer and more vibrant than GPT-5.5's output. On writing, Opus 4.8 scores 79.6 out of 100 on Every's internal benchmark (intro writing, promo emails, mid-piece paragraphs); GPT-5.5 scores 73. The gap is mainly in AI tells: at high and extra-high reasoning settings, Opus 4.8 produces prose that sounds less like a model. It matches a writer's voice from a single paragraph of context better than any other model Dan has tested. > *"Opus 4.8 scores a 79.6 out of 100 on the writing benchmark. GPT 5.5 is 73."* ## [08:57] Emotional Intelligence, Knowledge Work, and the Verdict Dan uses the model for interpersonal and management work—talking through decisions, pressure-testing his own framing. Opus 4.8's thinking traces show it genuinely cycling through permutations before responding, which makes it feel less like a sycophant and more like a useful counterpart. On knowledge work, it's versatile: code and writing coexist cleanly in a single thread, and the slide deck result is the first one-shot deck Dan would actually send to someone. The verdict: if you're a Claude fan, this model delivers. If Codex converted you, add Opus 4.8 as a parallel tool for writing and knowledge work—it's worth the context switch. The harness gap is real, but the model itself is a banger. > *"If you've been converted to Codex, I highly recommend you at least add it as part of your arsenal."* ## Entities - **Dan Shipper** (Person): Co-founder and CEO of Every; presenter and primary evaluator of Opus 4.8. - **Kieran Klaassen** (Person): GM of Kora at Every; gave Opus 4.8 a straight gold paradigm-shift rating on the reach test. - **Katie Parrot** (Person): Senior staff writer at Every; rated Opus 4.8 green, split between it and Codex. - **Every** (Organization): Applied AI lab and media subscription company focused on AI for the future of work. - **Anthropic** (Organization): Developer of Claude and Opus 4.8. - **Opus 4.8** (Software): Anthropic's latest Claude model; subject of the vibe check. - **GPT-5.5** (Software): OpenAI model used as the primary performance comparison across all benchmarks. - **Codex** (Software): OpenAI coding agent; praised for its clean desktop harness and used as the daily-driver counterpoint to Claude. - **Senior Engineer Benchmark** (Concept): Every's proprietary coding benchmark—rewrites a vibe-coded codebase from first principles and scores against human engineers. - **LFGbench** (Concept): Kieran Klaassen's real-world coding benchmark covering SaaS, e-commerce, and 3D scene generation tasks.

#claude#opus-4-8#llm-benchmarks

DEBATE DE EMERGENCIA: Nos Mienten Sobre la IA, la Guerra con Irán y Lo Que Viene Después

1:43:32

EN/ZH

Watch with Captions

The Diary Of A CEOhace alrededor de 1 mes

DEBATE DE EMERGENCIA: Nos Mienten Sobre la IA, la Guerra con Irán y Lo Que Viene Después

El inversor de Shark Tank Kevin O'Leary y el cofundador de Young Turks Cenk Uygur se enfrentan durante 103 minutos sobre si la IA va a liberar o hundir la economía estadounidense, por qué la guerra entre EE.UU. e Irán se prolonga pese a que un acuerdo de salida parece evidente, y quién tiene posibilidades reales de ganar en 2028. O'Leary defiende el optimismo de principio a fin: la IA crea empleos, el mercado siempre se adapta, China es la verdadera amenaza. Uygur martilla una sola tesis de forma ininterrumpida: la combinación del desempleo masivo impulsado por la IA y una política exterior condicionada por el lobby israelí está llevando a EE.UU. directo al iceberg, sin que ninguna institución esté preparada para el impacto. ## [00:00] Introducción El clip de apertura establece las apuestas del debate sin rodeos. Uygur entra sin calentamiento: las empresas se pelean por despedir al 10-25% de su plantilla para ganar ventaja competitiva, y si toda la economía lo hace a la vez el resultado es una depresión, no una recesión. La respuesta de O'Leary: "Vaya. Jake es un auténtico aguafiestas. De lo que estamos hablando es de una oportunidad increíble", marca el tono exacto que sostendrá la hora y cuarenta minutos siguientes. Steven Bartlett define su objetivo: llegar a la verdad a través del choque de dos mentes serias, no de un espectáculo de gritos. > *"Todos corren a despedir al 10 o al 25% de su plantilla, pero un 10% de desempleo sería peor que cualquier cosa que hayamos vivido en nuestras vidas."* — Cenk Uygur ## [02:35] Por Qué 7 De Cada 10 Estadounidenses Se Oponen a los Centros de Datos de IA Steven Bartlett abre con una encuesta: 7 de cada 10 estadounidenses se oponen a que se construyan centros de datos de IA cerca de donde viven. O'Leary señala a un culpable concreto: a través de auditores forenses y declaraciones IRS 990, rastreó dinero chino que fluía a través de una red llamada Arabella, vía Neville Singum, hacia campañas anti-centros de datos en Utah, acompañadas de amenazas de muerte a sus ejecutivos. Entregó 90 páginas de datos de IP a la Casa Blanca. Uygur descarta la teoría china y apunta a un agravio más simple: los centros de datos han disparado los costes eléctricos de iglesias, bibliotecas y centros comunitarios, como ocurrió en Virginia, y las empresas que los construyen deben asumir su propio suministro eléctrico o ceder participación pública a cambio. > *"Tengo pruebas irrefutables de que China está interfiriendo en cada lugar donde se propone nueva energía en EE.UU., en cada estado, en cada ciudad."* — Kevin O'Leary ## [07:24] Por Qué la IA Podría Desencadenar un Colapso y una Crisis de RBU Aquí toma cuerpo el argumento económico central de Uygur. Coincide en el problema de los costes energéticos y sostiene que cualquier centro de datos que tire de la red pública sin compensación es un saqueo corporativo, citando el rescate de 2008 como el manual de lo que no debe hacerse. Su alarma mayor es el desempleo masivo: si cada empresa se apresura a reducir entre el 10 y el 25% de su plantilla, el resultado agregado es destruir el consumo y provocar una depresión. Sam Altman, Elon Musk y Dario Amodei han dicho públicamente que el desplazamiento masivo de empleo está por llegar, pero ningún gobierno tiene un plan. O'Leary replica que cada disrupción tecnológica en los últimos 200 años de historia de EE.UU. creó más oportunidades de las que destruyó, y que frenar el desarrollo de la IA solo cede la delantera a China. > *"Cuando choquemos con el iceberg no vamos a estar preparados y será un desastre épico. No habrá nadie que compre tus productos porque los empleados también son clientes."* — Cenk Uygur ## [15:30] ¿Ocultan los Fundadores de IA los Verdaderos Riesgos al Público? Steven Bartlett lee declaraciones en público: Sam Altman en 2021 diciendo que la IA reemplazará la mayoría de los empleos; Musk en 2024 diciendo que probablemente ninguno de nosotros tendrá trabajo; y Amodei advirtiendo en 2025 que la IA podría eliminar la mitad de todos los empleos de cuello blanco de nivel inicial en cinco años y llevar el desempleo al 20%. La pregunta: si quienes construyen estos sistemas dicen abiertamente que causarán daño social, ¿por qué asumir que exageran? O'Leary rescata la otra mitad del argumento de Amodei: sin construir capacidad de cómputo en seis meses, Deepseek de China nos alcanza. La disyuntiva real es liderar la disrupción o cedérsela a Pekín. Uygur acepta que la carrera es inevitable, pero insiste en que los programadores que están siendo despedidos hoy ya están golpeando el iceberg, y que una RBU de 36.000 dólares al año es un recorte brutal para quien ganaba 120.000. > *"¿Podemos hacer la carrera de un modo responsable que sirva de verdad a los votantes y ciudadanos estadounidenses en lugar de servir únicamente a los ejecutivos de las empresas de IA y sus accionistas? Espero que sí, pero no hemos dado ni un solo paso en esa dirección."* — Cenk Uygur ## [23:55] ¿Puede la IA Desarrollarse de Forma Responsable o Es Imposible? Steven Bartlett presiona para concretar qué significa desarrollar la IA de forma responsable. Uygur ofrece su diagnóstico estructural: el soborno legalizado, Citizens United y Buckley contra Valeo, garantiza que la empresa de IA que más done obtenga el marco regulatorio que quiere. El Congreso no actúa en favor de los votantes, sino de los donantes. O'Leary argumenta que los empleos que se pierden son mayoritariamente puestos sobredimensionados que las empresas contrataron de forma especulativa, y que las compañías de IA están quemando miles de millones, no embolsándoselos. Pone como ejemplo su centro de datos en Utah: 4.000 empleos de construcción durante nueve años, otros 2.000 puestos de ingeniería, sin tocar ni un acre de tierra de cultivo. Ante la advertencia de Uygur sobre el socialismo, O'Leary no se amedrenta: sube los impuestos por encima del 50% y los ricos se van a Mónaco o a Florida, como descubrió Francia. > *"Si no se actúa, llegarán las horcas. Yo no soy de los de horca. Creo en la no violencia y siempre lo haré. Pero no creo que la gente entienda el nivel de rabia que está acumulándose."* — Cenk Uygur ## [32:11] Cómo la IA Destruye Empleos en Silencio Steven Bartlett comparte su propia experiencia: ahora contrata perfiles de entrada casi exclusivamente en función de su dominio de la IA, porque un junior que la maneja bien rinde entre 5 y 10 veces más, lo que en la práctica descarta a quienes no la dominan. O'Leary rechaza el argumento: los ingenieros se contratan para resolver problemas, no para escribir código, y la IA solo les da una herramienta más rápida; la mayoría de los despidos en tecnología corrigen excesos de contratación, no reflejan desplazamiento por IA. Uygur lo rechaza de plano: los analistas de Wall Street aplauden cada anuncio de recorte de plantilla como "sinergias", las acciones suben cuando despides gente, y nadie en esas llamadas de resultados pregunta quién comprará los productos cuando los trabajadores hayan desaparecido. Señala además un riesgo subestimado: históricamente, grandes bolsas de jóvenes desempleados se correlacionan con crimen y conflicto. > *"Cuando tienes muchos hombres jóvenes sin trabajo dando vueltas, lo que suele pasar no es nada bueno. Hay guerras, sube la criminalidad. Tenemos que estar preparados."* — Cenk Uygur ## [37:35] Por Qué el Desempleo Masivo Podría Llegar Antes de lo Esperado Steven Bartlett describe una visita a una aceleradora de robótica en San Francisco donde todos los equipos habían pasado del software a los robots físicos, porque la inteligencia, antes el ingrediente escaso y caro, ahora cuesta casi nada. Pide a ambos que imaginen en qué podrían estar equivocados. O'Leary se niega a contemplar el escenario del desempleo y se desplaza hacia la base permanente de la NASA en la Luna y el programa de Marte como fuentes de cientos de miles de empleos bien remunerados. Uygur lo llama "el problema del interregno": aunque el escenario optimista de O'Leary sea real dentro de 20 años, el trabajador de cadena de montaje de 61 años en Cleveland no puede reconvertirse en ingeniero de Marte. Steven Bartlett añade que el CEO de Uber le dijo en privado que la IA sustituirá a 9,4 millones de sus conductores y que, al preguntarle qué harán esos conductores, respondió: "No lo sé." > *"Las piezas del robot llevan aquí décadas. Siempre las hemos tenido. Lo que nos faltaba y lo que era caro era la inteligencia."* — Steven Bartlett, citando a su cofundador ## [46:32] Publicidad Bloque de patrocinadores con Stan (herramienta de contenido en redes sociales con IA), Pipedrive (CRM) y Cometeer (café). Sin contenido sustancial del debate. ## [48:40] Qué Está Pasando Realmente entre Israel, Irán y Oriente Medio El debate gira hacia la geopolítica. Steven Bartlett presenta los índices de aprobación de Trump en caída libre y pide a Uygur que explique la guerra. La respuesta de Uygur se extiende casi 25 minutos con una sola tesis: la guerra sirve al 100% a los intereses israelíes y al 0% a los intereses estadounidenses. Traza los 317 millones de dólares donados por la familia Adelson a la campaña de Trump como mecanismo financiero, señala que el lobby israelí dona al 94% del Congreso con AIPAC como donante vitalicio número uno tanto de Trump como de Biden, Hakeem Jeffries, Chuck Schumer y Mike Johnson, y argumenta que Israel ha subcontratado a EE.UU. siete guerras desde el 11-S, siendo Irán la última de la lista. Irán, dice, nunca ha tenido un sistema de entrega capaz de alcanzar EE.UU., nunca ha enriquecido uranio por encima del 60% (el grado armamentístico es el 90%), y el antiguo Gran Ayatolá emitió una fatwa contra las armas nucleares. Mientras tanto, Israel ha tomado el sur del Líbano, planea quedárselo, y Netanyahu exigió públicamente como condición de paz que Israel mantenga en exclusiva el derecho a seguir atacando el Líbano, lo que hace imposible cualquier acuerdo. O'Leary enmarca el régimen iraní de otra manera: 150.000 personas que brutaliza a 90 millones durante 60 años, un gobierno al que no se puede entregar armas nucleares, y una situación en la que la necesidad de China de que el estrecho de Ormuz permanezca abierto acabará forzando a Pekín a presionar a Teherán para que ceda. > *"100% interés israelí, 0% interés estadounidense. Salgamos de ahí. Dejemos de luchar sus guerras y volvamos a casa."* — Cenk Uygur ## [01:11:59] ¿Calculó Mal Trump la Duración de Este Conflicto? Steven Bartlett pregunta directamente a O'Leary si Trump subestimó el conflicto. O'Leary lo llama la primera "guerra tecnológica" real: drones de fibra de carbono de 35.000 dólares con motores de cortacésped son interceptados por misiles estadounidenses de entre 1,2 y 3 millones de dólares, una asimetría de costes que evidencia una brecha de capacidad computacional que EE.UU. necesita cerrar. No ve una invasión terrestre, solo un ablandamiento aéreo continuado hasta que el liderazgo iraní calcule que el coste de bloquear el estrecho, 210 millones de dólares diarios en ingresos perdidos, supera el beneficio. Su predicción: China fuerza un acuerdo antes de las elecciones de mitad de mandato de EE.UU. > *"Es caro porque estamos en el lado equivocado de la defensa. Necesitamos los drones baratos."* — Kevin O'Leary ## [01:15:47] Publicidad Bloque de patrocinadores con Pipedrive (CRM) y las Cartas de Conversación de Diary of a CEO. Sin contenido sustancial del debate. ## [01:18:08] Por Qué EE.UU. Pierde la Paciencia a Pasos Acelerados Steven Bartlett señala el punto de presión: si el liderazgo iraní sabe que Trump tiene meses antes de las elecciones de mitad de mandato y luego las de 2028, ¿por qué negociar ahora en lugar de esperar a un adversario debilitado? O'Leary añade una segunda restricción: el líder supremo de China también necesita el estrecho abierto para sostener su economía y su control del poder, de modo que Irán sirve a dos amos. Uygur argumenta que el acuerdo ya está redactado: Irán entrega uranio altamente enriquecido a monitores internacionales, EE.UU. levanta el bloqueo, el estrecho se reabre. Pero colapsa cada vez que Netanyahu llama a Trump y añade nuevas condiciones imposibles: desarme inmediato, incorporación de Irán a los Acuerdos de Abraham. Uygur señala que todos los políticos que se opusieron públicamente al reciente preacuerdo recibieron más de un millón de dólares del lobby israelí. Extiende el argumento a escala global: mientras Rusia sangra en Ucrania y EE.UU. sangra en Irán, China construye carreteras y puentes en África y América Latina, sin gastar nada en guerras y acumulando influencia por contraste. > *"Después de cada llamada con Netanyahu, Trump pasa de decir que vamos a tener paz a decir que no vamos a tener paz y que habrá estas nuevas condiciones imposibles. Ha pasado ya unas seis veces."* — Cenk Uygur ## [01:29:08] ¿Estamos Viendo el Ascenso del Socialismo en Tiempo Real? Steven Bartlett presenta datos de Gallup: la visión positiva del capitalismo entre los estadounidenses en mínimos históricos, el 70% de los demócratas ve el socialismo de forma positiva, el 62% de los jóvenes estadounidenses es favorable al socialismo, y esto antes de que los efectos económicos de la guerra se hayan hecho sentir. O'Leary ve un fenómeno cíclico: cada 17-20 años EE.UU. coquetea con el socialismo, y siempre se derrumba cuando los jóvenes idealistas reciben su primer sueldo y descubren los impuestos. Señala que 52 céntimos de cada dólar de fondo soberano del planeta van a EE.UU., no a Cuba ni a Rusia. Uygur rechaza el planteamiento de raíz: EE.UU. ya practica el socialismo para las corporaciones, subvenciones petroleras a empresas con beneficios récord, sin negociación de precios de medicamentos en Medicare, cada sector capturando a su regulador mediante donaciones. El verdadero proyecto es recuperar mercados libres de verdad, y para eso hay que sacar el dinero de la política primero. > *"Tendríamos suerte de volver al capitalismo, sin ni siquiera llegar al socialismo, porque ahora mismo no tenemos capitalismo. Tenemos capitalismo de amigos."* — Cenk Uygur ## [01:34:06] ¿Quién Tiene Ventaja de Cara a las Próximas Elecciones Presidenciales? O'Leary no quiere pronosticar un ganador, pero dice que los demócratas necesitan un centrista moderado y cita California como ejemplo del fracaso de la gobernanza progresista. Uygur le sorprende con una predicción concreta: Tucker Carlson es el único republicano que podría ganar en 2028. El entusiasmo votante republicano ya está por los suelos, las elecciones de mitad de mandato están perdidas, y para 2028 los efectos combinados del desempleo por IA y la guerra con Irán se habrán materializado por completo. O'Leary primero se ríe, luego da marcha atrás en directo: Carlson tiene una base masiva en redes sociales, dirige su propia red y adopta posiciones cada vez más independientes, incluso sobre la IA. Uygur cierra nombrando a Rohana como la figura progresista con más posibilidades de ganar a nivel nacional y defiende el capitalismo democrático, mercados privados supervisados por una democracia que funciona con el norte de Europa como modelo que ya existe, frente al corporativismo que se practica hoy y al socialismo que se teme. > *"Solo tienen un candidato que podría ganar, y me preocupa, y ese es Tucker Carlson. Si Tucker se presenta en las primarias republicanas, las gana sin duda. Pueden citarme."* — Cenk Uygur ## Entidades - **Kevin O'Leary** (Persona): Inversor de Shark Tank y presidente de O'Leary Ventures; defiende que la IA crea oportunidades, apoya el desarrollo de centros de datos, rastrea el activismo anti-IA hasta fuentes de financiación chinas y predice que China forzará a Irán a un acuerdo antes de las elecciones de mitad de mandato. - **Cenk Uygur** (Persona): Cofundador de Young Turks y comentarista progresista; argumenta que el desempleo por IA no está planificado, que la política exterior de EE.UU. está condicionada por el lobby israelí y que el sistema político estadounidense está corrompido por el soborno legalizado. - **Steven Bartlett** (Persona): Presentador de Diary of a CEO; empresario e inversor que modera el debate y aporta decisiones de contratación propias y observaciones de laboratorios de robótica que anclan el debate en el comportamiento empresarial real. - **AIPAC / lobby israelí** (Organización): Señalado por Uygur como el donante vitalicio número uno de la mayoría de los principales políticos de EE.UU. en ambos partidos; central en su tesis sobre por qué la guerra EE.UU.-Irán continúa pese a tener un acuerdo listo. - **Arabella / Alliance for a Better Utah** (Organización): Red que O'Leary afirma está financiada por entidades vinculadas a China para difundir desinformación anti-centros de datos en estados de EE.UU., documentado con declaraciones IRS 990. - **RBU (Renta Básica Universal)** (Concepto): Red de seguridad propuesta para los trabajadores desplazados por la IA; Uygur señala que incluso en el mejor caso, 36.000 dólares anuales de RBU supone un recorte devastador para trabajadores que antes ganaban 120.000. - **Estrecho de Ormuz** (Concepto): Punto de paso del 48% de las importaciones energéticas chinas; su cierre dispara la inflación global y reabrirlo es el interés central de EE.UU. en cualquier acuerdo con Irán. - **Deepseek** (Software): Modelo de lenguaje de gran escala chino; O'Leary y Amodei lo citan como prueba de que cualquier pausa en el desarrollo de IA en EE.UU. le da a China una ventaja decisiva en cuestión de meses. - **Tucker Carlson** (Persona): Expresentador de Fox News reconvertido en figura de medios independientes; Uygur predice que es el único candidato republicano viable en 2028, una predicción que O'Leary no acaba descartando. - **Capitalismo democrático** (Concepto): Marco económico preferido de Uygur: mercados privados supervisados por una democracia que funciona; lo distingue del corporativismo actual en EE.UU. y del socialismo de corte europeo. - **Rohana** (Persona): Figura política progresista citada varias veces por Uygur como el único político que trabaja en la política de desempleo por IA y el candidato de 2028 más cercano a la gobernanza del capitalismo democrático.

#ai-economy#unemployment#iran-war

Construyendo un guardián de IA para empresas con Maxim Bar Kogan, CEO de Onyx Security

41:09

EN/ZH

Watch with Captions

No Priors: AI, Machine Learning, Tech, & Startupshace alrededor de 1 mes

Construyendo un guardián de IA para empresas con Maxim Bar Kogan, CEO de Onyx Security

Sarah Guo habla con Maxim Bar Kogan, cofundador y CEO de Onyx Security, sobre lo que realmente implica proteger agentes de IA a escala empresarial. Maxim argumenta que los controles tradicionales — proxies, restricciones de identidad, revisión humana — se rompen cuando las acciones de los agentes se multiplican de forma exponencial, y que el único camino viable es entrenar modelos pequeños y especializados que sepan cuándo escalar a un supervisor más potente. La conversación abarca el producto "plano de control seguro" de Onyx, la aritmética de coste y latencia detrás del entrenamiento de modelos propios, por qué los laboratorios no pueden certificar de forma creíble la seguridad de sus propios modelos, y la convicción de Maxim de que la AGI está en camino y de que la supervisión independiente de IA será un negocio de cientos de miles de millones de dólares. ## [00:00] Apertura directa Maxim arranca en medio de la acción: cuanto más usan las empresas los agentes de IA, mayor es el riesgo de acciones indeseadas — agentes que publican credenciales por error, realizan llamadas de red no autorizadas o toman medidas irreversibles. Las empresas ya saben que la ola de adopción no tiene freno; lo que les falta es cualquier mecanismo para distinguir una acción legítima de una ilegítima. El fragmento expone la tesis central de Onyx antes de que comience la introducción. > *"Las empresas están empezando a darse cuenta de que ese riesgo crece de forma exponencial y de que no tienen manera de frenar la adopción. Solo pueden hacer algo para reducir la probabilidad de que esas acciones de los agentes sean ilegítimas o incorrectas."* ## [00:45] Presentación de Maxim Bar Kogan Sarah presenta a Maxim como cofundador y CEO de Onyx Security, una startup israelí formada por investigadores, matemáticos e ingenieros, descrita como la empresa que construye agentes para vigilar a los agentes de IA. La compañía combina experiencia en ciberataques ofensivos con investigación profunda en IA, incluido trabajo en datos sintéticos e interpretabilidad mecanicista. ## [01:10] AutoGPT y la apuesta por las acciones de los agentes Durante dos años, el gran riesgo en seguridad empresarial era la fuga de datos vía chatbots — empleados pegando información sensible en ChatGPT. Ese marco ha quedado superado por la preocupación por las acciones autónomas de los agentes. Maxim rastrea la apuesta fundacional de Onyx hasta AutoGPT: el primer agente que permitió a un LLM decidir qué hacer, llamar a una herramienta y repetir el ciclo en bucle, sin limitarse a generar texto. La demo demostró que los agentes podían actuar de forma autónoma en el mundo real, y Maxim concluyó de inmediato que alguien tendría que supervisar esas acciones a escala. > *"AutoGPT desató la imaginación de todos, incluida la nuestra, porque fue el primer agente verdaderamente autónomo sobre LLMs — un agente que no generaba texto sino que decidía qué hacer y luego tenía acceso a APIs para hacerlo."* ## [05:17] Qué hace el producto de Onyx Onyx hace dos cosas: entrenar modelos y construir agentes que supervisen a otros agentes, y empaquetar esa capacidad como un "plano de control seguro" que las empresas integran en su infraestructura de IA. El plano de control monitoriza las acciones de los agentes en tiempo real para determinar su legitimidad, gestionando al mismo tiempo el equilibrio entre latencia, coste y fiabilidad. Maxim sitúa la visión a largo plazo más allá de la seguridad empresarial: cualquier empresa que use agentes de IA necesita una parte independiente del proveedor que certifique lo que esos agentes hacen. > *"El número de estas acciones crece de forma exponencial. Cosas que antes parecían útiles, como un humano en el bucle — cuando vas a tener cien veces, mil veces, un millón de veces más de estas acciones — eso no va a funcionar."* ## [07:47] Estado del despliegue en grandes empresas En una gran empresa típica hoy, Maxim distingue tres categorías de despliegue de IA: automatizaciones SaaS de bajo código (drag-and-drop, no verdaderamente autónomas), agentes propios desarrollados internamente o como productos para clientes, y agentes de codificación autónomos y asistentes. De las tres, los agentes de codificación ya representan más del 50% del uso de IA. Los sectores más maduros — servicios financieros, sanidad — aplican los controles más estrictos, pero incluso las empresas más cautelosas han dejado de prohibir la IA y han pasado a gestionarla. > *"Más del 50% corresponde a los agentes de codificación autónomos y asistentes en la empresa media."* ## [09:58] Cómo asegurar los agentes Las empresas ya gastan unos 100 000 millones de dólares al año en seguridad — endpoint, red, nube, identidad. Sarah pregunta cuánto de eso sirve para la seguridad de los agentes. La respuesta de Maxim: casi nada. Los controles de identidad, la capa más básica, fallan porque los agentes necesitan permisos amplios y dinámicos que no pueden definirse de antemano. Un agente que escribe código en todo un repositorio o envía correos en nombre de un directivo no puede restringirse a un conjunto de permisos estrecho como sí puede hacerse con un proceso de software estático. La superficie de ataque es la intención, no el acceso — y las herramientas existentes no pueden leer la intención. > *"Con estas IAs autónomas, con estos asistentes, con estos agentes de codificación, no puedes saber de antemano qué permisos darles."* ## [12:45] Por qué los proxies no funcionan El instinto de Sarah, con su experiencia en seguridad, es que esto parece un problema para un proxy con un motor de políticas más inteligente. Maxim coincide en que los proxies sirven como punto de integración en algunas arquitecturas, pero dice que esquivan el problema de fondo. Un proxy te da el flujo de datos; no te dice si la acción en ese flujo es legítima. Ese juicio requiere entender el contexto — el objetivo del agente, su historial, lo que la empresa ha autorizado — y ningún motor de reglas sabe evaluar eso ante comportamientos de agente arbitrarios. > *"El problema difícil es entender si lo que debo hacer ahora está bien o no. En el caso de los sistemas de IA, esa es la pregunta difícil."* ## [14:11] Por qué Onyx entrena sus propios modelos La solución obvia — usar Claude Code para monitorizar Claude Code — se rompe por coste y latencia. Ejecutar un agente de modelo frontier por cada agente empresarial haría que la capa de seguridad fuera más cara que la IA a la que protege. La respuesta de Onyx son modelos pequeños y altamente especializados que hacen exactamente una cosa: decidir si la acción actual justifica escalar a un supervisor más potente. Sarah lo compara con el ajedrez relámpago: los grandes maestros juegan por intuición en las jugadas rápidas y solo se detienen en los momentos críticos. Maxim dice que la analogía es acertada — hay que concentrar la inteligencia donde el riesgo es mayor y mantenerse ágil en el resto. > *"Quieres entrenar modelos que sean muy buenos en una sola cosa. Son muy pequeños. Casi no pueden hacer nada más que decir: '¿Debería un agente más inteligente revisar esto?'"* ## [18:38] La cultura de talento de Onyx El talento en seguridad de Israel — forjado en unidades como la 8200 y empresas como Armis y Wiz — es bien conocido. El ADN de Onyx es diferente: el cofundador Gil viene de datos sintéticos y Nvidia, no de ciberofensiva. La mayor parte de la ingeniería de investigación de Onyx procede de una unidad de inteligencia israelí especializada en matemáticas y ciberseguridad en su intersección. Maxim ve esta combinación como algo deliberado — el problema a largo plazo que Onyx resuelve no es solo la seguridad empresarial, sino cómo controlar la IA avanzada en general. Eso exige experiencia profunda en IA junto a instintos de seguridad. Israel en su conjunto está avanzando rápidamente en IA: modelos del mundo, infraestructura de IA, chips. > *"El problema no es solo la ciberseguridad. El problema es cómo controlamos la IA avanzada a largo plazo — y ese problema, incluso si olvidamos las brechas de seguridad empresarial, parece muy importante."* ## [21:24] Interpretabilidad mecanicista Maxim cree que la interpretabilidad mecanicista — entender qué ocurre realmente dentro de los pesos y activaciones de un modelo — es posible y necesaria. Su tesis contraintuitiva: a medida que los modelos se vuelven más inteligentes que los humanos en aspectos clave, estarán mejor equipados para descifrar la estructura interna de otros modelos de lo que lo estamos nosotros. Onyx financia activamente investigación en esta dirección, no solo como herramienta de seguridad sino como ventana a qué es la inteligencia en sí misma. Sarah respalda la apuesta, señalando la oportunidad de entender no solo la IA sino la cognición en sentido amplio. > *"A medida que empezamos a tener modelos mucho más inteligentes que nosotros, al menos en algunos aspectos importantes, creemos que podremos empezar a descifrar la capacidad mecanicista de forma mucho más efectiva."* ## [23:35] Cómo Onyx construye confianza con sus clientes Las empresas del Fortune 10 y 20 no suelen trabajar con startups de dos años y menos de cien personas. Lo que rompe esa regla es el dolor: los CISOs que enfrentan incidentes diarios relacionados con acciones de agentes no tienen ningún proveedor establecido al que llamar, porque el problema no existía hace tres años. Onyx recibe llamadas entrantes de empresas que los encontraron al salir del sigilo porque la descripción del problema coincidía con algo que ya estaban combatiendo. Maxim trata esto como una ventana estrecha y temporal — los compradores empresariales saben que las startups nuevas maduran, y prefieren ser clientes tempranos que dan forma al producto antes que adoptarlo tarde. > *"Es una oportunidad que solo surge cuando el dolor es muy intenso. Su dolor es tan fuerte que dicen: 'Acabo de ver a esta empresa salir del sigilo, pero es un problema que tengo a diario, así que les voy a llamar.'"* ## [25:10] Mitigar el riesgo en el nivel fundacional La segunda oleada de preocupación entre los CISOs — más allá de las acciones de los agentes — es el desplome del coste de la investigación automatizada de vulnerabilidades. Las herramientas de codificación ya pueden encontrar y explotar vulnerabilidades a una escala que hace pocos años habría parecido lejana décadas. Maxim dice que el mercado no está exagerando: es un cambio estructural genuino. La respuesta correcta es en dos frentes: parchear rápido y aplicar controles mitigadores ahora, más inversión en controles fundacionales — identidad blindada, cortafuegos, detección en endpoints — que reduzcan la superficie explotable independientemente de lo que puedan hacer las herramientas del atacante. > *"La solución real — y todo responsable de seguridad en grandes empresas lo sabe — es que necesitamos tener las piezas fundacionales en su lugar para evitar esos riesgos."* ## [27:45] Despliegue por fases de Glasswing y Daybreak Sobre los despliegues controlados de Glasswing de Anthropic y Daybreak de OpenAI para modelos más capaces: Maxim tiene una posición condicional. El despliegue gradual es ideal si está coordinado globalmente — da tiempo para construir manuales, compartir conocimiento y evitar fallos catastróficos en redes eléctricas o aerolíneas. Pero si algún actor lanza un modelo de capacidad comparable antes del calendario gradual, el enfoque progresivo se convierte en un pasivo: las empresas que no tuvieron acceso temprano quedan expuestas a una amenaza para la que no tuvieron oportunidad de prepararse. Su recomendación es ampliar el acceso de forma amplia para que más organizaciones puedan construir defensas en paralelo. > *"Si alguien llega a un modelo de nivel method antes, en retrospectiva parecería un error enorme — al menos podríamos haber dado a las empresas la opción de empezar a moverse muy rápido."* ## [29:11] Grandes empresas que aún se resisten Hace dos años, un grupo significativo de grandes empresas simplemente prohibió la IA. Hoy Maxim apenas lo ve. El sector financiero sigue imponiendo restricciones — permite agentes pero limita qué herramientas — pero las prohibiciones totales han desaparecido. Argumenta que esto es correcto: el bloqueo a un proveedor concreto es en sí mismo un riesgo. Apostar exclusivamente por los modelos de un solo proveedor en un mercado que evoluciona a esta velocidad significa quedar expuesto cuando la próxima generación cambia los rankings. Las empresas que permiten herramientas amplias y las gestionan con rigor superarán a las que restringen de forma agresiva. > *"Si hubieras apostado por OpenAI hace un año, habría sido la apuesta más segura del mundo, pero de repente Anthropic tiene modelos y herramientas mucho mejores."* ## [30:46] Onyx y el ecosistema de seguridad de IA La seguridad de IA está llena de nuevos proveedores y nuevas superficies de ataque. El argumento de Maxim frente a la ansiedad por el alcance del producto: los dos primitivos centrales de la IA de 2026 — modelos de base transformer y bucles de agentes con llamadas a herramientas — no han cambiado fundamentalmente en años. Esa estabilidad permite a Onyx construir para muchas aplicaciones de agentes manteniendo su tecnología central ágil. La cobertura real ante cambios arquitectónicos es invertir en investigadores que puedan reentrenar y adaptarse rápidamente, en lugar de apostar a que cualquier paradigma de modelo concreto durará para siempre. > *"Los dos pilares centrales de cómo funciona la IA de 2026 no han cambiado en los últimos años. Seguimos usando en gran medida modelos de base LLM y seguimos construyendo agentes prácticamente de la misma forma."* ## [32:36] ¿Deben los laboratorios asumir la confianza y gobernanza de modelos? La pregunta urgente en el ecosistema tecnológico: ¿acabarán los laboratorios absorbiendo ellos mismos el problema de confianza y gobernanza? El argumento estructural de Maxim en contra: los compradores no quieren que el vendedor del coche certifique el coche. Los equipos de seguridad necesitan una parte independiente cuyo modelo de negocio dependa enteramente de acertar — no un proveedor que protege la reputación de su propio producto. Más allá de la psicología del comprador, Maxim traza una línea entre los errores de "inteligencia irregular" (equivocaciones absurdas que mejorarán con modelos más potentes) y los fallos a nivel de intención: manipulación adversarial, objetivos desalineados, deriva de metas. Los laboratorios resolverán la primera categoría. Solo un supervisor estructuralmente independiente puede abordar la segunda. > *"No vas a confiar en el proveedor de un producto para que te diga que ese producto no va a dañar tu entorno. Vas a querer una parte independiente cuyo negocio entero depende de decirte que esto es correcto y de tener razón."* ## [36:56] Qué hace falta en seguridad Sarah pregunta qué le falta a la comunidad tecnológica e investigadora en general — los laboratorios en particular — desde una perspectiva de seguridad. La respuesta de Maxim: no es una brecha técnica, es una brecha de empatía. Construir productos de seguridad exige entender en profundidad cómo operan realmente los equipos de seguridad — su estructura organizativa, responsabilidades, flujos de información. Israel produce talento sólido en seguridad en parte porque el servicio militar da a los ingenieros experiencia directa siendo el usuario final para el que luego construyen. Los laboratorios, implica, desarrollan capacidades sin prestar suficiente atención a la realidad operativa de las organizaciones que tendrán que desplegarlo y defenderse de ello. > *"No importa qué problema técnico estés resolviendo, estás construyendo una herramienta para personas, para una organización que tiene una estructura determinada. Crear un producto para este público que no solo resuelva el problema técnico sino que les encante de verdad es muy difícil."* ## [39:14] Por qué Maxim cree en la AGI Sarah cierra señalando la creencia implícita de Maxim de que los equipos de seguridad humanos seguirán existiendo durante algunos años. Él lo confirma — pero con un horizonte temporal: los equipos de seguridad funcionarán en modo totalmente de agentes de IA a corto plazo, igual que la mayor parte del trabajo del conocimiento. Su versión fundamentada del optimismo sobre la AGI es que la tarea de construir grandes productos no cambia: siempre hay que saber quién es el usuario final y optimizar para su experiencia. Ahora mismo ese es un humano con algunos agentes a su lado. A medida que la proporción se invierte, el mismo principio aplica — solo que a agentes que leen ventanas de contexto en lugar de paneles de control. > *"Hoy cuando vendo un producto lo vendo a una audiencia humana con algunos agentes, y a medida que esa audiencia se convierte en más agentes que humanos, será importante que evolucionemos y lo hagamos funcionar muy bien para los agentes que hacen el trabajo."* ## Entidades - **Maxim Bar Kogan** (Persona): Cofundador y CEO de Onyx Security; ex inteligencia israelí, formación en matemáticas y ciberofensiva. - **Sarah Guo** (Persona): Presentadora de No Priors; fundadora y GP en Conviction. - **Onyx Security** (Organización): Startup israelí que construye infraestructura de supervisión de IA — entrena modelos pequeños especializados para monitorizar y gobernar agentes de IA empresariales. - **AutoGPT** (Software): Primer agente LLM autónomo de código abierto; citado por Maxim como el punto de inflexión que hizo concreto el riesgo de los agentes. - **Glasswing / Daybreak** (Software): Programas de despliegue controlado de Anthropic y OpenAI respectivamente para acceso a modelos frontier. - **Mechanistic Interpretability** (Concepto): Programa de investigación orientado a entender la estructura interna de pesos y activaciones de redes neuronales; Onyx lo trata como pilar a largo plazo de la supervisión de IA. - **Secure Control Plane** (Concepto): Categoría de producto de Onyx — una capa independiente del proveedor que monitoriza permisos de agentes, legitimidad de acciones e historial de comportamiento en tiempo real. - **8200** (Organización): Unidad de inteligencia israelí ampliamente reconocida por producir el mejor talento tecnológico y de seguridad de Israel, incluyendo a muchos ingenieros de Onyx.

#ai-security#enterprise-ai#ai-agents

Devin’s 80% Moment: Background Agents, 7x PRs, & End of Hand-Held Coding — Walden Yan & Cole Murray

1:09:32

EN/ZH

Watch with Captions

Dan Shipper, cofundador y CEO de Every, regresa para exponer 12 predicciones contrarias sobre la IA y el trabajo — la mayoría rebate el pánico generalizado. Su argumento central: la automatización no reduce las cargas de trabajo, las reestructura; Codex y Claude Code se están convirtiendo en el nuevo sistema operativo del trabajo intelectual; el apocalipsis del SaaS es ficción; y la única habilidad de supervivencia que realmente necesitas es disposición para cabalgar los modelos a medida que mejoran. Every, empresa de 30 personas, funciona como experimento en vivo de esta tesis, lo que sitúa a Dan en una posición inusualmente sólida para saber si las predicciones se sostienen. ## [00:00] Presentación de Dan Shipper Lenny Rachitsky abre recordando la visita anterior de Dan, donde hizo una predicción "casi de pasada" de que la gente subestimaba Claude Code para el trabajo no técnico — una apuesta que resultó "increíblemente acertada." El regreso de Dan gira en torno a doce predicciones más, y arranca con la conclusión directamente: > *"El apocalipsis laboral de la IA no es realmente una cosa."* ## [02:56] La posición única de Dan: vivir en el futuro de la IA Dan explica por qué Every funciona como laboratorio de señales tempranas: cada empleado — editores, operaciones, finanzas — es usuario diario de IA, lo que le da a la empresa una ventaja real sobre cómo serán los próximos doce meses en la práctica. Lo contrasta con la visión de "la burbuja de San Francisco", argumentando que la verdadera frontera de adopción de la IA está donde la IA se encuentra con un experto de dominio haciendo trabajo real, no donde se está construyendo la IA. > *"El borde de la IA está donde la IA se encuentra con un humano real haciendo algo."* ## [09:17] Cómo cambiará la forma de trabajar en el próximo año Lenny Rachitsky agrupa tres bloques de predicciones: cómo trabajamos, la forma del trabajo en sí y quién prospera. La primera predicción de Dan es que todo el trabajo profesional converge en una sola superficie — Codex o Claude Code — que actúa como compañero de trabajo en paralelo: observa lo que haces, gestiona investigaciones, redacta correos y lanza tareas de larga duración mientras tú sigues en tu documento principal. Él ya lleva diez días consecutivos con la bandeja de entrada en cero gracias a Codex junto con Cora, el agente de correo de Every. > *"Básicamente siento que tengo este compañero de trabajo paralelo que no solo puede responder y escribir en el documento, sino que además puede ir a hacer investigación."* ## [16:39] El caso de los agentes generales Dan predice que toda empresa tendrá un "súper agente" dentro de Slack con el que todos los empleados interactuarán a diario — un asistente de propósito general con acceso al contexto de la empresa, no un bot de tareas concretas. Este agente se convierte en la capa de memoria organizacional: enruta preguntas, extrae datos y tiende puentes entre equipos que no saben que necesitan hablar entre sí. ## [18:08] Codex y Claude Code como nuevo sistema operativo del trabajo El avance de Claude Code fue colocar un agente capaz directamente en tu ordenador, con acceso al terminal y — clave — al navegador. Anthropic descubrió el paradigma primero; OpenAI lo alcanzó en torno a la versión 5.3 y luego aceleró. El conductor diario actual de Dan es Codex, que ejecuta en paralelo junto a su app de escritura Proof — el agente observa su navegador, lee la página abierta y actúa en su nombre sin cambiar de contexto. > *"Quien lleve la delantera, me parece muy obvio que todo el trabajo que haces va a estar en una de esas superficies."* El modelo de "trae tus propios tokens de IA a un SaaS" reestructura la economía: el producto SaaS no paga la inferencia, lo hace el usuario, lo que restaura márgenes y elimina la presión de construir una capa de IA propia desde cero. ## [25:39] El lugar de Cursor en este panorama Cursor domina los flujos de trabajo de programación hoy, pero Dan lo ve en una encrucijada estratégica: mantenerse como IDE puramente de código o evolucionar hacia la superficie agéntica de propósito general. Quedarse en lo estrecho mantiene el foco; ampliar significa competir directamente con Codex y Claude Code. Su predicción es que el ganador de la categoría será la superficie que gestione tanto código como trabajo de conocimiento general en un solo lugar. ## [27:42] Qué deben construir ahora las empresas de SaaS Los productos SaaS ahora necesitan ser legibles por agentes, no solo por humanos — HTML limpio, buenas interfaces CLI y un diseño que exponga la información para el consumo automatizado. Dan señala Proof: como Codex observa la página, las pequeñas fricciones se resuelven casi de inmediato, cerrando el ciclo entre "me topé con algo" y "ya está arreglado." > *"Puedes ver los destellos de este ciclo de cierre muy rápido entre me topé con algo, una pequeña molestia, y puedo solucionarlo aquí mismo."* ## [31:13] Por qué el CLI ya es historia La era del CLI se quemó a velocidad de vértigo. La ola fue: GUI, luego CLI como movimiento de poder, luego agentes que reemplazan al CLI por completo. Una vez que tu agente puede operar cualquier interfaz leyendo la pantalla, la razón para vivir en el terminal desaparece. La predicción de Dan es contundente: > *"Los CLIs han terminado. Corrimos a toda velocidad la era del CLI."* ## [33:34] Dos agentes son mejor que uno Dan cuestiona el maximalismo de los agentes. El patrón real que emerge es el de agentes especializados — uno para programar, otro para el correo, otro para datos — que se comunican entre sí en nombre del usuario. Cuando algo falla en una app, Codex puede hablar directamente con el agente del proveedor para diagnosticar el problema sin abrir un ticket de soporte. El paradigma cambia en cuanto asumes que todos tienen un agente y que los agentes pueden negociar entre ellos. ## [36:22] Por qué Dan apuesta por las acciones de SaaS La narrativa de "el SaaS ha muerto" pasa por alto cómo funcionan realmente los costes cuando los agentes impulsan el uso. Cuando los usuarios llevan sus propios tokens de IA a un producto SaaS, los costes de inferencia del proveedor se acercan a cero. La posición contraria de Dan: > *"Yo compraría acciones de SaaS ahora mismo."* Las empresas de SaaS que hacen sus productos compatibles con agentes no quedan desintermediadas — obtienen un viento de cola en sus márgenes. ## [39:01] Por qué la automatización no reduce el trabajo humano Esta es la tesis intelectual central del episodio. Dan argumenta que cada capa de automatización requiere un gestor humano por encima que verifique que funciona correctamente. Él mismo construyó su propio benchmark — el "senior engineer benchmark" — haciendo que dos ingenieros senior reales reescribieran su app Proof desde cero de forma independiente, y luego probando cada nuevo modelo contra esas soluciones de referencia. Los modelos obtenían 30/100 hasta GPT-5.5, que saltó a 60/100. La brecha revela algo importante: los modelos corrigen lo que les dices que corrijan. Un ingeniero senior humano mira la base de código, decide que necesita una reescritura completa y lo dice sin que nadie se lo pida — los modelos no emiten ese juicio por iniciativa propia. Siempre hay un marco superior que un humano tiene que articular. > *"Cada vez que automatizas algo, para asegurarte de que la automatización funciona bien, necesitas a un humano encima asegurándose de que funciona bien."* ## [47:00] El valor del código escrito por humanos El código escrito por humanos sigue siendo la señal de referencia que permite puntuar el output de los modelos. El benchmark de Dan depende de dos reescrituras humanas como base de verdad. A medida que el código generado por IA se convierte en la norma, el corpus escrito por humanos se vuelve más escaso y más valioso — es lo que necesitas para saber si la IA realmente está mejorando. ## [48:36] Resumen rápido Lenny Rachitsky resume el primer bloque de predicciones: el trabajo sucede dentro de Codex o Claude Code; cada empresa tiene un súper agente en Slack; traer tus propios tokens restaura los márgenes del SaaS; los CLIs han terminado; dos agentes especializados superan a uno generalista; la automatización amplía la carga de trabajo humana en lugar de reducirla. ## [50:15] Cómo está cambiando el trabajo El segundo bloque aborda la forma del trabajo en sí. La visión de Dan: el ingeniero desplegado en campo se convierte en la contratación más valiosa — alguien que puede sentarse con un cliente, entender su flujo de trabajo y construir y entregar una solución en la misma reunión. El concepto de "economía de asignación" de su ensayo anterior aplica aquí: los humanos pasan de ser productores directos a ser asignadores de capacidad de IA, y asignar bien resulta ser cognitivamente exigente por derecho propio. > *"Soy simultáneamente alguien muy volcado en la IA y muy optimista sobre los humanos y el rol de los humanos para asegurarse de que la IA produce cosas que vale la pena producir."* ## [56:17] Por qué los científicos de datos se ahogan en análisis deficientes Los equipos de datos se están inundando con análisis generados por IA provenientes de todos los demás en la empresa — análisis que parecen plausibles pero suelen estar equivocados. El trabajo del científico de datos senior pasa de producir análisis a auditarlos, lo cual es más difícil y cognitivamente más exigente. La misma dinámica afecta a la ingeniería: las peticiones de nivel junior las resuelven los modelos, lo que saca a la superficie más casos límite que requieren un juicio más profundo para resolver. > *"Necesitas más personas senior que se ocupen de las preguntas más profundas y difíciles para el equipo que gestiona todas las peticiones básicas."* ## [58:24] Qué roles de producto y tecnología cambian menos con la IA La respuesta de Dan: los roles cuyo output es más difícil de formular como un prompt. Distingue entre "vigilar agentes" — observar pasivamente en busca de errores — e "ingeniería desplegada en campo" — construir activamente sistemas que permitan a todos hacer lo que antes requería especialistas. Lo segundo es donde vive el trabajo interesante y difícil de automatizar. ## [62:17] Leeremos mucho más texto generado por IA y nos gustará Every usa agentes de Notion para la planificación trimestral — el informe de estrategia de cada equipo es generado por IA, y el resultado que recibe Dan es mejor que lo que producía la planificación manual. Su correo está escrito mayormente por GPT-5.5. Su prueba para saber si el contenido escrito por IA es aceptable: ¿tuvo que entender el remitente lo que contiene para dirigir a la IA? Si sí, está bien. Si el remitente claramente no lo ha leído, eso es una violación del contrato social. > *"El de baja calidad es aquel que al remitente le llevó menos tiempo producirlo de lo que a mí me lleva leerlo."* También publica guías de Every escritas con coautores agentes, diseñadas explícitamente para ser leídas tanto por humanos como por otros agentes — un nuevo formato de contenido optimizado para consumo dual. ## [68:28] Por qué los product managers dominarán la era de la IA Dan cita a Marcus, el PM interno de Every que dirige el producto Spiral, como arquetipo: fuerte sentido de producto, capaz de dirigir a la IA para construir e iterar rápido, lanza sin esperar ancho de banda de ingeniería. Los PMs son asignadores en esencia — deciden qué debe construirse y para quién — que es exactamente la habilidad que sigue siendo escasa cuando construir en sí se vuelve barato. > *"Soy super super optimista con los PMs."* ## [71:05] Los diseñadores full-stack son los otros grandes ganadores Los diseñadores full-stack — personas con fuerte instinto visual que también operan en código — ya están haciendo pull requests directamente en herramientas como Lovable y Figma Make. El traspaso entre diseño e ingeniería se comprime hacia cero. Dan espera que se conviertan en los superhéroes de referencia de la era de la IA junto a los PMs. ## [73:11] El apocalipsis laboral de la IA no va a ocurrir Dan separa la ronda actual de despidos (en su mayoría correcciones de contratación excesiva) de una afirmación de desplazamiento estructural por la IA, y rechaza esta última. Su argumento estructural: los modelos se entrenan con la competencia humana de ayer, lo que significa que producen lo que ya se conoce en su forma más predeterminada. Los humanos empujan la frontera haciendo cosas nuevas con esa competencia congelada, creando espacio que los modelos luego tienen que alcanzar. El ciclo se repite. > *"Estructuralmente, debido a cómo funcionan los modelos, siempre habrá espacio para que los humanos avancen más."* ## [76:00] Cómo "cabalgar los modelos" para seguir siendo relevante El consejo práctico: no resistas los nuevos lanzamientos de modelos — trata cada uno como un nuevo conjunto de capacidades que explorar y aplicar a tu dominio real. Dan vuelve a ejecutar su senior engineer benchmark cada vez que cae un modelo importante. También rebate la idea de que el borde del conocimiento sobre IA vive en San Francisco. Every, operando desde Brooklyn, se mantiene por delante precisamente porque usa los modelos para todo, no porque los esté construyendo. > *"Lo único que tienes que hacer es cabalgar los modelos. Y eso significa usarlos para lo que sea que hagas."* ## [81:02] Predicciones finales y consejos Lenny Rachitsky hace zoom out: las dos caras de la moneda de esta conversación son "menos está cambiando de lo que temes" (el SaaS continúa, los empleos no desaparecen) y "más está cambiando de lo que estás preparado" (cómo se hace el trabajo, qué roles importan, cómo es un día laboral). El cierre de Dan: el ingeniero desplegado en campo es la nueva contratación esencial; las empresas que bloquean a sus empleados de usar los últimos modelos están cometiendo un error estratégico de combustión lenta. ## [85:24] Ronda relámpago A quemarropa: la creencia más contraria de Dan es que el apocalipsis laboral de la IA genuinamente no está ocurriendo; lo que más desearía que la gente entendiera es que la frontera de la IA no está en San Francisco — está donde alguien usa un modelo para hacer trabajo real en un dominio real. Le diría a su yo del pasado que contratara ingenieros senior antes, y espera que la IA cambie fundamentalmente cómo la gente piensa sobre los benchmarks durante el próximo año. ## Entidades - **Dan Shipper** (Persona): Cofundador y CEO de Every; autor del ensayo "After Automation"; dirige Every como laboratorio vivo de adopción de IA - **Lenny Rachitsky** (Persona): Presentador de Lenny's Podcast, fundador de Lenny's Newsletter, ex PM de Airbnb - **Every** (Organización): Empresa de medios y software nativa de IA con 30 personas; todos los empleados son usuarios diarios de IA - **Codex** (Software): Superficie agéntica de OpenAI para programación y trabajo de conocimiento general; el driver diario actual de Dan - **Claude Code** (Software): Agente de programación basado en terminal de Anthropic; pionero del paradigma agéntico en el propio ordenador - **Proof** (Software): App de escritura markdown asistida por IA de Dan; la base de código de referencia para su senior engineer benchmark - **Cora** (Software): Agente de correo de Every, integrado con Codex para la gestión de la bandeja de entrada - **Cursor** (Software): IDE de programación con IA en una encrucijada estratégica entre herramienta de código y superficie agéntica general - **Ingeniero desplegado en campo** (Concepto): Rol híbrido que combina ejecución técnica con descubrimiento de problemas cara al cliente; la apuesta de Dan como la contratación más valiosa de la era de la IA - **Senior engineer benchmark** (Concepto): La evaluación personalizada de Dan donde dos ingenieros senior humanos reescriben una base de código desde cero; los nuevos modelos se puntúan contra esas soluciones de referencia - **Economía de asignación** (Concepto): El marco de Dan que predice que los humanos pasan de productores directos a asignadores de capacidad de IA - **Cabalgar los modelos** (Concepto): El consejo de Dan para seguir siendo relevante — tratar cada nuevo lanzamiento de modelo como un nuevo conjunto de capacidades que explorar y aplicar a tu propio dominio

#ai-agents#future-of-work#saas

PodcastsHear the voice. See the shape of the thought.

Explorar Canales

Lenny's Podcast

a16z

All-In Podcast

The Diary Of A CEO

AI Engineer

Machine Learning Street Talk

Google DeepMind

Lex Fridman

No Priors: AI, Machine Learning, Tech, &amp; Startups

Unsupervised Learning: With Jacob Effron

Sequoia Capital

Dwarkesh Patel

Yannic Kilcher

20VC with Harry Stebbings

Every

Anthropic

Latent Space

Bloomberg Originals

Claude

Tech Whistleblower: You Only Have 3 Years Left Before It Hits! - Mo Gawdat

A Conversation With Demis Hassabis' Biographer

Inside xAI: Building Grok Imagine in 3 Months, Videogen vs World Models, and Video Agents— Ethan He

A rational conversation on where AI is actually going | Benedict Evans

The Ex-Congressman Who Says AI Isn't Unstoppable — Brad Carson

Anthropic's Digital God, Pope vs AI, Job Loss Narrative Flips, Open Source Crackdown Coming?

Biggest Mysteries in Physics: Antimatter, Dark Energy & ToE - Don Lincoln | Lex Fridman Podcast #497

The Rule for Picking AI Winners | The a16z Show

Neuralink's DJ Seo: Inside the Race to Connect Brains and AI

Why Opus 4.8 Pulled Me Back to Claude

DEBATE DE EMERGENCIA: Nos Mienten Sobre la IA, la Guerra con Irán y Lo Que Viene Después

Construyendo un guardián de IA para empresas con Maxim Bar Kogan, CEO de Onyx Security

Devin’s 80% Moment: Background Agents, 7x PRs, & End of Hand-Held Coding — Walden Yan & Cole Murray

Mercados privados, repricing del software y asignación de capital | Marc Rowan en a16z

Automatizamos todo con IA y triplicamos nuestro equipo

🔬 La Amarga Lección llega a las proteínas — Alex Rives, BioHub

Cómo Cursor entrenó Composer en Fireworks: infraestructura distribuida para RL de alto rendimiento

Publica tu primer Managed Agent

Bruno Fernandes: Roy Keane Tergiversó Mis Palabras. Me Ofrecieron £200M, Dije Que No.

La paradoja de la IA: más automatización, más humanos, más trabajo | Dan Shipper

No Priors: AI, Machine Learning, Tech, & Startups