Nvidia usually sells the chips and software that go inside other people's robots. This week it skipped ahead and unveiled a complete humanoid of its own: a walking, two-armed machine (body built with partner Unitree) running Nvidia's most powerful robot computer and its full robot software, all in one package. It's meant for university labs, ships late this year, and the price isn't public.
Why care: this is Nvidia trying to become the default blueprint every robot-maker copies, the way its chips already power most AI. If labs everywhere build on this one kit, Nvidia ends up owning the foundation of the whole robot industry, not just selling it parts.
For years Nvidia sold the parts: chips to the robot-makers, simulators to the labs, models to whoever would download them. At GTC Taipei on June 1 it skipped to the end and shipped a complete humanoid. The Isaac GR00T Reference Humanoid Robot is a Unitree H2 Plus body (5'10", 150 lb, 31 degrees of freedom) with Sharpa Wave tactile hands at 22 DoF each, and a Jetson AGX Thor brain bolted in: a Blackwell GPU rated at 2,070 FP4 TOPS, 128 GB of unified memory, a 40–130 W power envelope. It's sold as a research platform, not a product you'll meet on the street, and the labs that buy it keep ownership of every byte it records.
The software is the actual moat. The robot ships running the Isaac stack end to end: Isaac Teleop to capture demonstrations, the open GR00T foundation models to act on them, Isaac Sim and Lab to train in simulation, Isaac ROS to deploy. Alongside it Nvidia pushed GR00T N1.7, a 3B vision-language-action model trained on 20,854 hours of human first-person video, with a scaling-law claim worth watching: going from 1k to 20k hours of plain human footage more than doubles average task completion. If that holds, cheap point-of-view video starts to substitute for expensive robot teleoperation data.
The catch is delivery. The robot ships "late 2026," pricing is undisclosed, the promised US, European, and South Korean body partners are unnamed, and every benchmark here is Nvidia grading its own homework. But the strategic move is plain: Nvidia wants to be the reference design every robot company builds against, the way its silicon already sits under most of AI.
A big selling point for robotaxis is that they'd ease traffic by driving smarter than humans. An MIT researcher checked the receipts, 86 million miles of Waymo's California data, and found the cars run with nobody inside about 46% of the time, once you count cruising between rides and driving to pickups. That's basically the same as Uber and Lyft drivers.
Why care: "robot cars will cut congestion" is a promise cities are planning around. The first hard look at real numbers says that, so far, they're no better than the human ride-hail they're meant to improve on. The empty-mile rate had been falling, but it's leveled off.
The pitch for robotaxis has always carried a congestion dividend: software dispatches better than people, so fewer empty miles. A new peer-reviewed analysis in Transport Findings holds that up to the meter and it doesn't survive. MIT's Awad Abdelhalim pulled Waymo's mandatory CPUC filings, 86.27 million miles across San Francisco and Los Angeles from August 2023 to December 2025, and found 46.4% of those miles carried no passenger, counting both repositioning and driving to a pickup.
That's exactly what the dividend was supposed to erase. Prior studies put Uber and Lyft deadheading north of 40% of miles, so Waymo isn't beating human ride-hail here, it's matching it or running slightly worse. The one hopeful thread is the trend: empty share fell from 64% at launch to about 44% by late 2025, then flattened. Per-trip empty distance dropped from 5.1 miles to 2.8.
46.4% of Waymo's California vehicle-miles were driven with no passenger aboard.
Caveats sit on the comparison. The 40% ride-hail figure comes from older studies, not this dataset, and the CPUC numbers are aggregated, so Abdelhalim can't slice by hour or neighborhood. The plateau may yet break as fleets densify. But the empirical claim is blunt, and it's the rare robotaxi number that didn't come from the company selling the rides.
Amazon showed a new version of its warehouse robot, Proteus, that you can just talk to: tell it "pick the items from the yellow bin and put them in the gray one," and it figures out how. The older model could only follow fixed routes near the loading dock; this one roams the whole warehouse. Amazon paired the reveal with a €10 billion plan to expand in Europe and 25,000 new jobs.
Why care: instructing a robot like a new coworker, instead of programming it, is the shift that could put robots into far more jobs. The catch: it's still a lab demo, and the real rollout isn't planned until 2027.
At a London event on June 3, Amazon showed the next version of Proteus, and the upgrade isn't the hardware. Today's Proteus runs at 25 US sites but stays penned near the dock on fixed paths. The new one roams the whole floor and takes instructions in conversational English: "pick all items in the yellow tote to your left and place them in the gray tote," "load the trailer with all totes in the loading area." A vision-language model parses the request; a policy layer turns it into motion. Amazon won't say which models.
The framing from robotics VP Scott Dresser:
You tell it what needs to be done. It figures out the priority, the route, the timing.
Bundled with it: a €10 billion European fulfillment build-out, 25,000 new jobs, and two more machines, the touch-sensitive Vulcan and the tote-handling STARK (headed to 15 European sites by 2027). The honest line is that conversational Proteus is a lab pilot, not a deployment, with European rollout targeted for the first half of 2027, and every figure is Amazon's own. But "direct the robot like you'd direct a coworker" is the interface the whole warehouse-robotics field is chasing, and Amazon has the floor space to ship it first.
Six children with spinal muscular atrophy, a disease that weakens muscles so badly the kids had never stood on their own, trained with a small robot (about one kilogram) strapped to the knee. Instead of helping their legs move, it pushed back, forcing the muscles to work, all turned into a video game where kicking harder scored points. After a few weeks, all six could stand unaided, and the gains stuck even after the robot came off.
Why care: these kids are usually told standing isn't possible, and the available medicines rarely undo damage already done. Scans showed their muscles actually grew. It's only six children with no comparison group, so it's a promising first step, not proof, but a real one for families with almost no options.
Children with type II spinal muscular atrophy are usually told standing isn't coming. A study in Nature (May 20) pushes back with a small, striking result. Six children aged 6 to 10, none able to rise from a chair unaided, trained with a 0.96 kg wearable knee robot, and afterward all six could stand on their own, hands on their knees, no support.
The design inverts the usual assistive approach. Most rehab devices help the limb move; this one resists it, delivering speed-controlled load so the weakened muscle has to fight. The protocol ran in phases: six weeks of normal physiotherapy with no robot (and no improvement, which acts as a within-subject control), then six weeks of high-intensity resistance, then six lighter weeks, then 30-plus days back on conventional physio with the device gone. The gains held. From the team's measurements:
To keep kids training hard, every session was a game: extend your knee faster, kick a cartoon ball farther. The obvious limit is n=6 with no control arm, no blinding, a prototype with no manufacturer, and the authors say exactly that and call for larger trials. But MRI-confirmed muscle growth that persisted after the robot came off is not a placebo, and for a population with almost no rehab options, it's a real signal.
Alongside the robot, Nvidia released Cosmos 3, a free-to-use AI aimed at robots and self-driving cars. The trick: older models could picture a scene or describe one, but this single model can also output the actual motion, the joint angles and gripper movements a robot needs to act, after training on a colossal pile of images, video, and movement data.
Why care: today, teams bolt together separate systems, one to imagine the world, another to decide how to move in it. Folding that into one model could make building capable robots a lot faster. Nvidia's "we're #1" benchmark claims are its own, though, so wait for outside testing.
The other half of Nvidia's June 1 push is a model, not a machine. Cosmos 3 is an open foundation model for physical AI, built as a Mixture-of-Transformers: one reasoning tower (an autoregressive VLM) and one diffusion tower for generation, trained together on 20 trillion tokens of multimodal data, nearly a billion images and 400 million real and synthetic videos among them. The leap from Cosmos 1 and 2 is that it generates action natively. The same model that imagines a scene can output joint angles, gripper positions, and trajectories to act in it, instead of handing video to a separate policy network.
It ships in two sizes now, Nano (16B) for a workstation GPU and Super (64B) for the datacenter, with an Edge variant promised, available on Hugging Face and GitHub and through NIM microservices. Nvidia claims the top spot among open models across a stack of physical-AI benchmarks, world generation, action policy, spatial reasoning, all of them self-reported or on leaderboards it helps run. The genuinely interesting bet is the collapse: fold the world model, the video generator, and the action model into one network, and the simulate-then-control pipeline that every robotics team rebuilds by hand gets a great deal shorter.
Most "robot helper in your house" stories are staged for cameras. This is a real one. Hello Robot makes a wheeled robot with an arm, and Keith Platt, quadriplegic since 2021, has used one at home every day since 2024. He sends it across the house with a voice app, then steers the arm himself to make a shake, put on his glasses, or brush his teeth.
Why care: a robot doing real daily chores, unsupervised, in an actual home for two years is rare, and for people with severe disabilities it means more independence and less reliance on a caregiver. Reality check: he's one user (and a company investor), and the robot costs nearly $30,000 and isn't sold to regular households yet.
Most "robot in the home" stories are a staged demo. This one isn't. Hello Robot released Stretch 4, its fourth-generation mobile manipulator, and runs a program called Assist that places the robot in the homes of people with severe mobility impairments. Keith Platt, quadriplegic since 2021 and a board member at the company, has used one daily since 2024: a voice-driven iPhone app sends it to navigate the house on its own, then he takes direct control for the hands-on parts, making a protein shake, putting on his glasses, brushing his teeth.
The control split is the practical insight. Full autonomy isn't ready and pure teleoperation needs constant attention, so Stretch handles the navigation and the human handles the manipulation. The hardware is a 160 cm pole on an omnidirectional base, an Nvidia Jetson Orin NX for onboard vision, dual lidars, open ROS 2 and a Python SDK. The honest caveats: Platt is one user and also an investor, the wins are self-reported, and at $29,950 Stretch 4 sells to researchers and enterprises, not to individuals yet. But a general-purpose manipulator running unsupervised in a real house, for two years, for actual daily care, is further than most of this field has gotten.
JPMorgan has been down on Tesla for years. This week it softened, and the reason was telling: its analyst now thinks that by 2030, roughly half of Tesla's money could come from robotaxis, its Optimus humanoid, and licensing its self-driving software, rather than from selling cars.
Why care: when the bank that was most skeptical starts describing Tesla as a robotics-and-AI company, it shows how completely the story around Tesla has shifted to robots. But this is an analyst's projection, not Tesla's own forecast, and it assumes a lot of things go right that haven't yet.
JPMorgan has been the most stubbornly bearish big bank on Tesla for the better part of eight years. This week its new analyst Rajat Gupta blinked, upgrading the stock from Underweight to Neutral with a $475 December-2027 target. The reasoning is the story, not the rating: Gupta models Tesla's revenue roughly doubling to ~$203B by 2030, and reckons robotaxi service, Optimus, and FSD licensing together make up about half of that, near $100B, more than the cars on his own numbers.
He reached for the Amazon analogy, AWS and Kiva robotics turning a retailer into something else, and pointed at Tesla's fleet driving data as the moat. Keep the frame honest: this is a sell-side projection, not Tesla guidance; the "half from robotics and AI" lumps three still-unproven businesses into one bucket with no per-segment split; and the whole curve assumes regulators wave through unsupervised robotaxis and Optimus reaches manufacturing scale. None of that is banked. What changed isn't Tesla's fundamentals, it's that the last bank holdout reframed the entire bull case around robots.
Google released Magenta RealTime 2, an AI that generates music live as you play, fast enough (about a fifth of a second of lag) that it doesn't feel like waiting. It runs on a regular Apple laptop without the internet, you can steer it with a keyboard or a text description, and Google put it out free for anyone to use, including commercially.
Why care: most AI music tools live behind a website and feel sluggish. This one's quick enough to actually jam with, on hardware musicians already own, and it's open. Limits: it does instruments, not singing, and the bigger version needs a fairly powerful Mac.
Magenta RealTime 2 (June 4) is a real-time music model you can actually play with, and the number that matters is latency. MRT1 processed audio in two-second chunks and lagged about three seconds. MRT2 moves to frame-level autoregression at 40 ms frames and cuts end-to-end control latency to roughly 200 ms, the threshold below which a human can play along without fighting the delay. It does this on a stock Apple Silicon Mac, no cloud round-trip.
The other unlock is MIDI as a control input, which MRT1 lacked, so a keyboard player can feed chords and melody directly instead of nursing a text prompt mid-take. It comes in a 2.4B base and a 230M small model, weights under CC-BY-4.0 (commercial use with attribution), with a C++ engine for Apple Silicon and an AUv3 plugin so it drops into a DAW. The limits are real: the base model wants an M2 Max or better for live use, output is instrumental only (vocals come out as wordless texture), and there's no peer-reviewed eval yet, just a forthcoming report. Still, open-weight, sub-200 ms accompaniment on hardware musicians already own is a sharp break from the API-gated music models.
AI image and molecule generators usually build things in many small steps, which is slow. Researchers found a way to cut that to one or two steps, but unlike earlier speed-ups, theirs keeps the built-in randomness mathematically correct instead of quietly cutting a corner that can introduce errors.
Why care: most "make it faster" tricks subtly change what the model is really doing. This one claims to be fast and faithful, which matters a lot for science uses like simulating molecules, where the random details aren't decoration, they're the answer. It's an early paper, so the exact numbers still need checking.
Fast diffusion sampling almost always means a deterministic shortcut: consistency models and distillation collapse the smooth path to a few steps but drop the stochastic part. A preprint out of Bath and AITHYRA, Strong Stochastic Flow Maps (arXiv:2606.01086, May 31), goes after the harder target. It learns the strong solution map of a noisy SDE, the Itô map, so given a specific random path the model lands on the right pathwise endpoint, not just a sample from the right average the way earlier stochastic flow-map methods do.
Why that distinction earns its keep: weak methods recover the distribution at each step but drift off any single noise realization, so chaining steps compounds error and pathwise statistics come out wrong. SSFM stays on the actual trajectory, and trains without simulating the SDE, using a polynomial approximation to Brownian motion with proven convergence. The authors report state-of-the-art few-step stochastic image generation on CIFAR-10 and CelebA, and sampling of equilibrium molecular conformations in 1–2 network evaluations, which for molecular work is the headline if it survives scrutiny. It's a preprint, the full FID tables aren't public yet, and one protein dataset isn't released, so treat the figures as unverified. The idea, fast generation that doesn't lie about the randomness, is the part to track.
When an AI helps build a website, it can look at the page and fix what's wrong. For iPhone apps it was working half-blind. OpenAI added a feature to its Codex coding tool that streams a live iPhone simulator into the browser and instantly updates the app's screens as the AI edits, so it can actually see what it's making.
Why care: seeing the result and correcting it is how good design gets built. Giving an AI that feedback loop for phone apps, not just websites, makes it meaningfully more useful to mobile developers. It runs on a Mac and leans on an outside open-source tool to do the streaming.
Coding agents have had a blind spot on mobile. A web agent renders a page, looks at it, fixes it, re-renders; a mobile agent had to infer the screen from accessibility trees and screenshots. OpenAI's curated build-ios-apps plugin for Codex closed some of that on June 3 with a new skill that streams a live iOS Simulator into the browser and hot-reloads SwiftUI previews as the agent edits.
The streaming leans on serve-sim, Evan Bacon's open tool that captures the simulator framebuffer via xcrun simctl io and serves it as an MJPEG feed plus a WebSocket control channel. The preview piece spins up a throwaway host project outside your source tree and hot-swaps views by dylib injection, without touching your .xcodeproj or schemes. So Codex finally gets a visual verify loop on iOS work, the same kind it already has for the web. The asterisks: it rides on a third-party tool, dylib injection is exactly the sort of thing Apple constrains in some contexts, and there are no reliability numbers. It also needs a real Mac with Xcode's command-line tools, so this is local-agent plumbing, not a cloud feature.

This week the robots walked out of the lab, and the day is the gap between how that looks and how it actually went. Nvidia set the tone by selling not chips but a whole humanoid, plus a model, Cosmos 3, that dreams up a scene and the motor commands to move through it in one breath. Amazon's warehouse robot started taking orders in plain English. A man who can't move his arms has quietly run a robot in his house for two years. A one-kilogram brace stood six children up who had never stood at all. Stack those together and the promise looks real, and close.
Then the same day hands you the reality. Waymo's robotaxis, the most mature autonomy we have, turn out to drive empty nearly half the time, no cleaner than the Uber they were meant to replace. And the money is sprinting ahead of the machines: the bank that spent eight years doubting Tesla now calls it a robot company and bets half its 2030 revenue on humanoids and robotaxis that don't yet exist at scale. The distance between a polished demo and a shipped deployment keeps showing up in the footnotes, late-2026 availability, a 2027 rollout, a sample size of six.
Underneath all of it, the software that makes any of this possible kept getting quietly faster and stranger, Google's music model jamming live on a laptop, a sampling trick that's quick without cheating the math, a coding agent that can finally see the iPhone screen it's building. The physical-AI push is real and accelerating. What it hasn't done is escape the ordinary problems, empty miles, hype, the long road from a working demo to a working product, that everything else in the world still has.

Pick a question above, or type your own. The badger answers from this issue's own words.
The real badger's napping off the dig, so this one's AI. It can be wrong, so check the sources.