AI Notes — May 7

Claude rate limits raised

Anthropic announced a new compute partnership with SpaceX and immediately turned the extra capacity into higher product limits: Claude Code's 5-hour rate limits are doubled for Pro, Max, Team, and seat-based Enterprise; peak-hour limit reductions are removed for Pro and Max; Opus API rate limits are substantially increased.

Harvey ships LAB legal-agent benchmark

Harvey launched LAB, an open-source long-horizon legal-agent benchmark covering 1,200 tasks across 24 practice areas, with support and commentary from LangChain, Baseten, Artificial Analysis, and others. Law is one of the few domains that combines long tasks, long context, and high accuracy bars all at once — much harder than point QA.

Genesis AI: a robotics model built for human-level hand control

Genesis AI introduced GENE-26.5, a robotics model designed for tasks that require precise hand movements and coordination.

To train the system, the company also built a robotic hand and a motion-capture glove that lets humans generate training data simply by performing tasks while wearing the glove. Genesis says the hardware is around 100x cheaper than traditional systems and can collect data much faster.

Their demo showed robots cracking eggs, slicing tomatoes, conducting lab experiments, solving a Rubik's Cube, and playing piano with surprisingly fluid hand control.

Why this matters:

The biggest problem in robotics is increasingly becoming data — robots need massive amounts of real-world examples showing how humans move and manipulate objects
Genesis is trying to solve that by turning normal human activity into training data. If it works, every interaction becomes part of the learning process and robots improve much faster

Kanwas: an open-source second brain for teams

Kanwas pitches itself as "the open-source second brain built for modern teams." Worth keeping in mind as a reference shape — the question of how to build a team-level brain is genuinely interesting.

Figure tour notes

A walkthrough of humanoid robotics company Figure's headquarters, with the founder leading a tour of R&D, manufacturing, and testing.

Company snapshot

Four-building campus in California, around 500 employees (250-300 at HQ)
Less than 4 years old, hundreds of robots in fleet
Goal: more robots than employees
Recently became the first company to bring a humanoid robot into the White House

The product: Figure 3

About 40 joint motors, each rotates 360°
~135 lb (61 kg), 4-5 hour battery life
Wireless inductive charging through the soles (2 kW), 1 hour of charge → 4-5 hours of work
Cameras, IMU, Wi-Fi, 5G, Bluetooth
Removable fabric outer suit (zip in back), high-top sneakers

The brain: Helix neural network

In-house Vision-Language-Action (VLA) model that fully replaces traditional control code
Inference runs on the robot's onboard GPU — no internet connection required
Outputs 50-200 full-body joint commands per second
Training data approaches one million hours, mostly in simulation via reinforcement learning, then zero-shot transferred to physical robots
"Never Fall" project: even with a missing joint (e.g. knee), the robot keeps a one-legged hop without falling

The factory: BotQ

In-house design and manufacturing of head, battery, limbs, fingers
2.25 kWh battery with structural design that prevents thermal runaway from spreading
March production hit a record high; the goal is a fully automated "lights-out" factory — robots building robots
Last year, Figure robots helped assemble the world's first car built by humanoid robots (BMW X3)

Design evolution

Figure 1 (2022): CNC-machined, hundreds of thousands of dollars per unit, tendon-driven hand (deprecated)
Figure 2: battery moved into the torso, 3x compute, ~50 units produced before retirement
Figure 3: ~90% cost reduction (under $100K per unit), thinner profile, soft-foam outer covering
Figure 4: in late development, expected to be the biggest jump yet

Go-to-market

Home: target $400-600/month lease pricing (similar to a car lease) — chores, laundry, organizing
Commercial: manufacturing, logistics, healthcare
Primary battlefield is the US; Europe is harder due to data privacy

Founder's perspective

Believes AGI may arrive first in embodied AI because real-world interaction data is the missing piece
Design philosophy leans Westworld (anthropomorphic) over Ex Machina
Data collection team wears spandex for motion capture training

Overall: a company that has vertically integrated hardware, AI, and manufacturing about as deeply as you can, sitting at the "flip phone to iPhone 1" transition point.

1. Data scale and tiers

Founder Brett describes training data in two layers:

Pre/mid-training: Helix used roughly one million hours. This is the foundation that determines whether the robot "understands the world"
Post-training: only on the order of thousands of hours. This fine-tunes for specific tasks (folding clothes, tidying a living room, moving packages)

The ratio is telling — most of the value density in data work sits in pre-training. Post-training is more like alignment. For anyone working in data, the long-term work is in pre-training: cleaning, deduplication, quality grading.

2. Three data collection paths

2.1 Human motion capture (the spandex crew)

The video mentions "people in spandex walking around the campus" — this is joint-level tracking. Markers or inertial sensors on tight clothing record the angle, velocity, and trajectory of every joint while a human does household tasks.

Why spandex? Loose clothing makes markers drift and joint-angle estimation unreliable. From a data perspective, the value of this kind of data is:

It provides a "human motion prior" — the robot doesn't have to learn from scratch how to swing its arm to wash a dish
Combined with video frames, it enables image conditioning: each frame pairs with a joint state, and that's the core training pair for a VLA model (vision → action)

2.2 Synthetic data from simulation (the main event)

Mortz, the controls lead, said it well:

"We spend a lot of time imagining everything that could happen in the real world, and then we make it all happen in simulation."

This is standard domain randomization. Inside the physics simulator, you randomize:

Gravity, friction coefficients
Joint damping, motor latency
Lighting, textures, camera noise
External disturbances (someone shoving the robot)

Simulation data requires no human labeling — states, actions, and rewards come straight from the simulator, so it's effectively unlimited. This is the fundamental reason Figure switched from hand-coded controllers to RL-based neural networks: the data scale is orders of magnitude different.

2.3 Real-robot teleoperation + autonomous return

Brett denied that the robots are remote-controlled, but during training, teleoperation data is unavoidable — Tesla, 1X, Physical Intelligence all do the same. Robots deployed in homes upload their daily run data to a central training pipeline, which then OTA-updates model weights.

3. Sim-to-real zero-shot

The phrase shows up during the segment where the robot gets shoved:

"We trained the controller in simulation, and then we zero-shot it to this robot — load it onto the computer, and that's the performance you get."

The precise meaning: the model has never been trained on the real robot, and yet works on first deployment. To make this work you need:

A small sim-to-real gap — Figure designs its own motors and measures its own torque response curves, so the simulated motor model is very close to the real one
Wide-enough domain randomization — the disturbance space seen in training ≥ the disturbance space in the real world
Matching observation spaces — simulated camera intrinsics, FOV, and noise have to track the real cameras closely

For data work, this means: pure simulation tasks need very little human labeling (the simulator generates labels), but simulator tuning and real-robot regression testing grow as QA work — for example, labeling "which real-robot failures correspond to uncovered scenarios in simulation".

4. Data processing details

Privacy and anonymization

Brett's framing: "What we mostly care about is the robot's state — what it sees, and how that helps it generalize." In practice:

Faces, text, private information get masked or blurred
What actually enters the training set is the triple pixels → robot state → action
Europe deployment is on hold because of GDPR — meaning their de-identification pipeline isn't yet EU-compliant

The "positive transfer" assumption

Brett makes one key bet:

"We assume the data we're collecting now will produce massive positive transfer in almost any environment."

This is a bet on the balance between diversity vs volume — collecting the same action (opening a drawer, picking something up) in different environments lets the model learn the essence of the action rather than the contingencies of any one environment. Implications for data strategy:

Diversity-axis labels (lighting, materials, layout, object types) are worth more than simply adding samples
Think in terms of long-tail distribution — common scenarios saturate quickly, rare scenarios deserve heavier labeling

5. Training pipeline

Stated in the video:

Onboard GPU inference at 50-200 Hz
One model, multiple tasks (home + logistics + manufacturing)
Architecture is a transformer-based VLA

Industry context (not stated in the video but public): Helix is a dual-system architecture — a slow "System 2" handles semantic understanding and task planning at ~7-9 Hz, while a fast "System 1" generates actions at ~200 Hz. Task-level labels ("step three of folding a shirt") and action-level labels (per-frame joint state) are produced separately.

6. New hands = new data challenges

Brett emphasizes that the newly released high-DOF hand has "as many joints as a human hand," with the goal of being able to passively learn from human videos. This points to a new direction:

Use large amounts of internet human video (YouTube cooking, chores) for pre-training
The data emphasis shifts from "how the robot moves" to "how the human hand moves in the video"
High-precision hand pose estimation is needed, likely involving parametric hand representations like the MANO model

Reachy Mini: Hugging Face's App Store for desktop robots

One-line version: Hugging Face built an "App Store" for a small desktop robot called Reachy Mini. Just like the iPhone App Store — except this time the "phone" is a robot.

Some terms first

Hugging Face: originally a model hosting platform — most open-source models live there. Last year they acquired French robotics company Pollen Robotics, stepping from pure software into hardware.

Reachy Mini: a desktop-scale robot from Pollen. Important — it's not a humanoid like Figure. Specs:

28 cm tall, 1.5 kg, sits on a desk
One movable head (6 DOF) + two antennae + one camera + microphone
Price: $299-$449
Fully open source

Compared to Figure 03 (a $60K machine that can do laundry) it's a different species. This is closer to a "moving smart speaker + AI toy."