A Decade of Humanoid Robotics: From ZMP to VLA

The Paradigm Shift from Model-Based Control to Learned Policies

A decade of humanoid robotics (2015–2026): four catalysts, System 0/1/2, frontier companies, and Manufacturing Physical AI.

First published: 2026-04-24 | Last updated: 2026-06-18

Start Reading

🤖

16 Chapters, 5 Parts

From the old stack to VLA in one book.

📚

Foundations + Modern Theory

LIPM/QP to PPO, Transformers, Diffusion policy, and VLA — theory bridges included.

🏭

Manufacturing Physical AI Lens

Where Korea's manufacturing strength meets global humanoid competition.

Part I: The Old Stack and Its Legacy

The Orthodox Stack (2003–2015): Glory and Limits

Kajita's LIPM, ZMP preview control, whole-body QP, and capture-point footstep planning. The ASIMO, HRP, and DRC-Atlas lineage. Why model uncertainty, contact, and latency made this paradigm brittle.

→ 02

Foundations That Still Matter

Theoretical legacies from LIPM, ZMP, whole-body QP, and MPC that survived into hybrid controllers and System 0 PD loops. Foundational concepts readers need before Parts II–III.

→ 03

Paradigm Shift Overview

A map of how the four catalysts depend on each other — QDD underwrites RL, GPU simulation underwrites DR, and so on. A roadmap for the rest of the book.

→

Part II: The Four Catalysts

Hardware: QDD Actuators

The MIT Cheetah (2017) lineage. Outer-rotor BLDC + low-ratio planetary gears enable backdrivability, high bandwidth, and proprioceptive ground reaction force. The path to Unitree, Figure, and 1X.

→ 05

GPU Massively Parallel Simulation

Isaac Gym (2021) as the inflection point and Rudin et al.'s ANYmal-in-minutes. Isaac Lab, MuJoCo MJX, Genesis, and Humanoid-Gym as the 2026 standard. The sample scale that made domain randomization practical.

→ 06

The Learning Algorithm Canon

Hwangbo (2019) actuator network, Lee (2020) teacher-student, Kumar (2021) RMA, Siekmann (2021) Cassie, Radosavovic (2023) full-size transformer. The history-encoder progression: TCN → LSTM → Transformer.

→ 07

Sim-to-Real: Three Strategies

Domain randomization, system ID with actuator networks, and residual corrections (ASAP-style delta action). Why reactive footsteps now emerge per control tick instead of being planned.

→

Part III: The 2026 Standard Stack

Modern Theory Primer

A theory bridge for digesting Parts II–III — RL and policy gradients (PPO/TD3), transformer history encoders and in-context adaptation, diffusion policy, and the VLA concept. The 'new foundations' paired with Ch 2's classical ones.

→ 09

The 3-Layer System 0/1/2 Architecture

The industry lingua franca after Figure's naming. Three layers running at different frequencies (1 kHz / 100 Hz / 7–10 Hz) and parameter scales (10M / 1B / 7B), communicating asynchronously. How this differs fundamentally from the old decoupled pipeline.

→ 10

VLA and Loco-Manipulation Integration

OpenVLA, GR00T N1/N1.5, Helix, GO-1/GO-2, and π0. Treating locomotion as a solved primitive beneath a VLA. The roles of diffusion policy and latent action.

→

Part IV: Frontier Company Analyses

The Incumbent: Boston Dynamics

Electric Atlas (56 DOF), the RAI Institute joint RL pipeline, and TRI's Large Behavior Model. Decades of MPC and simulation assets complementing — not replaced by — RL. The canonical hybrid MPC+RL system.

→ 12

US Challengers: Figure AI and Agility Robotics (with Tesla Optimus Outlook)

Figure Helix 02's end-to-end 'pixels to whole body' with BotQ vertical integration vs. Agility's Motor Cortex and GXO deployment leadership. The chapter closes with what's publicly known about Tesla Optimus and the positions still unknown.

→ 13

China's Leaders: Unitree and AgiBot

Unitree's G1 at $16K and unitree_rl_gym as the research-grade Android, versus AgiBot's GO-2, Genie Sim 3.0, and the million-trajectory AgiBot World. Two Chinese strategies: open hardware reference versus vertically integrated data-sim-deployment.

→

Part V: Korea's Opportunity and Future Scenarios

Korea's Position and the K-Humanoid Alliance

The technical positions of Korea's ecosystem — Hyundai Robotics, Rainbow Robotics, NAVER LABS, KAIST, SNU and others. Diagnosing the gap with frontier groups and assessing government initiatives such as the K-Humanoid Alliance.

→ 15

The Four Differentiation Axes Through the Manufacturing Physical AI Lens

Manipulation data, onboard VLA, fleet learning, and cross-embodiment — how each axis intersects Korean manufacturing strengths (semiconductors, automotive, shipbuilding, batteries). The strategic requirements for a 'Manufacturing Physical AI' thrust.

→ 16

Staged Diffusion: Manufacturing Physical AI Conquest and Industrial Spread

A conquest sequence in Manufacturing Physical AI — unlocking dexterous manipulation, then autonomous fixed-line automation, then flexible manufacturing — followed by domestic industrial spread (semiconductors, automotive, shipbuilding, batteries), overseas spread, and the final transition to services and homes. Energy, regulatory, and labor bottlenecks throughout.

→

Appendices

Glossary

Key term definitions

→ A

Consolidated References

Full bibliography

→