/stackumbrella/media/media_files/2026/02/12/xiaomi-robotics-0-1-2026-02-12-13-04-51.png)
In a surprise move that could reshape the robotics industry, Xiaomi has officially announced Xiaomi-Robotics-0, its first-generation open-source robot large model. Known globally for smartphones and smart home gadgets, the company is now stepping deep into Embodied AI, and it’s not playing small.
The 4.7 billion parameter Vision-Language-Action (VLA) model was unveiled on February 12, 2026. According to the company, it has already achieved state-of-the-art (SOTA) results across multiple robotics benchmarks. That’s not a minor claim, that’s a statement of intent.
Xiaomi-Robotics-0 Architecture
At its core, Xiaomi-Robotics-0 is built on a Mixture-of-Transformers (MoT) hybrid architecture. In simple terms, it splits intelligence into two main parts:
Vision-Language Model (VLM): Acts as the “brain.” It interprets human instructions and understands visual scenes.
Action Expert (Diffusion Transformer-based): Works like a “cerebellum,” generating smooth sequences of movements called “Action Chunks.”
This closed loop, perception, decision, execution, is what Xiaomi calls the foundation of “physical intelligence.”
And frankly, this is where many robotics models struggle. They either understand well but move poorly, or move well but lose reasoning ability. Xiaomi claims it solved this balance.
/filters:format(webp)/stackumbrella/media/media_files/2026/02/12/xiaomi-robotics-0-3-2026-02-12-13-04-51.png)
Xiaomi-Robotics-0 Training
One of the biggest issues in robotics AI is inference latency, the delay between thinking and acting. If processing takes an excessive amount of time,robots may seem slow or wobbly.
With Xiaomi-Robotics-0, the company introduced:
Action Proposal mechanism during training
Asynchronous inference mode to decouple thinking from movement
Clean Action Prefix to ensure smooth motion
Λ-shaped Attention Mask to prioritize current visual input
The result? Continuous movement even when the AI “thinks” longer. That’s a practical breakthrough, especially since Xiaomi says the model runs on consumer-grade GPUs.
/filters:format(webp)/stackumbrella/media/media_files/2026/02/12/xiaomi-robotics-0-2-2026-02-12-13-04-51.png)
Xiaomi-Robotics-0 Benchmark Results
According to Xiaomi’s official release and stock exchange disclosure (XIAOMI-W 01810.HK), the model outperformed nearly 30 competing systems in:
LIBERO
CALVIN
SimplerEnv
More importantly, this wasn’t just simulation hype.
In real-world testing on a dual-arm robot platform, Xiaomi-Robotics-0 successfully handled long-horizon tasks like folding towels and disassembling building blocks.
It was able to deal with rigid and flexible objects without breakdowns, which is one of the indicators of generalized physical intelligence.
Why Xiaomi-Robotics-0 Matters
This is more than just a tech demo.
Xiaomi is open-sourcing Xiaomi-Robotics-0, which makes the company not only a hardware giant, but also an AI infrastructure player. In my opinion, this is a smart long-term bet. Robotics will shape the next decade of automation. Open-source ecosystems tend to succeed over time.
Xiaomi isn’t just competing with consumer electronics brands anymore. It’s stepping into territory dominated by AI labs and robotics startups.
And if the benchmark claims hold up under independent scrutiny, this could mark Xiaomi’s most ambitious pivot yet.
The robotics race just got more interesting, and Xiaomi-Robotics-0 might be the company’s boldest move of 2026 so far.
/stackumbrella/media/agency_attachments/2026/02/03/2026-02-03t122236880z-logo_5ec00731b6678-2026-02-03-17-52-36.png)
Follow Us