On Tuesday, Google DeepMind released a new language model called Gemini Robotics On-Device, which can run tasks locally on robots without an Internet connection.
Based on the previous Gemini Robotics model, which was released in March, Gemini Robotics On-Device can control robot movements. Developers can monitor and fine-tune the model to meet different needs using natural language prompts.
In benchmarks, Google claims that the model performs at a level close to Gemini Robotics’ cloud-based model. The company claims that it outperforms other models on devices in common benchmarks, although it does not name these models.
In the demo, the company showed robots performing tasks such as unzipping bags and folding clothes. Google says that although the model was developed for ALOHA robots, it was later adapted to work with the Franka FR3 two-armed robot and the Apollo humanoid robot from Apptronik.
Google claims that the two-armed robot Franka FR3 successfully coped with scenarios and objects it had not “seen” before, such as performing assembly on an industrial belt.
Google DeepMind also releases Gemini Robotics SDK. The company stated that developers can show 50 to 100 task demonstrations to robots to train them to perform new tasks using these models on the MuJoCo physical simulator.
Other AI model developers are also dipping their toes into robotics. Nvidia is building a platform to create basic models for humanoids; Hugging Face is not only developing open models and datasets for robotics but also working on robots; and Korean startup RLWRLD, backed by Mirae Asset, is working on creating basic models for robots.









