News

It is built on the 3D-LLM and uses interaction tokens to engage with the environment. Embodied diffusion models are trained and aligned with the LLM to predict goal images and point clouds. We will ...