SonoGym: High Performance Simulation for Challenging Surgical Tasks with Robotic Ultrasound

Yunke Ao^1,4,5, Masoud Moghani^2,6^*, Mayank Mittal^1,6^*, Manish Prajapat^1,5, Luohong Wu^3,4, Frederic Giraud^3,4, Fabio Carrillo⁴, Andreas Krause^1,5, Philipp Fürnstahl^3,4,5

^*Equal second author contribution.

¹ETH Zurich · ²University of Toronto · ³University of Zurich · ⁴Balgrist University Hospital · ⁵ETH AI Center · ⁶NVIDIA

arXiv Code

Data

Assets

Abstract

We present SonoGym, a scalable simulation platform for robotic ultrasound, enabling parallel simulation across tens to hundreds of environments. Our framework supports realistic and real-time simulation of ultrasound data from CT-derived 3D models of the anatomy through both a physics-based and a Generative Adversarial Network (GAN) approach. Our framework enables the training of deep reinforcement learning (DRL) and recent imitation learning agents (IL) (vision transformers and diffusion policies) for ultrasound-guided navigation, anatomy reconstruction and surgery. We believe our simulation can facilitate research in robot learning approaches for such challenging robotic surgery applications. Future research directions include improving ultrasound simulation quality and diversity, modeling soft tissue deformation, scaling to larger patient populations, improving generalization over different patients, and validation with real systems in clinical settings.

Tasks

Ultrasound navigation

We demonstrate high-performance Proximal Policy Optimization, Action Chunking Transformer and Diffusion Policy agents for the navigation task. The bottom of the videos show the learning-based ultrasound simulation from the first 8 environments. The goal plane for navigation is the tranverse plane across the center of the L4 lumbar vertebra.

Action Chunking Transformer

Diffusion Policy

Proximal Policy Optimization

Ultrasound-based anatomy reconstruction

We support high-performing submodular Proximal Policy Optimization agents for the anatomy reconstruction task. We also show the reconstruction of the heuristic trajectory for comparison. The top right corner shows the real-time observation of the agent, which is the current surface reconstruction transformed to the ultrasound probe frame. The red vertebra model is only for visualization and is not included in the observation. The bottom right corner shows the reconstruction status. The covered and uncovered surface points are colored in yellow and blue, respectively.

Submodular Proximal Policy Optimization

Heuristic

Ultrasound-guided surgery

We support high-performing safe Proximal Policy Optimization and Action Chunking Transformer agents for the ultrasound-guided surgery task. The right half of the videos show the trajectories (green) from 50 environments towards the target L4 vertebra (blue). The goal frame / end point for the trajectory is annotated in red. For PPO + safety filter, predicted unsafe actions are stopped before entering the target vertebra.

Action Chunking Transformer

Proximal Policy Optimization

Proximal Policy Optimization + Safety filter