Visuotactile sensors can provide rich contact information, having great potential in contact-rich manipulation tasks with reinforcement learning (RL) policies. Sim2Real technique tackles the challenge of RL’s reliance on a large amount of interaction data. However, most Sim2Real methods for manipulation tasks with visuotactile sensors rely on rigid-body physics simulation, which fails to simulate the real elastic deformation precisely. Moreover, these methods do not exploit the characteristic of tactile signals for designing the network architecture.
In this article, we build a
general-purpose Sim2Real protocol for manipulation policy learning with marker-based visuotactile sensors. To improve the simulation fidelity, we employ an FEM-based physics simulator that can
simulate the sensor deformation accurately and stably for arbitrary
geometries.
We further propose a novel tactile feature extraction
network that directly processes the set of pixel coordinates of
tactile sensor markers and a self-supervised pretraining strategy
to improve the efficiency and generalizability of RL policies. We
conduct extensive Sim2Real experiments on the peg-in-hole task to
validate the effectiveness of our method. And we further show its
generalizability on additional tasks including plug adjustment and
lock opening.
We present a physics simulation method for marker-based visuotactile sensors using Incremental Potential Contact (IPC), which is based on the Finite Element Method. We show that IPC accurately simulates large deformations and dynamic properties of elastomers, employing barrier energy for contact modeling and continuous collision detection to enable stable simulation at large time steps. Our method models robot actions as Dirichlet boundary conditions on the elastomer mesh, simulating sensor deformation by calculating constrained vertex positions and velocities. We demonstrate that this approach enables accurate, efficient simulation of visuotactile sensors, leading to small domain gap between simulation and the real sensor.
In this work, we use the marker flow as the tactile sensor signals and propose an efficient tactile feature extractor based on PointNet. The marker-based tactile representation and point cloud learning architecture can inherently deal with the marker position input and extract both global and local tactile features. The randomization enhances its generalizability and further improves Sim2Real performance.
To enhance sample efficiency and training stability, we pretrain the tactile feature extractor using an autoencoder. We design a decoder to reconstruct all the marker positions from the original marker positions and the latent feature.
We achieve zero-shot Sim2Real for high-precision contact-rich manipulation tasks.
We designed an ablation study to compare our proposed marker-based tactile representation with conventional image-based tactile representation. Although the randomization parameters are the same, the proposed marker-based representation demonstrates advantages over the image-based representation.
Here we demonstrate that using the pretrained tactile encoder allows the policy to achieve a considerably high Sim2Real success rate, even at very early training stages.