Wellcome / EPSRC Centre for Interventional and Surgical Sciences


Ex-vivo dVRK segmentation dataset with kinematic data


This dataset is associated with our MICCAI 2020 paper “Synthetic and Real Inputs for Tool Segmentation in Robotic Surgery” and consists of 14 + 6 videos of 300 frames each, with corresponding segmentation ground truth and robot kinematic data. Frames were initially recorded at 720x576 pixel resolution and then centrally cropped (538x701) to remove side camera artefacts. The employed technique to produce segmentation labels, as well as further details on our dataset, are accurately described in our paper.

Dataset structure

The overall dataset is divided into training (Video_01 to Video_08), validation (Video_09 and Video_10) and test (Video_11 to Video_14) sets.
For each video we provide all the data we collected to produce our dataset. Each video folder is organized as follow:

  • images folder: set of 300 video frames after being cropped to 538x701 size. No further processing but central crop has been performed on the frames.
  • green_screen folder: set of 300 images collected with a green background used to produce our segmentation ground truth. Images have been cropped to 538x701 to be consistent with frames form the previous folder.
  • ground_truth folder: set of 300 segmentation binary masks relative to image folder frames.
  • background.png: image of the green background with no surgical tools employed to produce ground truth masks via background removal.
  • kinematic.mat: MATLAB file containing joint values of Patient Side Manipulators (PSMs) and Endoscope Control Manipulator for each video. The file is divided into 5 arrays: joint_values1 (6x300),  joint_values3 (6x300), jaw_values1 (1x300), jaw_values3 (1x300), ecm_values (4x300), where joint_values1 and jaw_values1 are relative to PSM 1 and  joint_values3,  jaw_values3 are relative to PSM 3.

A fourth folder, namely “others”, contains 6 more videos, whose segmentation ground truth masks are however corrupted due to failure in our robot movement repeatability. We did not use these videos in our paper, nevertheless we provide them with the same structure as the previous ones.

We provide camera calibration parameters in "camera_calibration.mat". These were produced moving a chessboard in front of the camera and processing the acquired images using MATLAB's Camera calibrator tool.

Finally, we provide the python code as well as the weights of the proposed model.

   - tensorflow == 1.14
   - keras == 2.3.0


The dataset can be downloaded here. Please cite our publication whenever research making use of this dataset is reported in any academic publication or research report.


This dataset is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International (CC BY-NC-SA 4.0).

Creative Commons License


For comments, suggestions or feedback, or if you experience any problems with this website or the dataset, please contact Emanuele Colleoni at emanuele.colleoni.19@ucl.ac.uk.