The MVSEC Dataset

April 18 2019: We have generated HDF5 files as a ROS free alternative for the dataset. These are now available for download, please see the Downloads section.

Ground truth optical flow is now available, as presented in “EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras”. See the Downloads page for more info.

The Multi Vehicle Stereo Event Camera Dataset

Sensors mounted on multiple vehicles.

The Multi Vehicle Stereo Event Camera dataset is a collection of data designed for the development of novel 3D perception algorithms for event based cameras. Stereo event data is collected from car, motorbike, hexacopter and handheld data, and fused with lidar, IMU, motion capture and GPS to provide ground truth pose and depth images. In addition, we provide images from a standard stereo frame based camera pair for comparison with traditional techniques.

Event based cameras are a new asynchronous sensing modality that measure changes in image intensity. When the log intensity over a pixel changes above a set threshold, the camera immediately returns the pixel location of a change, along with a timestamp with microsecond accuracy, and the direction of the change (up or down). This allows for sensing with extremely low latency. In addition, the cameras have extremely high dynamic range and low power usage.

Sequences


Hexacopter Indoor	Hexacopter Outdoor	Handheld

Daytime Driving	Nighttime Driving

Data was collected from four different vehicles, in both indoor and outdoor environments, in day and night settings. All hexacopter sequences have motion capture ground truth from an indoor Vicon area and outdoor Qualisys area, while the other sequences have ground truth generated by fusing lidar information with IMU and GPS. The full list of sequences can be found below:

Vehicle	Sequence
Hexacopter	Indoor short Indoor long Outdoor afternoon Outdoor evening
Handheld	Indoor-outdoor Outdoor-indoor
Outdoor Car	Pennovations day Pennovations evening West Philadelphia day West Philadelphia evening
Motorcycle	Motorcycle highway

Indoor Vicon motion capture area.

Outdoor Qualisys motion capture area.

Sensors

A number of different sensors and modalities and rigidly mounted to a stereo event camera pair, in order to generate accurate ground truth information, as well as to provide avenues for research in sensor fusion between modalities.

For events, two experimental DAVIS 346B cameras are mounted in a stereo (X axes aligned) configuration. each camera has a resolution of 346x260 pixels, with a 4mm lens and roughly 70 degrees vertical field of view. The camera clocks are synchronized by a hardware trigger generated by the left camera and send to the right camera. In addition to events, the cameras also each generate IMU and frame based image measurements.

In addition, a Velodyne lidar and stereo frame based camera with IMU (VI Sensor) is mounted with the DAVIS cameras. When available, ground truth pose is also captured using an indoor (Vicon, left) or outdoor (Qualisys, right) motion capture system.

The full set of sensor characteristics can be found below:

Sensor	Characteristics
DAVIS 346B	346x260 pixels APS (Active Pixel Sensor for frame based images) DVS (Dynamic Vision Sensor for events) FOV: 50° vert., 65° horiz. IMU: MPU 6150 at 1kHz
VI-Sensor	Skybotix integrated VI-sensor stereo camera: 2 x Aptina MT9V034 gray 2x752x480 at 20fps, global shutter FOV: 57deg vert., 2 x 80deg horiz. IMU: ADIS16488 at 200Hz
Velodyne Puck LITE	VLP-16 PUCK LITE FOV: 30° vert. 360° horiz. 16 channel 20Hz 100m range
GPS	UBLOX NEO-M8N 72-channel u-blox M8 engine Position accuracy 2.0m CEP
Motion Capture	Indoor Vicon 88 x 22 x 15 ft 20 Vicon Vantage VP-16 Cameras 100Hz pose updates Outdoor Qualisys 100 x 50 x 50 ft 34 Qualisys Opus 700 Cameras 100Hz pose updates

Ground Truth

For most sequences, accurate pose and depths are provided from a fusion of the sensors onboard.

map with pose

Reconstructed map with trajectory in green. depth image

Depth map (red) overlaid on APS from DAVIS, lighter is further.

Citations

Please cite the following papers when using this work in an academic publication:

For the main dataset, please cite:

Zhu, A. Z., Thakur, D., Ozaslan, T., Pfrommer, B., Kumar, V., & Daniilidis, K. (2018). The Multi Vehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception. IEEE Robotics and Automation Letters, 3(3), 2032-2039.

An arXiv preprint is also available:

Zhu, A. Z., Thakur, D., Ozaslan, T., Pfrommer, B., Kumar, V., & Daniilidis, K. (2018). The Multi Vehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception. arXiv preprint arXiv:1801.10202.

For the ground truth optical flow, please cite:

Zhu, A. Z., Yuan, L., Chaney, K., Daniilidis, K. (2018). EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras Robotics: Science and Systems 2018.

License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.