$ \newcommand{\cc}[1]{\color{black}{#1}} \newcommand{\mvec}[1]{\mathbf{#1}} \newcommand{\cvec}[2]{^\mathrm{#2}\mathbf{#1}} \newcommand{\ctrans}[3]{^\mathrm{#3}\mathbf{#1}_\mathrm{#2}} \newcommand{\rmat}[9]{\cc{\begin{bmatrix}{#1}&{#2}&{#3}\\\ {#4}&{#5}&{#6}\\\ {#7}&{#8}&{#9}\end{bmatrix}}} \newcommand{\hmat}[3]{\cc{\begin{bmatrix}{#1}\\\ {#2}\\\ {#3}\\\ \end{bmatrix}}} \newcommand{\mq}[4]{\cc{\begin{bmatrix}{#1}&{#2}&{#3}&{#4} \end{bmatrix}}} \newcommand{\nvec}[3]{\cc{\begin{bmatrix}{#1}&{#2}&{#3} \end{bmatrix}}} \newcommand{\vvec}[3]{\cc{\begin{bmatrix}{#1}\\\ {#2}\\\ {#3}\end{bmatrix}}} \newcommand{\vvt}[2]{\cc{\begin{bmatrix}{#1}\\\ {#2}\end{bmatrix}}} \newcommand{\calmat}[4]{\cc{\begin{bmatrix}{#1}&0&{#3}\\\ 0&{#2}&{#4}\end{bmatrix}}} \newcommand{\dp}{^{\prime\prime}} $

The MVSEC Dataset

Ground truth optical flow is now available, as presented in “EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras”. See the Downloads page for more info.

The Multi Vehicle Stereo Event Camera Dataset

Sensors mounted on multiple vehicles.

The Multi Vehicle Stereo Event Camera dataset is a collection of data designed for the development of novel 3D perception algorithms for event based cameras. Stereo event data is collected from car, motorbike, hexacopter and handheld data, and fused with lidar, IMU, motion capture and GPS to provide ground truth pose and depth images. In addition, we provide images from a standard stereo frame based camera pair for comparison with traditional techniques.

Event based cameras are a new asynchronous sensing modality that measure changes in image intensity. When the log intensity over a pixel changes above a set threshold, the camera immediately returns the pixel location of a change, along with a timestamp with microsecond accuracy, and the direction of the change (up or down). This allows for sensing with extremely low latency. In addition, the cameras have extremely high dynamic range and low power usage.

Sequences

Hexacopter Indoor. Hexacopter Outdoor. Handheld.
Hexacopter Indoor
Hexacopter Outdoor
Handheld
Daytime Driving. Nighttime Driving..
Daytime Driving
Nighttime Driving
Data was collected from four different vehicles, in both indoor and outdoor environments, in day and night settings. All hexacopter sequences have motion capture ground truth from an indoor Vicon area and outdoor Qualisys area, while the other sequences have ground truth generated by fusing lidar information with IMU and GPS. The full list of sequences can be found below:
VehicleSequence
Hexacopter
  • Indoor short
  • Indoor long
  • Outdoor afternoon
  • Outdoor evening
Handheld
  • Indoor-outdoor
  • Outdoor-indoor
Outdoor Car
  • Pennovations day
  • Pennovations evening
  • West Philadelphia day
  • West Philadelphia evening
Motorcycle
  • Motorcycle highway
Indoor Vicon motion capture area.
Indoor Vicon motion capture area.
Outdoor Qualisys motion capture area.
Outdoor Qualisys motion capture area.

Sensors

Full sensor configuration. A number of different sensors and modalities and rigidly mounted to a stereo event camera pair, in order to generate accurate ground truth information, as well as to provide avenues for research in sensor fusion between modalities.

For events, two experimental DAVIS 346B cameras are mounted in a stereo (X axes aligned) configuration. each camera has a resolution of 346x260 pixels, with a 4mm lens and roughly 70 degrees vertical field of view. The camera clocks are synchronized by a hardware trigger generated by the left camera and send to the right camera. In addition to events, the cameras also each generate IMU and frame based image measurements.

In addition, a Velodyne lidar and stereo frame based camera with IMU (VI Sensor) is mounted with the DAVIS cameras. When available, ground truth pose is also captured using an indoor (Vicon, left) or outdoor (Qualisys, right) motion capture system.

The full set of sensor characteristics can be found below:
SensorCharacteristics
DAVIS 346B
  • 346x260 pixels
  • APS (Active Pixel Sensor for frame based images)
  • DVS (Dynamic Vision Sensor for events)
  • FOV: 50° vert., 65° horiz.
  • IMU: MPU 6150 at 1kHz
VI-Sensor
  • Skybotix integrated VI-sensor
  • stereo camera: 2 x Aptina MT9V034
  • gray 2x752x480 at 20fps, global shutter
  • FOV: 57deg vert., 2 x 80deg horiz.
  • IMU: ADIS16488 at 200Hz
Velodyne Puck LITE
  • VLP-16 PUCK LITE
  • FOV: 30° vert. 360° horiz.
  • 16 channel
  • 20Hz
  • 100m range
GPS
  • UBLOX NEO-M8N
  • 72-channel u-blox M8 engine
  • Position accuracy 2.0m CEP
Motion Capture
  • Indoor Vicon
  • 88 x 22 x 15 ft
  • 20 Vicon Vantage VP-16 Cameras
  • 100Hz pose updates
  • Outdoor Qualisys
  • 100 x 50 x 50 ft
  • 34 Qualisys Opus 700 Cameras
  • 100Hz pose updates

Ground Truth

For most sequences, accurate pose and depths are provided from a fusion of the sensors onboard.

map with pose

Reconstructed map with trajectory in green.
depth image
Depth map (red) overlaid on APS from DAVIS, lighter is further.

Citations

Please cite the following papers when using this work in an academic publication:

For the main dataset, please cite:

Zhu, A. Z., Thakur, D., Ozaslan, T., Pfrommer, B., Kumar, V., & Daniilidis, K. (2018). The Multi Vehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception. IEEE Robotics and Automation Letters, 3(3), 2032-2039.

An arXiv preprint is also available:

Zhu, A. Z., Thakur, D., Ozaslan, T., Pfrommer, B., Kumar, V., & Daniilidis, K. (2018). The Multi Vehicle Stereo Event Camera Dataset: An Event Camera Dataset for 3D Perception. arXiv preprint arXiv:1801.10202.

For the ground truth optical flow, please cite:

Zhu, A. Z., Yuan, L., Chaney, K., Daniilidis, K. (2018). EV-FlowNet: Self-Supervised Optical Flow Estimation for Event-based Cameras Robotics: Science and Systems 2018.