MOT16

This benchmark contains 14 challenging video sequences (7 training, 7 test) in unconstrained environments filmed with both static and moving cameras. Tracking and evaluation are done in image coordinates. All sequences have been annotated with high accuracy, strictly following a well-defined protocol.

Jump to download

Training Set

SampleName FPS Resolution Length Tracks Boxes DensityDescriptionSourceRef.
1MOT16-0514640x480837 (01:00)12568188.1Street scene from a moving platformlink[1]
2MOT16-02301920x1080600 (00:20)541783329.7People walking around a large square.link[2]
3MOT16-04301920x10801050 (00:35)834755745.3Pedestrian street at night, elevated viewpointlink[3]
4MOT16-09301920x1080525 (00:18)25525710.0A pedestrian street scene filmed from a low angle.link[2]
5MOT16-10301920x1080654 (00:22)541231818.8A pedestrian scene filmed at night by a moving cameralink[2]
6MOT16-11301920x1080900 (00:30)69917410.2Forward moving camera in a busy shopping malllink[3]
7MOT16-13251920x1080750 (00:30)1071145015.3Filmed from a bus on a busy intersectionlink[3]
Total 5316 frm.
(215 s.)
517 110407 20.8

Test Set

SampleName FPS Resolution Length Tracks Boxes DensityDescriptionSourceRef.
1MOT16-0614640x4801194 (01:25)221115389.7Street scene from a moving platformlink[1]
2MOT16-01301920x1080450 (00:15)23639514.2People walking around a large square.link[2]
3MOT16-03301920x10801500 (00:50)14810455669.7Pedestrian street at night, elevated viewpointlink[3]
4MOT16-07301920x1080500 (00:17)541632232.6A busy pedestrian street filmed at eye level by a moving cameralink[2]
5MOT16-08301920x1080625 (00:21)631673726.8A crowded pedestrian street, stationary cameralink[2]
6MOT16-12301920x1080900 (00:30)8682959.2Forward moving camera in a busy shopping malllink[3]
7MOT16-14251920x1080750 (00:30)1641848324.6Filmed from a bus on a busy intersectionlink[3]
Total 5919 frm.
(248 s.)
759 182326 30.8


Download

Get all data (1.9 GB)
Get detections and labels only (3.2 MB)
Get raw (no NMS) detections (38.4MB)
Get development kit (0.5 MB)

References:


[1] Ess, A., Leibe, B. & Gool, L.V. Depth and Appearance for Mobile Scene Analysis. In Proceedings of the Eleventh IEEE International Conference on Computer Vision, 2007.
[2] Leal-Taixé, L., Milan, A., Reid, I., Roth, S. & Schindler, K. MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking. arXiv:1504.01942 [cs], 2015., (arXiv: 1504.01942).
[3] Milan, A., Leal-Taixé, L., Reid, I., Roth, S. & Schindler, K. MOT16: A Benchmark for Multi-Object Tracking. arXiv:1603.00831 [cs], 2016., (arXiv: 1603.00831).