MOT16

This benchmark contains 14 challenging video sequences (7 training, 7 test) in unconstrained environments filmed with both static and moving cameras. Tracking and evaluation are done in image coordinates. All sequences have been annotated with high accuracy, strictly following a well-defined protocol.

Training Set

Sample Name FPS Resolution Length Tracks BoxesDensityDescriptionSourceRef.
MOT16-13251920x1080750 (00:30)1071145015.3Filmed from a bus on a busy intersectionlink[1]
MOT16-11301920x1080900 (00:30)69917410.2Forward moving camera in a busy shopping malllink[1]
MOT16-10301920x1080654 (00:22)541231818.8A pedestrian scene filmed at night by a moving cameralink[2]
MOT16-09301920x1080525 (00:18)25525710.0A pedestrian street scene filmed from a low angle.link[2]
MOT16-0514640x480837 (01:00)12568188.1Street scene from a moving platformlink[3]
MOT16-04301920x10801050 (00:35)834755745.3Pedestrian street at night, elevated viewpointlink[1]
MOT16-02301920x1080600 (00:20)541783329.7People walking around a large square.link[2]
Total 5316 frm.
(215 s.)
517 110407 20.8

Test Set

Sample Name FPS Resolution Length Tracks BoxesDensityDescriptionSourceRef.
MOT16-14251920x1080750 (00:30)1641848324.6Filmed from a bus on a busy intersectionlink[1]
MOT16-12301920x1080900 (00:30)8682959.2Forward moving camera in a busy shopping malllink[1]
MOT16-08301920x1080625 (00:21)631673726.8A crowded pedestrian street, stationary cameralink[2]
MOT16-07301920x1080500 (00:17)541632232.6A busy pedestrian street filmed at eye level by a moving cameralink[2]
MOT16-0614640x4801194 (01:25)221115389.7Street scene from a moving platformlink[3]
MOT16-03301920x10801500 (00:50)14810455669.7Pedestrian street at night, elevated viewpointlink[1]
MOT16-01301920x1080450 (00:15)23639514.2People walking around a large square.link[2]
Total 5919 frm.
(248 s.)
759 182326 30.8


Download

Get all data (1.9 GB)
Get files (no img) only (3.2 MB)
Get raw (no NMS) detections (38.4MB)
Development Kit

References:


[1] Milan, A., Leal-Taixé, L., Reid, I., Roth, S. & Schindler, K. MOT16: A Benchmark for Multi-Object Tracking. arXiv:1603.00831 [cs], 2016., (arXiv: 1603.00831).
[2] Leal-Taixé, L., Milan, A., Reid, I., Roth, S. & Schindler, K. MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking. arXiv:1504.01942 [cs], 2015., (arXiv: 1504.01942).
[3] Ess, A., Leibe, B. & Gool, L.V. Depth and Appearance for Mobile Scene Analysis. In Proceedings of the Eleventh IEEE International Conference on Computer Vision, 2007.