This benchmark contains 14 challenging video sequences (7 training, 7 test) in unconstrained environments filmed with both static and moving cameras. Tracking and evaluation are done in image coordinates. All sequences have been annotated with high accuracy, strictly following a well-defined protocol.
Sample | Name | FPS | Resolution | Length | Tracks | Boxes | Density | Description | Source | Ref. |
MOT16-13 | 25 | 1920x1080 | 750 (00:30) | 107 | 11450 | 15.3 | Filmed from a bus on a busy intersection | link | [1] | |
MOT16-11 | 30 | 1920x1080 | 900 (00:30) | 69 | 9174 | 10.2 | Forward moving camera in a busy shopping mall | link | [1] | |
MOT16-10 | 30 | 1920x1080 | 654 (00:22) | 54 | 12318 | 18.8 | A pedestrian scene filmed at night by a moving camera | link | [2] | |
MOT16-09 | 30 | 1920x1080 | 525 (00:18) | 25 | 5257 | 10.0 | A pedestrian street scene filmed from a low angle. | link | [2] | |
MOT16-05 | 14 | 640x480 | 837 (01:00) | 125 | 6818 | 8.1 | Street scene from a moving platform | link | [3] | |
MOT16-04 | 30 | 1920x1080 | 1050 (00:35) | 83 | 47557 | 45.3 | Pedestrian street at night, elevated viewpoint | link | [1] | |
MOT16-02 | 30 | 1920x1080 | 600 (00:20) | 54 | 17833 | 29.7 | People walking around a large square. | link | [2] | |
Total | 5316 frm. (215 s.) | 517 | 110407 | 20.8 |
Sample | Name | FPS | Resolution | Length | Tracks | Boxes | Density | Description | Source | Ref. |
MOT16-14 | 25 | 1920x1080 | 750 (00:30) | 164 | 18483 | 24.6 | Filmed from a bus on a busy intersection | link | [1] | |
MOT16-12 | 30 | 1920x1080 | 900 (00:30) | 86 | 8295 | 9.2 | Forward moving camera in a busy shopping mall | link | [1] | |
MOT16-08 | 30 | 1920x1080 | 625 (00:21) | 63 | 16737 | 26.8 | A crowded pedestrian street, stationary camera | link | [2] | |
MOT16-07 | 30 | 1920x1080 | 500 (00:17) | 54 | 16322 | 32.6 | A busy pedestrian street filmed at eye level by a moving camera | link | [2] | |
MOT16-06 | 14 | 640x480 | 1194 (01:25) | 221 | 11538 | 9.7 | Street scene from a moving platform | link | [3] | |
MOT16-03 | 30 | 1920x1080 | 1500 (00:50) | 148 | 104556 | 69.7 | Pedestrian street at night, elevated viewpoint | link | [1] | |
MOT16-01 | 30 | 1920x1080 | 450 (00:15) | 23 | 6395 | 14.2 | People walking around a large square. | link | [2] | |
Total | 5919 frm. (248 s.) | 759 | 182326 | 30.8 |
[1] | MOT16: A Benchmark for Multi-Object Tracking. arXiv:1603.00831 [cs], 2016., (arXiv: 1603.00831). |
[2] | MOTChallenge 2015: Towards a Benchmark for Multi-Target Tracking. arXiv:1504.01942 [cs], 2015., (arXiv: 1504.01942). |
[3] | Depth and Appearance for Mobile Scene Analysis. In Proceedings of the Eleventh IEEE International Conference on Computer Vision, 2007. |