3D MOT 2015

Sequences with available camera calibration, enabling tracking in world coordinates. Evaluation is done on the ground plane with a distance threshold of 1m.

Training Set

SampleName FPS Resolution Length Tracks Boxes DensityDescriptionSourceRef.
1PETS09-S2L17768x576795 (01:54)1944765.6A widely used sequence showing up to 8 walking pedestrians, partly in unusual patterns.link[1]
2TUD-Stadtmitte25640x480179 (00:07)1011566.5A static camera at about 2 meters height shows walking people on the street.link[2]
Total 974 frm.
(121 s.)
29 5632 5.8

Test Set

SampleName FPS Resolution Length Tracks Boxes DensityDescriptionSourceRef.
1AVG-TownCentre2.51920x1080450 (03:45)226714815.9A pedestrian street filmed from an elevated pointlink[3]
2PETS09-S2L27768x576436 (01:02)42964122.1A crowded scene shown from an elevated viewpoint.link[1]
Total 886 frm.
(287 s.)
268 16789 18.9


Get all data (406.9 MB)
Get detections and labels only (2.5 MB)
Get development kit (0.5 MB)


[1] Ferryman, J. & Shahrokni, A. PETS2009: Dataset and challenge. In 11th IEEE International Workshop on Performance Evaluation of Tracking and Surveillance (PETS), 2009.
[2] Andriluka, M., Roth, S. & Schiele, B. Monocular 3D Pose Estimation and Tracking by Detection. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010.
[3] Benfold, B. & Reid, I. Guiding Visual Surveillance by Tracking Human Attention. In Proceedings of the British Machine Vision Conference, 2009.