CroHD provides tracking annotation of pedestrian heads in densely populated video sequences. It consists of 2,276,838 human heads in 11,463 frames across 9 sequences of Full-HD resolution. We built CroHD upon 5 sequences from the publicly available MOTChallenge CVPR19 benchmark to enable performance comparison of trackers in the same scene between two paradigms - head tracking and pedestrian tracking. We further annotated 4 new sequences of higher crowd densities in two new scenarios. The new scenario centers on the Shibuya Train station and Shibuya Crossing, one of the busiest pedestrian crossings in the world. All sequences in CroHD have a framerate of 25fps and are captured from an elevated viewpoint. The sequences involve crowded indoor and outdoor scenes, recorded across different lighting and environmental conditions.
Sample | Name | FPS | Resolution | Length | Tracks | Boxes | Density | Description | Source | Ref. |
HT21-04 | 25 | 1920x1080 | 997 (00:40) | 580 | 175479 | 176.0 | Crowded outdoor train station. | link | [1] | |
HT21-03 | 25 | 1920x1080 | 1000 (00:40) | 811 | 257939 | 257.9 | Crowded pedestrian crossing. | link | [1] | |
HT21-02 | 25 | 1920x1080 | 3315 (02:13) | 1276 | 733622 | 221.3 | People leaving entrance of stadium by night time, elevated viewpoint. | link | [1] | |
HT21-01 | 25 | 1920x1080 | 429 (00:17) | 85 | 21456 | 50.0 | Crowded indoor train station. | link | [1] | |
Total | 5741 frm. (230 s.) | 2752 | 1188496 | 207.0 |
Sample | Name | FPS | Resolution | Length | Tracks | Boxes | Density | Description | Source | Ref. |
HT21-15 | 25 | 1920x734 | 1008 (00:40) | 321 | 149821 | 148.6 | A pedestrian street scene. | link | [1] | |
HT21-14 | 25 | 1920x1080 | 1050 (00:42) | 1040 | 258227 | 245.9 | Crowded outdoor train station. | link | [1] | |
HT21-13 | 25 | 1920x1080 | 1000 (00:40) | 734 | 259603 | 259.6 | Crowded pedestrian crossing. | link | [1] | |
HT21-12 | 25 | 1920x1080 | 2080 (01:23) | 737 | 380647 | 183.0 | People leaving entrance of stadium by night time, elevated viewpoint. | link | [1] | |
HT21-11 | 25 | 1920x1080 | 585 (00:23) | 133 | 38492 | 65.8 | Crowded indoor train station. | link | [1] | |
Total | 5723 frm. (228 s.) | 2965 | 1086790 | 189.9 |
[1] | Tracking Pedestrian Heads in Dense Crowd. In Conference on Computer Vision and Pattern Recognition (CVPR), 2021. |