The Segmenting and Tracking Every Pixel (STEP) benchmark consists of 2 training sequences and 2 test sequences. It is based on the MOTChallenge and Multi-Object Tracking and Segmentation (MOTS) benchmark. This benchmark extends the annotations to the STEP task. To this end, we added dense pixelwise segmentation labels for every pixel. In this benchmark, every pixel has a semantic label and all pixels belonging to the most salient object class, pedestrian, have a unique tracking ID. We evaluate submitted results using the Segmentation and Tracking Quality (STQ) metric. This benchmark is part of the ICCV21-Workshop Segmenting and Tracking Every Point and Pixel.
|STEP-ICCV21-09||30||1920x1080||525 (00:18)||A pedestrian street scene filmed from a low angle.||link|||
|STEP-ICCV21-02||30||1920x1080||600 (00:20)||People walking around a large square.||link|||
|Total||1125 frm. |
|STEP-ICCV21-07||30||1920x1080||500 (00:17)||A busy pedestrian street filmed at eye level by a moving camera||link|||
|STEP-ICCV21-01||30||1920x1080||450 (00:15)||People walking around a large square.||link|||
|Total||950 frm. |
|||STEP: Segmenting and Tracking Every Pixel. arXiv:2102.11859, 2021.|