In the recent past, the computer vision community has accepted several centralized benchmarks for numerous tasks including object detection, pedestrian detection, 3D reconstruction, optical flow, single-object short-term tracking, and stereo estimation. Despite potential pitfalls of such benchmarks, they have proved to be extremely helpful to advance the state-of-the-art in the respective research fields. Interestingly, there has been rather limited work on the standardization of multiple target tracking evaluation. One of the few exceptions is the well-known PETS dataset, targeted primarily at surveillance applications. Even for this widely used benchmark, a common technique for presenting tracking results to date involves using different subsets of the available data, inconsistent model training and varying evaluation scripts.
In this workshop we would like to continue our goal towards a unified framework towards more meaningful quantification of multi-target tracking. Building on our first edition, we are determined to provide a stable infrastructure to eliminate current limitations. Its key strength is twofold. On the one hand, a dynamic framework that leverages the power of crowd-sourcing by accepting new datasets, annotations and even new evaluation metrics will ensure an up-to-date benchmark that constantly stays at the edge of the technological advance. On the other hand, all participants will be using exactly the same detection sets, annotations and evaluation procedures, hence guaranteeing a fair comparison.
During the workshop, we expect to collect participants’ experience regarding both the existing and the new 2016 datasets, the available infrastructure and to discuss further improvement strategies. We believe that our efforts to produce realistic data as well as the continuing workshop series will push the community towards a more unified and meaningful quantification of multi-target tracking.