Qualitative Tracking Result Assessment


Our Team is constantly working on the MOTChallenge to provide the best data and evaluation tools for your research. Here, we kindly ask to perform qualitative assessment of depicted tracker pairs.

If you have a MOTChallenge user account - this is optional - please login first before you continue: login here!

Below shows an example of the task. Two trackers are shown which are supposed to be tracking all of the pedestrians in the video. Your task is to select the best one of the two. In the example both are the ground-truth (perfect tracking) for the training set. This shows an example of the type of tracking behavior that is desired.


Press Play to start both videos simultaneously. When you click on the cursor on the timeline you can go back and forth with the (←) and (→) keys .

Tracks are shown as colored bounding boxes with a history tail. The color of the box and the location of the tail shows which boxes were associated over different frames. The tail locations are the 2D pixel coordinates of the bottom of the boxes in previous frames, thus they don't always follow the 3D location of a person, and neither should they.

Both detection (boxes covering people) and association (correctly linking boxes over time) are important for tracking. Trackers should be penalized for producing extra boxes that shouldn't be there, as well as for missing boxes for people in the video. This is usually easy to see in the images.

Visually judging how good the association over time is, is much more difficult, therefore we have tried to make the tails of the boxes as obvious and useful as possible, and urge users to focus on judging if these tracking associations are correct. It is important for trackers not to switch between different people for the same track (shown as a history line going between two people). It is also important for a tracker to cover the whole of each person's track, e.g. a single person shouldn't have multiple tracks over time (e.g. multiple short tracks, shown as short tails, over the same person in the video is an error).

In general it is easier for people to see mistakes when extra things are present (e.g. extra boxes present, extra associations occur that shouldn't) compared to seeing mistakes in things that are missing (e.g. people without boxes, associations that should have been made but weren't). We would like to make the user aware of this potential bias when making their judgments. There is no required number of videos to watch. The more you judge, the more you will help us with our research. Before you start, please provide us with the following information. Note, that we appreciate all help from people inside and outside of the MOT community: