TAO Open-World

The "Tracking Any Object in Open-World CVPR 2023 Challenge" consists of two sub-challenges: (i) long-tail challenge and (ii) open-world challenge. These challenges are based on the BURST Benchmark [1], which in turn is an extension of the Tracking Any Object (TAO) dataset [2] that involves pixel-precise segmentation masks for all objects. Submissions for both challenges will be evaluated on three sets of object classes: 1) a 78 "common" class set which roughly corresponds to the 80 standard COCO classes [3]. 2) a 404 "uncommon" class for which there are often very few samples in the dataset. 3) The union of the above two, i.e. the 482 class "all" set. Note that our 482 classes are a subset of the much larger class set for the LVIS dataset for image-level instance segmentation [6]. For open-world tracking benchmark, models can be trained using only labels for the 78 common class set. Training on COCO and/or ImageNet is permitted, as is any kind of self-supervised pre-training on other datasets. For this challenge, we will require code to be submitted for internal revision to ensure models are not trained using labels beyond the permitted datasets. Submissions will be evaluated using OWTA, which is basically a recall-based variant of HOTA that does not penalize false positives. The OWTA metric is computed for each of the three above-mentioned class sets and is denoted by OWTA_com, OWTA_unc and OWTA_all, respectively. For more details about the metrics, please refer to the HOTA metrics paper [4] and Sec. 6 of the BURST benchmark paper [2] This challenge is a part of CVPR 2023 workshop: "Tracking and Its Many Guises: Tracking Any Object in Open-World".

Training Set

Sample Name FPS Resolution Length Tracks BoxesDensityDescriptionSourceRef.
Total 0 frm.
(0 s.)

Test Set

Sample Name FPS Resolution Length Tracks BoxesDensityDescriptionSourceRef.
BURST_test301280x72052194 (29:00)79631671323.2The test set of BURST.link[1]
Total 52194 frm.
(1740 s.)
7963 167132 3.2


Get all data ()
Get files (no img) only ()
Development Kit


[1] Athar, A., Luiten, J., Voigtlaender, P., Khurana, T., Dave, A., Leibe, B. & Ramanan, D. BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in Video. In WACV, 2023.
[2] Dave, A., Khurana, T., Tokmakov, P., Schmid, C. & Ramanan, D. TAO: A Large-Scale Benchmark for Tracking Any Object. In European Conference on Computer Vision, 2020.
[3] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. & Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Computer Vision -- ECCV 2014, 2014.
[4] Jonathon Luiten, A.O. & Leibe, B. HOTA: A Higher Order Metric for Evaluating Multi-Object Tracking. International Journal of Computer Vision, 2020.
[5] Liu, Y., Zulfikar, I.E., Luiten, J., Dave, A., Ramanan, D., Leibe, B., Osep, A. & Leal-Taixe, L. Opening up Open-World Tracking. In Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
[6] Gupta, A., Dollar, P. & Girshick, R. LVIS: A Dataset for Large Vocabulary Instance Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.