TAO Long-Tail

The "Tracking Any Object in Open-World CVPR 2023 Challenge" consists of two sub-challenges: (i) long-tail challenge and (ii) open-world challenge. These challenges are based on the BURST Benchmark [1], which in turn is an extension of the Tracking Any Object (TAO) dataset [2] that involves pixel-precise segmentation masks for all objects. Submissions for both challenges will be evaluated on three sets of object classes: 1) a 78 "common" class set which roughly corresponds to the 80 standard COCO classes [3]. 2) a 404 "uncommon" class for which there are often very few samples in the dataset. 3) The union of the above two, i.e. the 482 class "all" set. Note that our 482 classes are a subset of the much larger class set for the LVIS dataset for image-level instance segmentation [6]. For the long-tail tracking benchmark, models can be trained using annotations for all 482 classes. Submissions will be evaluated using the HOTA metrics. [4] These are computed separately for each of the three class sets above, and are denoted by HOTA_com, HOTA_unc and HOTA_all, respectively. For more details about the metrics, please refer to the HOTA metrics paper [4] and Sec. 6 of the BURST benchmark paper [2] This challenge is a part of CVPR 2023 workshop: "Tracking and Its Many Guises: Tracking Any Object in Open-World".

Training Set

Sample Name FPS Resolution Length Tracks BoxesDensityDescriptionSourceRef.
Total 0 frm.
(0 s.)

Test Set

Sample Name FPS Resolution Length Tracks BoxesDensityDescriptionSourceRef.
BURST_test301280x72052194 (29:00)79631671323.2The test set of BURST.link[1]
Total 52194 frm.
(1740 s.)
7963 167132 3.2


Get all data ()
Get files (no img) only ()
Development Kit


[1] Athar, A., Luiten, J., Voigtlaender, P., Khurana, T., Dave, A., Leibe, B. & Ramanan, D. BURST: A Benchmark for Unifying Object Recognition, Segmentation and Tracking in Video. In WACV, 2023.
[2] Dave, A., Khurana, T., Tokmakov, P., Schmid, C. & Ramanan, D. TAO: A Large-Scale Benchmark for Tracking Any Object. In European Conference on Computer Vision, 2020.
[3] Lin, T.Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P. & Zitnick, C.L. Microsoft COCO: Common Objects in Context. In Computer Vision -- ECCV 2014, 2014.
[4] Jonathon Luiten, A.O. & Leibe, B. HOTA: A Higher Order Metric for Evaluating Multi-Object Tracking. International Journal of Computer Vision, 2020.
[5] Liu, Y., Zulfikar, I.E., Luiten, J., Dave, A., Ramanan, D., Leibe, B., Osep, A. & Leal-Taixe, L. Opening up Open-World Tracking. In Conference on Computer Vision and Pattern Recognition (CVPR), 2022.
[6] Gupta, A., Dollar, P. & Girshick, R. LVIS: A Dataset for Large Vocabulary Instance Segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2019.