Click on a measure to sort the table accordingly. See below for a more detailed description.
Tracker | MOTA | IDF1 | HOTA | MT | ML | FP | FN | Rcll | Prcn | AssA | DetA | AssRe | AssPr | DetRe | DetPr | LocA | FAF | ID Sw. | Frag | Hz | |
MPLT 1. | 54.2 | 48.8 | 0.0 | 82 (30.6) | 56 (20.9) | 2,385 | 4,930 | 70.6 | 83.3 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.7 | 366 (5.2) | 538 (7.6) | 0.4 | |
F. Fan Yang, S. Nakamura. Using panoramic videos for multi-person localization and tracking in a 3D panoramic coordinate. In ICASSP, 2020. | |||||||||||||||||||||
MOANA 2. | 52.7 | 62.4 | 0.0 | 76 (28.4) | 59 (22.0) | 2,226 | 5,551 | 66.9 | 83.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.5 | 167 (2.5) | 586 (8.8) | 19.4 | |
Z. Tang, J. Hwang. MOANA: An online learned adaptive appearance model for robust multiple object tracking in 3D. In IEEE Access, 2019. | |||||||||||||||||||||
DBN 3. | 51.1 | 0.0 | 0.0 | 77 (28.7) | 48 (17.9) | 2,077 | 5,746 | 65.8 | 84.2 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.3 | 380 (5.8) | 418 (6.4) | 0.1 | |
T. Klinger, F. Rottensteiner, C. Heipke. Probabilistic Multi-Person Tracking using Dynamic Bayes Networks. In ISPRS Workshop on Image Sequence Analysis (ISA), 2015. | |||||||||||||||||||||
GPDBN 4. | 49.8 | 0.0 | 0.0 | 69 (25.7) | 46 (17.2) | 1,813 | 6,300 | 62.5 | 85.3 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.0 | 311 (5.0) | 386 (6.2) | 0.1 | |
T. Klinger, F. Rottensteiner, C. Heipke. Probabilistic multi-person localisation and tracking in image sequences. In ISPRS Journal of Photogrammetry and Remote Sensing, 2017. | |||||||||||||||||||||
MCFPHD 5. | 39.9 | 0.0 | 0.0 | 69 (25.7) | 45 (16.8) | 3,029 | 6,700 | 60.1 | 76.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3.4 | 363 (6.0) | 529 (8.8) | 17.7 | |
N. Wojke, D. Paulus. Global data association for the Probability Hypothesis Density filter using network flows. In 2016 IEEE International Conference on Robotics and Automation, ICRA, 2016. | |||||||||||||||||||||
LPSFM 6. | 35.9 | 0.0 | 0.0 | 37 (13.8) | 58 (21.6) | 2,031 | 8,206 | 51.1 | 80.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.3 | 520 (10.2) | 601 (11.8) | inf | |
L. Leal-Taixé, G. Pons-Moll, B. Rosenhahn. Everybody needs somebody: modeling social and grouping behavior on a linear programming multiple people tracker. In IEEE International Conference on Computer Vision Workshops (ICCVW). 1st Workshop on Modeling, Simulation and Visual Analysis of Large Crowds, 2011. | |||||||||||||||||||||
LP3D 7. | 35.9 | 0.0 | 0.0 | 56 (20.9) | 44 (16.4) | 3,588 | 6,593 | 60.7 | 74.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.0 | 580 (9.6) | 659 (10.9) | inf | |
MOT baseline: Linear programming on 3D image coordinates. | |||||||||||||||||||||
SVT 8. | 34.2 | 0.0 | 0.0 | 30 (11.2) | 68 (25.4) | 3,057 | 7,454 | 55.6 | 75.3 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3.5 | 532 (9.6) | 611 (11.0) | 1.9 | |
Longyin Wen, Zhen Lei, Ming-Ching Chang, Honggang Qi, Siwei Lyu. Multi-Camera Multi-Target Tracking with Space-Time-View Hyper-graph. IJCV, 2016. | |||||||||||||||||||||
AMIR3D 9. | 25.0 | 0.0 | 0.0 | 8 (3.0) | 74 (27.6) | 2,038 | 9,084 | 45.9 | 79.1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 2.3 | 1,462 (31.9) | 1,647 (35.9) | 1.2 | |
A. Sadeghian, A. Alahi, S. Savarese. Tracking The Untrackable: Learning To Track Multiple Cues with Long-Term Dependencies. In ICCV, 2017. | |||||||||||||||||||||
KalmanSFM 10. | 25.0 | 0.0 | 0.0 | 18 (6.7) | 39 (14.6) | 3,161 | 7,599 | 54.7 | 74.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 3.6 | 1,838 (33.6) | 1,686 (30.8) | 30.6 | |
S. Pellegrini, A. Ess, K. Schindler, L. Gool. You'll never walk alone: Modeling social behavior for multi-target tracking.. In ICCV, 2009. |
Sequences | Frames | Trajectories | Boxes |
2 | 886 | 268 | 16789 |
Measure | Better | Perfect | Description |
MOTA | higher | 100% | Multi-Object Tracking Accuracy (+/- denotes standard deviation across all sequences) [1]. This measure combines three error sources: false positives, missed targets and identity switches. |
IDF1 | higher | 100% | ID F1 Score [2]. The ratio of correctly identified detections over the average number of ground-truth and computed detections. |
HOTA | higher | 100% | Higher Order Tracking Accuracy [3]. Geometric mean of detection accuracy and association accuracy. Averaged across localization thresholds. |
MT | higher | 100% | Mostly tracked targets. The ratio of ground-truth trajectories that are covered by a track hypothesis for at least 80% of their respective life span. |
ML | lower | 0% | Mostly lost targets. The ratio of ground-truth trajectories that are covered by a track hypothesis for at most 20% of their respective life span. |
FP | lower | 0 | The total number of false positives. |
FN | lower | 0 | The total number of false negatives (missed targets). |
Rcll | higher | 100% | Ratio of correct detections to total number of GT boxes. |
Prcn | higher | 100% | Ratio of TP / (TP+FP). |
AssA | higher | 100% | Association Accuracy [3]. Association Jaccard index averaged over all matching detections and then averaged over localization thresholds. |
DetA | higher | 100% | Detection Accuracy [3]. Detection Jaccard index averaged over localization thresholds. |
AssRe | higher | 100% | Association Recall [3]. TPA / (TPA + FNA) averaged over all matching detections and then averaged over localization thresholds. |
AssPr | higher | 100% | Association Precision [3]. TPA / (TPA + FPA) averaged over all matching detections and then averaged over localization thresholds. |
DetRe | higher | 100% | Detection Recall [3]. TP /(TP + FN) averaged over localization thresholds. |
DetPr | higher | 100% | Detection Precision [3]. TP /(TP + FP) averaged over localization thresholds. |
LocA | higher | 100% | Localization Accuracy [3]. Average localization similarity averaged over all matching detections and averaged over localization thresholds. |
FAF | lower | 0 | The average number of false alarms per frame. |
ID Sw. | lower | 0 | Number of Identity Switches (ID switch ratio = #ID switches / recall) [4]. Please note that we follow the stricter definition of identity switches as described in the reference |
Frag | lower | 0 | The total number of times a trajectory is fragmented (i.e. interrupted during tracking). |
Hz | higher | Inf. | Processing speed (in frames per second excluding the detector) on the benchmark. The frequency is provided by the authors and not officially evaluated by the MOTChallenge. |
Symbol | Description |
This is an online (causal) method, i.e. the solution is immediately available with each incoming frame and cannot be changed at any later time. | |
This method used the provided detection set as input. | |
This method used a private detection set as input. | |
This entry has been submitted or updated less than a week ago. |
[1] | Evaluating Multiple Object Tracking Performance: The CLEAR MOT Metrics. Image and Video Processing, 2008(1):1-10, 2008. |
[2] | Performance Measures and a Data Set for Multi-Target, Multi-Camera Tracking. In ECCV workshop on Benchmarking Multi-Target Tracking, 2016. |
[3] | HOTA: A Higher Order Metric for Evaluating Multi-Object Tracking. International Journal of Computer Vision, 2020. |
[4] | Learning to associate: HybridBoosted multi-target tracker for crowded scene. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2009. |