Short name:
MPNTrack
Detector:
Public
Description:
Graphs offer a natural way to formulate Multiple Object Tracking (MOT) within the tracking-by-detection paradigm. However, they also introduce a major challenge for learning methods, as defining a model that can operate on such a structured domain is not trivial. As a consequence, most learning-based work has been devoted to learning better features for MOT, and then using these with well-established optimization frameworks. In this work, we exploit the classical network flow formulation of MOT to define a fully differentiable framework based on Message Passing Networks (MPNs). By operating directly on the graph domain, our method can reason globally over an entire set of detections and predict final solutions. Hence, we show that learning in MOT does not need to be restricted to feature extraction, but it can also be applied to the data association step. We show a significant improvement in both MOTA and IDF1 on three publicly available benchmarks.
Reference:
G. Braso, L. Leal-Taixe. Learning a Neural Solver for Multiple Object Tracking. In CVPR, 2020.
Last submitted:
April 16, 2020 (4 years ago)
Published:
April 17, 2020 at 10:41:55 CET
Submissions:
1
Project page / code:
Open source:
No
Hardware:
NVIDIA Quadro P5000
Runtime:
6.5 Hz
Benchmark performance:
Sequence | MOTA | IDF1 | HOTA | MT | ML | FP | FN | Rcll | Prcn | AssA | DetA | AssRe | AssPr | DetRe | DetPr | LocA | FAF | ID Sw. | Frag |
MOT15 | 51.5 | 58.6 | 45.0 | 225 (31.2) | 187 (25.9) | 7,620 | 21,780 | 64.6 | 83.9 | 46.2 | 44.4 | 54.8 | 67.1 | 51.0 | 66.3 | 79.4 | 1.3 | 375 (5.8) | 872 (13.5) |
Detailed performance:
Sequence | MOTA | IDF1 | HOTA | MT | ML | FP | FN | Rcll | Prcn | AssA | DetA | AssRe | AssPr | DetRe | DetPr | LocA | FAF | ID Sw. | Frag |
ADL-Rundle-1 | 33.3 | 53.9 | 42.2 | 15 | 3 | 3,025 | 3,143 | 66.2 | 67.1 | 46.1 | 38.9 | 51.4 | 69.2 | 51.9 | 52.6 | 77.5 | 6.1 | 42 | 111 |
ADL-Rundle-3 | 55.9 | 61.8 | 50.7 | 19 | 8 | 1,001 | 3,454 | 66.0 | 87.0 | 52.3 | 49.5 | 61.1 | 69.6 | 55.7 | 73.4 | 83.7 | 1.6 | 32 | 35 |
AVG-TownCentre | 60.4 | 62.5 | 45.2 | 86 | 37 | 439 | 2,335 | 67.3 | 91.6 | 44.5 | 46.5 | 57.8 | 58.2 | 50.3 | 68.5 | 76.3 | 1.0 | 58 | 222 |
ETH-Crossing | 52.1 | 62.5 | 48.1 | 7 | 10 | 61 | 416 | 58.5 | 90.6 | 50.8 | 45.6 | 64.7 | 70.1 | 49.8 | 76.9 | 84.9 | 0.3 | 3 | 6 |
ETH-Jelmoli | 60.4 | 72.6 | 54.5 | 18 | 13 | 372 | 626 | 75.3 | 83.7 | 57.2 | 52.0 | 70.2 | 66.8 | 61.7 | 68.5 | 82.5 | 0.8 | 7 | 30 |
ETH-Linthescher | 49.1 | 58.5 | 46.8 | 44 | 96 | 692 | 3,826 | 57.2 | 88.1 | 50.4 | 43.6 | 63.2 | 65.1 | 47.7 | 73.6 | 82.2 | 0.6 | 24 | 73 |
KITTI-16 | 55.5 | 69.5 | 44.5 | 2 | 1 | 107 | 636 | 62.6 | 90.9 | 46.6 | 42.5 | 49.9 | 67.3 | 46.1 | 66.9 | 75.4 | 0.5 | 14 | 33 |
KITTI-19 | 49.1 | 62.7 | 43.1 | 14 | 14 | 599 | 2,088 | 60.9 | 84.5 | 45.9 | 40.7 | 50.2 | 68.5 | 45.8 | 63.5 | 75.8 | 0.6 | 34 | 102 |
PETS09-S2L2 | 55.2 | 43.7 | 32.4 | 6 | 2 | 583 | 3,593 | 62.7 | 91.2 | 23.4 | 45.1 | 26.8 | 60.8 | 48.8 | 71.0 | 78.3 | 1.3 | 147 | 238 |
TUD-Crossing | 80.7 | 62.7 | 49.0 | 7 | 0 | 45 | 157 | 85.8 | 95.5 | 41.5 | 58.0 | 56.8 | 48.8 | 63.5 | 70.7 | 77.5 | 0.2 | 11 | 13 |
Venice-1 | 51.7 | 67.6 | 47.4 | 7 | 3 | 696 | 1,506 | 67.0 | 81.5 | 53.6 | 42.0 | 61.6 | 72.3 | 50.3 | 61.2 | 78.5 | 1.5 | 3 | 9 |
Raw data: