Benchmark:
MOT15 |
Short name:
MotiCon
Detector:
Public
Description:
We present a novel method for multiple people tracking that leverages a generalized model for capturing interactions among individuals.
At the core of our model lies a learned dictionary of interaction feature strings which capture relationships between the motions of targets.
These feature strings, created from low-level image features, lead to a much richer representation of the physical interactions between targets compared to hand-specified social force models that previous works have introduced for tracking. One disadvantage of using social forces is that all pedestrians must be detected in order for the forces to be applied, while our method is able to encode the effect of undetected targets, making the tracker more robust to partial occlusions.
The interaction feature strings are used in a Random Forest framework to track targets according to the features surrounding them.
Reference:
L. Leal-Taixé, M. Fenzi, A. Kuznetsova, B. Rosenhahn, S. Savarese. Learning an image-based motion context for multiple people tracking. In CVPR, 2014.
Last submitted:
April 07, 2015 (9 years ago)
Published:
April 07, 2015 at 21:02:03 CET
Submissions:
1
Project page / code:
Open source:
Yes
Hardware:
2.6 GHz, 1 core
Runtime:
1.4 Hz
Benchmark performance:
Sequence | MOTA | IDF1 | HOTA | MT | ML | FP | FN | Rcll | Prcn | AssA | DetA | AssRe | AssPr | DetRe | DetPr | LocA | FAF | ID Sw. | Frag |
MOT15 | 23.1 | 29.4 | 0.0 | 34 (4.7) | 375 (52.0) | 10,404 | 35,844 | 41.7 | 71.1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.8 | 1,018 (24.4) | 1,061 (25.5) |
Detailed performance:
Sequence | MOTA | IDF1 | HOTA | MT | ML | FP | FN | Rcll | Prcn | AssA | DetA | AssRe | AssPr | DetRe | DetPr | LocA | FAF | ID Sw. | Frag |
ADL-Rundle-1 | 1.0 | 30.4 | 0.0 | 6 | 4 | 4,449 | 4,628 | 50.3 | 51.3 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 8.9 | 136 | 170 |
ADL-Rundle-3 | 18.1 | 22.9 | 0.0 | 2 | 9 | 2,755 | 5,355 | 47.3 | 63.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 4.4 | 217 | 140 |
AVG-TownCentre | 11.9 | 23.0 | 0.0 | 2 | 158 | 353 | 5,872 | 17.9 | 78.3 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.8 | 74 | 75 |
ETH-Crossing | 22.8 | 29.1 | 0.0 | 1 | 17 | 13 | 753 | 24.9 | 95.1 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.1 | 8 | 8 |
ETH-Jelmoli | 43.5 | 52.4 | 0.0 | 9 | 13 | 295 | 1,102 | 56.6 | 82.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.7 | 37 | 51 |
ETH-Linthescher | 18.3 | 24.0 | 0.0 | 3 | 146 | 98 | 7,124 | 20.2 | 94.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.1 | 72 | 75 |
KITTI-16 | 38.8 | 39.5 | 0.0 | 0 | 2 | 142 | 863 | 49.3 | 85.5 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.7 | 36 | 48 |
KITTI-19 | 33.8 | 41.7 | 0.0 | 4 | 13 | 887 | 2,552 | 52.2 | 75.9 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.8 | 100 | 126 |
PETS09-S2L2 | 46.6 | 27.2 | 0.0 | 4 | 6 | 560 | 4,354 | 54.8 | 90.4 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.3 | 238 | 264 |
TUD-Crossing | 58.2 | 46.6 | 0.0 | 3 | 2 | 32 | 403 | 63.4 | 95.6 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.2 | 26 | 32 |
Venice-1 | 18.2 | 26.1 | 0.0 | 0 | 5 | 820 | 2,838 | 37.8 | 67.8 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 0.0 | 1.8 | 74 | 72 |
Raw data: