Short name:
mfi_tst
Detector:
Public
Description:
Recently, with the development of deep-learning, the performance of multi-object tracking algorithms based on the deep neural network has been greatly improved. However, most methods separate different functional modules into multiple networks and train them independently on specific tasks. When these network modules are used directly, they are not compatible with each other effectively, nor can they be better adapted to the multi-object tracking task, which leads to a poor tracking effect. Therefore, a network structure is designed to aggregate the regression of objects between frames and the extraction of appearance features into one model to improve the harmony between various functional modules of multi-object tracking. To improve the support for the multi-object tracking task, an end-to-end training method is also proposed to simulate the multi-object tracking process during the training and expand the training data by using the historical position of the target combined with the prediction of the motion model. A metric loss that can take advantage of the historical appearance features of the target is also used to train the extraction module of appearance features to improve the temporal correlation of extracted appearance features. Evaluation results on the MOTChallenge benchmark datasets show that the proposed approach achieves state-of-the-art performance.
Reference:
J. Y, H. Ge, J. Yang, Y. Tong, S. Su. Online multi-object tracking using multi-function integration and tracking simulation training. In Applied Intelligence, 2021.
Last submitted:
May 12, 2021 (3 years ago)
Published:
May 12, 2021 at 09:06:44 CET
Submissions:
1
Project page / code:
n/a
Open source:
No
Hardware:
2080TI
Runtime:
0.7 Hz
Benchmark performance:
Sequence | MOTA | IDF1 | HOTA | MT | ML | FP | FN | Rcll | Prcn | AssA | DetA | AssRe | AssPr | DetRe | DetPr | LocA | FAF | ID Sw. | Frag |
MOT15 | 49.2 | 52.4 | 41.5 | 210 (29.1) | 176 (24.4) | 8,707 | 21,594 | 64.9 | 82.1 | 40.3 | 43.4 | 44.2 | 74.2 | 50.7 | 64.2 | 79.0 | 1.5 | 912 (0.0) | 1,397 (0.0) |
Detailed performance:
Sequence | MOTA | IDF1 | HOTA | MT | ML | FP | FN | Rcll | Prcn | AssA | DetA | AssRe | AssPr | DetRe | DetPr | LocA | FAF | ID Sw. | Frag |
ADL-Rundle-1 | 31.2 | 47.5 | 39.2 | 18 | 3 | 3,476 | 2,868 | 69.2 | 64.9 | 40.2 | 38.7 | 44.4 | 69.5 | 54.0 | 50.7 | 78.0 | 7.0 | 61 | 145 |
ADL-Rundle-3 | 42.4 | 46.6 | 39.4 | 11 | 5 | 1,768 | 4,007 | 60.6 | 77.7 | 37.3 | 42.3 | 40.2 | 77.7 | 50.6 | 64.9 | 82.8 | 2.8 | 82 | 111 |
AVG-TownCentre | 56.6 | 63.0 | 44.8 | 76 | 32 | 502 | 2,267 | 68.3 | 90.7 | 44.3 | 45.9 | 49.6 | 69.9 | 50.4 | 66.9 | 76.0 | 1.1 | 331 | 372 |
ETH-Crossing | 47.6 | 55.0 | 42.5 | 5 | 9 | 51 | 463 | 53.8 | 91.4 | 42.1 | 43.0 | 45.0 | 82.9 | 46.3 | 78.5 | 85.3 | 0.2 | 12 | 16 |
ETH-Jelmoli | 57.6 | 69.5 | 51.3 | 18 | 12 | 362 | 698 | 72.5 | 83.6 | 53.3 | 49.8 | 60.5 | 75.6 | 58.6 | 67.5 | 81.4 | 0.8 | 15 | 40 |
ETH-Linthescher | 51.6 | 57.6 | 45.9 | 37 | 100 | 304 | 3,967 | 55.6 | 94.2 | 49.0 | 43.1 | 53.3 | 78.0 | 45.6 | 77.4 | 82.4 | 0.3 | 52 | 98 |
KITTI-16 | 56.3 | 72.9 | 47.0 | 4 | 1 | 184 | 539 | 68.3 | 86.3 | 49.1 | 45.2 | 52.3 | 68.1 | 50.0 | 63.2 | 73.9 | 0.9 | 20 | 54 |
KITTI-19 | 51.7 | 61.0 | 42.4 | 17 | 10 | 730 | 1,792 | 66.5 | 82.9 | 42.2 | 43.2 | 47.3 | 66.5 | 49.4 | 61.6 | 74.8 | 0.7 | 57 | 158 |
PETS09-S2L2 | 59.7 | 34.9 | 29.8 | 9 | 2 | 506 | 3,140 | 67.4 | 92.8 | 18.6 | 47.9 | 19.8 | 69.3 | 51.9 | 71.4 | 78.2 | 1.2 | 244 | 357 |
TUD-Crossing | 75.5 | 75.5 | 53.7 | 9 | 0 | 69 | 192 | 82.6 | 93.0 | 51.2 | 56.4 | 59.8 | 63.7 | 62.3 | 70.1 | 77.5 | 0.3 | 9 | 14 |
Venice-1 | 46.4 | 53.3 | 42.2 | 6 | 2 | 755 | 1,661 | 63.6 | 79.4 | 44.8 | 40.2 | 47.5 | 77.3 | 48.1 | 60.1 | 77.8 | 1.7 | 29 | 32 |
Raw data: