DD_TAMA16: Online Multiple Pedestrian Tracking with Deep Temporal Appearance Matching Association


Video not available.

Rendering of new sequences is currently deactivated due to heavy load.

Rendering of new sequences is currently deactivated due to heavy load.

Rendering of new sequences is currently deactivated due to heavy load.

Rendering of new sequences is currently deactivated due to heavy load.

Rendering of new sequences is currently deactivated due to heavy load.

Rendering of new sequences is currently deactivated due to heavy load.

Rendering of new sequences is currently deactivated due to heavy load.

Benchmark:

MOT16 |

Short name:

DD_TAMA16

Detector:

Public

Description:

In online multi-target tracking, it is of great importance to model appearance and geometric similarity between pedestrians which have been tracked and appeared in a new frame. The dimension of the inherent feature vector in the appearance model is higher than that in the geometric model, which causes many problems in general. However, the recent success of deep learning-based methods makes it possible to handle high dimensional appearance information successfully. Among many deep networks, the Siamese network with triplet loss is popularly
adopted as an appearance feature extractor. Since the Siamese network can extract features of each input independently, it is possible to update and maintain target-specific features. However, it is not suitable for multi-target settings that require comparison with other inputs. In this paper, to address this issue, we propose a novel track appearance model based on the joint-inference network. The proposed method enables a comparison of two inputs to be used for adaptive appearance modeling, and contributes to disambiguating the process of target-observation matching and consolidating identity consistency. Diverse experimental results support the effectiveness of our method. Our work has been awarded as a 3rd-highest tracker on MOTChallenge19, held in CVPR2019.1 The code is available on https://github.com/yyc9268/Deep-TAMA.

Reference:

Y. Yoon, D. Kim, Y. Song, K. Yoon, M. Jeon. Online Multiple Pedestrians Tracking using Deep Temporal Appearance Matching Association. In Information Sciences, 2020.

Last submitted:

February 10, 2019 (5 years ago)

Published:

April 28, 2019 at 13:42:34 CET

Submissions:

1

Open source:

No

Hardware:

3.7GHZ, 1 Core, no GPU

Runtime:

6.5 Hz

Benchmark performance:

Sequence MOTA IDF1 HOTA MT ML FP FN Rcll Prcn AssA DetA AssRe AssPr DetRe DetPr LocA FAF ID Sw. Frag
MOT1646.249.437.3107 (14.1)334 (44.0)5,12692,36749.394.638.336.640.874.138.473.778.90.9598 (12.1)1,127 (22.8)

Detailed performance:

Sequence MOTA IDF1 HOTA MT ML FP FN Rcll Prcn AssA DetA AssRe AssPr DetRe DetPr LocA FAF ID Sw. Frag
MOT16-0139.044.233.25101123,76841.195.937.429.540.168.930.671.376.60.22048
MOT16-0354.253.540.329292,36345,32056.796.238.842.041.075.644.174.979.11.6248526
MOT16-0643.655.841.4351067865,64151.188.248.036.154.566.539.067.476.60.780162
MOT16-0739.844.632.65195439,20943.692.933.631.935.669.333.471.277.01.179165
MOT16-0832.935.529.192552610,64236.492.129.528.831.475.430.176.182.20.87071
MOT16-1242.051.439.518413144,45746.392.445.334.547.676.436.572.979.80.33748
MOT16-1424.935.825.9610448213,33027.991.432.420.735.167.221.570.778.10.664107

Raw data: