Online Multiple Pedestrian Tracking with Deep Temporal Appearance Matching Association

MOT17-01-DPM MOT17-03-DPM MOT17-06-DPM MOT17-07-DPM MOT17-08-DPM MOT17-12-DPM MOT17-14-DPM MOT17-01-FRCNN MOT17-03-FRCNN MOT17-06-FRCNN

Short name:

DEEP_TAMA

Benchmark:

Description:

In online multiple pedestrian tracking it is of great importance to construct reliable cost matrix for assigning observations to tracks. Each element of cost matrix is constructed by using similarity measure. Many previous works have proposed their own similarity calculation methods consisting of geometric model (e.g. bounding box coordinates) and appearance model. In particular, appearance model contains information with higher dimension compared to geometric model. Thanks to the recent success of deep learning based methods, handling of high dimensional appearance information becomes possible. Among many deep networks, a siamese network with triplet loss is popularly adopted as an appearance feature extractor. Since the siamese network can extract features of each input independently, it is possible to adaptively model tracks (e.g. linear
update). However, it is not suitable for multi-object setting that requires comparison with other inputs. In this paper we propose a novel track appearance modeling based on joint inference network to address this issue. The proposed method enables comparison of two inputs to be used for adaptive appearance
modeling. It contributes to disambiguating target-observation matching and consolidating the identity consistency. Intensive experimental results support effectiveness of our method. Ours has been awarded as a 3rd-highest tracker on MOTChallenge19, held in 4th BMTT workshop.

Hardware:

3.7 GHZ, 1 Core(no GPU)

Detector:

Public

Processing:

Online

Last submitted:

February 07, 2019 (10 months ago)

Published:

February 07, 2019 at 02:56:08 CET

Submissions:

1

Open source:

No

Project page / code:

n/a

Reference:

Y. Yoon, D. Kim, K. Yoon, Y. Song, M. Jeon. Online Multiple Pedestrian Tracking using Deep Temporal Appearance Matching Association. In arXiv:1907.00831, 2019.

Benchmark performance:

MOTAMOTPFAFMTMLFPFNID Sw.FragSpecificationsDetector
50.376.71.419.2 % 37.5 % 25,479252,9962,1923,9783.7 GHZ, 1 Core(no GPU)Public
IDF1ID PrecisionID Recall
53.571.642.7

Detailed performance:

Sequence MOTA IDF1 MOTP FAF GT MT ML FP FN ID Sw Frag
MOT17-01-DPM38.743.971.80.22416.7 % 45.8 % 1123,8232048
MOT17-03-DPM54.253.575.81.614819.6 % 19.6 % 2,33145,407248520
MOT17-06-DPM43.255.172.60.622215.3 % 49.1 % 7585,85982164
MOT17-07-DPM38.543.673.21.1608.3 % 43.3 % 5389,77779166
MOT17-08-DPM26.530.480.00.87611.8 % 51.3 % 47514,9787071
MOT17-12-DPM41.050.276.50.39120.9 % 49.5 % 2634,8163748
MOT17-14-DPM24.935.874.60.61643.7 % 63.4 % 48213,33064107
MOT17-01-FRCNN28.340.975.73.42433.3 % 29.2 % 1,5493,0512645
MOT17-03-FRCNN59.657.877.31.514827.0 % 16.9 % 2,31139,826186329
MOT17-06-FRCNN47.354.976.41.122223.4 % 31.1 % 1,3694,732115208
MOT17-07-FRCNN31.739.473.74.0605.0 % 25.0 % 2,0119,373150290
MOT17-08-FRCNN23.130.379.51.4769.2 % 51.3 % 89215,2806583
MOT17-12-FRCNN36.550.077.50.79116.5 % 49.5 % 6354,8382837
MOT17-14-FRCNN16.934.371.04.41644.3 % 45.1 % 3,30811,803243410
MOT17-01-SDP44.349.574.62.72437.5 % 16.7 % 1,2022,3543565
MOT17-03-SDP75.970.678.41.014851.4 % 9.5 % 1,48623,515183488
MOT17-06-SDP52.258.875.11.122233.3 % 31.1 % 1,3144,206108191
MOT17-07-SDP46.449.275.52.26023.3 % 30.0 % 1,0977,85996192
MOT17-08-SDP32.636.679.50.87614.5 % 44.7 % 51113,626103136
MOT17-12-SDP40.854.478.40.99123.1 % 46.2 % 8264,2713545
MOT17-14-SDP32.444.972.52.71646.1 % 39.6 % 2,00910,272219335

Raw data:

n/a


DEEP_TAMA