MOT Challenge

Video not available.

Rendering of new sequences is currently deactivated due to heavy load.

Rendering of new sequences is currently deactivated due to heavy load.

Rendering of new sequences is currently deactivated due to heavy load.

Rendering of new sequences is currently deactivated due to heavy load.

Rendering of new sequences is currently deactivated due to heavy load.

Rendering of new sequences is currently deactivated due to heavy load.

Rendering of new sequences is currently deactivated due to heavy load.

Benchmark:

MOT16 |

Short name:

DD_TAMA16

Detector:

Public

Description:

In online multi-target tracking, it is of great importance to model appearance and geometric similarity between pedestrians which have been tracked and appeared in a new frame. The dimension of the inherent feature vector in the appearance model is higher than that in the geometric model, which causes many problems in general. However, the recent success of deep learning-based methods makes it possible to handle high dimensional appearance information successfully. Among many deep networks, the Siamese network with triplet loss is popularly
adopted as an appearance feature extractor. Since the Siamese network can extract features of each input independently, it is possible to update and maintain target-specific features. However, it is not suitable for multi-target settings that require comparison with other inputs. In this paper, to address this issue, we propose a novel track appearance model based on the joint-inference network. The proposed method enables a comparison of two inputs to be used for adaptive appearance modeling, and contributes to disambiguating the process of target-observation matching and consolidating identity consistency. Diverse experimental results support the effectiveness of our method. Our work has been awarded as a 3rd-highest tracker on MOTChallenge19, held in CVPR2019.1 The code is available on https://github.com/yyc9268/Deep-TAMA.

Reference:

Y. Yoon, D. Kim, Y. Song, K. Yoon, M. Jeon. Online Multiple Pedestrians Tracking using Deep Temporal Appearance Matching Association. In Information Sciences, 2020.

Last submitted:

February 10, 2019 (6 years ago)

Published:

April 28, 2019 at 13:42:34 CET

Submissions:

Project page / code:

https://github.com/yyc9268/Deep-TAMA

Open source:

Hardware:

3.7GHZ, 1 Core, no GPU

Runtime:

6.5 Hz

Benchmark performance:

Sequence	MOTA	IDF1	HOTA	MT	ML	FP	FN	Rcll	Prcn	AssA	DetA	AssRe	AssPr	DetRe	DetPr	LocA	FAF	ID Sw.	Frag
MOT16	46.2	49.4	37.3	107 (14.1)	334 (44.0)	5,126	92,367	49.3	94.6	38.3	36.6	40.8	74.1	38.4	73.7	78.9	0.9	598 (12.1)	1,127 (22.8)

Detailed performance:

Sequence	MOTA	IDF1	HOTA	MT	ML	FP	FN	Rcll	Prcn	AssA	DetA	AssRe	AssPr	DetRe	DetPr	LocA	FAF	ID Sw.	Frag
MOT16-01	39.0	44.2	33.2	5	10	112	3,768	41.1	95.9	37.4	29.5	40.1	68.9	30.6	71.3	76.6	0.2	20	48
MOT16-03	54.2	53.5	40.3	29	29	2,363	45,320	56.7	96.2	38.8	42.0	41.0	75.6	44.1	74.9	79.1	1.6	248	526
MOT16-06	43.6	55.8	41.4	35	106	786	5,641	51.1	88.2	48.0	36.1	54.5	66.5	39.0	67.4	76.6	0.7	80	162
MOT16-07	39.8	44.6	32.6	5	19	543	9,209	43.6	92.9	33.6	31.9	35.6	69.3	33.4	71.2	77.0	1.1	79	165
MOT16-08	32.9	35.5	29.1	9	25	526	10,642	36.4	92.1	29.5	28.8	31.4	75.4	30.1	76.1	82.2	0.8	70	71
MOT16-12	42.0	51.4	39.5	18	41	314	4,457	46.3	92.4	45.3	34.5	47.6	76.4	36.5	72.9	79.8	0.3	37	48
MOT16-14	24.9	35.8	25.9	6	104	482	13,330	27.9	91.4	32.4	20.7	35.1	67.2	21.5	70.7	78.1	0.6	64	107

Raw data:

download

GLMBFairMOT

DD_TAMA16

AFN

DD_TAMA16: Online Multiple Pedestrian Tracking with Deep Temporal Appearance Matching Association