END-TO-END PERSON SEARCH SEQUENTIALLY TRAINED ON AGGREGATED DATASET

In video surveillance applications, person search is a chal-
lenging task consisting in detecting people and extracting
features from their silhouette for re-identification (re-ID) pur-
pose. We propose a new end-to-end model that jointly com-
putes detection and feature extraction steps through a single
deep Convolutional Neural Network architecture. Sharing
feature maps between the two tasks for jointly describing
people commonalities and specificities allows faster runtime,
which is valuable in real-world applications. In addition
to reaching state-of-the-art accuracy, this multi-task model
can be sequentially trained task-by-task, which results in a
broader acceptance of input dataset types. Indeed, we show
that aggregating more pedestrian detection datasets without
costly identity annotations makes the shared feature maps
more generic, and improves re-ID precision. Moreover, these
boosted shared feature maps result in re-ID features more
robust to a cross-dataset scenario.

2019_icip_aloesch_v2.pdf

2019_ICIP_aloesch (430)

Thumbs Up

CITE

Documents

Poster

END-TO-END PERSON SEARCH SEQUENTIALLY TRAINED ON AGGREGATED DATASET

2019_icip_aloesch_v2.pdf

QUESTIONS?