Sorry, you need to enable JavaScript to visit this website.

VISION TRANSFORMER EQUIPPED WITH NEURAL RESIZER ON FACIAL EXPRESSION RECOGNITION TASK

Citation Author(s):
Soyeon Kim, Hyeon Yeo, Wei-Jin Park, Jiho Seo, Kyungtae Ko
Submitted by:
Hyeonbin Hwang
Last updated:
7 May 2022 - 9:40am
Document Type:
Poster
Document Year:
2022
Event:
Presenters:
Hyeonbin Hwang
Paper Code:
IVMSP-34.6
 

When it comes to wild conditions, Facial Expression Recognition is often challenged with low-quality data and imbalanced, ambiguous labels. This field has much benefited from CNN based approaches; however, CNN models have structural limitations to see the facial regions in distance. As a remedy, Transformer has been introduced to vision fields with a global receptive field but requires adjusting input spatial size to the pretrained models to enjoy its strong inductive bias at hands. We herein raise the question of whether using the deterministic interpolation method is enough to feed low-resolution data to Transformer. In this work, we propose a novel training framework, Neural Resizer, to support Transformer by compensating information and downscaling in a data-driven manner trained with loss function balancing the noisiness and imbalance. Experiments show our Neural Resizer with F-PDLS loss function improves the performance with Transformer variants in general and nearly achieves the state-of-the-art performance.

up
0 users have voted: