Sorry, you need to enable JavaScript to visit this website.

MULTIGAP: MULTI-POOLED INCEPTION NETWORK WITH TEXT AUGMENTATION FOR AESTHETIC PREDICTION OF PHOTOGRAPHS

Citation Author(s):
Yong-Lian Hii, John See, Magzhan Kairanbay, Lai-Kuan Wong
Submitted by:
Magzhan Kairanbay
Last updated:
15 September 2017 - 4:38am
Document Type:
Presentation Slides
Document Year:
2017
Event:
Presenters:
Magzhan Kairanbay
Paper Code:
3306
Categories:
 

With the advent of deep learning, convolutional neural networks have solved many imaging problems to a large extent. However, it remains to be seen if the image “bottleneck” can be unplugged by harnessing complementary sources of data. In this paper, we present a new approach to image aesthetic evaluation that learns both visual and textual features simultaneously. Our network extracts visual features by appending global average pooling blocks on multiple inception modules (MultiGAP), while textual features from associated user comments are learned from a recurrent neural network. Experimental results show that the proposed method is capable of achieving state-of-the-art performance on the AVA / AVA Comments datasets. We also demonstrate the capability of our approach in visualizing aesthetic activations.

up
0 users have voted: