Sorry, you need to enable JavaScript to visit this website.

Spectrogram Analysis Via Self-Attention for Realizing Cross-Model Visual-Audio Generation

Citation Author(s):
Huadong Tan ; Guang Wu ; Pengcheng Zhao ; Yanxiang Chen
Submitted by:
Guang Wu
Last updated:
13 May 2020 - 9:29pm
Document Type:
Document Year:
Paper Code:


Human cognition is supported by the combination of multi- modal information from different sources of perception. The two most important modalities are visual and audio. Cross- modal visual-audio generation enables the synthesis of da- ta from one modality following the acquisition of data from another. This brings about the full experience that can only be achieved through the combination of the two. In this pa- per, the Self-Attention mechanism is applied to cross-modal visual-audio generation for the first time. This technique is implemented to assist in the analysis of the structural char- acteristics of the spectrogram. A series of experiments are conducted to discover the best performing configuration. The post-experimental comparison shows that the Self-Attention module greatly improves the generation and classification of audio data. Furthermore, the presented method achieves re- sults that are superior to existing cross-modal visual-audio generative models.

Document File(s): 
0 users have voted: