Documents
Poster
Spectrogram Analysis Via Self-Attention for Realizing Cross-Model Visual-Audio Generation
- Citation Author(s):
- Submitted by:
- Guang Wu
- Last updated:
- 13 May 2020 - 9:29pm
- Document Type:
- Poster
- Document Year:
- 2020
- Event:
- Presenters:
- GuangWu
- Paper Code:
- MMSP-P1.1
- Categories:
- Log in to post comments
Human cognition is supported by the combination of multi- modal information from different sources of perception. The two most important modalities are visual and audio. Cross- modal visual-audio generation enables the synthesis of da- ta from one modality following the acquisition of data from another. This brings about the full experience that can only be achieved through the combination of the two. In this pa- per, the Self-Attention mechanism is applied to cross-modal visual-audio generation for the first time. This technique is implemented to assist in the analysis of the structural char- acteristics of the spectrogram. A series of experiments are conducted to discover the best performing configuration. The post-experimental comparison shows that the Self-Attention module greatly improves the generation and classification of audio data. Furthermore, the presented method achieves re- sults that are superior to existing cross-modal visual-audio generative models.