Spectrogram Analysis Via Self-Attention for Realizing Cross-Model Visual-Audio Generation

Human cognition is supported by the combination of multi- modal information from different sources of perception. The two most important modalities are visual and audio. Cross- modal visual-audio generation enables the synthesis of da- ta from one modality following the acquisition of data from another. This brings about the full experience that can only be achieved through the combination of the two. In this pa- per, the Self-Attention mechanism is applied to cross-modal visual-audio generation for the first time. This technique is implemented to assist in the analysis of the structural char- acteristics of the spectrogram. A series of experiments are conducted to discover the best performing configuration. The post-experimental comparison shows that the Self-Attention module greatly improves the generation and classification of audio data. Furthermore, the presented method achieves re- sults that are superior to existing cross-modal visual-audio generative models.

SA-CMGAN poster.pdf

SA-CMGAN (297)

Thumbs Up

CITE

Documents

Poster

Spectrogram Analysis Via Self-Attention for Realizing Cross-Model Visual-Audio Generation

SA-CMGAN poster.pdf

QUESTIONS?