ICASSP 2022

ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2022 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

MULTILINGUAL SECOND-PASS RESCORING FOR AUTOMATIC SPEECH RECOGNITION SYSTEMS

Read more about MULTILINGUAL SECOND-PASS RESCORING FOR AUTOMATIC SPEECH RECOGNITION SYSTEMS
Log in to post comments

ICASSP_2022_MLNOS.pdf

ICASSP_2022_MLNOS.pdf (149)

Categories:: Multilingual Recognition and Identification (SPE-MULT)

35 Views

Speech enhancement with neural homomorphic synthesis

Read more about Speech enhancement with neural homomorphic synthesis
Log in to post comments

Most deep learning-based speech enhancement methods operate directly on time-frequency representations or learned features without making use of the model of speech production. This work proposes a new speech enhancement method based on neural homomorphic synthesis. The speech signal is firstly decomposed into excitation and vocal tract with complex cepstrum analysis. Then, two complex-valued neural networks are applied to estimate the target complex spectrum of the decomposed components. Finally, the time-domain speech signal is synthesized from the estimated excitation and vocal tract.

Speech enhancement with neural homomorphic synthesis.pdf

Speech enhancement with neural homomorphic synthesis.pdf (326)

Categories:: Speech Enhancement (SPE-ENHA)

45 Views

ENCRYPTION RESISTANT DEEP NEURAL NETWORK WATERMARKING

Read more about ENCRYPTION RESISTANT DEEP NEURAL NETWORK WATERMARKING
Log in to post comments

Deep neural network (DNN) watermarking is one of the main techniques to protect the DNN. Although various DNN watermarking schemes have been proposed, none of them is able to resist the DNN encryption. In this paper, we propose an encryption resistent DNN watermarking scheme, which is able to resist the parameter shuffling based DNN encryption. Unlike the existing schemes which use the kernels separately for watermarking embedding, we propose to embed the watermark into the fused kernels to resist the parameter shuffling.

ENCRYPTION RESISTANT DEEP NEURAL NETWORK WATERMARKING.pdf

The poster of the paper (193)

Categories:: Watermarking and Steganography

8 Views

Quantum Federated Learning with Quantum Data

Read more about Quantum Federated Learning with Quantum Data
Log in to post comments

Mahdi Poster.pdf

Poster (103)

Categories:: Other

20 Views

END-TO-END ASR-ENHANCED NEURAL NETWORK FOR ALZHEIMER’S DISEASE DIAGNOSIS

Read more about END-TO-END ASR-ENHANCED NEURAL NETWORK FOR ALZHEIMER’S DISEASE DIAGNOSIS
Log in to post comments

This paper presents an approach to Alzheimer’s disease (AD) diagnosis from spontaneous speech using an end-to-end ASR-enhanced neural network. Under the condition that only audio data are provided and accurate transcripts are unavailable, this paper proposes a system that can analyze utterances to differentiate between AD patients, healthy controls, and individuals with mild cognitive impairment. The ASR-enhanced model comprises automatic speech recognition (ASR) with an encoder-decoder structure and the encoder followed by an AD classification network.

icassp2022_upload.pptx

icassp2022_upload.pptx (155)

Categories:: Bioacoustics and Medical Acoustics

18 Views

Modeling Beats And Downbeats With A Time-frequency Transformer

Read more about Modeling Beats And Downbeats With A Time-frequency Transformer
Log in to post comments

ICASSP 2022 beat poster.pdf

ICASSP 2022 beat poster.pdf (140)

Categories:: Audio and Acoustic Signal Processing

19 Views

TO CATCH A CHORUS, VERSE, INTRO, OR ANYTHING ELSE: ANALYZING A SONG WITH STRUCTURAL FUNCTIONS

Conventional music structure analysis algorithms aim to divide a song into segments and to group them with abstract labels (e.g., ‘A’, ‘B’, and ‘C’). However, explicitly identifying the function of each segment (e.g., ‘verse’ or ‘chorus’) is rarely attempted, but has many applications. We introduce a multi-task deep learning framework to model these structural semantic labels directly from audio by estimating "verseness," "chorusness," and so forth, as a function of time.

ICASSP 2022 structure poster.pdf

poster for the paper "TO CATCH A CHORUS, VERSE, INTRO, OR ANYTHING ELSE: ANALYZING A SONG WITH STRUCTURAL FUNCTIONS" (147)

Categories:: Music Signal Processing

47 Views

Adaptive Group Testing with Mismatched Models

Read more about Adaptive Group Testing with Mismatched Models
Log in to post comments

ICASSP_Poster.pdf

Poster for "Adaptive Group Testing with Mismatched Models" (250)

Categories:: Bayesian learning; Bayesian signal processing (MLR-BAYL)

8 Views

The Mirrornet : Learning Audio Synthesizer Controls Inspired by Sensorimotor Interaction

Experiments to understand the sensorimotor neural interactions in the human cortical speech system support the existence of a bidirectional flow of interactions between the auditory and motor regions. Their key function is to enable the brain to ‘learn’ how to control the vocal tract for speech production. This idea is the impetus for the recently proposed "MirrorNet", a constrained autoencoder architecture.

MirrorNet_presentation.pdf

Presentation (165)

Siriwardena_poster.pdf

Poster (141)

Categories:: Music Signal Processing

28 Views

Aerial Base Station Placement Leveraging Radio Tomographic Maps

Read more about Aerial Base Station Placement Leveraging Radio Tomographic Maps
Log in to post comments

Mobile base stations on board unmanned aerial vehicles (UAVs)
promise to deliver connectivity to those areas where the
terrestrial infrastructure is overloaded, damaged, or absent. A
fundamental problem in this context involves determining a
minimal set of locations in 3D space where such aerial base
stations (ABSs) must be deployed to provide coverage to a set of
users. While nearly all existing approaches rely on average
characterizations of the propagation medium, this work

02-poster_icassp-v2-daniel.pdf

Poster (178)

Categories:: Communication Systems and Applications

27 Views

Pages