Bioacoustics and Medical Acoustics

Stethoscope-Guided Supervised Contrastive Learning for Cross-domain Adaptation on Respiratory Sound Classification

Despite the remarkable advances in deep learning technology, achieving satisfactory performance in lung sound classification remains a challenge due to the scarcity of available data. Moreover, the respiratory sound samples are collected from a variety of electronic stethoscopes, which could potentially introduce biases into the trained models. When a significant distribution shift occurs within the test dataset or in a practical scenario, it can substantially decrease the performance.

SG-SCL_poster_final.pdf

SG-SCL_poster_final.pdf (219)

Categories:: Bioacoustics and Medical Acoustics

42 Views

VOCAL FOLD DYNAMICS FOR AUTOMATIC DETECTION OF AMYOTROPHIC LATERAL SCLEROSIS FROM VOICE

Amyotrophic Lateral Sclerosis (ALS) is a neurodegenerative disease that affects motor neurons and causes speech and respiratory dysfunctions. Current diagnostic methods are com- plicated, thus motivating the development of an efficient and objective diagnostic aid. We hypothesize that analyses of features capturing the essential characteristics of the biomechanical process of voice production can distinguish ALS patients from non-ALS controls. In this paper, we represent voices with algorithmically estimated vocal fold dynamics from physical models of phonation.

icassp_poster.pdf

Poster for paper (380)

Categories:: Bioacoustics and Medical Acoustics

22 Views

Exploring Meta Information for Audio-based Zero-Shot Bird Classification

Read more about Exploring Meta Information for Audio-based Zero-Shot Bird Classification
Log in to post comments

Advances in passive acoustic monitoring and machine learning have led to the procurement of vast datasets for computational bioacoustic research. Nevertheless, data scarcity is still an issue for rare and underrepresented species. This

ICASSP24 - Poster.pdf

ICASSP24 - Poster.pdf (251)

Categories:: Bioacoustics and Medical Acoustics

27 Views

Articulation GAN: Unsupervised Modeling of Articulatory Learning

Read more about Articulation GAN: Unsupervised Modeling of Articulatory Learning
Log in to post comments

Generative deep neural networks are widely used for speech synthesis, but most existing models directly generate waveforms or spectral outputs. Humans, however, produce speech by controlling articulators, which results in the production of speech sounds through physical properties of sound propagation. We introduce the Articulatory Generator to the Generative Adversarial Network paradigm, a new unsupervised generative model of speech production/synthesis.

Begus Zhou Wu Anumanchipalli 5406 Articulation GAN ICASSP 2023.pdf

Begus Zhou Wu Anumanchipalli 5406 Articulation GAN ICASSP 2023.pdf (392)

Categories:: Speech Production (SPE-SPRD)
Speech Synthesis and Generation, including TTS (SPE-SYNT)
Human Spoken Language Acquisition, Development and Learning (SLP-LADL)
Language Modeling, for Speech and SLP (SLP-LANG)
Bioacoustics and Medical Acoustics

80 Views

ANALYSIS AND RE-SYNTHESIS OF NATURAL CRICKET SOUNDS ASSESSING THE PERCEPTUAL RELEVANCE OF IDIOSYNCRATIC PARAMETERS

Cricket sounds are usually regarded as pleasant and, thus, can be used as suitable test signals in psychoacoustic experiments assessing the human listening acuity to specific temporal and spectral features. In addition, the simple structure of cricket sounds makes them prone to reverse engineering such that they can be analyzed and re-synthesized with desired alterations in their defining parameters.

poster_AJF.pdf

Poster (222)

Categories:: Bioacoustics and Medical Acoustics

26 Views

Self-supervised learning for infant cry analysis

Read more about Self-supervised learning for infant cry analysis
Log in to post comments

In this paper, we explore self-supervised learning (SSL) for analyzing a first-of-its-kind database of cry recordings containing clinical indications of more than a thousand newborns. Specifically, we target cry-based detection of neurological injury as well as identification of cry triggers such as pain, hunger, and discomfort.

Poster_SSL_for_cry_analysis.pdf

Poster_SSL_for_cry_analysis.pdf (267)

Paper_SSL_for_cry_analysis.pdf

Paper_SSL_for_cry_analysis.pdf (260)

Categories:: Bioacoustics and Medical Acoustics
Pattern recognition and classification (MLR-PATT)

63 Views

END-TO-END ASR-ENHANCED NEURAL NETWORK FOR ALZHEIMER’S DISEASE DIAGNOSIS

Read more about END-TO-END ASR-ENHANCED NEURAL NETWORK FOR ALZHEIMER’S DISEASE DIAGNOSIS
Log in to post comments

This paper presents an approach to Alzheimer’s disease (AD) diagnosis from spontaneous speech using an end-to-end ASR-enhanced neural network. Under the condition that only audio data are provided and accurate transcripts are unavailable, this paper proposes a system that can analyze utterances to differentiate between AD patients, healthy controls, and individuals with mild cognitive impairment. The ASR-enhanced model comprises automatic speech recognition (ASR) with an encoder-decoder structure and the encoder followed by an AD classification network.

icassp2022_upload.pptx

icassp2022_upload.pptx (313)

Categories:: Bioacoustics and Medical Acoustics

22 Views

THE SECOND DICOVA CHALLENGE: DATASET AND PERFORMANCE ANALYSIS FOR DIAGNOSIS OF COVID-19 USING ACOUSTICS

The Second Diagnosis of COVID-19 using Acoustics (DiCOVA) Challenge aimed at accelerating the research in acoustics based detection of COVID-19, a topic at the intersection of acoustics, signal processing, machine learning, and healthcare. This paper presents the details of the challenge, which was an open call for researchers to analyze a dataset of audio recordings consisting of breathing, cough and speech signals. This data was collected from individuals with and without COVID-19 infection, and the task in the challenge was a two-class classification.

poster_final_icassp22_Debarpan_Bhattacharya.pdf

poster of the paper presented in ICASSP 22 (347)

Categories:: Bioacoustics and Medical Acoustics

23 Views

THE SECOND DICOVA CHALLENGE: DATASET AND PERFORMANCE ANALYSIS FOR DIAGNOSIS OF COVID-19 USING ACOUSTICS

ppt_icassp22_v2.pdf

Slides of the paper presented in ICASSP 22 (264)

Categories:: Bioacoustics and Medical Acoustics

21 Views

ORCA-PARTY: An Automtatic Killer Whale Sound Type Separation Toolkit Using Deep Learning

Data-driven and machine-based analysis of massive bioacoustic data collections, in particular acoustic regions containing a substantial number of vocalizations events, is essential and extremely valuable to identify recurring vocal paradigms. However, these acoustic sections are usually characterized by a strong incidence of overlapping vocalization events, a major problem severely affecting subsequent human-/machine-based analysis and interpretation.