ICASSP 2023

IEEE ICASSP 2023 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2023 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit the website.

RECOGNIZING HIGHLY VARIABLE AMERICAN SIGN LANGUAGE IN VIRTUAL REALITY

Read more about RECOGNIZING HIGHLY VARIABLE AMERICAN SIGN LANGUAGE IN VIRTUAL REALITY
Log in to post comments

Recognizing signs in virtual reality (VR) is challenging; here, we developed an American Sign Language (ASL) recognition system in a VR environment. We collected a dataset of 2,500 ASL numerical digits (0-10) and 500 instances of the ASL sign for TEA from 10 participants using an Oculus Quest 2. Participants produced ASL signs naturally, resulting in significant variability in location, orientation, duration, and motion trajectory. Additionally, the ten signers in this initial study were diverse in age, sex, ASL proficiency, and hearing status, with most being deaf lifelong ASL users.

RECOGNIZING HIGHLY VARIABLE AMERICAN SIGN LANGUAGE IN VIRTUAL REALITY.pdf

RECOGNIZING HIGHLY VARIABLE AMERICAN SIGN LANGUAGE IN VIRTUAL REALITY.pdf (231)

Categories:: Other

80 Views

RECOGNIZING HIGHLY VARIABLE AMERICAN SIGN LANGUAGE IN VIRTUAL REALITY

Read more about RECOGNIZING HIGHLY VARIABLE AMERICAN SIGN LANGUAGE IN VIRTUAL REALITY
Log in to post comments

SLTAT 2023.pdf

SLTAT 2023.pdf (243)

Categories:: Other

11 Views

TOWARDS HIGH-SECURITY UBIQUITOUS IOT NETWORKS USING AI

Read more about TOWARDS HIGH-SECURITY UBIQUITOUS IOT NETWORKS USING AI
Log in to post comments

The on-going paradigm shift knocking on the door of future wireless communication system is ubiquitous Internet of Things (IoT), and the maturity of which will be hindered by the challenges related to security. Artificial intelligence (AI) is proficient in solving intractable optimization problems in a data-based way, which provides a new idea for network security and physical-layer guarantee. In this paper, we divide the ubiquitous IoT networks into cyberspace and electromagnetic space, and identify the threat models.

poster.pdf

Poster (329)

Categories:: Communications and Networking

43 Views

ViTASD: Robust Vision Transformer Baselines for Autism Spectrum Disorder Facial Diagnosis

Autism spectrum disorder (ASD) is a lifelong neurodevelopmental disorder with very high prevalence around the world. Research progress in the field of ASD facial analysis in pediatric patients has been hindered due to a lack of well-established baselines. In this paper, we propose the use of the Vision Transformer (ViT) for the computational analysis of pediatric ASD. The presented model, known as ViTASD, distills knowledge from large facial expression datasets and offers model structure transferability.

370.pdf

Paper (249)

370_Poster.pdf

Poster (258)

Categories:: Medical imaging

86 Views

SIGNAL PROCESSING AND QUANTUM STATE TOMOGRAPHY ON NOISY DEVICES

Read more about SIGNAL PROCESSING AND QUANTUM STATE TOMOGRAPHY ON NOISY DEVICES
Log in to post comments

Quantum State Tomography (QST) is a fundamental tool for quantum signal processing. However, in real noisy quantum devices construction of the state's density matrix via QST can utilize a large amount of resources. Here, we discuss some signal processing techniques that are currently applied to this resource issue, and implement on current quantum chips a modification that can assist in reducing resources. An application of QST to quantum entanglement distillation is provided for further insight.

! ICASSP 2023_Poster_Wenbo SHI.pdf

5548_Poster_SIGNAL PROCESSING AND QUANTUM STATE TOMOGRAPHY ON NOISY DEVICES (261)

Categories:: Signal and System Modeling, Representation and Estimation

20 Views

Class-aware Shared Gaussian Process Dynamic Model

Read more about Class-aware Shared Gaussian Process Dynamic Model
Log in to post comments

A new method of Gaussian process dynamic model (GPDM), named class-aware shared GPDM (CSGPDM), is presented in this paper. One of the most difference between our CSGPDM and existing GPDM is considering class information which helps to build the class label-based latent space being effective for the following class-related tasks. In terms of representation learning, CSGPDM is optimized by considering not only a non-linear relationship but also time-series relation and discriminative information of each class label.

icassp2023_sawata_rev00_forSigPort.pptx

The slides used for oral presentation at ICASSP 2023. (215)

ICASSP2023_Class-aware Shared Gaussian Process Dynamic Model_CameraReady_rev01_eXpressPassed.pdf

Paper pre-print (217)

Categories:: Multimodal signal processing

28 Views

A study of audio mixing methods for piano transcription in violin-piano ensembles

Read more about A study of audio mixing methods for piano transcription in violin-piano ensembles
Log in to post comments

poster_final.pdf

poster_final.pdf (238)

Categories:: Music Signal Processing

21 Views

SPEECH MODELING WITH A HIERARCHICAL TRANSFORMER DYNAMICAL VAE

Read more about SPEECH MODELING WITH A HIERARCHICAL TRANSFORMER DYNAMICAL VAE
Log in to post comments

The dynamical variational autoencoders (DVAEs) are a family of latent-variable deep generative models that extends the VAE to model a sequence of observed data and a corresponding sequence of latent vectors. In almost all the DVAEs of the literature, the temporal dependencies within each sequence and across the two sequences are modeled with recurrent neural networks.

ICASSP2023Poster.pdf

Poster (212)

Categories:: Speech Coding (SPE-CODI)

28 Views

PoGaIN: Poisson-Gaussian Image Noise Modeling from Paired Samples

Read more about PoGaIN: Poisson-Gaussian Image Noise Modeling from Paired Samples
Log in to post comments

Image noise can often be accurately fitted to a Poisson-Gaussian distribution. However, estimating the distribution parameters from a noisy image only is a challenging task. Here, we study the case when paired noisy and noise-free samples are accessible. No method is currently available to exploit the noise-free information, which may help to achieve more accurate estimations. To fill this gap, we derive a novel, cumulant-based, approach for Poisson-Gaussian noise modeling from paired image samples.

PoGaIN_Poisson-Gaussian_Image_Noise_Modeling_From_Paired_Samples.pdf

paper.pdf (193)

supplementary_material.pdf

supplementary_material.pdf (238)

poster.pdf

poster.pdf (257)

slides.pdf

slides.pdf (277)

Categories:: Image/Video Processing

44 Views

IMAGE SEGMENTATION FOR IMPROVED LOSSLESS SCREEN CONTENT COMPRESSION

Read more about IMAGE SEGMENTATION FOR IMPROVED LOSSLESS SCREEN CONTENT COMPRESSION
Log in to post comments

In recent years, it has been found that screen content images (SCI) can be effectively compressed based on appropriate probability modelling and suitable entropy coding methods such as arithmetic coding. The key objective is determining the best probability distribution for each pixel position. This strategy works particularly well for images with synthetic (textual) content. However, usually screen content images not only consist of synthetic but also pictorial (natural) regions. These images require diverse models of probability distributions to be optimally compressed.

presentationICASSP_pdf.pdf

presentationICASSP_pdf.pdf (245)

Categories:: Image, Video, and Multidimensional Signal Processing

24 Views

Pages