Sorry, you need to enable JavaScript to visit this website.

ICASSP 2021 - IEEE International Conference on Acoustics, Speech and Signal Processing is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2021 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

In this paper, we propose a multi-granularity feature interaction and relation reasoning network (MFIRRN) which can recover a detail-rich 3D face and perform more accurate dense alignment in an unconstrained environment. Traditional 3DMM-based methods directly regress parameters, resulting in the lack of fine-grained details in the reconstruction 3D face. To this end, we use different branches to capture discriminative features at different granularities, especially local features at medium and fine granularities.

Categories:
13 Views

Bridge weigh-in-motion (BWIM) is a technique of estimating vehicle loads on bridges and can be used to assess a bridge's structural fatigue and therefore its life.
BWIM can be realized by analyzing the bridge deflection in terms of its response to moving axle loads.
To obtain accurate load estimates, current BWIM systems require strain sensors, whose (re-) installation costs have limited their application.

Categories:
43 Views

This paper shows the benefits of using Complex-Valued Neural Network (CVNN) on classification tasks for non-circular complex-valued datasets. Motivated by radar and especially Synthetic Aperture Radar (SAR) applications, we propose a statistical analysis of fully connected feed-forward neural networks performance in the cases where real and imaginary parts of the data are correlated through the non-circular property.

Categories:
35 Views

Many commercial and forensic applications of speech demand the extraction of information about the speaker characteristics, which falls into the broad category of speaker profiling. The speaker characteristics needed for profiling include physical traits of the speaker like height, age, and gender of the speaker along with the native language of the speaker. Many of the datasets available have only partial information for speaker profiling.

Categories:
105 Views

State-of-the-art speaker verification systems take frame-level acoustics features as input and produce fixed-dimensional embeddings as utterance-level representations. Thus, how to aggregate information from frame-level features is vital for achieving high performance. This paper introduces short-time spectral pooling (STSP) for better aggregation of frame-level information. STSP transforms the temporal feature maps of a speaker embedding network into the spectral domain and extracts the lowest spectral components of the averaged spectrograms for aggregation.

Categories:
14 Views

Pages