Sorry, you need to enable JavaScript to visit this website.

ICASSP is the world’s largest and most comprehensive technical conference focused on signal processing and its applications. The ICASSP 2020 conference will feature world-class presentations by internationally renowned speakers, cutting-edge session topics and provide a fantastic opportunity to network with like-minded professionals from around the world. Visit website.

Recently, a hybrid analog-digital architecture has been proposed for multiuser MIMO transmission in the millimeter-wave spectrum using reflect-arrays. The architecture exhibits scalability and high energy-efficiency while keeping the transmitter cost-efficient. Inspired by this architecture, we design a secure multiuser hybrid analog-digital precoding scheme. This scheme utilizes the method of regularized least-squares to shape the downlink beamformers, such that the signal received via malicious terminals is effectively suppressed.

Categories:
42 Views

In this paper, we revisit the popular affinity matrix based on the anchor graph and point out that the spectral embedding obtained using symmetric normalized Laplacian is only a side view of the bipartite structure. Based on the analysis, we propose Fast Spectral Clustering based on the Random Walk Laplacian (FRWL) method to explicitly balance the popularity of anchors and the independence of data points, which is especially important for clustering of boundary points.

Categories:
18 Views

This paper investigates state-of-the-art Transformer- and FastSpeech-based high-fidelity neural text-to-speech (TTS) with full-context label input for pitch accent languages. The aim is to realize faster training than conventional Tacotron-based models. Introducing phoneme durations into Tacotron-based TTS models improves both synthesis quality and stability.

Categories:
97 Views

The estimation of the frequencies of multiple complex sinusoids in the presence of noise is required in many applications such as sonar, speech processing, communications, and power systems. This problem can be reformulated as a separable nonlinear least squares problem (SNLLS). In this paper, such formulation is derived and a variable projection (VP) optimization is proposed for solving the SNLLS problem and estimate the frequency parameters. We also apply a lethargy type theorem for quantifying the difficulty of the optimization.

Categories:
19 Views

The availability and quality of channel state information heavily influences the performance of wireless communication systems. For perfect channel knowledge, optimal signal processing and coding schemes are well studied and often closed-form solutions are known. On the other hand, the case of imperfect channel information is much less understood and closed-form solutions remain unknown in general.

Categories:
28 Views

In this paper, we present a Small Energy Masking (SEM) algorithm, which masks inputs having values below a certain threshold. More specifically, a time-frequency bin is masked if the filterbank energy in this bin is less than a certain energy threshold. A uniform distribution is employed to randomly generate the ratio of this energy threshold to the peak filterbank energy of each utterance in decibels. The unmasked feature elements are scaled so that the total sum of the feature values remain the same through this masking procedure.

Categories:
20 Views

Noisy measurements of a physical unclonable function (PUF) are used to store secret keys with reliability, security, privacy, and complexity constraints. A new set of low-complexity and orthogonal transforms with no multiplication is proposed to obtain bit-error probability results significantly better than all methods previously proposed for key binding with PUFs. The uniqueness and security performance of a transform selected from the proposed set is shown to be close to optimal.

Categories:
18 Views

In this paper, we address the problem of speaker recognition in challenging acoustic conditions using a novel method to extract robust speaker-discriminative speech representations. We adopt a recently proposed unsupervised adversarial invariance architecture to train a network that maps speaker embeddings extracted using a pre-trained model onto two lower dimensional embedding spaces. The embedding spaces are learnt to disentangle speaker-discriminative information from all other information present in the audio recordings, without supervision about the acoustic conditions.

Categories:
11 Views

In this paper, we address the problem of speaker recognition in challenging acoustic conditions using a novel method to extract robust speaker-discriminative speech representations. We adopt a recently proposed unsupervised adversarial invariance architecture to train a network that maps speaker embeddings extracted using a pre-trained model onto two lower dimensional embedding spaces. The embedding spaces are learnt to disentangle speaker-discriminative information from all other information present in the audio recordings, without supervision about the acoustic conditions.

Categories:
7 Views

Pages