Documents
Poster
psoter 11535
- DOI:
- 10.60864/9fw2-3x16
- Citation Author(s):
- Submitted by:
- Changheng Li
- Last updated:
- 6 June 2024 - 10:54am
- Document Type:
- Poster
- Categories:
- Keywords:
- Log in to post comments
Acoustic-scene-related parameters such as relative transfer functions (RTFs) and power spectral densities (PSDs) of the target source, late reverberation and ambient noise are essential and challenging to estimate. Existing methods typically only estimate a subset of the parameters by assuming the other parameters are known. This can lead to unmatched scenarios and reduced estimation performance. Moreover, many methods process time frames independently, despite they share common information such as the same RTF. In this work, we consider a noisy scenario by modelling the noise component as a spatially homogeneous sound field with a time-invariant spatial coherence matrix and time-varying PSD. We first modify an existing alternating least squares (ALS) method to obtain more accurate estimates using a single time frame. Then, we extend the method to use multiple time frames that share the same RTF. Furthermore, we propose more robust constraints on the PSDs to avoid large estimation errors. We compare our proposed methods to several reference methods, among which the state-of-the-art simultaneously confirmatory factor analysis (SCFA) method, a recently developed joint maximum likelihood estimation (JMLE) method and an existing ALS-based method. The experimental results in terms of estimation accuracy, noise reduction performance, predicted speech quality, and predicted speech intelligibility demonstrate that our proposed ALS-based methods achieve similar performance compared to the state-of-the-art SCFA method. Both the proposed ALS-based methods and the SCFA method outperform the existing ALS-based method in all scenarios and outperform the JMLE method particularly in low SNR scenarios. Moreover, in terms of computational complexity, our proposed methods are the least complex of all reference methods. This is confirmed by the measured processing time, which is significantly lower than for SCFA.