Sorry, you need to enable JavaScript to visit this website.

NOVEL AMPLITUDE SCALING METHOD FOR BILINEAR FREQUENCY WARPING-BASED VOICE CONVERSION

Citation Author(s):
Nirmesh J. Shah and Hemant A. Patil
Submitted by:
Nirmesh Shah
Last updated:
28 February 2017 - 4:49am
Document Type:
Poster
Document Year:
2017
Event:
Presenters:
Nirmesh Shah
Paper Code:
1380
 

In frequency warping (FW)-based Voice Conversion (VC), the source spectrum is modified to match the frequency-axis of the target spectrum followed by an Amplitude Scaling (AS) to compensate the amplitude differences between the warped spectrum and the actual target spectrum. In this paper, we propose a novel AS technique which linearly transfers the amplitude of the frequency warped spectrum using the knowledge of a Gaussian Mixture Model (GMM)-based converted spectrum without adding any spurious peaks. The novelty of the proposed approach lies in avoiding a perceptual impression of wrong formant location (due to perfect match assumption between the warped spectrum and the actual target spectrum in state-of-the-art AS method) leading to deterioration in converted voice quality. From subjective analysis, it is evident that the proposed system has been preferred 33.81 % and 12.37 % times more compared to the GMM and state-of-the-art AS method for voice quality, respectively. Similar to the quality conversion trade-offs observed by other studies in the literature, speaker identity conversion was 0.73 % times more and 9.09 % times less preferred over GMM and state-of-the-art AS-based method, respectively.

up
0 users have voted: