Documents
Poster
NOVEL AMPLITUDE SCALING METHOD FOR BILINEAR FREQUENCY WARPING-BASED VOICE CONVERSION
- Citation Author(s):
- Submitted by:
- Nirmesh Shah
- Last updated:
- 28 February 2017 - 4:49am
- Document Type:
- Poster
- Document Year:
- 2017
- Event:
- Presenters:
- Nirmesh Shah
- Paper Code:
- 1380
- Categories:
- Log in to post comments
In frequency warping (FW)-based Voice Conversion (VC), the source spectrum is modified to match the frequency-axis of the target spectrum followed by an Amplitude Scaling (AS) to compensate the amplitude differences between the warped spectrum and the actual target spectrum. In this paper, we propose a novel AS technique which linearly transfers the amplitude of the frequency warped spectrum using the knowledge of a Gaussian Mixture Model (GMM)-based converted spectrum without adding any spurious peaks. The novelty of the proposed approach lies in avoiding a perceptual impression of wrong formant location (due to perfect match assumption between the warped spectrum and the actual target spectrum in state-of-the-art AS method) leading to deterioration in converted voice quality. From subjective analysis, it is evident that the proposed system has been preferred 33.81 % and 12.37 % times more compared to the GMM and state-of-the-art AS method for voice quality, respectively. Similar to the quality conversion trade-offs observed by other studies in the literature, speaker identity conversion was 0.73 % times more and 9.09 % times less preferred over GMM and state-of-the-art AS-based method, respectively.