Multiple emotional voice conversion in Vietnamese HMM-based speech synthesis using non-negative matrix factorization

Chi tiết

Tên	Multiple emotional voice conversion in Vietnamese HMM-based speech synthesis using non-negative matrix factorization
Lĩnh vực	Tin học
Tác giả	Trung-Nghia Phung
Nhà xuất bản / Tạp chí	Năm 2017
Số hiệu ISSN/ISBN	2313-626X
Tóm tắt nội dung
Most of current text-to-speech (TTS) systems can synthesize only single voice with neutral emotion. If different emotional voices are required to be synthesized, the system has to be trained again with the new emotional voices. The training process normally requires a huge amount of emotional speech data that is usually impractical. The state of the art TTS using Hidden Markov Model (HMM), called as HMM-based TTS, can synthesize speech with various emotions by using speaker adaption methods. However, both of the emotional voices synthesized and adapted by HMM-based TTS are “oversmooth”. When these voices are over-smooth, the detail structures clearly linked to speaker emotions may be missing. We can also synthesize multiple voices by using some voice conversion (VC) methods combined with HMMbased TTS. However, current voice conversions still cannot synthesize target speech while keeping the detail information related to speaker emotions of the target voice and just using limited amount data of target voices. In this paper, we proposed to use exemplar-based emotional voice conversion combined with HMM-based TTS to synthesize multiple high-quality emotional voices with a few amount of target data. The evaluation results using the Vietnamese emotional speech data corpus confirmed the merits of the proposed method. Most of current text-to-speech (TTS) systems can synthesize only singlevoice with neutral emotion. If different emotional voices are required to besynthesized, the system has to be trained again with the new emotionalvoices. The training process normally requires a huge amount of emotionalspeech data that is usually impractical. The state of the art TTS using HiddenMarkov Model (HMM), called as HMM-based TTS, can synthesize speech withvarious emotions by using speaker adaption methods. However, both of theemotional voices synthesized and adapted by HMM-based TTS are “oversmooth”.When these voices are over-smooth, the detail structures clearlylinked to speaker emotions may be missing. We can also synthesize multiplevoices by using some voice conversion (VC) methods combined with HMMbasedTTS. However, current voice conversions still cannot synthesize targetspeech while keeping the detail information related to speaker emotions ofthe target voice and just using limited amount data of target voices. In thispaper, we proposed to use exemplar-based emotional voice conversioncombined with HMM-based TTS to synthesize multiple high-qualityemotional voices with a few amount of target data. The evaluation resultsusing the Vietnamese emotional speech data corpus confirmed the merits ofthe proposed method.
Đính kèm: 01 2017-4-8-pp.1-5 IJAAS.pdf

Các bài báo, công trình nghiên cứu khoa học lĩnh vực Tin học đã công bố

GS-OPT: A new fast stochastic algorithm for solving the non-convex optimization problem
GS-OPT: A new fast stochastic algorithm for solving the non-convex optimization problem
Xây dựng hệ chẩn đoán lỗi tiềm ẩn của máy biến áp lực dựa trên mạng neural kết hợp với phương pháp phân tích khí hoà tan
Forecasting Vietnam's Inflation Rate Based on Arima, Sarima, Scarima Models
Điều khiển trượt cho đối tượng con lắc ngược có liên kết đàn hồi sử dụng đại số gia tử

Công trình khoa học đã công bố

CTKH đã công bố

Đề tài - Dự án mới

Nghiên cứu phát triển phương pháp mô hình hóa toán học để giải quyết các vấn đề về phân lớp đối tượng dựa trên kỹ thuật thị giác máy tính và phương pháp học sâu, ứng dụng trong bài toán phân loại các khuyết tật mặt đường (Chủ nhiệm: admin_cntt&tt)
Nghiên cứu và xây dựng mô hình dự đoán vị trí Protein SUMOylation (Chủ nhiệm: admin_cntt&tt)
Chuyển đổi số các hiện vật tại Bảo tàng tỉnh Yên Bái hình thành Bảo tàng thực tế ảo (VR) (Chủ nhiệm: admin_cntt&tt)
Tên đề tài: Xây dựng hệ thống tái hiện 3D di tích lịch sử Đồi A1, TP. Điện Biên Phủ nhằm hỗ trợ phát triển và quảng bá du lịch (Chủ nhiệm: admin_cntt&tt)
Xây dựng cơ sở dữ liệu trực tuyến phục vụ phát triển kinh tế, xã hội tỉnh Thái Nguyên (Chủ nhiệm: admin_cntt&tt)

Từ khóa nổi bật

Từ khóa mới nhất