Tìm kiếm theo cụm từ
Chi tiết
Tên Adapting bottle neck feature to multi space distribution for Vietnamese speech recognition
Lĩnh vực Tin học
Tác giả Nguyen Van Huy, Luong Chi Mai, Vu Tat Thang
Nhà xuất bản / Tạp chí Conference of the Oriental chapter of the International Coordinating Committee on Speech Databases and Speech I/O Systems and Assessment Năm 2014
Số hiệu ISSN/ISBN
Tóm tắt nội dung

This paper presents a new approach of integrating bottle neck feature (BNF) which is used for extracting tone information, to adapt to Multi Space Distribution Hidden Markov Model (MSD-HMM) for Vietnamese Automatic Speech recognition (Vietnamese ASR). In order to improve the performance of tonal feature, for the first point that we present a progress for extracting tonal feature based on a bottle neck Multilayer Perceptron (MLP) network that so called tonal bottle neck feature. The second major point in this paper is that we describe an approach for adapting TBNF to MSD-HMM model. A new building system was trained with the appropriated topology for BNF size and MLP topology of hidden layers for tone recognition. Experiments on new building recognition system with TBNF integration are done to compare to 1/ a baseline system using MFCC feature and normal HMM prototype of five states, and 2/ a MSD-HMM system with widely used for extraction pitch feature such as Average Magnitude Difference Function (AMDF). Recognition accuracy on the testing set is 80.69%, it improved 2.38% compared to the baseline system and 0.32% compared to the best MSD-HMM system using the standard pitch feature AMDF.