This monograph discusses the dominance of Local Features (LFs), as input to the Multilayer Neural Network (MLN), extracted from a Bangla input speech over Mel Frequency Cepstral Coefficients (MFCCs). Here, LF-based method comprises three stages- (i) LF extraction from input speech, (ii) Phoneme probabilities extraction using MLN from LF and (iii) The Hidden Markov Model (HMM) based classifier to obtain more accurate phoneme strings. In the experiments on Bangla speech corpus prepared by us, it is observed that the LF-based Automatic Speech Recognition system provides higher phoneme correct rate than the MFCC-based system. Moreover, the proposed system requires fewer mixture components in the HMMs. Moreover, this paper reviews some of the key advances in several areas of automatic speech recognition. We also illustrate, by examples, how these key advances can be used for continuous speech recognition of Bangla. Finally we elaborate the requirements in designing successful real-world applications and address technical challenges that need to be harnessed in order to reach the ultimate goal of providing an easy-to-use, natural, and flexible voice interface between people and machines.
Bitte wählen Sie Ihr Anliegen aus.
Rechnungen
Retourenschein anfordern
Bestellstatus
Storno