Some comments about the existing theory of sound with comparison to the experimental research of vector effects in real-life acoustic near fields

Downloads

Authors

  • Paweł MRÓWKA Neurosoft Sp. z o.o.
  • Ryszard MAKOWSKI Wrocław University of Technology Institute of Telecommunications, Teleinformatics and Acoustics

Abstract

The article presents a~novel method of speaker individual characteristics normalization and linear transmission distortion compensation aimed at improving the effectiveness of short isolated utterances recognition. To achieve this goal, spectral transformation banks of a~speaker's signal and the division of speakers into classes were applied. The article also discusses the form of spectral transformation, the method of its parameter values optimization, the method of transformation banks definition, the method of speaker classes selection and the way of iterative improvement of recognition results. Moreover, the study puts forward a fast method of speaker classes selection on the basis of the fundamental voice frequency. The efficiency of the proposed solution has been validated by the recognition results obtained by means of four versions of a recognition system using Hidden Markov Models (HMM) and the mel frequency cepstral coefficients (MFCC) parametrization.

Keywords:

automatic speech recognition, speaker normalization, transmission distortion compensation