Unlocking Speech Recognition for All Languages: A Revolutionary Approach

Voice technology is everywhere these days. But not all languages have the same accuracy, which can make it less inclusive. The amount of data available for different languages is a big factor in how accurate the technology is. This is especially true for training all-neural end-to-end automatic speech recognition (ASR) systems.

There are two techniques that have been successful in improving the accuracy of ASR systems, especially for low-resource languages like Ukrainian. These techniques are cross-lingual knowledge transfer and iterative pseudo-labeling.

Our goal is to train an all-neural ASR system called the Transducer. We want to replace a DNN-HMM hybrid system without using any manually annotated training data. Our tests show that the Transducer system, using transcripts from the hybrid system, reduces word error rate by 18%. But by combining cross-lingual knowledge transfer and iterative pseudo-labeling, we are able to reduce the error rate by 35%.

Source link

Stay in the Loop

Get the daily email from AI Headliner that makes reading the news actually enjoyable. Join our mailing list to stay in the loop to stay informed, for free.

Latest stories

You might also like...