Recent advances in the field of AI have led to major improvements in speech recognition. However, recognizing personal names still presents a challenge. In response to this, a new personalization solution has been developed for an end-to-end system based on connectionist temporal classification. This solution uses a class-based language model that provides context for named entity classes, and personal names are compiled in a separate finite state transducer. Additionally, a phoneme-to-wordpiece model has been introduced to map rare named entities to more frequent wordpieces, and wordpiece prior normalization has been implemented to bias for rare wordpieces. This has resulted in a 48.9% relative improvement in personal named entity accuracy on top of an already personalized baseline. This work has allowed these systems to compete with highly personalized hybrid systems on personal named entity recognition. These developments are significant in the field of AI and have the potential to greatly improve end-to-end speech recognition systems.
Advances in AI and Speech Recognition
Recent advances in AI have led to major improvements in speech recognition. However, recognizing personal names still presents a challenge.
Personalization Solution for End-to-End AI Systems
A new personalization solution has been developed for an end-to-end system based on connectionist temporal classification. This solution uses a class-based language model that provides context for named entity classes, and personal names are compiled in a separate finite state transducer. Additionally, a phoneme-to-wordpiece model has been introduced to map rare named entities to more frequent wordpieces, and wordpiece prior normalization has been implemented to bias for rare wordpieces. This has resulted in a 48.9% relative improvement in personal named entity accuracy on top of an already personalized baseline. This work has allowed these systems to compete with highly personalized hybrid systems on personal named entity recognition.
Significance of these Developments
These developments are significant in the field of AI and have the potential to greatly improve end-to-end speech recognition systems.