Minimizing Performance Gap: Training End-to-End ASR Models with Federated Learning

Title: Improving Automatic Speech Recognition with Federated Learning

In this blog, the authors explore the use of Federated Learning (FL) to train End-to-End Automatic speech recognition (ASR) models. They examine different factors that can help minimize the performance gap between models trained using FL versus their centralized counterpart.

Effects of different factors:
– Adaptive optimizers
– Loss characteristics via altering Connectionist Temporal Classification (CTC) weight
– Model initialization through seed start
– Carrying over modeling setup from experiences in centralized training to FL
– FL-specific hyperparameters

The authors shed light on how some optimizers work better than others by inducing smoothness. They also summarize the applicability of algorithms, trends, and propose best practices from prior works in FL toward End-to-End ASR models.

The figure in the article shows the overlap among central model updates for Yogi and Adam optimizers for the first 50 aggregation rounds. The wider diagonal white beam for Yogi represents the additional smoothening achieved by Yogi, minimizing the effect of heterogeneity among client updates.

In conclusion, this research explores how FL can improve ASR models and provides valuable insights for future developments in the field.

Source link

Stay in the Loop

Get the daily email from AI Headliner that makes reading the news actually enjoyable. Join our mailing list to stay in the loop to stay informed, for free.

Latest stories

You might also like...