Enhancing Speech Emotion Recognition with Task Salient Representations and Label Uncertainty

The Use of BERT and HuBERT Models in Speech Emotion Recognition

Speech emotion recognition has made great strides with the help of models like BERT and HuBERT. These models produce large dimensional representations, leading to speech emotion models with large parameter size, which can be costly and complex. This study looks at how to select representations based on their task saliency to reduce model complexity without sacrificing performance. It also explores modeling label uncertainty to improve the model’s generalization capacity and robustness. Finally, it analyzes the model’s robustness against acoustic degradation.

Using BERT and HuBERT Models for Speech Emotion Recognition

The Use of BERT and HuBERT Models
In this work, we investigate the selection of representations based on their task saliency and modeling label uncertainty to improve the model’s generalization capacity and robustness. We also analyze the model’s robustness against acoustic degradation.

Source link

Stay in the Loop

Get the daily email from AI Headliner that makes reading the news actually enjoyable. Join our mailing list to stay in the loop to stay informed, for free.

Latest stories

You might also like...