Targeted Distillation: Maximizing Skills and Transparency of Student Models for NER Applications

AI News

Targeted Distillation: Maximizing Skills and Transparency of Student Models for NER Applications

Jimmy W.

August 13, 2023

Targeted Distillation: Maximizing Skills and Transparency of Student Models for NER Applications

**ChatGPT and the Advantages of Targeted Distillation**

ChatGPT and other large language models (LLMs) have impressive generalization abilities, but their training and inference costs are often too high. White-box access to model weights and inference probabilities is essential for transparency and confidence in critical applications like healthcare. That’s why targeted distillation has gained popularity as a method for creating more affordable and transparent student models that can mimic ChatGPT’s skills.

In this research, the focus is on targeted distillation, where student models are trained through mission-focused instruction adjustment for specific application classes. The goal is to replicate the capabilities of LLMs in specified application classes while maintaining generalizability. The researchers chose named entity recognition (NER) as a case study, as it is a fundamental problem in natural language processing.

While LLMs still need to catch up to advanced supervised systems in terms of entity recognition, the process of creating annotated examples is costly and time-consuming. Targeted distillation offers a solution by utilizing ChatGPT to create instruction-tuning data for NER from unlabeled online text. The result is the UniversalNER models, which outperform other student models like LLaMA and Alpaca.

The UniversalNER model achieves state-of-the-art NER accuracy across tens of thousands of entity types in various disciplines. It also surpasses state-of-the-art multi-task instruction-tuned systems. Extensive tests were conducted to evaluate the effects of different distillation components, and the researchers provide their distillation recipe, data, and UniversalNER model for further study on targeted distillation.

**Conclusion and Further Resources**

Targeted distillation offers a more affordable and transparent alternative to large language models like ChatGPT. The UniversalNER model demonstrates impressive NER accuracy across various disciplines and outperforms other student models. To learn more about this research, you can check out the paper, GitHub repository, and project page. Additionally, be sure to join the ML SubReddit, Facebook community, Discord channel, and subscribe to the email newsletter for the latest AI research news and projects.

**Sources:**

– [Paper](https://arxiv.org/abs/2308.03279)
– [GitHub](https://github.com/universal-ner/universal-ner)
– [Project Page](https://universal-ner.github.io/#)

Source link

LEAVE A REPLY Cancel reply