IGEL: A German Language Model for Instruction-Based Tasks with High Accuracy

IGEL, the Instruction-tuned German large Language Model for Text, is a proof of concept model created to determine the feasibility of constructing a German instruction-tuned model. It combines existing open-source models with a German-translated instruction dataset.

The first version of IGEL was based on BigScience BLOOM, localized into German by Malte Ostendorff. IGEL is designed for various natural language comprehension tasks, such as sentiment analysis, language translation, and question answering. It excels in accuracy and dependability in each area.

To test the performance of IGEL in instruction-based modeling tasks in German, the team used a pre-trained customized BLOOM model (6B) and fine-tuned it using a dataset of translated instructions. Although automatic translation introduced the possibility of errors, the team aimed to determine if the model could still produce instructional replies.

Instruct-igel-001, the version available, contains LoRA-tuned BLOOM-CLP Deutsch (6.4B parameters) with merged weights for usage with Hugging Face Transformers. The dataset used to train instruct-igel-001 did not undergo extensive data cleaning, filtering, or post-processing.

Instruct-igel-001 suffers from common language model issues like hallucination, toxicity, and stereotyping. However, the team plans to develop a chat model to create a conversational interface that improves data quality.

If you want to learn more about IGEL, check out the blog or try the model. Also, make sure to join the ML SubReddit, Discord Channel, and subscribe to the Email Newsletter for the latest AI research news and cool AI projects.

Source link

Stay in the Loop

Get the daily email from AI Headliner that makes reading the news actually enjoyable. Join our mailing list to stay in the loop to stay informed, for free.

Latest stories

You might also like...