Artificial Intelligence has witnessed a significant advancement in the form of Large Language Models (LLMs). These models have revolutionized language understanding, making it easier to utilize Natural Language Processing (NLP) and Natural Language Understanding (NLU). LLMs have excelled in various tasks, such as text summarization, question answering, content generation, and language translation. They can even comprehend complex textual prompts, including reasoning and logic, and identify patterns and relationships in data.
Despite their impressive performance, LLMs often struggle to efficiently use tools through API calls. Even renowned LLMs like GPT-4 face challenges in generating accurate input arguments and frequently suggest inappropriate API calls. To tackle this issue, researchers from Berkeley and Microsoft Research have introduced Gorilla, a finetuned LLaMA-based model that outperforms GPT-4 in generating API calls. Gorilla enhances LLMs’ ability to work with external tools and select the most suitable API for specific activities.
To develop Gorilla, researchers have created an APIBench dataset, comprising a substantial collection of APIs with overlapping functionality. The dataset includes API requests from popular model hubs like TorchHub, TensorHub, and HuggingFace, along with ten fictional user query prompts for each API. By using this dataset and document retrieval techniques, researchers have fine-tuned Gorilla, a 7 billion parameter model. Gorilla surpasses GPT-4 by producing more accurate API calls and minimizing errors. Its integration with a document retriever showcases the potential for LLMs to utilize tools more effectively. Furthermore, Gorilla’s ability to modify documentation as needed enhances the applicability and reliability of its results. This development is significant as it allows LLMs to keep up with constantly updated documentation, providing users with accurate and current information.
An example shared by the researchers highlights Gorilla’s proficiency in recognizing tasks and delivering precise API results. In contrast, GPT-4 generated API requests for hypothetical models, demonstrating a lack of understanding of the task. Another model named Claude chose the wrong library, indicating a failure to recognize the appropriate resources. However, Gorilla accurately identified the task and produced an API call that aligned with it. This showcases Gorilla’s improved performance and task comprehension in comparison to GPT-4 and Claude.
In summary, Gorilla is a noteworthy addition to the realm of language models as it addresses the challenge of generating API calls accurately. Its capabilities effectively tackle issues related to hallucination and reliability.
Don’t forget to join our ML SubReddit with over 26k members and our Discord Channel to stay updated with the latest AI research news and cool projects. You can also subscribe to our Email Newsletter for regular updates. If you have any questions or if we missed anything in the article, feel free to reach out to us at Asif@marktechpost.com.
Check out AI Tools Club for hundreds of AI tools that can assist you in your projects.