Introduction to Large Language Models (LLMs)
In today’s rapidly advancing world of technology, Large Language Models (LLMs) have become the talk of the town. These sophisticated computer programs have the remarkable ability to understand, generate, and interact with human language in a natural way. One such groundbreaking model is FurChat, an embodied conversational agent that pushes the boundaries of natural language processing. LLMs like GPT-3.5 have paved the way for exciting opportunities in various domains, including robotics.
The Revolutionary FurChat System
Researchers at Heriot-Watt University and Alana AI have developed FurChat, a revolutionary system that functions as a receptionist, engages in dynamic conversations, and conveys emotions through facial expressions. When deployed at the National Robotarium, FurChat showcases its transformative potential by facilitating natural conversations with visitors and providing information about facilities, news, research, and upcoming events.
The Features of Furhat Robot
Furhat is a humanoid robotic bust equipped with a three-dimensional mask that closely resembles a human face. This mask comes to life through a micro projector, which animates facial expressions onto it. Mounted on a monitored platform, Furhat can move and nod its head, enhancing its lifelike interactions. With a microphone array and speakers, Furhat can recognize and respond to human speech, enabling seamless communication.
FurChat’s system is designed for smooth functionality. It incorporates three main components: NLU (Natural Language Understanding), DM (Dialogue Management), and a custom database. NLU analyzes incoming text, classifies intents, and evaluates confidence levels. DM maintains the conversational flow, sends prompts to the LLM, and processes responses. The custom database is created by web-scraping the National Robotarium’s website, providing data relevant to user intents. By combining prompt engineering and gesture parsing, FurChat generates context-aware replies synchronized with facial expressions and speech.
To convert text to speech, Furhat utilizes Amazon Polly, which is available in FurhatOS. This text-to-speech conversion enhances the immersive interaction with users.
In the future, researchers aim to expand the capabilities of FurChat. They plan to enable multiuser interactions, an active area of research for receptionist robots. Additionally, they aim to address the issue of hallucinations in language models by exploring strategies such as finetuning the model and experimenting with direct conversation generation, reducing reliance on NLU components. The Sigdial conference will provide a significant platform for showcasing FurChat’s capabilities to a wider audience of peers and experts.
For more information, check out the research paper released by the researchers behind FurChat.
If you enjoy our work, you’ll love our newsletter. Subscribe now!