Generative AI, particularly LLMs, have changed the way we interact with computers: using our own language. Nevertheless, these tools by themselves may lack of some critical properties:

In this blog post, we will see how we can embed GenAI components into chatbots to:

  • Simplify the chatbot development process
  • Create smarter chatbots
  • Control on the content LLMs generate

The BESSER Bot Framework (GitHub and blog post) allows creating chatbots with an internal state machine structure and events that trigger transitions between states, moving the user from one state to another. A common event in chatbots is to receive a message from the user, embedding a specific intent. The following table compares this approach with LLM-based chatbots:

State machine + Intent Recognition Large Language Model
Full control on bot behavior Suffer hallucinations
Explainable behavior Black box behavior
Data privacy Data leakage: be careful on what data you provide to an LLM
Can embed LLM-based functionalities Need to integrate with other software tools (APIs, DBs, …)
Small, free Big, expensive

Each state in a state machine embeds a set of instructions, which could involve executing LLMs to generate data based on the context or conversation. Therefore, this is the best approach to take advantage of all its benefits but also of the power of LLMs, in a controlled environment where we decide when and how they are used.

In this article, we briefly introduce the LLM integration in chatbots and a specific use case: RAG.

LLM integration in chatbots

Example tasks that can be done with an LLM.

BBF comes with wrappers for the state-of-the-art LLMs, facilitating their integration into a traditional chatbot. Here, we briefly illustrate how to use them with a simple example, but a more extended explanation can be found in the BBF documentation.

The first step is to import the classes and create the LLM object, which we will call ‘gpt’.

 

The LLM we just created can be used for different purposes within a chatbot. The following screenshot shows how to use the LLM as a powerful intent classifier:

Our example bot will recognize the following intent from the user. We are defining the intent with a short description and then the transition from the initial state of the bot to the state in charge of storing the data once the intent is received.

Now let’s define the body of the state where the LLM is to be used. Here, after the user sends some food he/she ate, the bot stores the nutrients (calories, proteins, etc.) in a database.

The purpose of the LLM here is to reduce the complexity of the bot and facilitate the interaction with the user, simply asking for a meal and then delegating the task of identifying the nutrients from that meal to the LLM:

 

 

Retrieval Augmented Generation (RAG)

Pipeline of the RAG process

 

Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models (LLMs) with facts fetched from external sources.

The benefits of RAG include:

  • Updated information: provide LLMs with the latest information (LLMs are trained with data until a certain date)
  • Access to domain-specific information not seen during training without the need of fine-tuning the LLM.
  • Factual grounding: reduce hallucinations by providing LLMs with access to a knowledge base

A common use case is to have a set of documents (e.g., PDF files) that we want to use as knowledge source for an LLM. To achieve this, we need to store the documents in a Vector Database by using an embeddings model. Then, given a user query, a retriever will get the most relevant document fragments, which will be added to the LLM prompt as additional context to generate the answer. A detailed explanation of RAG can be found in the BBF documentation.

The BBF RAG component wraps all the complexities, providing simple methods to get answers with RAG.

It is built on top of LangChain, a powerful Python library that, among many other things, provides wrappers for the best technologies necessary for RAG. To create our RAG object, we simply need a vector store and a splitter (both from LangChain)

 

 

Then, the RAG we just created can be used within any bot state, like the following example:

 

 

It simply runs the RAG (implicitly) taking the user message as input. The result will look like this:

Featured image by macrovector on Freepik

Join our Team!

Follow the latest news on software development, especially for open source projects

You have Successfully Subscribed!

Share This