Generative AI Series
Ongoing (Q4 '24): 5-part Generative AI Series
Create RAG systems and AI agents with Sectors Financial API, LangChain and state-of-the-art LLM models -- capable of producing fact-based financial analysis and financial-specific reasoning. **Continually updated** to keep up with the latest major versions of the tools and libraries used in the series.
Create RAG systems and AI agents with Sectors Financial API, LangChain and state-of-the-art LLM models -- capable of producing fact-based financial analysis and financial-specific reasoning. **Continually updated** to keep up with the latest major versions of the tools and libraries used in the series.
Generative AI Series: Table of Contents
Table of Content
Table of Content
1
Generative AI for Finance
An overview of designing Generative AI systems for the finance industry and the motivation for retrieval-augmented generation (RAG) systems.
2
Tool-Use Retrieval Augmented Generation (RAG)
Practical guide to building RAG systems leveraging on information retrieval tools (known as "tool-use" or "function-calling" in LLM)
3
Structured Output from AIs
From using Generative AI to extract from unstructured data or perform actions like database queries, API calls, JSON parsing and more, we need schema and structure in the AI's output.
4
Tool-use ReAct Agents w/ Streaming
Updated for LangChain v0.3.2, we explore streaming, LCEL expressions and ReAct agents following the most up-to-date practices for creating conversational AI agents.
Conversational Memory AI Agents
Updated for LangChain v0.2.3, we dive into Creating AI Agents with Conversational Memory
This article is part 5 of the Generative AI for Finance series, and is written using LangChain 0.3.2.For best results, it is recommended to consume the series in order, starting from chapter 1.For continuity purposes, I will point out the key differences between the current version (LangChain 0.3.2, using
runnables
) and the older implementations featuring LLMChain
and ConversationChain
.Conversational AI with Memory
Oftentimes, we design our AI agents to be conversational, allowing them to interact with users in a more human-like manner. Part 5 of the Generative AI series is on building a conversational AI agent with memory capabilities, which can “remember” past interactions in the conversation and use that information to generate more contextually relevant responses. The essential components of a memory system requires:- Memory Storage: A mechanism to store and retrieve information.
- Memory Update: A mechanism to update the memory based on new information.
- Memory Retrieval: A mechanism to retrieve information from memory.
- Augments the user input with memory information before passing it to the model. This happens after receiving the user input but before the agent performs any processing.
- Updates the memory with the agent’s response after the model has generated a response, typically before returning the response to the user. This adds information to the memory storage that future conversation turns can refer to.
LLMChain
or ConversationChain
, and
the simplicity of these classes made it easy to showcase the memory system. I will first demonstrate how that is
done before moving on to the newer, more flexible RunnableWithMessageHistory
class as recommended in the
latest version of LangChain (0.3.2).
Memory in LLMChain
and ConversationChain
This sub-section demonstrates the memory system in LangChain’s
LLMChain
and ConversationChain
classes.As of LangChain 0.3.0 (mid-October ‘24), these two will yield a LangChainDeprecationWarning
warning.- The class
LLMChain
was deprecated in LangChain 0.1.17 and will be removed in 1.0. Use :meth:~RunnableSequence, e.g.,
prompt | llm“ instead. - The class
ConversationChain
was deprecated in LangChain 0.2.7 and will be removed in 1.0. Use :meth:~RunnableWithMessageHistory: https://python.langchain.com/v0.2/api_reference/core/runnables/langchain_core.runnables.history.RunnableWithMessageHistory.html
instead.
-
The
PromptTemplate
class, which is used to define the template for the prompt. How we name the variables in the template is important, as it will be used to match the keys in the memory system. -
The
ConversationBufferMemory
class, which is a simple memory system that stores the conversation history in a buffer. It requires amemory_key
to match the key in the prompt template.
{history}
in the prompt template, the memory system will store the conversation history under the key history
, which will be used to augment the prompt before passing it to the model.
If desired, one can also manipulate the memory system by adding user or AI messages to the conversation history through the chat_memory
attribute.
ConversationChain
or LLMChain
set up, you can interact with it as you would with any other chain. The memory system will automatically update the conversation history with each turn,
and the model will be able to access this history in subsequent turns.
Conversational Agents through RunnableWithMessageHistory
If you’re going through the Generative AI series on your own, you’d probably be reading this article
closer to the end of 2024 or later. In that case, you should be using the RunnableWithMessageHistory
class
along with the LCEL (LangChain Expression Language) to build your conversational AI agents. ReAct agents and LCEL
are topics covered in Chapter 4: Tool-Use ReAct Agents of the series.
The key changes with LangChain 0.3.2 and above are the use of RunnableWithMessageHistory
to construct a
runnable
— consistent with what we’ve learned in previous chapters of this series — and a more explictly
way of handling message history through InMemoryChatMessageHistory
. RunnableWithMessageHistory
wraps around a runnable (like
the ones we’ve seen before) but with the added capability of working with chat message history, thus allowing this
runnable to read and update the message history in a conversation.
Unlike other runnables, RunnableWithMessageHistory
must always be invoked with a config
that contains the parameters
for the chat message history.
Let’s start with the imports and set up a runnable chain much like you’ve done in the previous chapters.
history
and question
, but your use-case may vary.
The big picture idea isn’t much different from the previous examples, where we are creating these variables to allow the memory system to augment the prompt before passing it to the model.
Set aside syntactic differences, the key idea is to inject, or “copy-paste”, into the prompt past conversational rounds so the prompt is contextually informative.
In a production environment, you might use a persistent implementation of key-value store for this message history, like
RedisChatMessageHistory
or MongoDBChatMessageHistory
.View the full list of integration packages and providers on LangChain Providers.chain
set up, let’s now:
- Create an in-memory dictionary to store the message history based on a unique session id
- Wrap our
chain
withRunnableWithMessageHistory
to handle the message history through matching the variables in the prompt template.
get_session_history_by_id
function retrieves the message history based on a unique session id.
If the session_id
is not found in the store, it means the user has not interacted with the agent before, and
so a new InMemoryChatMessageHistory
object is created and stored in the dictionary.
Runnable with Message History in Action
With all of that in place, let us now interact with ourwith_memory
runnable to see how it performs in a conversation.
supertype
is not present in store
, a new InMemoryChatMessageHistory
object is created on our memory store under the supertype
key.
Subsequent interactions with the agent using this session_id
will refer to this key (pointing to an object containing the conversation history).
Just as how we initialized store
as an empty dictionary, print(store)
will show you that the structure of this dictionary is as follows:
store
has been updated with this new key, let’s also print out the content of this new key-value pair:
Different session_id
for different Conversations
It does look like our AI agent handled that follow-up question well!
By matching the session_id
, it was able to identify which companies were being referred to and inject the right
context from our memory store.
Now that our conversation has grown a little longer, let’s see if it still maintains context in the next question.
dependent on the way we set up the memory system as well as the LLM model itself. If you have been following along with your own LLM model, you might notice a difference in the quality of responses compared to the examples above. It should come as no surprise that when we try to access a different
session_id
, the agent will not be able to retrieve the conversation history
from the store
dictionary and will promptly create a new InMemoryChatMessageHistory
object for that session_id
, as implemented in the get_session_history_by_id
function.
Advanced configuration for message histories tracking
Recall that this is our current implementation carried over from the previous sections:history_factory_config
that expects a list of ConfigurableFieldSpec
objects.
get_session_history
to this new function that I have yet to create, so let’s go ahead
and create it:
prompt
for this example, even though it’s not necessary for the
history_factory_config
to work.
history_factory_config
, your config
will have to match
the specifications constructed with the ConfigurableFieldSpec
objects.
_get_stocks_of_user
and _get_user_settings_preferences
:
001
), and then user Anonymous (id 002
).
conversation_id
, it will default to 1
. This is verified
by printing the store
dictionary after the first chat:
conversation_id
of 1
explicitly, but due to
the implementations of get_session_history_by_uid_and_convoid
, it will still create a new InMemoryChatMessageHistory
object.
Let’s verify that asking the AI for the name (user 1 introduces himself as Sam) will not work for user 2.
conversation_id
is the same, our function is implemented in such a way that the
AI agent will treat it as a separate conversation.
conversation_id
of 1
:
SQLChatMessageHistory
Memory implementations vary from simple in-memory dictionaries to more complex, persistent storage systems. The exact
implementation will depend on your specific use case, requirements, as well as the library you choose.
To demonstrate a more persistent memory system, I will show you how to use SQLChatMessageHistory
with SQLite.
Start with installing the langchain-community
package, which contains the SQLChatMessageHistory
class. As always, I
recommend doing this in a virtual environment.
SQLChatMessageHistory
class and modify your get_session_history_by_uid_and_convoid
function to use it,
swapping out InMemoryChatMessageHistory
for SQLChatMessageHistory
.
chat(user_id, input)
for the
first time, it will create a new memory.db
file in the same directory as your script.
message_store
being created for us, identical to
the following schema:
SELECT * FROM message_store
will show you the conversation history stored in the database:
New to SQL and SQLite?
Once you’ve run the code above, a new database is created on your behalf by SQLite. This database exists on
your local machine and can be accessed using a SQLite client, or directly queried using SQL commands.You can learn more about SQL in the SQL Essentials
guide I wrote, but it is beyond the scope of this article.
Adding memory to prebuilt ReAct
agents
We’ve learned about the prebuilt ReAct
agents in the previous chapter. Adding in-memory
capabilities to these agents is actually fairly straightforward, so let’s see a bare minimum example of how to do this.
MemorySaver
class, which LangChain describes as an in-memory checkpoint saver.
Just like the store={}
dictionary we used in the previous examples, this class also
stores its checkpoints using a defaultdict
in memory.
I’ve mentioned that create_react_agent
really requires two arguments: the llm
model and the tools
list, but
accept additional keyword arguments. If you want to, you can also pass in a state_modifier
that acts almost like
a prompt (we’ve also seen this earlier):
@tool
decorator, but
this serves as a sufficient example to demonstrate the use of a tool-using (“function calling”) ReAct
agent with memory capabilities.
get_company_overview
tool.
In fact, if we so desire, we can also break down each intermediary message contained in the out['messages']
list for inspection:
- The first message is a
HumanMessage
object, which is the user’s input (e.g. “Give me an overview of ADRO”). - The second message is an
AIMessage
, which reads the user’s input and decides on the right tools to call - The third message is a
ToolMessage
, which is the tool call itself (e.g.get_company_overview
) - The fourth message is another
AIMessage
, which is the AI agent’s response to the user’s input, in plain human language
chat()
and we will now proceed to ask a few follow up questions to see the memory in action.
Challenge
Using what you’ve learned in this chapter, try to implement an end-to-end financial agent that is fun to use and can provide you with the latest stock information, company overviews, and even more. Here are some ideas to get you started:- Implement 3 or more tools, each leveraging an external API to retrieve financial data
- Implement a CLI interface for your agent, or a simple web interface using any tools of your choice
- Implement a memory system that can store and retrieve conversation histories, and use it to provide contextually relevant responses