Tool-Use ReAct Conversational Agents
By: Samuel Chan · October 10, 2024
Generative AI Series
Create RAG systems and AI agents with Sectors Financial API, LangChain and state-of-the-art LLM models -- capable of producing fact-based financial analysis and financial-specific reasoning. **Continually updated** to keep up with the latest major versions of the tools and libraries used in the series.
Generative AI Series: Table of Contents
Generative AI for Finance
Tool-Use Retrieval Augmented Generation (RAG)
Structured Output from AIs
Tool-use ReAct Agents w/ Streaming
Conversational Memory AI Agents
This article is part 4 of the Generative AI for Finance series, and is written using LangChain 0.3.0 (released on 14th September 2024).
For best results, it is recommended to consume the series in order, starting from chapter 1.
For continuity purposes, I will point out the key differences between the current version (featuring ReAct-style agent) and the older implementations featuring AgentExecutor
(in chapter 1 and chapter 2).
Tool Use LangGraph Agents
Part 4 of this series will feel familiar to readers who have gone through the materials in chapter 2, with the addition of the LangGraph
library and
a prebuilt ReAct agent that comes with LangGraph.
To make use of the LangGraph
library, you will need to install it:
LangGraph is a library by LangChain, intended for more complex agentic systems and greater control than LangChain’s agents. The orchestration isn’t much different from what you have seen up to this point, so let us walk this process through:
- Setting up our secrets and retriever utility
- Creating our tools with the
@tool
decorator - Use
create_react_agent
to create our agent equipped with tools we created from (2) - Using the LCEL syntax to set up our runnables
- Invoke the runnables
Setting up a retriever utility
The following code should feel familiar to you if you have gone through chapter 2: tool use LLMs.
With the retriever utility in place, we can proceed to create the various tools that our agent will use. Our tools are thin wrappers around the retriever utility, designed to fetch data from Sectors Financial API based on the user’s query.
Creating information retrieval tools
With the @tool
decorator, we turn our functions into LangChain’s structured tools, of the type <class 'langchain_core.tools.structured.StructuredTool'>
, and
these tools can be used, either directly or indirectly by our agent.
Just to see how the tools work, let us invoke them directly:
Notice that at this point we are not using the ReAct
agent yet. In fact, we don’t even have a Language Model (neither Llama3.1 nor GPT-4) to work with.
Here is the result of the above code:
and:
The output gives us some assurance that the tool functions are working as expected. We are calling each of the tool with the appropriate parameters, allowing each tool to then fetch the data from Sectors Financial API through our utility function.
At this time of writing, Indo Tambangraya Megah Tbk (ITMG.JK) has the highest dividend yield among the companies listed on IDX (yield of 11.45%), followed by 3 companies in the finance sector each with a yield of around 8.5% to 9.8%. Rounding off the list is PT Prima Andalan Mandari — owner of the Mandiri Coal brand — with a dividend yield of 7.71%.
Bringing in our LLM
Now that we have our tools in place, let us also create a prompt
object and bind our tools to a LLM model of our choice.
Instead of invoking each tool directly like we did earlier, we will create a runnable
that chains the prompt with the tool-use LLM model.
Our expectation is to be able to prompt the runnable with something like “overview of BBRI” and have the agent invoke the correct tool — in this case, get_company_overview
— along with
the correct parameters for us.
What we observe, is that the runnable has correctly identified the tool to call, and the parameters to pass to the tool. If the
query require more than one tool, we will see this being reflected in the list of .tool_calls
.
At this point, you might be tempted to chain the output of tool_calls
to further runnables, thus actually calling the API for information retrieval
and then using structuring the output into the desired format. However, LangChain provides some utility functions to make this process easier. Recall from
chapter 2 we have the AgentExecutor
class that helps us orchestrate the tools and the LLM model:
For the following sections, I will be using of LangGraph’s ReAct
agent instead, as it is
also now the recommended way to create agents in LangChain. The AgentExecutor
class is still available,
but its official documentation now recommends the use of ReAct
agents instead.
LangGraph pre-built ReAct agent
Source: ReAct: Synergizing Reasoning and Acting in Language Models
An in-depth discussion of the ReAct paper is beyond the scope of this article, I have linked to the papers in the references section1,2 for interested readers. The authors posit that large language models (LLMs) research has been focused either on reasoning (e.g. chain-of-thought prompting) or acting (e.g. action plan generation), but not together.
Its own abstract reads:
We explore the use of LLMs to generate both reasoning traces and task-specific actions in an interleaved manner, allowing for greater synergy between the two: reasoning traces help the model induce, track, and update action plans as well as handle exceptions, while actions allow it to interface with external sources, such as knowledge bases or environments, to gather additional information. ReAct outperforms imitation and reinforcement learning methods by an absolute success rate of 34% and 10% respectively, while being prompted with only one or two in-context examples.
Introduction to pre-built ReAct agents
We’ll start off by importing the create_react_agent
function from langgraph.prebuilt
. This function requires two arguments:
model
: The LLM model to usetools
: A list of tools to bind to the agent, essentially replacing thellm.bind_tools(tools)
set-up we did earlier- (optional)
state_modifier
: A function that modifies the state of the agent. This is useful for adding system messages or other state changes to the agent.
The object returned by create_react_agent
is of type <class 'langgraph.graph.state.CompiledStateGraph'>
which conveniently also implements the
invoke
method. The most basic usage of the agent is as follows:
The .invoke()
method takes a dictionary with a key messages
and a value that is a string. The output is a dictionary that would contain, among other things,
a messages
key with a list of HumanMessage
and AIMessage
objects.
If desired, you can also visualize this agent:
A Financial Data Agent with ReAct
With the newly equipped knowledge, we can now create a financial data agent that can retrieve companies ranked by a certain metric, as well as providing overviews of companies based on their stock symbols.
Our code is slightly modified from the one in the previous section, with the addition of a state_modifier
and another utility function
to simplify the invocation of the agent.
With the utility function in place, let’s query the agent with a few questions and see how it responds:
When invoking the tool directly (without the agent, and without any language models),
we would have to manually pass the stock symbol to the get_company_overview
tool. Even then, the response
would be in the original format returned by the API, i.e. a JSON object — making it less readable for the end-user.
With the ReAct agent, we’re now querying in natural language, and getting a far more human-friendly response, generated by the agent.
We can also query the agent for the top 5 companies ranked by a certain metric:
We observe that the agent is able to understand the query, and the instructions in state_modifier
were also
being followed as the agent responded with a markdown-formatted answer (ordered lists, along with bolded text syntax).
Streaming the agent’s response
The last section of this chapter will demonstrate the idea of streaming. Streaming is a concept where we want to make our AI feels a bit more human-like and responsive, by having it to stream its response in chunks, with our program yielding each chunk as soon as they are available. This is especially true when the agent can take a long time to process the query, or when the underlying large language models are slow to respond.
For the purpose of this demonstration, we will be using the same query as the one above. Notice that we are print
ing the
the response as they come in, rather than waiting for the entire response to be generated before displaying it.
On a given runnable,
- Calling
.invoke()
gets us a final response; - Calling
.stream()
let us stream messages as they occur (useful if agent takes a while, i.e. multiple steps)
The output of the streaming code will be similar to the one below:
flush
the buffer!
Disk write and read operations are slow in comparison to RAM-based operations. One way programs can speed up their operations is by batching characters in a RAM buffer before writing them at once to disk, dramatically reducing the number of disk write operations.
The flush=True
argument in print
forces the buffer to be written to disk immediately, thus flushing
the buffer. In the case of streaming responses from our AI agent, this has the benefit of ensuring
our users see the output as soon as it is generated, rather than waiting for the buffer to fill up.
Streaming JSON and Structured Output
Streaming sounds like a really great idea, but also one that would fail if the output is JSON (or any structured syntax, like xml). If we
were to stream a JSON object, and use json.loads
(or json.dumps
) on the partial JSON, the parsing would fail due to
the incomplete syntax.
The solution is to apply the parser on the input stream so that it could attempt to auto-complete the partial json into a valid and complete JSON object.
Here is a simple example of one such implementation:
And when we observe the output, we will see that the JSON object is being streamed in chunks, and the parser is able to correctly infer a valid JSON object from the partial JSON syntax:
Challenge
Earn a Certificate
There is an associated challenge with this chapter. Successful completion of this challenge will earn you a certificate of completion and possibly extra rewards if you’re among the top performers.
Implement a ReAct-style agent using the code you’ve written so far. Use this agent to answer the following questions:
- True to the spirit of value investing, find the top 7 companies based on their P/E values (lower is better).
- Issue a second query to get an overview of the fourth company in the list.
You should submit the code that implements the agent, and outputs of both queries in your notebook.
For reference, at the time of this writing, the top 7 companies by P/E values returned by the query are as follows. Depending on the time you attempt this challenge, the companies and their P/E values may differ:
Learn: What is P/E value and how is it used?
If you are unfamiliar with the concept of P/E values and value-investing in general, I have an article that explains the concept in more detail. Giving it a read will help you understand the context of the challenge better.