Tool Use and Function Calling for Finance LLMs
By: Samuel Chan · August 1, 2024
Generative AI Series
Create RAG systems and AI agents with Sectors Financial API, LangChain and state-of-the-art LLM models -- capable of producing fact-based financial analysis and financial-specific reasoning. **Continually updated** to keep up with the latest major versions of the tools and libraries used in the series.
Generative AI Series: Table of Contents
Generative AI for Finance
Tool-Use Retrieval Augmented Generation (RAG)
Structured Output from AIs
Tool-use ReAct Agents w/ Streaming
Conversational Memory AI Agents
API-based RAG systems architecture
We’ve learned from the previous chapter that we can use APIs to retrieve data from the web. In this chapter, we’ll learn how to use APIs to retrieve data from the web and use it to build a simple financial model. We’ve also seen how such a system falls into a broader category of systems that are referred to as Retrieval-Augmented Generation (RAG) models.
Rather than generating text solely from scratch, RAG models such as the one we’ll build in this chapter use information retrieved from an external data source to form the basis of their text generation. Let’s take one more look at the architecture of our API-based RAG system:
Broadly speaking, the system consists of three main components:
- Orchestrator: We will be implementing this is Python using the LangChain framework. This script will be responsible for taking in the user’s input and passing it to the retriever and LLM components along with some pre-defined constraints.
- Retriever: We will create tools that retrieve data from Sector’s, a financial API platform. These retrievers are in the form of some Python functions that take in some parameters and retrieve data accordingly from our data source (“Sectors API”).
- LLM: The language model we will be using is
llama3-groq-70b-8192-tool-use-preview
, which is state-of-the-art and finetuned specifically for tool use.
Llama-3-Groq-70B-Tool-Use is the highest performing model on the Berkeley Function Calling Leaderboard (BFCL), outperforming all other open source and proprietary models. Our models have achieved remarkable results, setting new benchmarks for Large Language Models with tool use capabilities:
- Llama-3-Groq-70B-Tool-Use: 90.76% overall accuracy (#1 on BFCL at the time of publishing)
- Llama-3-Groq-8B-Tool-Use: 89.06% overall accuracy (#3 on BFCL at the time of publishing)
Practice: Building a RAG System
Available as a Colab notebook
This tutorial is also available as a Colab notebook. Click here to access it.
For the remainder of this section, you will need to set up your environment along with the necessary dependencies. You can do this by running the following commands:
In a new Python file, you should be able to initiate a request with your key:
Since we’ll be using a model hosted on Groq, we’ll also need an API key obtained from Groq. You can sign up for a free account here. With both keys in hand, you can start setting up our script and load the keys into our environment.
I’m using dotenv
to load my keys from a .env
file. You can install it by running pip install python-dotenv
and create a .env
file in the same directory as your Python script with the following content:
And our Python script should look like this:
Notice that retrieve_from_endpoint
is just a convenience function
that we can use to retrieve data from the Sectors API. We are creating
an abstraction layer so we can re-use this function throughout our script.
Building tools for our tool-use model
Now that we have our data retrieval function, we can start building a set of tools, each tool specializing in handling a specific type of queries. For example, we can create a tool that retrieves financial data for a given stock, another tool that retrieves news articles for a given stock, and yet another tool might retrieve the financial reports for a given stock.
langchain
provides a Tool
class as well as a tool
decorator that we can use to
wrap any Python function and turn it into a tool. Here’s how we use it to create a grand
total of three tools:
Orchestrating our RAG system
We’ll be using the llama3-groq-70b-8192-tool-use-preview
model from Groq. Not only
is this model state-of-the-art and specializes in tool use, it is also open source and
more easily accessible than other proprietary models.
In the following code, we instantiate a ChatGroq
object with the model name and the Groq API key.
This provides us with the llm
object that we’ll pass, along with tools
created earlier, to our
orchestrator agent. What’s missing is the a prompt template that we’ll use to
specify some system constraints and provide some overall guidance to the LLM.
From top to bottom, we’re performing a few key steps:
- We’re creating a
ChatPromptTemplate
object that contains a message for the system and a message for the user. The system message provides some guidance to the LLM on how to respond to the user’s input. - We’re instantiating our LLM model. This LLM is responsible for interpreting the user’s input and choosing the right tool among the candidates to best respond to the user’s query.
- We combine
llm
,tools
, andprompt
into anagent
object. This agent is responsible for orchestrating the interaction between the user, the LLM, and the tools. - Finally, we create an
AgentExecutor
object that will execute the agent and handle the interaction between the user and the system. Settingverbose
toTrue
will print out the system’s responses to the user’s queries and additional messages that the system might generate.
Financial Queries with a custom RAG
Now is where the magic happens. We can start querying our system with financial queries and see how it responds. Each of the question is set up to require some form of tool use and actual data retrieval, failing which the system will not be able to generate a response.
The first 3 queries should be relatively straightforward given the tools we’ve created in earlier steps.
For the remaining query_4
and query_5
, you will need to implement additional tools that can retrieve historical performance of a stock
since its IPO listing. This section: Company’s Performance since IPO of the API Documentation
should provide you with the necessary information to extend your RAG system with this capability.
Financial Agent Responses
If implemented successfully, you should see the system generating an appropriate response for each of the query.
On query_1
, we asked “What are the top 3 companies by transaction volume over the last 7 days?” and the system responded with:
To arrive at this answer, it would need to interpret “last 7 days” and infer the start and end dates for the query, given
that the tool get_top_companies_by_tx_volume
requires a start and end date to retrieve the data. It then correctly
retrieves the data and have to perform a reduction operation to find the top 3 companies by transaction volume.
As a bonus, you may also want to provide a calculator tool that your RAG can use to perform the necessary summation and sorting operations to find the top 3 companies by transaction volume. In cases where it fail to interpret the intent correctly, your RAG system might instead return the top 3 entries for the last 7 days, which is probably a good approximation but not fully correct.
On query_2
, we asked “Based on the closing prices of BBCA between 1st and 30th of June 2024, are we seeing an uptrend or downtrend? Try to explain why.”
Our RAG model return with the following response:
query_3
is considerably trickier as it requires the system to retrieve the market cap for both BBCA and BREN, compare them, and then retrieve additional information for the company with the largest market cap.
Our original query is “What is the company with the largest market cap between BBCA and BREN? For said company, retrieve the email, phone number, listing date and website for further research.” and
our RAG system responded with:
As for query_4
and query_5
, you should expect a response similar to the one below:
Going Further: Improving the RAG System
Available as a Colab notebook
This tutorial is also available as a Colab notebook. Click here to access it.
There are numerous opportunities to improve our RAG system. Here are a few ideas to get you started:
- Adding more tools: You can add more tools to your system to handle a wider range of queries. For example, you could add tools to retrieve news articles, analyst reports, or social media sentiment for a given stock.
- Better prompts: You can experiment with different prompts to guide the LLM in generating more accurate and informative responses.
- More descriptive tools: You can make your tools more descriptive by providing additional information about the data they retrieve. This helps your orchestrator agent make better decisions about which tool to use for a given query.
- Different LLMs: You can try using different LLMs to see how they perform on your financial queries. You can experiment with different models from Groq or other providers. With how fast the field is evolving, it’s always a good idea to keep an eye on the latest models and see how they perform on your use case.
- Error handling: You can add error handling to your system to handle cases where the LLM is unable to generate a response. For example, you could have the system prompt the user for more information or provide a default response.
In the spirit of motivating you to explore further, in the model that I have implemented for Generative AI for the Finance Industry workshop, my prompt template is more detailed and include the present date to help the LLM infer the start and end dates for the queries when user queries include time-sensitive information framed as “last 7 days”, or “since the start of the month” etc.
Here is my implementation:
Challenge
Earn a Certificate
There is an associated challenge with this chapter. Successful completion of this challenge will earn you a certificate of completion and possibly extra rewards if you’re among the top performers.
If you’re up for a challenge, I have created two exercises that you can participate by making a copy of the Colab notebook and start working on them.
When you’re done, submit your work following this guide and I will be grading them.