Last week I introduced the concept of functions to allow GPT to use real-time data in formulating an answer.
This week I’m going to dive into detail to explain how that actually works with a complete, functional1 example.
There’s two parts to this article. The first is to explain what’s happening, but at a higher level than the actual code. This is so you have a firm understanding of how it works in general, which is all you really need to evaluate possible solutions to problems. The second part will be a discussion of the code itself, so you can have a working solution to start with.
Let’s get started!
The Premise
I’ve written a custom Chat client that uses GPT (so, not ChatGPT per se) that can answer questions about stock prices. That’s all it does, it’s kind of stupid otherwise.
In this dialog you can see that it will look up ticker symbols for me and it will retrieve the latest quote from the stock market for stocks. You can also see that it’s pretty good at figuring out what I’m asking when I make incomplete requests like “Volume?”.
Behind the scenes, I have two functions which I’ve told GPT about2. The first is “lookup_ticker” which, given the name of a company, does it’s best to find the ticker symbol for it. It’s not super smart, as you can see when I start out, because I used the wrong name for Deere & Co, but one everyone would recognize nonetheless. The second function, “get_quote”, is one that takes a ticker symbol, and returns the most recent data from the stock market. Thus, I could ask questions about Salesforce’s high, low, close and volume.
If you have an eagle-eye, you’ll notice that I said that the get_quote function takes a ticker symbol, yet I asked by company name. GPT is smart enough to know that it has to first translate “Salesforce” into the ticker symbol “CRM” and then look up the quote:
Here I’ve logged the conversation to and from GPT, and you can see that when I ask for the close and volume of Salesforce it:
Asks for the ticker symbol for Salesforce and gets CRM
Asks for the quote for CRM and gets a bunch of data
Generates an answer to the specific question asked.
What happens if I then ask for the high?
It doesn’t have to call get_quote again; it already has the data from the previous question3.
The entire program is under 300 lines of code. It doesn’t do much, of course, so that makes the code count pretty small.
What to Make of This?
The most amazing thing about this is that GPT can figure out that it needs to chain two functions together (Company → Ticker, then Ticker → Quote) to return the quote when I give it a company name. I have seen this in other contexts (with other functions), and I have to say this does impress me.
To be clear, the ability to chain logic like this has been something that’s been around for a long time. The programming language Prolog was invented in 1972 and did this kind of chain of reasoning as its raison d'être4. Still, it’s like a well executed double lift: I am always impressed when I see it5.
This ability, however, means that your code has to be prepared to make multiple calls to GPT as it works through a solution before you present and answer to the user. This can chew up some clock time, and it’s good to give the user some indication that things are working properly, because they’re not working as expected by the user.
High Level Design of Code
In subsequent posts we’ll go through the code, but I want to give you a sense of what it looks like first. There are three .py files. The first drives the UI, which just alternates between accepting input from the user and printing the response from GPT. It’s less than 10 lines of code. The second makes calls to an API to look up ticker and quote information and handles the formatting. That’s just under 100 lines of code.
The last file handles the interaction with GPT, and this is where the magic lives. It does several things:
Informs GPT of what functions are available and what they can do.
Provides the overall “system prompt” that defines GPT’s role
Maintains the history of previous interactions so GPT can reference the context of previous interactions.
It’s a bit under 200 lines of code.
If you’re interested, you can see the code now at:
https://github.com/cmcguinness/ChatWithFuncs
There may be a few last minute changes, but it does work now.
No groaning in back of class!
I use the somewhat strange wording of “I’ve told GPT about” because GPT does not call the functions on its own. It tells my code what function it wants me to call for it (including the parameters to the function) and then I tell it what the results are. It’s “as-if” GPT called the function, but it is not really so. I’ll discuss when I delve into the code.
OpenAI calls functions “plug-ins” in ChatGPT and in that case it will call them directly, however there’s quite a vetting process for that and it ONLY works in ChatGPT. When I realized that I just gave up on plug-ins as being to consumer oriented for me.
To be clear, it has the data because of how my chat client works, which is that I keep the last bunch of questions and answers around and feed them to GPT on every request. When we get to the code we’ll discuss that.
Prolog is short for programmation en logique and it is French. Like many things French, it was far ahead of its time and still looks modern today.
Youtube is full of double lift videos, but this one seems decent enough. A good double lift and a smooth glimpse and you’re on your way to Vegas, baby.