Some Quick OpenAI Updates
OpenAI announced new versions of GPT-4 and GPT-3.5 the middle of last week.
For GPT-4, they described the release as having bug fixes, and performing “tasks like code generation more thoroughly than the previous preview model and is intended to reduce cases of ‘laziness’ where the model doesn’t complete a task…”
And to think, I just called GPT-4 lazy. Well, hush my mouth!
The GPT-3.5 announcement, apart from bug fixes, is notable for lowering the price (again) and doing better at generating the precise format (e.g., JSON) requested.
As of this writing, the new GPT-3.5 model is not available, so I’ve yet to test it. As for the new GPT-4 model, I’ll try it out this week in some applications I’m building to see if there’s any sort of improvement.
My observations are:
GPT-3.5 is being kept very price competitive with other vendors.
Most of the work we’re seeing now is OpenAI making (presumably) run-time optimizations to lower costs and tweaking the way the existing models handle our instructions.
I still think that we’re a long, long way off from seeing a GPT-5 which is an order of magnitude better than GPT-4. (And remember, we don’t know how much bigger GPT-4’s model is from GPT-3.5, as OpenAI is not forthcoming).
Model tweaks are going to keep coming, which means any production applications will need to have solid tests to make sure there’s no regressions introduced by a new model.
A Bit More On Pricing
OpenAI has been quicker to drop the cost of inputs than outputs to GPT-3.5. That’s been generally a good thing, in that the maximum input context window (the most you can shove into it on every call) has quadrupled in size (4K to 16K) over the year. Coincidentally,1 the cost of inputs has now dropped by the same proportion from the beginning of June to now.
For people who are either directly or indirectly passing through the costs of using OpenAI, I wonder if they are passing along the reduced costs to their customers or using this as an opportunity to grow their margins. 🤔
It’s interesting to note that the cost of the outputs has dropped only a bit, and only just now. I’m not privy to OpenAI’s pricing rationale, but I do know that every token generated represents, metaphorically, a crank of the NN wheel. The number of output tokens you generate is linearly related to cost of providing the service: twice as much output means they had to run through the model twice as many times. So I guess I’m not surprised to see the price of outputs stay fairly consistent.
As for GPT-4, pricing remains at $0.01/1k input tokens and $0.03/1k output tokens, having dropped from the initial $0.03/input and $0.06/output pricing with the introduction of GPT-4-turbo last November (the input context window expanded from 8K to 128K tokens at the same time).
Even with the price drop, GPT-4 is still vastly more expensive to use than GPT-3.5 (and always has been):
The cost of inputs in GPT-4, currently, is 10× GPT3-5, and outputs are 15× as much! I’m not sure GPT-4 is 15× better than GPT-3.5; it’s hard to put a number on it. Regardless, there is a real incentive to use GPT-3.5 whenever possible. As you build applications, it is always good to check the relative performance of the two models and to invest extra time in improving your GPT-3.5 results.
Another Bit of Good News
There’s also an announcement which is near and dear to my heart: they will track usage on each of your authorization tokens separately (if you enable it). This means you can have one OpenAI account with distinct tokens for each of your projects and monitor the usage for them independently. Prior to this, the only reliably way was to create a proxy in front of OpenAI that logged all the traffic.
All of these changes strike me as OpenAI being responsive to the developer community (which I appreciate).
Or maybe not!