GPT-3 Made Better: OpenAI Adds Model Fine-Tuning Capabilities to its GPT-3 API
OpenAI’s GPT-3 natural language model just got faster and less expensive to use, thanks to a new API feature that integrates fine-tuning so users can customize their models to produce better results from their workloads.
By harnessing automated fine-tuning into the API, developers can now create versions of GPT-3 tailored for their enterprise applications. Developers can start using the customized API using just one command in the OpenAI command line tool, according to the independent AI research and deployment company. The custom version will begin training and then be available immediately to the API.
“You can use an existing dataset of virtually any shape and size, or incrementally add data based on user feedback,” OpenAI said in a blog post about the announcement. “With fine-tuning, one API customer was able to increase correct outputs from 83 percent to 95 percent. By adding new data from their product each week, another reduced error rates by 50 percent.”
The custom GPT-3 API feature, which had been in alpha testing with customers for workloads including content generation, classification and text summarization, has resulted in positive reviews from users who cite lower computing costs, faster training speed and differentiated outcomes, according to OpenAI.
“People have been clamoring for fine-tuning for a while because it basically unlocks new capabilities that were not possible with the base GPT-3 models,” Rachel Lim, the lead engineer on the API fine-tuning project and an engineer on the applied team at OpenAI, told EnterpriseAI. “If you look at models before GPT-3, a lot of models had to be fine-tuned before they were useful at all. So now with GPT-3, the first of many large language models, it was a first of its kind to be useful without fine-tuning. But imagine if you add a layer of fine-tuning on top of it, it just makes it so much more capable and able to solve more use cases.”
Fine-tuning leads to better, more reliable results, said Lim. “It makes a huge difference between a cool demo that you can show on Twitter versus a production-quality application where you have paying users who expect a certain quality and robustness of results. And so, we find that fine-tuning often pushes a product over the finish line where it is able to serve things much more reliably for customer-specific use cases.”
Fine-tuning uses training data to “steer” a model, or tweak it to give “weights” inferred from the information, said Lim. Those weights are used by the model to make predictions.
“This allows you to steer the model even before you give it [directions such as] ‘English, German, English, German, English, German,’” said Lim. “Now, it already knows that it is a model that is supposed to do English, German, English, German, English, German. Because of that, you do not have to give it few shot (a small number of) examples anymore. You can just say ‘English’ and it knows the next thing is supposed to be ‘German,’ for example.”
Fine-tuning capabilities are needed because many users require more accurate results than those that are possible using the standard GPT-3 model, said Lim.
“A lot of people have built production-quality applications with [the original GPT-3], but for some users, an 85 percent accuracy rate versus a 95 percent accuracy rate makes all the difference as to whether you can get people to pay for your product,” she said. “This increase in reliability is truly the main value that people get out of fine-tuning.”
Other important benefits are that usage costs are lower because it needs to process less data to achieve its results, and that since the data was provided ahead of time to the customized model, less data will be needed next time because it already recognizes the customized patters, said Lim.
In November, OpenAI announced that its waiting list to run workloads on GPT-3 was dropped, making its AI modeling capabilities immediately available to developers and enterprises to work on their most challenging language problems.
OpenAI first debuted its powerful GPT-3 natural language model in June of 2020 on a limited beta capacity along with a waiting list where developers could sign up to use its infrastructure and capabilities in the future.
The general release added conditions to prevent GPT-3 from being used to harm people, as well as conditions that only allow its use in certain nations around the world. That means that developers in some nations, including Cuba, Iran and Russia, cannot currently access it.
GPT-3 is a massive natural language model that runs exclusively on Microsoft Azure. GPT-3, which stands for Generative Pre-trained Transformer 3, is an autoregressive language model with 175 billion parameters, which OpenAI claims is ten times more than any previous non-sparse language model.
The first version, GPT-1, was released in 2018, while the second version, GPT-2, debuted in 2019. With the release of GPT-3 in 2020, natural language processing (NLP) gained more power and use cases in the enterprise than ever before.
The latest OpenAI API, which includes GPT-3 and is now readily available, contains a host of safety improvements, including Instruct Series models that adhere better to human instructions, specialized endpoints for more truthful question-answering, and a free content filter to help developers mitigate accidental abuse.
So far, GPT-3 is primarily for English language modeling.
GPT-3 has been steadily gaining more interest in the world of enterprise IT over the last several years.
Microsoft exclusively licensed GPT-3 from OpenAI in September 2020, extending an existing relationship between the two companies. The licensing agreement covers the use of GPT-3 for all Microsoft products and services. In May, Microsoft announced the first such integration of GPT-3 into one of its products, its Microsoft Power Apps software, which aims to make it easier for enterprise workers to build no-code and low-code applications.
The Microsoft licensing deal was not an exclusive arrangement. Others can still use the GPT-3 model through OpenAI’s API, according to OpenAI.
In May, OpenAI launched a $100 million AI startup fund to provide investment money to startups that are driving intriguing AI technologies. The fund is slated to help a few early-stage startups in fields where artificial intelligence can have a transformative effect, including healthcare, climate change, education and in areas where AI tools can empower people by helping them be more productive using personal assistants and semantic search, the company said.
A wide range of enterprises are developing with GPT-3 today, including Disney, IBM, Twitter, Salesforce, Cisco and Intel, according to OpenAI.