In this article, we talk about what Language Models are and what they have to do with Artificial Intelligence. What are the chances of this algorithm’s further evolution and implementation in other fields?
Online competition is gaining momentum these days, and thus the pace of developments based on artificial intelligence and their subsequent introduction into our lives is increasing. Among the most sophisticated are algorithms for understanding a live language, based on deep learning, its interpretation, further reproduction, and even prediction.
Machines Can Understand People Even Better
The OpenAI team has been working on creating and improving the GPT (Generative Pre-trained Transformer) language model for years. To date, there are three generations of the algorithm. The project aims to teach a machine to understand natural human language to write text, answer questions, solve anagrams, solve problems and do translations.
OpenAI, of course, does not disclose all the details of the process of creation and improvement. Still, some sources report that the third version of the algorithm has about 175 billion parameters to configure.
Despite its progressiveness, GPT is not a particularly new architecture – it’s very similar to the decoder-only Transformer.
Google has also developed its neural network, BERT, which can be used to create AI programs for natural language processing: answering open-ended questions, creating chatbots, automatic translators, analyzing text, and so on.
Why We Need Language Models
To understand how a language model works, we need to understand what task the model is trying to accomplish. Given the previous context, language modeling predicts the following word (or word segment). In simple terms, a language model is a type of machine learning that can indicate a word based on prior words in a sentence. The best-known language model is the smartphone keyboard, which prompts you to continue as you type.
To complete the text, the model must understand its meaning very well and have some intrinsic knowledge of the real world. Then, the model’s internal command can be attempted to be pulled outward by modifying the additional context. It allows you to solve many problems: answering questions, summarizing text, and even creating dialog systems.
In this sense, we can say that GPT-2 is an algorithm for predicting the next word in a keyboard app, but heavier and smarter than the one implemented in your phone. GPT-2 was taught on a large 40 GB dataset (WebText) that OpenAI collected from the Internet as part of their research project. GPT-3, on the other hand, received input data dozens of times larger. With “Prompt Engineering,” GPT-3 can solve virtually unlimited problems. That’s why many consider GPT-3 a semblance of strong artificial intelligence.
But what makes GPT-3 special isn’t just its superb word processing – it doesn’t mainly set it apart from other language models. Its capabilities seem limitless, and the examples are outstanding. Do the math: given a query as raw data, GPT-3 can produce code, design a layout, make queries, do accounting, research, and be involved in other software solutions development. Give it some examples to learn from, and it can write small constructs such as a layout or an ML model description.
What Language Models Can Do Today
- A language model, and in particular GPT-3, can produce correct JSON proceeding from merely several examples. Although JSON is very intricate, GPT-3 makes errors no more than 10% of the time.
- It can imitate the style of famous writers and even help marketers create clickbait headlines, not to mention memes.
- The versatile =GPT-3() function has become a multi-tool for Google Sheets. For example, it can now search for state population, people’s Twitter nicknames, and where they work and do various calculations.
- The model scripts bank transactions with Python and the script completes/updates the Google Spreadsheet. It is a big step in accounting.
- The algorithm based on the OpenAI GPT-2 was used to create a free-to-play text-based game AI Dungeon, where the user entered commands into a text box, and the AI understood the context, adapted, and responded. When developers transferred the game to GPT-3, players could specify any command to which the algorithm would react correctly and change the game world.
Creating machine texts doesn’t limit language model functionality. Engineers from various fields are building this model into their tools, and we can expect a new wave of startups based on it. Unfortunately, OpenAI allows using this algorithm only by invitations and in demo versions. The developers say that they are afraid of using the model to the detriment of society, for example, to create fake news. However, there may be purely commercial reasons: shortly after the model’s release, OpenAI signed a contract with Microsoft for exclusive access to GPT-3.
OpenAI GPT Language Model Criticism
Forbes columnist Rob Toews thinks that GPT-3 is an impressive technological achievement, but its limitations make it impossible to talk about artificial intelligence.
This language model can indeed write texts, code, and produce memes. But in essence, it is not an AI algorithm, just a next-step predictor. First, the user specifies the raw data, and the model guesses what the following fragment should be.
That doesn’t make the tool bad, but it’s still unreliable and prone to errors that a human would not make.
It is not “artificial intelligence,” but it is an impressive technical achievement, capable of producing any text at any request.
However, it is worth remembering that gradually OpenAI involves more developers to GPT-3. They demonstrate the model’s capabilities, study it and create more fascinating projects to know about.
About the author:
Robyn McBride is a journalist, tech critic, author of articles about software, AI and design. She is interested in modern image processing, tech trends and digital technologies. Robyn also works as a proofreader at Computools.