Why GPT-3 is like pink slime

By Laurent Sorber

December 8th, 2020

5 minutes

For all the buzz surrounding the introduction of GPT-3, a sober look at the new AI model shows that it’s not a quantum leap in artificial intelligence. For some specific applications, it can be very useful. But for broader use in the real world, GPT-3 falls short, as it still delivers the equivalent of pink slime rather than nutritious meat.

GPT-3 has recently garnered a lot of attention — and with good reason. OpenAI’s model is one of the first models that can generate uncannily realistic human-like text, among other applications. Machine learning has taught GPT-3 to do so by asking it to read gigabytes of articles, books, and webpages.

Some observers praise GPT-3 for the endless possibilities it could offer: discuss the meaning of life with Nietzsche, ask the model to summarize your meeting notes, or get it to post undercover comments on Reddit for a week.

This is all within the realm of possibility for OpenAI's superpowered language model. However, GPT-3 looks to be closer to an evolution than to a revolution: an extraordinary, but incremental achievement.

The model still lingers in the uncanny valley of natural language. You could say its output resembles pink slime: text generated by GPT-3 is hyper-processed and feels consistent, but isn’t very nutritional. Two reasons underlying GPT-3's weaknesses are that its output is modeled after a stream of consciousness and that it lacks a world model.

AI without purpose

Like all AI and machine learning models, GPT-3 automates tasks that require some level of intelligence. GPT-3 is the third iteration of OpenAI’s language model family.

What separates it from its older brothers is its sheer magnitude: 175 billion parameters, requiring over 650 GB of storage. The entirety of the English Wikipedia (over 6 million articles) represents only 0.6% of GPT-3’s training data. These parameters are a measure of its ability to absorb information and learn about natural language (and in turn, learn about the world). This makes GPT-3 the largest and most knowledgeable language model ever created.

So why is GPT-3’s output still firmly within the uncanny valley?

Firstly, because its output is a stream of consciousness by design: the next sentence feeds off of the previous one without a clear goal of what it wants to communicate. Have a look at this article written by GPT-3. Upon reading, you can probably tell that it wasn’t written by a human. Something doesn’t feel entirely natural. This is because GPT-3 is somewhat myopic: the text is convincing within the space of a paragraph, but the coherence doesn’t extend well across the entire article.

AI without understanding

There is a second reason for GPT-3’s eeriness. The model lacks a world model. That means it doesn't understand what humans might understand from context. It doesn’t have the shared world model that you and I have. This can lead the model to misunderstand implicit contextual cues, because it isn’t grounded in a consistent world (whether real or fictional). Think of a situation where you tell GPT-3 that you’re “listening to some rock”, and the model infers that you’re plugging your headset into an actual rock.

Source

To avoid these situations, GPT-3 would need, among other things, to learn how to count — which is also one of the first life skills toddlers learn. But how do you learn how to count by only reading examples of counting? It is extremely difficult for a model to do so. (Google recently tried and succeeded in counting small numbers with some mistakes interspersed, similar to a 3-year-old’s ability.)

Could we reasonably expect GPT-3 to be able to build its world model from reading alone? No, because GPT-3 is exactly the same as GPT-2, but with more parameters. An (enormous) upscaling of an existing architecture. It is a fantastic accomplishment, but it doesn’t directly address its flaws. There is no reason to assume that adding more parameters and letting the model learn from more text will solve its inherent problems.

AI without trust

Currently, GPT-3 appears to be very good at delivering “short outputs” generated from provided information. For example, it is quite capable of summarizing a person’s notes.

However, for tasks where large amounts of output are generated, GPT-3’s flaws are most obvious. After a few paragraphs, you start to zone out and realize that you’ve been reading hyper-processed pink slime. Something is off.

This means that you can’t entirely trust the model. You can't lean on it like you would with a human. GPT-3 will need to improve in quality to gain that trust.

To be truly revolutionary, you don't just have to improve — you have to reach a certain critical level of trust based on accuracy. Would you trust your voice assistant if it understood you 80% of the time? Your self-driving car to work 95% of the time? Or your AI assistant if it recommended you kill yourself “only” 0.1% of the time?

Source

We're on the brink of something extraordinary, but we have not yet reached the critical threshold that unlocks the next set of applications. We could be there in a few years, but for that to happen we need to fundamentally improve the model.

A world model to leave the uncanny valley

To revolutionize NLP, we will need to build a world model and teach it to GPT-3. There does exist some research in this direction, such as ConceptNet’s “retrofitting approach”, which uses Wikidata to inject world knowledge into the model (based off of Faruqui, M.; Dodge, J.; Jauhar, S. K.; Dyer, C.; Hovy, E.; and Smith, N. A. 2015. Retrofitting word vectors to semantic lexicons. In Proceedings of NAACL.).

Fixing GPT-3’s stream of consciousness may be an equally difficult challenge. If we manage to achieve this and have GPT-3 communicate with purpose, we'll open up a new world of capabilities and applications, particularly in the fields of assisted writing and question answering.

GPT-3 would be able to create convincing Cliffs Notes of entire books, flawlessly write poems matching the style of famous authors or even generate high-grade school essays that are almost indistinguishable from the work of human students. GPT-3 could even directly answer questions in natural language on Wikipedia or corporate knowledge bases.

But to leave the uncanny valley, GPT-3 needs to put some flesh on its bones. GPT-3 needs a world model to communicate with purpose.

Stay up to date

Stay ahead of the world. Our team shares their
knowledge learnt on the field. Sign up for our
newsletter

Automotive

Transport

Finance

Human Resources

Telecom

Agriculture

Conservation

R&D

Why GPT-3 is like pink slime

AI without purpose

AI without understanding

AI without trust

A world model to leave the uncanny valley

More posts by Radix

Foundation Models for Forecasting: the Future or Folly?

Implementing Effective Operations Research Solutions in OR-Tools

Steady the Course: Navigating the Evaluation of LLM-based Applications

Stay up to date