AI devs created a lean, mean, GPT-3-beating machine that uses 99.9% fewer parameters

AI researchers from the Ludwig Maximilian University (LMU) of Munich have developed a bite-sized text generator capable of besting OpenAI‘s state of the art GPT-3 using only a tiny fraction of its parameters.

GPT-3 is a monster of an AI system capable of responding to almost any text prompt with unique, original responses that are often surprisingly cogent. It’s an example of what incredibly talented developers can do with cutting-edge algorithms and software when given unfettered access to supercomputers.

But it’s not very efficient. At least not when compared to a new system developed by LMU researchers Timo Schick and Hinrich Schutze.

[Read: OpenAI reveals the pricing plans for its API — and it ain’t cheap]

According to a recent pre-print paper on arXiv, the duo’s system outperforms GPT-3 on the “superGLUE” benchmark test with only 223 million parameters:

In this work, we show that performance similar to GPT-3 can be obtained with language models whose parameter count is several orders of magnitude smaller. This is achieved by converting textual inputs into cloze questions that contain some form of task description, combined with gradient-based optimization; additionally exploiting unlabeled data gives further improvements.

Parameters are variables used to tune and tweak AI models. They’re intimated from data – in essence the more parameters an AI model is trained with, the more robust we expect it to be.

When a system using 99.9% less model parameters is able to best the best at a benchmark task, it’s a pretty big deal. This isn’t to say that the LMU system is better than GPT-3, nor that it’s capable of beating it in tests other than the SuperGLUE benchmark – which isn’t indicative of GPT-3’s overall capabilities.

The LMU system’s results come courtesy of a training method called pattern-exploiting training (PET). According to Open AI policy director Jack Clark, writing in the weekly ImportAI newsletter:

Their approach fuses a training technique called PET (pattern-exploiting training) with a small pre-trained Albert model, letting them create a system that “outperform GPT-3 on SuperGLUE with 32 training examples, while requiring only 0.1% of its parameters.”

Clark goes on to point out that, while it won’t outperform GPT-3 in every task, it does open new avenues for researchers looking to push the boundaries of AI with more modest hardware.

For more information check out the duo’s paper here.

H/t: Jack Clark and ImportAI

So you’re interested in AI? Then join our online event, TNW2020, where you’ll hear how artificial intelligence is transforming industries and businesses.

Published September 21, 2020 — 17:52 UTC

What my CS team was missing

X goes quiet again

Saatva Memory Foam Hybrid Mattress Review: Going for Gold and Good Sleep

Amazon Props Up Misleading, Junky Laptops No One Should Buy

What my CS team was missing

The Science Fiction and Fantasy Books You Can’t Afford to Miss in September!

Send a newsletter? This $100 list-building tool is just $12 right now.

There’s officially a snake named after Salazar Slytherin now

What my CS team was missing

X goes quiet again

Saatva Memory Foam Hybrid Mattress Review: Going for Gold and Good Sleep

Amazon Props Up Misleading, Junky Laptops No One Should Buy

AI devs created a lean, mean, GPT-3-beating machine that uses 99.9% fewer parameters

Bydls

Related Post

What my CS team was missing

X goes quiet again

Saatva Memory Foam Hybrid Mattress Review: Going for Gold and Good Sleep

You missed

What my CS team was missing

X goes quiet again

Saatva Memory Foam Hybrid Mattress Review: Going for Gold and Good Sleep

Amazon Props Up Misleading, Junky Laptops No One Should Buy