AI Language Models Growing at Staggering Rate

Article By : Sally Ward-Foxton

NLP is a popular and rapidly evolving field of AI. But the evolution of more sophisticated language models comes with an overarching mega-trend: These models are growing at a staggering rate.

Natural-language processing (NLP) is a popular and rapidly evolving field of AI. But the evolution of more sophisticated language models comes with an overarching mega-trend: These models are growing at a staggering rate. Google’s latest language model has 1.6 trillion parameters — a figure even its inventors called “outrageous” in their paper. The model’s parameters are the figures that the training process adjusts, and in general, more parameters mean a more sophisticated AI. Models like this one that purport to understand language are usually given a text prompt and then create more text in the same vein. The AI does not really understand language; it merely mimics it to a convincing degree. Google’s model topped the previous world-record holder, OpenAI’s GPT-3, by a factor of 9. GPT-3’s 175 billion parameters were widely heralded as enormous, and examples of the texts it created were marvelled over by many news outlets. In many cases, GPT-3’s text output was indistinguishable from articles written by humans; the model can mimic language concepts such as analogies and even write basic software code. GPT-3 itself was multiples bigger than its predecessor, GPT-2, which had 1.5 billion parameters.
AI Language Models
Language models like these have grown to the size at which the financial and environmental costs associated with computing them invite scrutiny. A 2019 study by researchers from the University of Massachusetts Amherst found that training GPT-2 took state-of-the-art specialized hardware (32× TPUv3 chips) 168 hours with an estimated cloud compute cost of between US$12,000 and US$43,000. That’s training one model, one time. Models are typically trained and tweaked and retrained many times over during the development process. Now consider that Google’s latest model is 1,000× bigger than GPT-2. If we assume the compute required to train scales roughly with the model’s size, we start to get an idea of the scale of resources required to develop and use these models. (Hint: It’s many millions of dollars). And the carbon footprint associated with compute on this scale is significant, too. Are today’s language models too large, and aside from financial and environmental cost, what are the implications? In practice, these models are now so large that Google is one of only a few companies that can do this kind of research. The company has access to huge amounts of computing power in-house which, while accessible via the cloud, are not economically viable at such scale for smaller companies or academic researchers. This allows tech giants like Google to effectively dictate the direction of research in this field, and naturally, research will follow whatever path is in their commercial interest. Whether their commercial interest aligns with ethical best practices or will lead to language models that can actually understand language is unclear. Where does one get the vast amount of training data required for a model this large? The short answer is the internet. Text is scraped from sources such as Wikipedia and Reddit. There is a significant risk that trained models’ output will reflect the nature of the training data, i.e., that it will mimic things people write in the internet’s darkest corners. If the model sees humanity at its worst, we should expect its output to reflect that. There are also questions about the potential for misuse of such powerful models. If a machine can write on any topic as well as a human can, it suddenly becomes very easy to fool people — especially over the internet, where people communicate via text. Imagine being able to readily produce convincing misinformation at scale and the havoc that rogue chatbots could wreak given the ability to pass for a human. Only a few weeks before Google published its latest work on the trillion-parameter model, a prominent ethics researcher with the company said she had been forced out after authoring a paper on the ethics of large language models. In light of the criticism this move drew, Google will find it difficult to keep answering questions about the ethics of research into huge — and growing — AI language models. This article was originally published on EE Times Europe. Sally Ward-Foxton covers AI technology and related issues for EETimes.com and all aspects of the European industry for EETimes Europe magazine. Sally has spent more than 15 years writing about the electronics industry from London, UK. She has written for Electronic Design, ECN, Electronic Specifier: Design, Components in Electronics, and many more. She holds a Masters' degree in Electrical and Electronic Engineering from the University of Cambridge.

Leave a comment