AI Code Energetically Wagging Hardware

Article By : Rick Merritt

Deep-learning models, frameworks, and techniques like reinforcement learning are moving faster than you can carve out paths in silicon.

SAN JOSE, Calif. — In AI, hardware is the tail and software is the dog — and this is a very active dog. One need only browse the popular site to find one- or two-dozen new research papers posted daily.

Wei Li, who leads a software group at Intel devoted to machine learning, rattles off a list of a dozen popular convolutional, recurrent, and other neural-network models. Adding another layer, most big cloud and chip vendors have created their own frameworks to build and run the models optimally on their platforms.

“There’s a variety of topologies and frameworks to test,” he said.

Don’t let the complexity overwhelm you, said Chris Rowen, chief executive of BabbleLabs, a startup creating DNN engines for audio tasks. “The structure of a neural net can be important to efficiency, but any of them can get the job done,” he said. “In many cases, it’s almost a question of style.”

Automated learning is perhaps the most powerful megatrend that will drive change in the software. It could take decades to evolve into what is still considered a kind of science fiction — machines that can learn independently of humans. Meanwhile, researchers are helping today’s neural nets take baby steps in that direction.

“In my opinion, the future of AI is self-supervised learning,” said Yann LeCun, who is considered the father of convolutional neural nets, now used widely in computer vision and other systems. “The trend is to rely increasingly on unsupervised, self-supervised, weakly supervised, or multi-task learning, for which larger networks perform even better,” he wrote in a recent paper.

Generalized adversarial networks are showing promise as one technique to let systems make their own predictions. In a recent talk, LeCun showed examples of GANs used for designing fashionable clothes and guiding self-driving cars. He also pointed to work such as BERT, a pre-training technique using unlabeled data that Google recently made open-source.

Such code requires big iron and lots of memory, and future algorithms will demand even larger models. Tomorrow’s neural nets will also be more dynamic and sparse, using new kinds of basic primitives such as dynamic, irregular graphs, LeCun said.

Long term, “one hope is that training a system to predict videos will allow it to discover much of the hidden regularities, geometry, and physics of the world … [The resulting predictive models could] be the centerpiece of intelligent systems … for such applications such as robotic grasping and autonomous driving,” he added.

The near-term challenge is especially acute for engineers such as Jinook Song, who designs AI blocks for Samsung’s smartphones. He recently described a 5.5-mm2 block in the latest 8-nm Exynos chip that hits performance of 6.937 TOPS when a neural net allows pruning of up to three-quarters of its weight.

He’s not tapping the brakes. Asked what he most wants for a future generation, he said some kind of learning capability in the power budget of a handset.

Researchers are showing progress teaching neural nets a form of learning by having them fill in blanks in images. (Source: Yann LeCun, ISSCC)

Don’t reward your drone for staying still

Today, reinforcement learning has limited use but lots of buzz, thanks in part to the results that Google got with it playing human experts in Go and other games. The technique will have a key role in future self-driving cars, said Wei Li of Intel.

“Reinforcement learning is like an agent that tries things and sees how they work in the real world, typically a simulation on a general-purpose CPU,” so it may require acceleration both for the agent on a custom chip and the CPU running the simulation, explained Dave Patterson, a veteran computer research now spending some time at Google.

Researchers at Georgia Tech recently ran reinforcement learning on low-end systems thanks to the use of time-domain coding. Articulating the right rewards for the system is one of the big challenges — potentially a whole new field of computer science, said Arijit Raychowdhury, who worked on the project.

For example, in one project, students carefully defined rewards to encourage a drone to maximize battery life. It did — by not moving. “There are so many parameters, it’s easy to get them wrong,” said Raychowdhury.

“Deep reinforcement learning was one of the hottest classes at Berkeley last year,” said Hoi-Jun Yoo, a professor at the Korea Advanced Institute of Science and Technology. “The technique looks promising but has many variations, and the algorithms are still in development, so it’s not clear how they may influence hardware.”


Researchers ran reinforcement learning on small embedded systems thanks to the use of time-domain processing. Click to enlarge. (Source: Georgia Tech, ISSCC)

Meanwhile, middleware is also moving. Today’s AI coders are “not like traditional programmers who pick a language and stick with it for 20 years,” said Patterson. “Apparently, every couple of years, researchers switch horses … the field is very exciting, with continuous algorithm improvements.”

The MPEG community is about to weigh in. This month, it is evaluating initial proposals responding to a call that it made last fall for ways to use MPEG to compress trained neural networks. Nokia was one of the active responders.

Its early days, especially for embedded environments, are far away from the early tools and methods that data center operators are starting to understand. “Developing deep learning for embedded systems is not for the faint of heart,” said Rowen. “There are limits, bugs, mixes of tools needed to make everything work … and implementations are still incomplete and immature.”

The same is true for an emerging class of enterprise software offerings for building and running neural-net models, said the chief executive of one startup in the field. “This industry needs abstraction layers everywhere, but they don’t exist yet,” said Evan Sparks of Determined AI. “There’s a lack of standard file formats to export models between frameworks and protocols to build tools that work together — it’s a Wild West in tooling.”

It’s also a Gold Rush of opportunity. “My best proxy and lower bound for the market are Nvidia’s data center revenues at about $4 billion last year, up from about $200 million three years ago — almost all of it for deep learning,” said Sparks, who has customers in everything from semiconductors and genomics to waste management.

Today, the best neural-networking software “lives within the four walls of Facebook, Google, and Microsoft,” he said. “They are building more and better models than anyone and have high-grade software that’s only available internally, but my goal is to make that quality of software available to others.”

Leave a comment