The second AI Hardware Summit held this week here at the Computer History Museum was significantly larger than the inaugural event from last year. It was a clear indication that the semiconductor industry is exceedingly interested in AI startups. But the show is not just all about startups. There were also the established players including Intel and Nvidia.

While the show was well attended and several companies made interesting disclosures, the big news never quite materialized. In fact, several companies expected to attend did not show up. One company, Groq, dropped out at the very last minute. It surprised everyone at the show, since the startup was well-funded and its first disclosure was highly anticipated.

This year’s event included design support companies such as Synopsys and eSilicon. There was even a panel on system supporting technologies from Crossbar, Rambus, and Pure Storage. Microsoft and Facebook presented their approaches to hyper-scale data center AI.

The big no-show

A number of companies originally signed on to present or sponsor at AI Hardware Summit pulled out of the show.

Of those the most awkward was Groq, which canceled its presentation so late that its sponsorship was still prominently displayed on the stage banner. The reasons that the company gave to press and analysts scheduled for pre-briefinng on a major announcement had to do with a “customer issue.” This last-minute change certainly drew speculation that the company had hit a snag. The press and analysts at the show had a number of theories including problems with the first test chip, a customer that was unwilling to let the company talk about their solution, or potentially acquisition offer had been made. Groq was formed by a team of engineers that built the first Google TPU chip, and was well-funded. Unfortunately, Groq was going to be the big reveal of the show.

Groq logo empty chair

Groq logo and empty chair at the AI Hardware Summit (Photo: Tirias Research)

Opening Keynote: The future is accelerators

John Hennessy, Chairman of Alphabet, Inc. and former Stanford University President, delivered a keynote speech. He presented his view that the future of computing requires domain specific architectures, a topic that fit well in a conference that focused on specialized AI processing. He made the point that with Dennard scaling ending and Moore’s Law slowing down, transistor power and costs were no longer heading in the right direction. There’s no free ride for future performance just from process developments.

For microarchitectures, performance improving techniques such as speculative execution — while great for maximum performance — wastes power, because wasted work is wasted energy. Other techniques such as caches are hitting diminishing returns, multicore processing is limited by Amdahl’s Law, the clock speeds seem to be reaching a dead end as well. Further, modern scripting languages, such as Python, while great for greater programmer efficiency, pay a penalty in execution inefficiency.

Hennessy’s solution to the problem is to focus on domain-specific architectures, such as GPUs and TPUs, that will do some functions extremely well and that can be targeted efficiently by software. With domain-specific architectures, the amount of control logic can be reduced. The goal is to abstract the hardware while improving execution efficiency. Energy efficiency is the ultimate goal of any architecture at this point because that is the primary limitation to scaling.

Of the AI market, Hennessy believes that we don’t know what the final or perfect answer will be in terms of the architecture. He suggested that the experimentation will and should continue in new and different architectures. He hopes that hardware design becomes more like software — with fast prototyping, reuse, and abstraction, enabled by the development of simplified design tools.

John Hennessy

John Hennessy delivers a keynote at the AI Hardware Summit (Photo: Tirias Research)

After that inspirational talk, Karl Freund from Moor Insights and Strategy, brought the startups back to the reality. Today many of the inference workloads are still run on CPUs in the data center or in devices. And that in the data center, Nvidia GPUs have the dominant share of training acceleration. And while specialized silicon may be good for specialized functions, CPUs, GPUs and FPGAs are more fungible assets that can be used for other workloads.

Silicon or PowerPoint?

The program mostly broke down into the haves and the have-nots. Those that have are shipping silicon, and those that don’t have aren't. Some are still just shipping PowerPoint.

The companies in the former category include Habana, Intel, and Nvidia. GraphCore has been sampling for some time and showed its boards last year, but gave no more updates on performance. The hot topic at Hot Chips, Cerebras, announced design wins with two US Department of Energy Labs. Both Argonne National Laboratory and Lawrence Livermore National Laboratory announced multi-year partnerships with in Cerebras.

Habana Gaudi Rack Server

Habana Gaudi rack server (Photo: Tirias Research)

Qualcomm said its Cloud AI 100 chip will sample this year, with production in 2020. There are not a lot of details available on the Qualcomm chip but we know it will be one of the very few inference chips manufactured in 7nm. The design uses an array of AI processor and memory tiles that the company says is very scalable from mobile to data center. The chip will support the latest LPDDR DRAM for low power and is rated for 350 TOPS (8-bit integer values). Qualcomm is targeting Automotive, 5G infrastructure, 5G Edge, and data center inference. The company is leveraging its extensive experience in low-power inference from its Snapdragon smartphone processors.

Two companies that have focused on more generalized programmable logic such as FlexLogix and SambaNova were presenting at the event. Flexlogic will be sampling its machine learning chip this year. For the second straight year, Kunle Olukotun of SambaNova presented, but there’s still no sign of the company releasing silicon any time soon. The company has a more generalized mission to create software-defined hardware and is developing a new spatial language and a reconfigurable data flow architecture.

Building a new brain

Several of the companies that presented at the AI Hardware Summit are taking unique approaches to AI. Those approaches include neuromorphic computing, computing in memory, analog computing, and even optical computing.

Mythic is performing analog computing in NAND memory. They have developed a unique solution that can store an eight-bit value in one NAND memory cell. The company says it also has a unique A/D and D/A converter technique that uses very low power. One of the criticisms of analog computing has been that the translation of the digital input to analog and the result back to digital consumes a tremendous amount of power. The company believes it has solved that problem. The challenges for most analog computing solutions include: scalability, reliability, calibration, and drift. In addition, the Mythic solution works exclusively on 8-bit integer values. The company’s advantage is being able to perform inference in only a few watts of power.

Rain Neuromophic is building a chip that tries to model the brain more closely with a memristive material connected by nanowires. This is still a technology in the early research phase, but the goal is to build a device that is a 100x better on speed and power. Tape out is planned in 2020 and the first product will be a PCIe co-processor card.

Another brain-inspired design comes from Applied Brain Research (ABR) with efficient spiking neural networks. Their Nengo “brain” chip design is also targeting 100x lower power inference and is targeting in 2021 samples using GlobalFoundries FD-SOI with body bias compensation. The GrAI (pronounced "gray") Matter Labs has a fully digital neuromorphic processor.

While Intel’s corporate vice president and general manager of Artificial Intelligence products group Naveen Rao spent most of his presentation talking about the ultimate goal of AI R&D to build generalized intelligence, he also took the time to show the latest chips that the company is sampling for machine learning training and for inference. The big issues for developing generalized intelligence will take years to be solved, but advance models are already accelerating at a rapid pace.

With the development of AI, we’re moving from processing vast amounts of data to developing usable information, and eventually to building knowledge. The really big impact is going to be from information to knowledge, which is coming soon as model complexity is doubling every 3.5 month. AI designs will need to pull on all the levers — semiconductor technology, advanced packaging, improved software, more efficient architecture, and faster interconnects — to reach the ultimate goal.

Naveen Rao

Naveen Rao holding Spring Crest (Photo: Tirias Research)

The AI Hardware Summit offered a number of panel discussions, including a venture investment panel. The panel made several recommendations for startups including, that they need to do something that someone else can’t do or isn’t doing. The startups need to differentiate themselves from the established players who could easily outspend them. Recommendations included focusing on keeping the total cost of ownership lower and making their devices easier to program for faster time-to-market.

The panel was asked if there are too many companies building AI chips. Given so many companies at the event and the multitude of those who didn’t make the event, you might think so. But for the industry that could be this important, is there such a thing as "too much innovation"?

Another question was how many AI chip vendors would be there in 5 years. While the panelists predicted anywhere from 5 to 13 companies, we at Tirias Research believes the right answer is that all chip vendors will be AI vendors, in that AI processing will be integrated into most processors from microcontrollers up to servers.

Lies, damn lies, and benchmarks?

Throughout the show numerous vendors and audience questions revolved around benchmarking. While the whole industry is still nascent, benchmarking is even more so. But, as Peter Drucker famously said, “If you can’t measure it, you can’t improve it.”

And a lot of companies are still using older workloads that are less relevant to modern poblems. To help solve this issue, a group called MLPerf was formed to develop reasonable benchmarks. Work on the development of MLPerf is important and was described in one session talk by David Kanter. Those involved include a mix of both academia and industry players from startups and established companies. It seems clear that overall the industry is aligning behind this benchmark. The training and the inference benchmarks are still a work in progress and will require more industry input and will evolve over time.

The target was on Nvidia’s back

Throughout the show the number one target of vendor comparison was Nvidia. It really was considered by many the big bad wolf of AI.

And for the second year in a row, Nvidia got the last word at the conference with the closing keynote address. Nvidia focused mostly on the challenges of bringing AI machine learning knowledge to industries beyond the data sciences, at its working on developing vertical markets solutions. But there were no big reveals this year.

While the conference may have been light on news, it was heavy on people-to-people networking. The show had expanded the time for breaks that allowed for more time to talk and for people to visit vendor booths in the Expo area. The show was bigger and longer than last year, and although not every presentation was cutting-edge, there were more than enough indications that next year’s show will probably be even bigger and, hopefully, some of the PowerPoint will turn into actual silicon.