A Google executive wants optimised processors to reduce latency in context switching and other operations key to the tech giant's actual workloads.
Moore’s law is not keeping pace with the growing needs of the still-young market for cloud services, which is why a Google manager is calling for a suite of innovations in processors, memories, interconnects and packaging.
“The slowdown in Moore’s law and the growth of cloud services has brought us to an inflection point,” Prasad Sabada, a senior director of operations who oversees sourcing for Google’s data centre hardware, told executives at the annual Industry Strategy Symposium in California. “The game is changing again and we need the industry to respond in a meaningful way.”
Specifically, he called for processors optimised to reduce latency in context switching and other operations key to Google’s actual workloads. “We’ve seen many processors optimised for Spec [a synthetic benchmark], but at Google, our workloads differ significantly from Spec.”
Google also wants memory chips with lower latency. “We can get as much bang for the buck improving memory latency as processor performance,” said Sabada, pointing to promising work on new memory architectures.
Nearly a year ago, rival Facebook came out in support of Intel’s 3D XPoint memories, which promise improvements over today’s NAND flash. Intel started limited sampling of the chips late last year.
In interconnects, today’s typical “processor bus has a lot of overhead accessing I/O and accelerator devices” and is not suited to emerging memory architectures, he said. In addition, optical interfaces such as silicon photonics are needed to link servers in the data centre.
Sabada called out IBM’s OpenCAPI interface as one effort it supports. He did not mention two separate efforts launched last year, CCIX and GenZ, for open interfaces for accelerators and storage-class memories, respectively.
Google seeks lower-cost 2.5D chip stacks
In packaging, the move towards 2.5D chip stacks that put logic, memory, digital and analog die on a shared substrate looks “exciting” as “a cool way to have heterogeneous silicon.” However, it “doesn’t have right yields or costs for volume deployments,” Sabada said.
Figure 1: Processor performance is flattening out, said Google, citing Stanford figures. (Source: Google)
The chief architect of AMD’s graphics division recently expressed a similar frustration trying to bring the chip stacks to mainstream markets.
Sabada asked chip executives to speed up innovations in all of the areas. But he recognised the increasing complexity and costs of designing and making advanced chips.
“We have hit a power wall; frequency advances are not what we are used to and, essentially, we’ve seen a cap on single-core performance,” he said. The challenges have driven a move to multiprocessors that “can be a challenge in a cloud environment.”
Sabada pointed to Google’s announcement last year of its Tensorflow processing unit as the wave of the future. “We have entered the era of accelerators … and the TPU is just one example of the kind of thing you can see going forward,” he said.
“Machine learning will be a key driver for cloud computing; it’s a powerful use case for the way we go about acquiring intelligence … and a capability [used] across many [Google cloud] products,” he added.
This article first appeared on EE Times U.S.