Hyperscalers want wholesale architectural changes without paying for the capital equipment needed.
There’s never been more pressure on memory to meet the demands of new applications — everything from edge computing and the Internet of things (IoT) to increasingly smarter phones and smart cars. There’s also artificial intelligence (AI) and machine learning, both of which are becoming a big part of next-generation platforms being developed by the major hyperscale players — the Googles, Facebooks and Amazons of the world.
All of them are expecting a great deal of innovation from the broad electronics industry and the memory makers, whether it’s further improvements to incumbent memories such as DRAM and NAND flash — or making emerging memories that incorporate novel materials commercially viable as part memory devices for new computing architectures. But despite their deep pockets, it’s unlikely any of the companies will ever invest in manufacturing equipment to make their own memory devices, and they’re not interested in paying a premium price. If DRAM still does the job, they’re not going to pay five dollars more per device for an emerging memory because at this scale, it adds up quickly.
Jim Handy, principal analyst with Objective Analysis, said the clout hyperscalers have today is unprecedented; the closest historical analogy he could think of was the “enormous” buying power Apple had 15 to 20 years ago. However, the company was only looking for a minor change to a conventional computing architecture — one pin changed, for example — and expected that change for no extra charge. “They were more into taking existing computer architectures and then delivering them in a prettier way or more friendly way to their customers.”
What hyperscalers are looking for are wholesale architectural changes, said Handy. “Their motivation is actually very different because they look at what it costs them to buy something, and then they also look at how much power it’s going to take to run it.” The hyperscalers expect the industry to solve that problem — they’re not going to go out and cover the costs of new capital equipment.
HBM, which got a specification update by JEDEC, is seen as meeting the demands
of AI applications but the manufacturing costs are high. (Source: JEDEC)
Neither are they going to start building their own memory devices, particularly DRAM, said Stephen Pawlowski, Micron Technology’s vice president of advanced computing solutions, and aside from its volatility, there’s nothing available that has the reliability, speed, and endurance of DRAM. NAND and some of the newer storage class memories, meanwhile, are complicated from a materials perspective, as well as understanding how they work under temperature, multiple cycles, and different workloads — so a memory maker, such as Micron, is no danger of becoming irrelevant. “It takes a lot of creativity and ingenuity to use those devices,” he said. “When it comes down to what do we need to do for the memory and storage subsystem in terms of improving capacity and performance efficiency, the collaboration seems to be pretty good.”
Pawlowski sees the hyperscalers as having taken over as the canary in the coal mine from the OEMs who played the role in the mid-2000s. OEMs drove innovation around moving storage closer to the CPU, while the hyperscalers are trying to push network bandwidth in a way that nobody has before, and that means a lot of power being consumed to move data around, he said. “When we look at how we’re going to improve the efficiency of our data centers, we really need to make sure we can get the latency of the transferred information between the compute system and the memory storage sub-system down as much as possible.”
Martin Mason, GlobalFoundries’ senior director of embedded memory, said there’s interest in both MRAM and ReRAM being deployed in the data center’s compute-intensive applications, including mainstream AI processing being done in a server farm, where a key challenge is power and memory bandwidth. “You are starting to see the emergence of novel memory technologies being deployed in that space. I don’t think any of them has really been truly commercially exploited at this point, but both MRAM and RRAM are being looked at as various high-density memory technologies to replace SRAM in those applications.”
This trend reflects the evolution of hyperscalers over the past five years, said Martin, “They’ve migrated from being predominantly software-based companies to increasingly becoming more vertically integrated in both the solutions that they provide in terms of the enterprise infrastructure, and now, the silicon to go into those solutions.” They see silicon components helping them in two different ways, he said. The first is fundamental differentiation from all the commoditized enterprise hardware, and the second is economic. By vertically integrating and taking their designs directly to the foundry, they end up with a more cost-effective solution that scales faster and more cost effectively.
A good example of these hyperscalers going beyond just software is Google’s Tensor Processing Unit (TPU) for its AI own workloads, a technology normally expected from a company like Intel, said Mahendra Pakala, managing director, Memory Group, Advanced Process Technology Development at Applied Materials. Right now, these companies are just using what’s available to realize their AI accelerators, “but once you start designing and building your accelerators, you do see shortcomings.” He believes accelerators will drive the adoption of emerging memories too, as well as the overall memory roadmap because of their requirements.
Applied Materials has evolved its Endura platform from a single process system to an integrated process system as part of its materials engineering foundation for emerging memories. (Source: Applied Materials)
One established memory getting a lot of attention for AI applications is High Bandwidth Memory 2.0 (HBM2), which has traditionally been used for high-end graphics and high-performance computing, but Pakala noted that while it’s matured from a manufacturing perspective, it’s still relatively expensive. When it comes to emerging memories, PCRAM in the form of 3D XPoint, has seen some commercial adoption as has MRAM, he said. “We do see both of them maturing and the name of the game is reducing the cost per bit. We do see a pathway to reduce the price.”
Ultimately, Applied sees materials engineering as the foundation for moving forward so that PCRAM, ReRAM, and MRAM can be cost-effectively manufactured to meet emerging use cases, including AI. Its latest Endura platforms, for example, focus on enabling the novel materials that are key to these new memories to be deposited with atomic-level precision. In the case of MRAM, which is seen as excellent candidate for storing AI algorithms, Applied just announced a 300-millimeter MRAM system for high-volume manufacturing, made up of nine unique wafer processing chambers all integrated in high-vacuum conditions and capable of individually depositing up to five different materials per chamber.
As much as the foundries are exploring how they can work with hyperscalers on devices based on emerging memories, Handy doesn’t see hyperscale applications significantly driving demand for them as they would rather pay less for a complicated setup made of DRAM and flash. GlobalFoundries’ Mason also sees a pragmatic camp, which is about making what’s available work today and getting the best possible incremental solution. But there’s also a disruptive and “knock-it-out-of-the-park type mentality,” where if the price is right there would be a willingness to invest in the development of something that is truly disruptive in the industry, he said. “That’s how I think some of them see major breakthroughs happening.”
Handy said these companies using memories would like them to continue to follow Moore’s Law, and that sets the boundaries going forward. “There needs to be continued capital spending by the memory companies, but if the capital spending becomes too great, if they try moving too fast, then it pushes up the costs rather than pushing them down.”