The term "AI" appears in discussions of just about every industry, but what's talked about less are the technologies.
In our tech-dominated world, the term “AI” appears in discussions of just about every industry. Whether it’s automotive, cloud, social media, health care, or insurance, AI is having a major impact, and companies both big and small are making investments.
What’s talked about less, however, are the technologies making our current use of AI feasible and paving the way for growth in the future. After all, AI isn’t easy, and it’s taking increasingly large neural network models and datasets to solve the latest problems like natural-language processing.
Between 2012 and 2019, the growth of AI training capabilities increased by a factor of 300,000 as more complex problems were taken on. That’s a doubling of training capability every 3.4 months, an incredible growth rate that has demanded rapid innovation across many technologies. The sheer amount of digital data in the world is also rapidly increasing—doubling every two to three years, by some estimates—and in many cases, AI is the only way to make sense of it all in a timely fashion.
As the world continues to become more data-rich, and as infrastructure and services become more data-driven, storing and moving data is rapidly growing in importance. Behind the scenes, advancements in memory technologies like DDR and HBM, and new interconnect technologies like Compute Express Link (CXL), are paving the way for broader uses of AI in future computing systems by making it easier to use.
This will ultimately enable new opportunities, though each comes with its own set of challenges, as well. With Moore’s Law slowing, these technologies are becoming even more important, especially if the industry hopes to maintain the pace of advancement that we have become accustomed to.
Though the JEDEC DDR5 specification was initially released in July 2020, the technology is just now beginning to ramp up in the market. To address the needs of hyperscale data centers, DDR5 improves on its predecessor, DDR4, by doubling the data-transfer rate, increasing storage capacity by 4×, and lowering power consumption. A new generation of server platforms essential to the advancement of AI and general-purpose computing in data centers will be enabled by DDR5 main memory.
To enable higher bandwidths and more capacity while maintaining operation within the desired power and thermal envelope, DDR5 DIMMs must be “smarter” and more capable memory modules. In an expanded chipset, SPD Hub and Temperature sensors are incorporated into server RDIMMs with the transition to DDR5.
High-bandwidth memory (HBM), once a specialty memory technology, is becoming mainstream due to the intense demands of AI programs and other high-intensity compute applications. HBM provides the capability to supply the tremendous memory bandwidths required to quickly and efficiently move the increasingly large amounts of data needed for AI, though it comes with added design and implementation complexities due to its 2.5D/3D architecture.
In January of this year, JEDEC published its HBM3 update to the HBM standard, ushering in a new level of performance. HBM3 can deliver 3.2 terabytes per second when using four DRAM stacks and provides better power and area efficiency compared with previous generations HBM, and compared with solutions like DDR memory.
GDDR memory has been a mainstay of the graphics industry for two decades, supplying ever-increasing levels of bandwidth needed by GPUs and game consoles for more photorealistic rendering. While its performance and power efficiency are not as high as HBM memory, GDDR is built on similar DRAM and packaging technologies as DDR and follows a more familiar design and manufacturing flow that reduces design complexity and makes it attractive for many types of AI applications.
The current version of the GDDR family, GDDR6, can deliver 64 gigabytes per second of memory bandwidth in a single DRAM. The narrow 16-bit data bus allows multiple GDDR6 DRAMs to be connected to a processor, with eight or more DRAMs commonly connected to a processor and capable of delivering 512 GB/s or more of memory bandwidth.
COMPUTE EXPRESS LINK
CXL is a revolutionary step forward in interconnect technology that enables a host of new use cases for data centers, from memory expansion to memory pooling and, ultimately, fully disaggregated and composable computing architectures. With memory being a large portion of the server BOM, disaggregation and composability with CXL interconnects can enable better utilization of memory resources for improved TCO.
In addition, processor core counts continue to increase faster than memory systems can keep up, leading to a situation where the bandwidth and capacity available per core is in danger of falling over time. CXL memory expansion can provide more bandwidth and capacity to keep processor cores fed with more data.
The most recent CXL specification, CXL 3.0, was released in August of this year. The specification introduces a number of improvements over the 2.0 spec, including fabric capabilities and management, improved memory sharing and pooling, enhanced coherency, and peer-to-peer communication. It also doubles the data rate to 64 gigatransfers per second, leveraging the PCI Express 6.0 physical layer without any additional latency.
While this list is by no means exhaustive, each of these technologies promises to enable new advancements and use cases for AI by significantly improving computing performance and efficiency, and each will be critical to the advancement of data centers in the coming years.
This article was originally published on EE Times.
Steven Woo is a fellow and distinguished inventor at Rambus.