An interesting application area that produces large quantities of data is the Electronic Design Automation (EDA) industry. At the present rate, the data generated by EDA tools doubles every year, but not all EDA data is equally organised.
The process of designing an electronic chip is based on creating an accurate model of the chip's architecture, behaviour, and functionality. Broadly speaking, the process consists of two stages or phases: front-end and back-end.
During the front-end design phase, engineers create a chip design by compiling source files into a model. The chip design model is verified by scheduling and running simulation jobs in a large compute grid.
The front-end phase generates an input/output (I/O)-intensive workload when a large number of jobs run in parallel: EDA applications read and compile millions of small source files to build and simulate a chip design. The workload requires high levels of concurrency because of the large number of jobs that need to run in parallel, generating a random I/O pattern.
During the back-end design and verification phase, the data access pattern becomes more sequential. The backend workload tends to have a smaller number of jobs with a sequential I/O pattern that runs for a longer period of time. The output of all the jobs involved in a chip's design phases can produce terabytes of data. Even though the output is often considered working space, the data still requires the highest tier of storage for performance.
Within the storage system, EDA workflows tend to store a large number of files in a single directory—typically per design phase—in a deep directory structure on a large storage system. Performance-sensitive project directories, including those for both scratch and non-scratch directories, dominate the file system.
Directories contain source code trees, front-end register transfer level (RTL) files that define logic in a Hardware Description Language (HDL), binary compiled files after synthesis against foundry libraries, and the output of functional verifications and other simulations. This poses interesting challenges to the vendors of the data storage devices that EDA vendors rely upon, as we will discuss in a future column.
Figure 2: Design data generated at the front-end has different structures than does data generated by back-end EDA tools (Source: Dell EMC)
Figure 3: Estimated storage capacity requirements by EDA tools for the entire RTL-to-GDSII flow per chip design versus technology process node (Source: Dell EMC)
At the time of the 19th General Conference on Weights and Measures in 1991, a metric prefix to the power of 24 was considered to be large enough to include virtually all known physical measures for many years to come.
Approximately 20 years later, in 2010, digital data storage hit the "Zetta" prefix, with only one prefix, the "Yotta," left available. Maybe the time is approaching for another conference to further expand the available prefixes.
This article first appeared on EE Times U.S.