The future of multi-threading
"Threads are dead," asserted Gary Smith, founder and chief analyst for Gary Smith EDA. "It is a short-term solution to a long-term problem."
At the 45nm node, more and more designs reach and exceed the 100 million-gate mark. These designs break current IC CAD tools, forcing EDA vendors to develop products capable of parallel processing.
Until now, parallel processing has relied on threading. Threading, however, tends to show its limits at four processors, and EDA vendors may have to come up with new ways of attacking the problem.
"Threads will only give you two or three years," Smith said. "Library- or model-based concurrency is the best midterm approach."
Looking into the future
EDA vendors interviewed at the 2008 Design Automation Conference (DAC) painted a more nuanced picture of the future of multi-threading.
"We have not seen the limits to multi-threading in the timing-analysis area," said Graham Bell, marketing counsel for EDA start-up Extreme DA Corp. "We see good scaling for three or four process threads. We get to see difficulties beyond that, but they are not dramatic."
With Extreme DA's GoldTime, a multi-threaded static and statistical timing analyser, the company has applied a fine-grained multi-threading technique based on ThreadWave, a netlist-partitioning algorithm. "Because of our unique architecture, we have a small memory footprint," Bell said. "We have not seen the end of taking advantage of multi-threading."
For applications with a fine-grained parallelism, multi-threading is one of the most generic ways to exploit multi-cores, said Luc Burgun, CEO of Emulation and Verification Engineering SA. "On the other hand, multi-thread-based programs can also be quite difficult to debug." That's because they "break the sequential nature of the software execution, and you may easily end up having nondeterministic behaviour and a lot of headaches."
According to Burgun, multi-process remains the "easiest and safest way to exploit multi-core." He said he expects some interesting initiatives to arise from parallel-computing experts to facilitate multi-core programming. "From that standpoint, CUDA [the Nvidia-developed Compute Unified Device Architecture] looks very promising," Burgun said.
Simon Davidmann, president and CEO of Imperas Ltd, delivered a similar message. "Multithreading is not the best way to exploit multi-core resources," he said. "For some areas, it might be OK, but in terms of simulation, it is not."
Multithreading is not the only trick up Synopsys Inc.'s sleeve, said Steve Smith, senior director of product platform marketing. "Within each tool, there are different algorithms. When looking at each tool, we profile the product to see the largest benefits to multi-threading," he said. "Multithreading is not always applicable. If not, we do partitioning."
As chipmakers move to eight and 16 cores, a hybrid approach will be needed, asserted Smith, suggesting a combination of multi-threading and partitioning.
To illustrate the point, Smith cited a host of Synopsys' multi-core solutions in the area of multi-threading, "HSpice has been broadly used by our customers. This is typically the tool you do not want to start from scratch," he said.
HSpice multi-threading has come in stages, noted Smith. "Last year, we multi-threaded the model-evaluation piece, and it gave a good speed-up. Then, in March, we introduced the HSpice multi-threaded matrix solver. We want to make sure our customers are not impacted, and we do it [multi-threading] piece by piece," he said.
Another trend Synopsys is investigating, Smith continued, is pipelining. This technique—an enterprise-level activity, since it demands the involvement of IT—collapses multiple tasks, such as optical proximity correction and mask-data preparation, into a single pipeline.
|HSpice multi-threading has come in stages.|
Last year, Magma Design Automation Inc. unveiled an alternative to multi-threading, using a streaming-data-flow-based architecture for its Quartz-DRC design rule checker. Multithreading provides a less-fine-grained parallel-processing capability than Magma's data flow architecture, said Thomas Kutzschebauch, senior director of product engineering at Magma.
Magma's multi-core strategy is focused on massive parallelism, Anirudh Devgan, VP and general manager of the custom design business unit, said at a DAC panel session on reinventing EDA with "manycore" processors.
"Four CPU boxes are just the beginning of a trend, and EDA software has to work on large CPUs with more than 32 cores," he said. "Parallelism offers an opportunity to redefine EDA productivity and value. But just parallelism is not enough, since parallelizing an inefficient algorithm is a waste of hardware."
Devgan's conclusion was that tools have to be productive, integrated and massively parallel.
Seeing beyond C
As he unveiled "Trends and What's Hot at DAC," Smith expressed doubts about C as the ultimate language for multi-core programming. He cited the identification of a new embedded-software language as one of the top 10 issues facing the industry this year, and asserted, "a concurrent language will have to be in place by 2015."
EDA executives did not debate the point. "We will change language over time," stated Joachim Kunkel, VP and general manager of the solutions group at Synopsys. "We are likely to see a new language appear, but it takes time. It is more an educational thing."
On the software side, meanwhile, reworking the legacy code is a big issue, and writing new code for multi-core platforms is just as difficult. Nonetheless, Davidmann held that "the biggest challenge is not writing, reworking or porting code, but verifying that the code works correctly, and when it doesn't, figuring out how to fix it. Parallel processing exponentially increases the opportunities for failure."
Traditionally, Davidmann said, software developers think sequentially. Now, that has to change. Chip design teams have been writing parallel HDL for 20 years, so it's doable—though it will take much effort and new tool generations to assist software teams in this task.
With single-processor platforms and serial code, functional verification meant running real data and tests directed to specific pieces of functionality, Davidmann said. "Debug worked as a single window within a GNU project debugger."
But with parallel processing, "running data and directed tests to reduce bugs does not provide sufficient coverage of the code," he said. "New tools for debug, verification and analysis are needed to enable effective production of software code."
Davidmann said Imperas is announcing products for verification, debug and analysis of embedded software for heterogeneous multi-core platforms. "These tools have been designed to help software development teams deliver better-quality code in a shorter period of time," he said.
To simplify the software development process and help with the legacy code, Burgun said customers could validate their software running on the RTL design emulated in EVE's ZeBu. It behaves as a fast, cycle-accurate model of the hardware design.
For instance, he continued, some EVE customers can run their firmware and software six months prior to tape-out. They can check the porting of the legacy code on the new hardware very early and trace integration bugs all the way to the source, whether in software or in hardware. When the engineering samples come back from the fab, 99 per cent of the software is already validated and up and running.
Thus, "ZeBu minimises the number of re-spins for the chip and drastically reduces the bring-up time for the software," Burgun said.
- Anne-Francoise Pele
|Related Articles||Editor's Choice|