Embedded Benchmark Needs Support

Article By : Rick Merritt

A handful of mainly academic researchers aim to create EmBench, a free benchmark for embedded processors based on real-world applications

A benchmark in the works for embedded processors aims to provide a free, open-source alternative to the well-established suite of EEMBC benchmarks created by paying members. A small group of mainly academics are calling for support for EmBench, which they hope to release in a 0.5 version before the end of the year.

EmBench aims to deliver a single performance score based on a suite of about 20 real-world applications, mainly sourced from an earlier effort, the Bristol/Embecosm Embedded Benchmark Suite (BEEBS). It also plans to report metrics for code size and latency but not floating-point performance or power consumption.

Figures will generally be reported based on a geometric mean and geometric standard deviation to a reference platform. It tentatively picked PULP RI5CY, an open-source 32-bit RISC-V core from the Integrated Systems Lab at ETH Zürich in Switzerland, as its reference.

EmBench started as an idea from David A. Patterson, a Berkeley professor who helped launch the RISC-V initiative and the MLPerf benchmark for AI accelerators.

Since January, Patterson has worked on EmBench in monthly meetings with Palmer Dabbelt of SiFive, Cesare Garlati of Hex Five Security, G. S. Madhusudan of the India Institute of Technology Madras, Trevor Mudge of the University of Michigan, and Jeremy Bennett, head of an embedded software company that created BEEBS.

The idea emerged at a RISC-V workshop, when SiFive co-founder Yunsup Lee gave a presentation that used synthetic benchmarks to a group including Patterson. Patterson’s textbook on microprocessor design, co-written with John Hennessey, preached against use of synthetic benchmarks such as Dhrystone.

“Lee was embarrassed to do this in front of his former professor, and I was pissed off that people were still using synthetic benchmarks,” Patterson said in an interview. “My experience with MLPerf showed me [that] a small group could move fast, so I thought we could finally kill off Dhrystone and have a good benchmark for embedded computing. I talked to a few people at the workshop and they liked the idea. People are making changes to hardware to make Dhrystone run better. Everyone realized it was stupid, but no one took the initiative to fix it.”

EmBench also aims to be an alternative to the widely cited CoreMark, one of more than a dozen embedded benchmarks developed by the EEMBC trade group. EEMBC currently has about 20 active members, mainly embedded chip vendors such as Arm, Intel, and Renesas.

“EEMBC is fine for what they do with a paid membership model, but CoreMark is a synthetic benchmark — EmBench is an alternative model,” Patterson said.

The EEMBC website lists 17 benchmarks that it has developed since it was funded in 2000 or currently has in the works. Its most recent offerings include benchmarks for IoT, security, and advanced driver assistance.

For his part, EEMBC’s director left a door open for collaboration.

“My first reaction would be to see if we can work together [being that] EEMBC has support of the industry already,” said Peter Torelli, head of EEMBC. “I’m not opposed to permissively licensed benchmarks, so why not leverage what EEMBC has been running for 20 years instead of reinventing the wheel?”

EmBench

An analysis of widely used processor benchmarks. Click to enlarge. (Source: David A. Patterson)

EmBench details some financial, technical plans

Patterson will detail EmBench for the first time at a RISC-V workshop this week in Zurich. He will use the talk to start recruiting more volunteers to finish a version 0.5 and drive the effort forward.

Financially, “it’s all done with volunteers to avoid creating an organization with a director and staff — I think we can be a committee of an existing organization” such as the Free and Open Source Silicon Foundation, Patterson said.

On the technical side, it expects to run a small program written in assembly language of the target platform to measure its latency in context switching. The metric needs volunteers to design and document how to set up and run interrupt measurements.

The group may define a separate floating-point benchmark in the future. So far, it believes that it does not need a metric for power consumption.

In a document detailing EmBench, Patterson noted that SPEC includes “power in some of its benchmarks, but it comes with a 33-page manual on how to fairly set up and measure power, including restrictions on altitude and room temperature. It will take a while to decide what of that applies to IoT.”

In addition, variations in SoCs make it difficult to directly compare their power use. Also, “energy efficiency is often highly correlated with performance, so even if we did all the work to benchmark IoT power, the results might not be enlightening,” he wrote.

Engineers interested in participating in EmBench can email the group – info@embench.org

“I’d be surprised if this doesn’t become a standard way to benchmark embedded computers in a couple of years,” he said. “My experience is that if there’s a need, and the software is easy to port and freely available, it’s hard to stop. And there’s a thirst for something beyond CoreMark and Dhrystone, so I think it will take off.”

Leave a comment