GreenWaves tweaks architecture, uses FD-SOI to cut power by 5x
LONDON – The next generation of GreenWaves’ ultra-low power AI accelerator, GAP9, will use five times less power than its predecessor, GAP8, while handling algorithms that are 10x bigger. The new device will offer up to 50 GOPS at an overall power consumption of 50mW. This is down to a combination of architectural improvements and a new state-of-the-art FD-SOI (fully depleted silicon on insulator) process technology.
Like the previous generation device, GAP9 is aimed at AI inferencing in systems at the very edge of the network, such as small, battery-powered IoT sensor nodes. As an example, GreenWaves’ figures have GAP9 running MobileNet V1 on 160 x 160 images with a channel scaling of 0.25 in just 12ms with a power consumption of 806 μW/frame/second.
GreenWaves, based in Grenoble, France, has chosen GlobalFoundries’ 22nm FDX FD-SOI process to minimise the power consumption of what was already an ultra-low power architecture.
“For GAP9, we’ve tuned the GAP8 architecture using customer feedback on GAP8, but at the same time we’ve moved to a market-leading semiconductor process,” said Martin Croome, vice president of marketing at GreenWaves. “We are using the body biasing ability in FD-SOI to allow us to achieve even lower power consumption.”
GreenWaves has made several architectural advancements for GAP9.
One more RISC-V core has been added, bringing the total to 10. One core is used as a fabric controller as well as for low intensity compute in certain modes. The other nine make up a computation cluster with a shared L1 data area. One core in this cluster (the new one) is used as a task group master, calculating memory movements and managing tasks on the other eight cores.
Internal RAM has been tripled to 1.6MB and memory bandwidth has been increased to 41.6 GB/sec for L1 and 7.2 GB/s for L2.
“This [memory bandwidth] is now very significant for an MCU-class device,” Croome said.
Changes to the GAP9 architecture also include a much higher top frequency; GAP8 clocked in at 175MHz, GAP9 will run at or close to 400MHz. New power states have also been added, including a “dozy” state when data can be acquired but the power consumption is still under 1 mW. In this state, the processor can run on a low dropout regulator (LDO) which can start up quickly. This brings GAP9’s time to first instruction down to just a few microseconds (GAP8 took around 700 µs while it waited for the DC-DC converter to stabilise, Croome said). This quick start-up capability is useful when capturing time-based signals such as speech.
All ten cores are now capable of handling ‘transprecision’ floating point numbers: IEEE format 16 and 32-bit floating point plus additional 8 and 16-bit formats with support for vectorization. This capability can be used to lower the energy requirements for algorithms that require floating point. GAP9 also supports vectorized 4-bit and 2-bit operations for applications exploiting deep levels of quantization.
Other new features include bi-directional multi-channel audio interfaces.
GAP9 is expected to reach mass production in 2021 with samples coming in the first half of 2020. Croome said the pricing is expected to be at a 50% premium compared to GAP8. Given the different timing, power figures and price point, the company expects both products will find markets going forward.