The Japanese firm brings its extreme quantization technology to hardware IP...
LeapMind (Tokyo, Japan) announced its entry into the processor IP business with Efficiera, an ultra-low-power AI inference accelerator IP product.
Efficiera is optimized for models that have been heavily quantized using LeapMind’s ‘extremely low-bit quantization’ software techniques. It is designed for convolutional neural networks (CNNs), the type of network typically used for image processing and analysis tasks today.
“This is the company’s first hardware IP product. But we are working on the development of a core technology called extreme quantization technology that operates at both software and hardware-IP levels with a network optimized for practical applications and a dedicated compiler,” a LeapMind spokesperson told EE Times.
An Efficiera test chip built with Taiwanese manufacturer Alchip on TSMC’s 12nm 6T SVt process achieved 6.55 TOPS (at 800MHz) and consumed the equivalent of 14.8 TOPS/W. The Efficiera block occupied 0.442mm2 of silicon area.
The company’s secret sauce is its ‘extremely low bit quantization’ technology, which quantizes models to 1- or 2-bit precision.
Quantization, the process of reducing the number of bits used for parameters in a neural network model, can vastly improve performance as it reduces memory bandwidth and increases computational efficiency. However, reducing all numbers to 1, 2 or 4 bits comes at the price of overall prediction accuracy. Doing extreme quantization in a practical way therefore requires retraining the model from scratch in a way designed to optimize prediction accuracy, usually by increasing the size of the network to compensate. Done right, this software operation can save significant compute and power in the application.
LeapMind says the Efficiera test chip it produced with Alchip did not use cutting-edge semiconductor manufacturing processes or specialized cell libraries to optimize the power efficiency and silicon area associated with convolution operations, the mathematical basis for CNNs. All performance advantages measured with the test chip were down to its proprietary quantization methods. The company points out that there is therefore room to further improve the power efficiency if required.
Efficiera is for SoCs in edge devices that require low power and low cost, such as household appliances, industrial machines, security cameras and robots. It’s suitable for acceleration of video processing applications such as hazard detection, image noise reduction and artificially increasing the resolution of footage (super resolution).
LeapMind, founded in 2012, previously worked on quantization software, neural network algorithms and AI applications. The company has no plans to develop or sell ASICs or SoCs, a company spokesperson told EE Times.
Efficiera will begin shipping in autumn 2020.