Following in the footsteps of its merger partner Qualcomm, NXP is supporting inference jobs like image recognition in software on its i.MX8 processor.

NXP's goal is to extend its approach for natural-language processing, claiming that dedicated hardware is not required in resource-constrained systems. However, the mobile giant expects to eventually augment its code with dedicated hardware. NXP and Qualcomm's shared IP partner, ARM, is developing neural networking libraries for its cores.

NXP’s i.MX8 packs two GPU cores from Vivante, now part of Verisilicon. They use about 20 opcodes that support multiply-accumulates and bit extraction and replacement, originally geared for running computer vision.

“Adding more and more hardware is not the way forward on the power budget of a 5W SoC,” said Geoff Lees, NXP’s executive vice president for i.MX. “I would like to double the Flops, but we got the image processing acceleration we wanted for facial and gesture recognition and better voice accuracy.”

The software is now in use with NXP’s lead customers for image-recognition jobs. Meanwhile, Verisilicon and NXP are working on additional extensions to the GPU shader pipeline targeting natural-language processing. They hope to have the code available by the end of the year.

“Our VX extensions were not originally viewed as a neural network accelerator, but we found [that] they work extraordinarily well … the math isn’t much different,” said Thomas “Rick” Tewell, vice president of system solutions at Verisilicon.

The GPU cores come with OpenCL drivers. “No one has to touch the instruction extensions … people don’t want to get locked into an architecture or tool set; they want to train a set of engineers who are interchangeable.”

NXP_IMX8-dev-kit (cr) Figure 1: One i.MX8 dev kit supports up to eight cameras. (Source: NXP)

ARM is taking a similar approach with its ARM Compute Library, released in March to run neural net tasks on its Cortex-A and Mali cores.

“It doesn’t have a lot of features yet and only supports single-precision math—we’d prefer 8-bit—but I know ARM is working on it,” said a Baidu researcher working on its neural net benchmark. “It also lacks support for recurrent neural nets, but most libraries still lack this.”

For its part, Qualcomm released earlier this year its Snapdragon 820 Neural Processing Engine SDK. It supports jobs run on the SoC’s CPU, GPU and DSP and includes Hexagon DSP vector extensions to run 8-bit math for neural nets.

“Long-term, there could be a need for dedicated hardware,” said Gary Brotman, director of product management for commercial machine-learning products at Qualcomm. “We have work in the lab today but have not discussed a time-to-market.”

The code supports a variety of neural nets, including LSTMs often used for audio processing. Both NXP and Qualcomm execs said that it’s still early days for availability of good data sets to train models for natural-language processing. “Audio is the next frontier,” said Brotman.

First published by EE Times U.S.