Latest Synopsys ARC embedded vision processors integrate multicore vision engine and DNN accelerator to deliver 35 TOPS.
Synopsys has launched its latest generation of embedded vision processors with deep neural network (DNN) accelerator delivering what it claims is an industry-leading 35 TOPS (tera operations per second) performance for artificial intelligence (AI) intensive edge applications. Also introduced is a functional safety processor version for automotive advanced driver assist systems (ADAS), radar/lidar, and automotive sensor system on chip (SoC) development.
Based on the ARCv2 RISC instruction set architecture, the new DesignWare ARC EV7x vision processors feature a 1, 2 or 4-core heterogeneous architecture which integrates vector DSP, vector FPU, and neural network accelerator to enable a variety of intelligent automotive and consumer applications with integrated AES encryption. The optional DNN accelerator scales from 880 to 14,080 MACs to enable a system that delivers up to 35 TOPS performance in 16-nanometer (nm) FinFET process technologies under typical conditions, four times the performance of the previous generation ARC EV6x processors.
In a telephone briefing with EE Times, Gordon Cooper, product marketing manager for Synopsys EV processors, said the EV7x processors are an optimization of the EV6x processors, but it was not just about adding four times the MACs. He said adding MACs was the easy part of scaling CNN (convolutional neural network) graph performance, but what’s really critical is improving memory bandwidth required for external memory accesses in order to minimize power consumption. This bandwidth handling in the new processors means it is possible to scale to 100 TOPS, allowing use of lower cost DRAM interfaces.
In addition, advanced graph mapping tools are needed to partition a CNN graph across increasing MACs. As a result, he said the EV7x processors accelerate frame-per-second throughput by up to 65% compared to EV6x.
The next piece added to the EV7x family is the DNN. Cooper said that a CNN takes information from a 2D image, but if you add RNNs (recurrent neural networks) and LSTMs (long short-term memories), this then provides the temporal data. Hence the EV6x’s CNN engine has now become a DNN accelerator in the EV7x. The other two main features added to the EV7x series are secure AES encryption and a real time trace hardware module.
The EV7x architecture and software development
The new EV7x vision processors’ multicore architecture includes up to four high-performance vector processing units (VPUs), each of which includes a 32-bit scalar unit and a 512-bit wide vector DSP, configurable for 8-, 16-, or 32-bit operations to perform simultaneous multiply-accumulates on different streams of data. Synopsys said the DNN accelerator employs a specialized architecture for faster memory access, higher performance, and better power efficiency than alternative neural network IP. In addition to supporting CNNs, the DNN accelerator supports batched LSTMs for applications that require time-based results, such as predicting the location of a pedestrian based on their observed path and speed.
The vision engine and the DNN accelerator work on tasks in parallel, making the EV7x particularly efficient for autonomous vehicles and ADAS applications where multiple cameras and vision algorithms operate concurrently.
Security and safety are also key aspects of the new EV7x processors. Optional AES-XTS encryption engines protect data passing from on-chip memory to the vision engine and DNN accelerator. The engine prevents high-value data such as training datasets and personal biometric data, including facial recognition and retina scans, from being exploited.
Software development for the ARC EV7x vision processor family is enabled by the MetaWare EV development toolkit, a high-productivity software development environment based on common embedded vision standards, including OpenVX and OpenCL C. The tool suite enables the development of efficient computer vision applications on the EV7x processor’s vision engine as well as automatic mapping and optimization of neural networks graphs on the dedicated DNN accelerator. The mapping tools support Caffe and Tensorflow frameworks, as well as the ONNX neural network interchange format.
The DesignWare ARC real-time trace (RTT) unit helps trace executed instructions or program flow and data, generating Nexus 5001 class 3-compliant trace messages. The RTT system can be set up in many different configurations which need to be specified as build-time configurations by including the trace generator in the core and the RTT module at build time, and can support on- and off-chip memory setups to suit application tracing needs.
Automotive SoC functional safety
ASIL B and ASIL D compliant versions of the new processors, the ARC EV7xFS portfolio, also announced at launch, accelerate ISO 26262 certification of automotive SoCs. The functional safety-enhanced processors offer hardware safety features, safety monitors, and lockstep capabilities that enable designers to achieve stringent levels of functional safety and fault coverage without significant impact on power or performance. In addition, a new “hybrid” option enables system architects to select required safety levels up to ASIL D in the software, post-silicon.
The Synopsys ARC "FS" cores integrate hardware safety features, such as redundant processors, error-correcting code (ECC), parity protection, safety monitors, and user-programmable windowed watchdog timers, to detect system errors. Comprehensive documentation related to safety, including enhanced-safety manuals, FMEDA, and DFMEA reports accelerate SoC-level functional safety assessments. In addition, the ARC MetaWare development toolkits help simplify the development of ISO 26262-compliant software.
The ARC EM22FS processor provides ultra-low power, dual-core lockstep functionality for ASIL D safety requirements in applications such as automotive sensors, braking and steering systems, and keyless entry. For use cases with ASIL B requirements (i.e., non-lockstep), the processor can be configured with the two cores operating independently. The ARC HS4xFS processors support single-, dual-, and quad-core implementations to enable high-performance safety applications, such as vehicle-to-vehicle (V2V), vehicle-to-infrastructure (V2I) networking, and electric vehicle battery charging. In addition, the ARC EM22FS and HS4xFS processors can function as ASIL D compliant SoC-level safety managers with tight integration to Synopsys test solutions to provide a comprehensive functional safety test solution.
The ARC EV7xFS embedded vision processors, with their multicore vision CPU and DNN engine, integrate safety-critical hardware features to help meet ASIL B and D requirements for vision, radar, and lidar for ADAS applications and level 3+ autonomous vehicles.
One programming environment for variety of vision applications
The combination of high-performance vision engine and DNN accelerator and programming tools make the ARC EV7x embedded vision processors suited to a broad range of vision applications. Kudan, a developer of simultaneous localization and mapping (SLAM) and artificial perception (AP) algorithms for embedded systems, and a Synopsys partner, said the new EV7x vision processors optimize the execution of linear algebra and matrix math operations to accelerate processing on SLAM and its related solutions, such as real-time tracking for AR/VR and localization for autonomous driving, while increasing the accuracy of environmental maps. Tomo Ohno, CEO and co-founder of Kudan commented, “Through our collaboration with Synopsys, designers have access to a highly efficient SLAM solution that delivers high performance while consuming significantly less power and memory resources than alternate implementations.”
Taiwan-based ULSee said its facial tracking and computer vision algorithms running on ARC EV7x vision processors provide its mutual customers with high-performance, power-efficient solutions for edge applications such as automotive ADAS and mobile. Dr. Yi-Ta Wu, vice president of engineering and auto team leader at ULSee, said, “The extensive EV7x vision processor configuration options enable designers to address a wide range of devices within the same programming environment, such as low-power SoCs for drowsiness detection and high-end SoCs for environment monitoring. This versatility saves design teams effort and time-to-market for a tremendous competitive advantage.”
The ARC EV7x embedded vision processors, DNN accelerator option up to 14,080 MACs, and MetaWare EV software is expected to be available for lead customers in Q1 2020. The DNN accelerator option with up to 3,520 MACs is available now. The automotive functional safety processors ARC EM22FS processor is scheduled to be available during Q4 2019, while the ARC HS4xFS processor will be available in Q1 2020.
New products & solutions, whitepaper downloads, reference designs, videos
Register, join the conference, and visit the booths for a chance to win great prizes.