CEVA Introduces the NeuPro-S AI Processor for Deep Neural Network Workloads

CEVA, Inc. has developed NeuPro-S, its second-generation AI processor architecture for deep neural network inferencing at the edge.

In conjunction with NeuPro-S, CEVA also introduced the CDNN-Invite API, a deep neural network compiler technology that supports heterogeneous co-processing of NeuPro-S cores together with custom neural network engines, in a unified neural network optimizing run-time firmware. NeuPro-S, along with CDNN-Invite API, can be used by any vision-based device with the need for edge AI processing, including autonomous cars, smartphones, surveillance cameras, consumer cameras and the emerging use cases in AR/VR headsets, robots and industrial applications.

Designed to optimally process neural networks for segmentation, detection and classification of objects within videos and images in edge devices, NeuPro-S includes system-aware enhancements. These include support for multi-level memory systems to reduce costly transfers with external SDRAM, multiple weight compression options, and heterogeneous scalability that enables various combinations of CEVA-XM6 vision DSPs, NeuPro-S cores and custom AI engines in a single, unified architecture. More specifically, weight compression is achieved by retraining and compression via CDNN (offline) and decompression via the NeuPro-S engine (real-time). Further, by enabling seamless use of L2 memory types, internal memory improves. It also features robust DMA and the local memory system by optimizing parallel processing and memory fetching to minimize overheads. This means that NeuPro-S does not draw power from a main computer. According to Ceva, the NeuPro-S achieves on average, 50% higher performance, 40% lower memory bandwidth and 30% lower power consumption compared to the company's first-generation AI processor.

The NeuPro-S family includes NPS1000, NPS2000 and NPS4000, pre-configured processors with 1000, 2000 and 4000 8-bit MACs respectively per cycle. The NPS4000 offers the highest CNN performance per core with up to 12.5 Tera Operations Per Second (TOPS) @ 1.5GHz and is fully scalable to reach up to 100 TOPS.

The CDNN-Invite API allows the incorporation of neural network engines into CEVA's Deep Neural Network (CDNN) framework. CDNN will then holistically optimize and enhance networks and layers to take advantage of the performance of each of the CEVA-XM6 vision DSP, NeuPro-S and custom neural network processors.

The fully programmable CEVA-XM6 vision DSP incorporated in the NeuPro-S architecture facilitates simultaneous processing of imaging, computer vision and general DSP workloads in addition to AI runtime processing. This also allows customers and algorithm developers to take advantage of CEVA's extensive imaging and vision software and libraries, including the CEVA-SLAM software development kit for 3D mapping, CEVA-CV and CEVA-VX software libraries for computer vision development, and its recently acquired wide-angle imaging software suite including dewarp, video stitching and Data-in-Picture sensor fusion technology.

NeuPro-S is available today and Ceva has already licensed it to customers for automotive and consumer camera applications. CDNN-Invite API is available for today and for general licensing by the end of 2019.