Intel is showing at the Supercomputing 2016 show this week two new versions of its Xeon processor and a new FPGA card for deep learning.
AI is all around us, from the commonplace (talk-to-text, photo tagging, fraud detection) to the cutting edge (precision medicine, injury prediction, autonomous cars). The growth of data, better algorithms, and faster compute capabilities is leading to this revolution in artificial intelligence.
Machine learning, and its subset deep learning, are key methods for the expanding field of AI. Deep learning is a set of machine learning algorithms that utilize deep neural networks, to power advanced applications, such as image recognition and computer vision, with wide-ranging use-cases across a variety of industries.
Intel continues to position Xeon Phi, a massively multicore x86 processor, as its key weapon against graphics processors from Nvidia and AMD. At IDF in August it said the Knights Mill version of Phi, the first to act as both host and accelerator, will ship in 2017.
Intel dominates the high margin market for server processors, but machine learning demands more performance on highly parallel tasks than those chips offer.
Google is already using its own ASIC to accelerate the kind of tasks in machine learning. Intel targets with a new PCI Express card using an Altera Arria FPGA. Facebook designed its own GPU server using Nvidia chips for the computationally intensive job of training neural networks.
Meanwhile, Nvidia launched its own GPU server earlier this year, and IBM and Nvidia collaborated on another one using Power processors. For its part, AMD rolled out an open software initiative for its GPUs earlier this year.
To deliver optimal solutions for each customer’s machine learning requirements, Intel offers flexible and performance optimized portfolio of AI solutions, powered by Intel Xeon processors, Intel Xeon Phi processors or systems using FPGAs.
One of the biggest challenges of implementing FPGAs is the work needed to lay out the specific circuitry for each workload and algorithm, and develop custom software interfaces for each application. To make this easier, the Intel Deep Learning Inference Accelerator (Intel DLIA) was designed to deliver the latest deep learning capabilities via FPGA technology as a turnkey solution. Intel DLIA is a complete solution, combining hardware, software, and IP into an end-to-end package that provides superior power efficiency for inference for deep learning workloads.
The Intel DLIA brings together Intel Xeon processor and an Intel Arria 10 FPGA with Intel’s software ecosystem for AI and machine learning, including frameworks such as Intel-optimized Caffe and Intel’s Math Kernel Libraries for Deep Neural Networks (Intel MKL-DNN).
Intel’s Arria 10 FPGA PCIe card will quadruple performance/watt when running so-called scoring or inference jobs when it ships next year.
The Intel Deep Learning Inference Accelerator will come with intellectual property (IP) for convolutional neural networks (CNNs), supporting targeted CNN-based topologies and variations, all reconfigurable through software.
The Intel Deep Learning Inference Accelerator will be available in early 2017.
Intel is working on support for seven machine learning frameworks, including the Neon software it acquired with Nervana.
The Nervana offering alone will include its Neon framework as part of an end-to-end solution focused on the enterprise with solution blueprints and reference platforms.
The new Knights Mill version of Xeon Phi coming next year will be optimized for the toughest AI jobs such as training neural nets. It will support mixed-precision modes, likely including the 16-bit precision work becoming widely adopted to speed the job of getting results when combing through large data sets.
Intel’s latest Xeon chips on display at the supercomputing show is the new 14nm Broadwell-class Xeon E5 2699A with a 55 Mbyte L3 cache and 22 cores running at 2.4 GHz. It sports a mere 4.8% gain over the prior chip on the Linpack benchmark popular in high performance computing.
Intel currently has no plans to support open interconnects for accelerators such as CCIX and memory such as GenZ recently launched by companies including Dell and Hewlett-Packard Enterprise.
So far, Intel has 50 large deployments for the discrete version of its Xeon Phi accelerator. It starts shipping this week a version with Omnipath, an Intel link that is an alternative to Infiniband and Ethernet.
Intel is also integrating Omnipath on Skylake, its next 14nm Xeon processor. The company is demoing the chip for the first time at the supercomputing event. The processors will ship next year and also be available in versions without Omnipath. The new processor will offer HPC Optimizations, such as Intel Advanced Vector Instructions-512 boost floating point calculations & encryption algorithms.
Omnipath is currently used in more than half of all servers supporting 100 Gbit/second links.