AMD Launches Boltzmann Initiative to Reduce Barriers to GPU Computing on AMD FirePro Graphics

Building on its investments in heterogeneous system architecture (HSA), AMD announced a suite of tools designed to make heterogeneous computing available to many more software developers, increasing the pool of programmers. As SC15, the largest event for supercomputing systems and software, AMD announced the "Boltzmann Initiative", which leverages HSA's ability to harness both central processing units (CPU) and AMD FirePro graphics processing units (GPU) for maximum compute efficiency through software. The initiative will deliver new software tools to take advantage of the processing power of AMD's products, including the upcoming AMD Opteron A1100 64-bit ARM processor and the new "Zen" x86 CPU core coming next year.

Heterogeneous computing takes advantage of CPUs, GPUs, and other accelerators such as DSPs and other programmable and fixed-function devices to help increase performance and efficiency with the goal of reduced energy use. The GPU in particular is a critical component since general purpose computing on a GPU (GPGPU) makes large performance gains achievable for certain applications through parallel execution.

The first results of the initiative are featured at SC15 and include the Heterogeneous Compute Compiler (HCC); a headless Linux driver and HSA runtime infrastructure for cluster-class, High Performance Computing (HPC); and the Heterogeneous-compute Interface for Portability (HIP) tool for porting CUDA-based applications to a common C++ programming model. The tools are designed to drive application performance across markets ranging from machine learning to molecular dynamics, and from oil and gas to visual effects and computer-generated imaging.

Over the last several years, it’s been possible to program for GPU compute through the use of OpenCL, an open industry standard language, or the proprietary CUDA language. Both provide a general-purpose model for data parallelism as well as low-level access to hardware. And while both are significant improvements in both ease and functionality compared to previous methods, they still require unique programming skills.

AMD wants the heterogeneous computing to become a mainstream reality by making these technologies accessible to a majority of the programmers in the world through more familiar languages such as C++. By creating a logical model where heterogeneous processors fully share system resources such as memory, HSA promises a standard programming model that allows developers to write code that can run seamlessly on whatever processor block is best able to execute it. The idea of matching the right workload to the right processor is being embraced by many hardware and software companies. The new AMD C++ compiler makes that idea a whole lot easier to execute.

While the Windows operating system supports billions of consumer client devices and commercial servers, Linux is highly popular in technical and scientific communities where collaboration on application development is the traditional model to maximize performance. By making an all new Linux driver available, AMD is helping expand the developer base for heterogeneous computing even further. Benefits for the programmer of this new, headless Linux driver include low latency compute dispatch, peer-to-peer GPU support, Remote Direct Memory Access (RDMA) from InfiniBand interconnects directly to GPU memory, and Large Single Memory Allocation support.

Finally, for applications already developed in CUDA, they can now be ported into C++. This is achieved using the new Heterogeneous-computing Interface for Programmers (HIP) tool that ports CUDA runtime APIs into C++ code. AMD testing shows that in many cases 90 percent or more of CUDA code can be automatically converted into C++ by HIP. The remainder will require manual programming, but this should take a matter of days, not months as before. Once ported, the application could run on a variety of underlying hardware, and enhancements could be made directly through C++.

An early access program for the "Boltzmann Initiative" tools is planned for Q1 2016.

On the hardware side, at SC15 in booth 727, AMD showcases systems from Dell, HP and Supermicro that include AMD FirePro GPUs running demonstrations such as AMD FireRender, Abaqus and TUM Navier-Strokes.

In addition, AMD's Opteron A1100 series ARM processor installed in SoftIron's new Enterprise Class Overdrive 3000 system for developers, and the AMD FirePro S9170 server GPU is in action with Dell and Supermicro servers. AMD FirePro S9170 card features 32GB high-speed and high-bandwidth onboard memory.