Intel's Multicore Architecture Briefing
Intel today discussed upcoming microprocessors and technologies including its 45nm high-k metal gate manufacturing technology, and discussed future products with four, six, eight and many computing cores coming to the market.
Pat Gelsinger, Intel Senior Vice President and General
Manager, Digital Enterprise Group disclosed details
around Intel's multi-processor (MP) servers based on
Intel's 6-core processor codenamed "Dunnington" and
Intel's new Itanium processor codenamed "Tukwila."
Gelsinger discussed current enterprise topics, including
virtualization and the new SPEC power benchmark for
measuring server energy efficiency, in which Intel-based
systems hold all of the top 20 spots. He disclosed a
number of technical features on two important products
for Intel, Nehalem, Intel's next generation processor
family and Larrabee, a future Intel product with many
cores.
Dunnington for Expandable (Multi-Processor) servers
Intel's current 7300 chipset based platform combined with the Quad-Core Xeon 7300 processor is the industry's virtualization platform of choice for MP servers.
Dunnington is socket-compatible with the Caneland platform and will be available in the second half of 2008. Dunnington is the first IA (Intel Architecture) processor with 6-cores, is based on the 45nm high-k process technology, and has large shared caches. Another supported feature is FlexMigration technology, which allows a single compatible virtualization pool that supports live VM (Virtual Machine) migration across both 65nm and 45nm high-k Intel Core microarchitecture-based servers and 45nm-based servers.
Tukwila for performance
Tukwila is Intel's next-generation Itanium processor with four cores, 30MB total cache, QuickPath Interconnect, dual Integrated Memory Controller and mainframe-class RAS features. It is the world's first 2 billion transistor microprocessor and is projected to deliver more than double the performance of the current generation Itanium processor.
Nehalem is Intel's dynamically scalable new processor microarchitecture
Nehalem will provide performance and energy improvements to Intel's current microprocessors. Nehalem is scalable with future versions having anywhere from 2 to 8 cores, with Simultaneous Multi-threading, resulting in 4 to 16 thread capability. Nehalem will deliver 4 times the memory bandwidth compared to today's highest-performance Intel Xeon processor-based systems. With up to 8 MB level-3 cache, 731 million transistors, Quickpath interconnects (up to 25.6GB per second), integrated memory controller and optional integrated graphics, Nehalem will eventually scale from notebooks to high-performance servers.
Other features discussed include support for DDR3-800, 1066, and 1333 memory, SSE4.2 instructions, 32KB instruction cache, 32KB Data Cache, 256K L2 data and instruction low-latency cache per core and new 2-level TLB (Translation Lookaside Buffer) hierarchy.
These technical improvements will result in performance improvements as well as flexibility for a wide range of eventual products based on the Nehalem architecture. Gelsinger also discussed the new Tylersburg platform, which can be configured for both one socket High End Desktop (HEDT) and two socket (HPC and dual processing server) operation.
Visual Computing: Graphics Re-defined
Visual Computing is redefining the high definition experience for computer users. Next-generation techniques for delivering naturally realistic gaming, graphics and high definition video and audio are driving increasing performance and architecture demands on the PC. For example, global illumination techniques such as ray tracing used to provide accurate shadow and lighting effect place greater performance demands on computers than traditional graphics. Behavioral realism in applications such as real-life physics in game titles or life-like display of human motion in medical imaging drive the need for more general purpose computing. Finally, entirely new levels of interactivity will emerge. For example, new forms of game controllers that can understand human motion will enable users to become characters in their favorite games. In medical imaging patient sensors will feed real time information to enable doctors to perform interactive computed guided procedures. In order to deliver on the promise of Visual Computing, a complete platform is required. This includes the multi-core CPU, chipset and graphics plus software and associated developer tools.
Larrabee Architecture for Visual Computing
With plans for the first demonstrations later this year, the Larrabee architecture will be Intel's next step in evolving the visual computing platform. The Larrabee architecture includes a high-performance, wide SIMD vector processing unit (VPU) along with a new set of vector instructions including integer and floating point arithmetic, vector memory operations and conditional instructions. In addition, Larrabee includes a major new hardware coherent cache design enabling the many-core architecture. The architecture and instructions have been designed to deliver performance, energy efficiency and general purpose programmability to meet the demands of visual computing and other workloads that are inherently parallel in nature. Tools are critical to success and key Intel Software Products will be enhanced to support the Larrabee architecture and enable developer freedom. Industry APIs such as DirectX and OpenGL will be supported on Larrabee-based products.
Intel AVX: The next step in the Intel instruction set
Gelsinger also discussed Intel AVX (Advanced Vector Extensions) which, when used by software programmers, will increase performance in floating point, media, and processor intensive software. AVX can also increase energy efficiency, and is backwards compatible to existing Intel processors. Key features include wider vectors, increasing from 128 bit to 256 bit wide, resulting in up to 2x peak FLOPs output. Enhanced data rearrangement, resulting in allowing data to be pulled more efficiently, and three operand, non-destructive syntax for a range of benefits. Intel will make the detailed specification public in early April at the Intel Developer Forum in Shanghai. The instructions will be implemented in the microarchitecture codenamed "Sandy Bridge" in the 2010 timeframe.
Dunnington for Expandable (Multi-Processor) servers
Intel's current 7300 chipset based platform combined with the Quad-Core Xeon 7300 processor is the industry's virtualization platform of choice for MP servers.
Dunnington is socket-compatible with the Caneland platform and will be available in the second half of 2008. Dunnington is the first IA (Intel Architecture) processor with 6-cores, is based on the 45nm high-k process technology, and has large shared caches. Another supported feature is FlexMigration technology, which allows a single compatible virtualization pool that supports live VM (Virtual Machine) migration across both 65nm and 45nm high-k Intel Core microarchitecture-based servers and 45nm-based servers.
Tukwila for performance
Tukwila is Intel's next-generation Itanium processor with four cores, 30MB total cache, QuickPath Interconnect, dual Integrated Memory Controller and mainframe-class RAS features. It is the world's first 2 billion transistor microprocessor and is projected to deliver more than double the performance of the current generation Itanium processor.
Nehalem is Intel's dynamically scalable new processor microarchitecture
Nehalem will provide performance and energy improvements to Intel's current microprocessors. Nehalem is scalable with future versions having anywhere from 2 to 8 cores, with Simultaneous Multi-threading, resulting in 4 to 16 thread capability. Nehalem will deliver 4 times the memory bandwidth compared to today's highest-performance Intel Xeon processor-based systems. With up to 8 MB level-3 cache, 731 million transistors, Quickpath interconnects (up to 25.6GB per second), integrated memory controller and optional integrated graphics, Nehalem will eventually scale from notebooks to high-performance servers.
Other features discussed include support for DDR3-800, 1066, and 1333 memory, SSE4.2 instructions, 32KB instruction cache, 32KB Data Cache, 256K L2 data and instruction low-latency cache per core and new 2-level TLB (Translation Lookaside Buffer) hierarchy.
These technical improvements will result in performance improvements as well as flexibility for a wide range of eventual products based on the Nehalem architecture. Gelsinger also discussed the new Tylersburg platform, which can be configured for both one socket High End Desktop (HEDT) and two socket (HPC and dual processing server) operation.
Visual Computing: Graphics Re-defined
Visual Computing is redefining the high definition experience for computer users. Next-generation techniques for delivering naturally realistic gaming, graphics and high definition video and audio are driving increasing performance and architecture demands on the PC. For example, global illumination techniques such as ray tracing used to provide accurate shadow and lighting effect place greater performance demands on computers than traditional graphics. Behavioral realism in applications such as real-life physics in game titles or life-like display of human motion in medical imaging drive the need for more general purpose computing. Finally, entirely new levels of interactivity will emerge. For example, new forms of game controllers that can understand human motion will enable users to become characters in their favorite games. In medical imaging patient sensors will feed real time information to enable doctors to perform interactive computed guided procedures. In order to deliver on the promise of Visual Computing, a complete platform is required. This includes the multi-core CPU, chipset and graphics plus software and associated developer tools.
Larrabee Architecture for Visual Computing
With plans for the first demonstrations later this year, the Larrabee architecture will be Intel's next step in evolving the visual computing platform. The Larrabee architecture includes a high-performance, wide SIMD vector processing unit (VPU) along with a new set of vector instructions including integer and floating point arithmetic, vector memory operations and conditional instructions. In addition, Larrabee includes a major new hardware coherent cache design enabling the many-core architecture. The architecture and instructions have been designed to deliver performance, energy efficiency and general purpose programmability to meet the demands of visual computing and other workloads that are inherently parallel in nature. Tools are critical to success and key Intel Software Products will be enhanced to support the Larrabee architecture and enable developer freedom. Industry APIs such as DirectX and OpenGL will be supported on Larrabee-based products.
Intel AVX: The next step in the Intel instruction set
Gelsinger also discussed Intel AVX (Advanced Vector Extensions) which, when used by software programmers, will increase performance in floating point, media, and processor intensive software. AVX can also increase energy efficiency, and is backwards compatible to existing Intel processors. Key features include wider vectors, increasing from 128 bit to 256 bit wide, resulting in up to 2x peak FLOPs output. Enhanced data rearrangement, resulting in allowing data to be pulled more efficiently, and three operand, non-destructive syntax for a range of benefits. Intel will make the detailed specification public in early April at the Intel Developer Forum in Shanghai. The instructions will be implemented in the microarchitecture codenamed "Sandy Bridge" in the 2010 timeframe.