Nvidia Makes Pixel Processing Pipeline Faster
Nvidia has been awarded with a new patent for an invention that makes the pixel processing pipeline faster and more efficient.
With U.S. Patent no. 7609272, the company hit a major milestone ? 1,000 patents. The new patent helps the shader process textures in a way that makes full use of any extra circuits, speeding up output, according to AMD's Hector Marinez.
"Previously, when a large texture needed to be read, one instruction would be issued, and one shader circuit would need to make several passes while other circuits sat idle. But patent authors Emmett Kilgariff and Rui Bastos ?both longtime NVIDIANS ? figured out a way to allow for a partial texture load. By breaking the texture load into smaller pieces - able to be completed in one pass each - all circuits can keep firing, Marinez explained in a blog post.
As Bastos recalls, the idea came from asking themselves, "What can we do to reduce the number of cycles required to run a program and get applications/games running faster? The key for the idea was an application of the divide-and-conquer principle."
Textures can be 32-bit, 64-bit, or 128-bit. But anything larger than 32-bit requires more than one pass. Before Bastos and Kilgariff?s invention, texture lookups were monolithic instructions that took multiple cycles to be executed, leaving other shader functional units to sit idle. "The idle units in the pipe presented the opportunity to try to fit other, non-texture instructions in those slots ? i.e., run more than one instruction per cycle," says Bastos. But to do that, the monolithic texture-load instructions had to be split into chunks. Break a 128-bit texture into four pieces ? each of which can be completed in one pass ? and that lets one cycle-hungry instruction be broken into four instructions. Doing this means that other circuits keep processing instructions ? no more waiting.
In addition, Kilgariff and Bastos discovered they could reorder instructions for greater efficiency. For instance, if a texture for instruction 1 is not immediately available, the shader circuit could get to work on instruction 2. Instructions don?t back up in a queue.
By providing for partial texture loads and reordering instruction sequences for greater efficiency, Kilgariff and Bastos found they could reduce the number of required passes. Ultimately, textures render faster and game play is more seamless.
The invention made its way to the market in 2004 in the GeForce 6 family of products. It also featured in the RSX ? or Reality Synthesizer ? GPU that NVIDIA co-developed for the Sony PlayStation 3.
"Previously, when a large texture needed to be read, one instruction would be issued, and one shader circuit would need to make several passes while other circuits sat idle. But patent authors Emmett Kilgariff and Rui Bastos ?both longtime NVIDIANS ? figured out a way to allow for a partial texture load. By breaking the texture load into smaller pieces - able to be completed in one pass each - all circuits can keep firing, Marinez explained in a blog post.
As Bastos recalls, the idea came from asking themselves, "What can we do to reduce the number of cycles required to run a program and get applications/games running faster? The key for the idea was an application of the divide-and-conquer principle."
Textures can be 32-bit, 64-bit, or 128-bit. But anything larger than 32-bit requires more than one pass. Before Bastos and Kilgariff?s invention, texture lookups were monolithic instructions that took multiple cycles to be executed, leaving other shader functional units to sit idle. "The idle units in the pipe presented the opportunity to try to fit other, non-texture instructions in those slots ? i.e., run more than one instruction per cycle," says Bastos. But to do that, the monolithic texture-load instructions had to be split into chunks. Break a 128-bit texture into four pieces ? each of which can be completed in one pass ? and that lets one cycle-hungry instruction be broken into four instructions. Doing this means that other circuits keep processing instructions ? no more waiting.
In addition, Kilgariff and Bastos discovered they could reorder instructions for greater efficiency. For instance, if a texture for instruction 1 is not immediately available, the shader circuit could get to work on instruction 2. Instructions don?t back up in a queue.
By providing for partial texture loads and reordering instruction sequences for greater efficiency, Kilgariff and Bastos found they could reduce the number of required passes. Ultimately, textures render faster and game play is more seamless.
The invention made its way to the market in 2004 in the GeForce 6 family of products. It also featured in the RSX ? or Reality Synthesizer ? GPU that NVIDIA co-developed for the Sony PlayStation 3.