Today’s high computational throughput probably would not be attainable without the application of the SIMD paradigm in modern processors in increasingly clever ways. It’s no coincidence that GPUs also gain most of their performance, die area, and efficiency benefits thanks to this instruction issue scheme. In this article we will explore a couple of examples of how GPUs may take advantage of SIMD and the implications of those on the programming model.
Multisampling is a well-understood technique used in computer graphics that enables applications to efficiently reduce geometry aliasing, yet not everybody is familiar with the entire toolset offered by modern GPU hardware to control multisampling behavior. In this article we present the behavior of basic multisampling and explore a set of controls that enable us to tune performance/quality trade-offs and open doors for more advanced rendering techniques.
The behavior of the graphics pipeline is practically standard across platforms and APIs, yet GPU vendors come up with unique solutions to accelerate it, the two major architecture types being tile-based and immediate-mode rendering GPUs. In this article we explore how they work, present their strengths/weaknesses, and discuss some of the implications the underlying GPU architecture may have on the efficiency of certain rendering algorithms.
The Khronos Group recently released a set of provisitional extensions adding video encoding and decoding capabilities to the Vulkan API, collectively referred to as Vulkan Video. This thus seemed like the perfect opportunity to provide an introduction to video compression from the perspective of a graphics programmer, and discuss why having integrated support for video encoding and decoding as part of the Vulkan API is an important step forward for the industry.
Most systems programmers are not new to creating and applying custom memory allocators in performance or memory constrained projects. However, the benefits of purpose-built memory allocators are often underestimated or overlooked by many programmers. This article aims to provide an overview of the motivation and advantages of deploying custom memory allocation schemes and presents a few common allocation strategies.
Previously we explored the different types of memories available for access by the GPU, but only barely touched on the topic of caches. In this article we will make up for that by taking a look at the different caches available on modern GPUs to appreciate their role in the system. Having thorough understanding of GPU cache behavior enables developers to better utilize them and thus improve the performance of their graphics or compute applications.
With the recent announcement of AMD Smart Access Memory it seemed to be the right time to write about the different types of memories available to be used by applications targeting dedicated GPUs. This article aims to provide an introduction to different memory pools within such a system, their access characteristics, and why enabling access to the entire VRAM through the PCI-Express bus could be a game changer.
Article about the new API improvements and hardware features brought by the latest version of OpenGL.