HSA Represents The Evolution of Computing – Heterogeneous System Architecture Foundation

HSA Represents The Evolution of Computing

July 16, 2012 by Guest Blogger
The announcement of the Heterogeneous System Architecture (HSA) Foundation in June sparked an enormous amount of interest in HSA and what it means to the tech industry. HSA defines interfaces for parallel computation utilizing CPU, GPU and other programmable and fixed function devices, and support for a diverse set of high-level programming languages, thereby creating the next foundation in general purpose computing. More simply, HSA represents the latest step in an evolution that began roughly five years ago when the computing power of the graphics processing unit (GPU) began to exceed that of the central processing unit (CPU). Pure computing power, though, is not the only variable that matters in an overall computing experience, so HSA represents a lot more.
Let me explain: It turns out that a CPU is excellent in serial computations (solving a problem one piece at a time) and the GPU is excellent in parallel computations (dividing a problem into smaller ones and solving them simultaneously). Other “hardwired” processing engines might do a single thing very well, like encoding / decoding of video or managing system security. HSA defines interfaces for parallel computation utilizing CPU, GPU and other programmable and fixed function devices, and support for a diverse set of high-level programming languages, thereby creating the next foundation in general purpose computing
With this understanding that different types of computation are best performed with different compute resources, let’s see what the future of computing might look like with HSA.
AMD solved the first half of the equation for leveraging the GPU and CPU by putting them together on a single-chip called an accelerated processing unit, or APU. Being on the same chip allows computationally intensive tasks to be divided between the compute resources more efficiently, providing a better application experience to the end user. To ensure users see a net benefit, software developers want the hardware to appear as a single processing unit – HSA is conceived to ensure there is no data transfer, thus eliminating delay while ensuring use by the appropriate resource. This is where the relatively new programming models using DirectCompute, C++ AMP and OpenCL™ can help.
Let’s take a quick look at each:
DirectCompute: Part of Microsoft’s DirectX® collection of APIs, DirectCompute lets GPUs (whether on an APU or discrete) perform general computing parallel tasks beyond traditional graphics rendering and video processing.
OpenCL™: Stands for Open Compute Language, and is a programming framework which offers a computing language based on C as well as an API. With OpenCL™ you can leverage CPUs, GPUs, APUs (or even other types of processors) to accelerate parallel computations, this provides dramatic speedup for computationally intensive applications that work across devices and architectures. My colleague, Mark Ireton, wrote a recent blog post going into more depth, and there are also video overviews available on the AMD website: http://tinyurl.com/yemb266
C++ AMP: (Accelerated Massive Parallelism) is a programming model that uses C++ programming language to exploit data parallelism.
Now let’s say you are a developer trying to create a software application. Three criteria that might be important to you are:
- You want to create code that runs on as many platforms as possible
- You don’t want to learn new programming languages, or models
- You want your code to be optimized so to run as fast and efficiently as possible
DirectCompute, OpenCL™, and C++ AMP will all have various levels of success as development platforms for the above goals. For example DirectCompute is based on a Higher Level Shader Language (HLSL) which means that the number of developers that can use it is somewhat restricted, not to mention it is limited on which hardware supports it. OpenCL™ on the other hand is quite low level and could be more challenging for a C programmer than C++ AMP would be.
Beyond the software programming model, HSA is also very closely tied to changes in hardware. For AMD, making the GPU a true peer processor to the CPU with direct access by software is the ultimate goal. Today, select AMD products feature C++ support for GPU Compute, IOMMUv2 (GPU can share system memory efficiently), and Bi-Directional Power Management between the CPU and the GPU. In the near future, memory efficiency will be improved because of: unified address space for CPU and GPU where the GPU uses pageable system memory via CPU pointers (i.e. GPU can have more memory by using virtual memory), and full coherent memory between CPU and GPU. Shortly after, we get to a GPU that looks more like the CPU through features such as GPU compute context switch, GPU graphics pre-emption (to allow critical applications to get access to the GPU with the lowest latency possible), and quality of service (proper prioritization of tasks).
In conclusion, HSA takes a comprehensive view of computing architecture by defining the key elements hardware, software and programming languages so that the whole system is more efficient, powerful and easily accessible by mainstream programmers. AMD is committed to building HSA into its products, but we are also committed to sharing the basic architecture specifications and interfaces openly. More information on how this will be accomplished is available from the HSA Foundation: www.hsafoundation.com

Terry Makedon is a Product Marketing Manager at AMD. His postings are his own opinions and may not represent AMD’s positions, strategies or opinions. Links to third party sites, and references to third party trademarks, are provided for convenience and illustrative purposes only. Unless explicitly stated, AMD is not responsible for the contents of such links, and no third party endorsement of AMD or any of its products is implied.