BEAVERTON, OR – March 14, 2017 – The Heterogeneous System Architecture (HSA) Foundation continues to expand its Academic Partnership Program with the addition of Finland-based Tampere University of Technology (TUT) as an HSA Academic Center of Excellence. TUT is in Tampere, about 170 km (105 miles) north of Helsinki.
TUT is now the third European university accorded this distinction — in December, the Foundation announced that Technische Universitaet (TU) Darmstadt, and Friedrich-Alexander-University Erlangen-Nurnberg (FAU), both in Germany, were named Academic Centers of Excellence; Northeastern University was the first in North America.
HSA is a standardized platform design that unlocks the performance and power efficiency of the parallel computing engines found in most modern electronic devices. It allows developers to easily and efficiently apply the hardware resources — including CPUs, GPUs, DSPs, FPGAs, fabrics and fixed function accelerators — in today’s complex systems-on-chip (SoCs).
“We’re excited to have TUT on board as an Academic Center of Excellence and look forward to collaborating with the university on several projects,” said HSA Foundation President Dr. John Glossner. “TUT is the forefront of research in areas that intersect closely with heterogeneous computing such as intelligent machines and networked systems.”
“The HSA ecosystem is growing rapidly not only in Finland, but throughout Europe,” said HSA Foundation Chairman and Managing Director Greg Stoner. “TUT has a long-established reputation encompassing an array of innovative technologies. Our members — and the global tech community — will benefit greatly from this burgeoning partnership.”
Jarmo Takala, a professor in the TUT Faculty of Computing and Electrical Engineering, added that the research group is currently working on an open source implementation of the OpenCL standard, called the Portable Computing Language project.
“We’re also going to add support for HSA specs and create a complete tool flow for HSA runtime customized accelerators based on transport-triggered architecture, and open source design tools for these processors, known as the “TTA-based Co-Design Environment” added Takala.
Dr. Pekka Jääskeläinen, who is currently working at TUT as a postdoctoral researcher funded by the Academy of Finland and also involved in various HSA- related activities, said adopting HSA standards “is enabling us to build well-documented IP interfaces to SoC components. HSA is also providing a framework for more studies related to programmer-productivity challenges still hindering heterogeneous platform adoption.”
About TUT
Established in 1965 as a subsidiary of Helsinki University of Technology, TUT became an independent university in 1972. Today, more than 8,300 undergraduate and postgraduate students attend TUT. Of these, about 1,500 students from more than 60 countries are currently pursuing studies. TUT is a sought-after partner for collaborative research and development projects with business and industry worldwide.
About the HSA Foundation
The HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive. HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption. HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
Follow the HSA Foundation on Twitter, Facebook, LinkedIn and Instagram.
Contact Information
Contact:
Neal Leavitt
Leavitt Communications
(760) 639-2900
neal@leavcom.com
Author Archive: mfrickie
The HSA Foundation expands its Academic Partnership Program

Entrepreneur Podcast Network: http://epodcastnetwork.com/the-hsa-foundation-expands-its-academic-partnership-program/
Dr. John Glossner, President of HSA or The Heterogeneous System Architecture a non-profit whose goal is making programming for parallel computing easy and pervasive again joins Enterprise Radio to discuss more about the foundation, the overall benefit and the new partnership.
Listen to host Eric Dye & guest Dr. John Glossner discuss the following:
- Dr. Glossner, we last talked in early November. For the benefit of our listeners, can you please provide a brief synopsis again on what the HSA Foundation is.
- In November, we also talked about what the Foundation calls Academic Centers of Excellence. Please elaborate again on what these are, and how does a higher educational institution become one?
- You mentioned then that Northeastern University in Boston was the first of these; in early December, two leading German universities also became Academic Centers of Excellence. Tell us about each and elaborate on some of the innovative HSA projects they’re working on.
- AMD, a founding member of the Foundation, recently provided a tutorial at an international conference on code generation and optimization. The title was ‘Updates in Heterogeneous Compute.’ Please share what you see as recent heterogeneous compute updates and developments.
- It appears that heterogeneous compute will be applicable for an array of apps. This can be everything from vision based IoT systems to mobile devices; desktops, high-performance computing (HPC) systems, AR/VR environments, and servers. So how will heterogeneous compute improve performance and power efficiency?
- How does HSA make life easier for IP and system designers?
John Glossner, Ph.D. is the President of The Heterogeneous System Architecture (HSA) Foundation and is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive.
HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption.
HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
Glossner currently serves as CEO of General Processor Technologies.
hsaflogo2015
Website: www.hsafoundation.com
Social Media Links:
Facebook: facebook.com/thehsafoundation
Twitter: @hsafoundation
HSA Foundation, AMD Spearheading Heterogeneous Compute Tutorial at CGO
BEAVERTON, OR–(Marketwired – January 26, 2017) – The HSA (Heterogeneous System Architecture) Foundation together with Foundation member AMD will be providing a half-day tutorial entitled, ‘Updates in Heterogeneous Compute’ at the International Symposium on Code Generation and Optimization (CGO). The conference will be held from Feb. 4-8 in Austin, TX.
CGO provides a venue to bring together researchers and practitioners working at the interface of hardware and software on a wide range of optimization and code generation techniques and related issues. The conference spans the spectrum from purely static to fully dynamic approaches, and from pure software-based methods to specific architectural features and support for code generation and optimization.
The half-day tutorial will be presented by AMD Fellow Paul Blinzer on Sunday, Feb. 5, at 1:15 PM in Room 616B. Blinzer’s talk will provide insight into the latest developments in hardware and software for heterogeneous compute, a solution required for a growing number of applications including vision based IoT systems, mobile devices, desktops, high-performance computing (HPC) systems, AR/VR environments, and servers.
The talk will include updates on HSA, a standardized platform design supported by more than 40 technology companies and 23 universities that unlocks the performance and power efficiency of the parallel computing engines found in most modern electronic devices.
The tutorial and other CGO sessions will be held at the Hilton hotel, 500 East 4th St., Austin. For more information, including a full list of speakers, supporting organizations and sponsors, as well as registration information, please visit: http://cgo.org/cgo2017.
About the HSA Foundation
The HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive. HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption. HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
About Paul Blinzer
Paul Blinzer works on a wide variety of Platform System Software architecture projects and specifically on the Heterogeneous System Architecture (HSA) System Software at Advanced Micro Devices, Inc. (AMD) as a Fellow in the System Software group. Living in the Seattle, WA area, during his career he has worked in various roles on system level driver development, system software development, graphics architecture, graphics & compute acceleration since the early ’90s. Paul is the chairperson of the “System Architecture Workgroup” of the HSA Foundation. He has a degree in Electrical Engineering (Dipl.-Ing) from TU Braunschweig, Germany.
https://www.linkedin.com/in/paul-blinzer-4523602
Follow the HSA Foundation on Twitter, Facebook, LinkedIn and Instagram.
You’ll likely find the HSA software and toolchains quite useful and timeless
by Paul Blinzer, Embedded Computing Design: http://embedded-computing.com/guest-blogs/youll-likely-find-the-hsa-software-and-toolchains-quite-useful-and-timeless/#
Many people talk about hardware architecture as if it’s the most important part of a new platform. It’s true that hardware architecture is important for performance, which was discussed at length in a previous blog post. As a refresher, the pillars of the Heterogeneous System Architecture (HSA) are unified and shared virtual memory user-mode dispatch, platform atomics, architected signals, strict memory model, quality of service, and cache coherency.
However, including these features into the platform architecture is not for their own sake; it allows software to be written easily and to run efficiently. Even more so, it enables existing software to be ported easily and ideally automatically onto the new architecture.
While hardware typically has a limited lifespan of a few years at most, software may live almost forever. Sure, almost no one uses actual VT100 text terminals to communicate with the computer and the programs running back then, yet a lot of the software used today uses libraries and application frameworks that have their origin as far back as the 1970s. That software set the foundation of high-performance computing, the Internet, and security protocols used today, usually behind a shiny user interface. Even the good old VT100 terminal still lives on in the command lines of many popular operating systems (OSs) where the control sequences still behave as they did 40 years ago.
This is one reason why some platform architectures have endured over decades. While the hardware design and implementation may have changed substantially internally, the software-visible instruction set architecture (ISA) has endured and got incrementally extended without breaking backward compatibility to run the old programs, while other, more modern architectures were popular for a time but ultimately withered away as their performance advantage diminished. Software-compatible platforms came close enough to their levels to make binary software compatibility the overwhelming factor. Good examples are the x86 ISA, the ARM instruction architecture, or IBM’s System/360 ISA, the latter celebrating its 53rd anniversary and still in use.
How do you ensure the long-term viability of a platform architecture? You ensure that software written for the traditional architectures can run well and faster on it but also keep the software development tool chain like compilers, linkers, and development process familiar, so that the programmer doesn’t have to deal with two or more different software toolchains to get to performant software running on the platform.

Today’s extensive use of open-source software is an important factor, especially the GNU and LLVM-based compiler toolchains, readily available in open source repositories, and OSs like Linux, which are used as a foundation in embedded systems in various forms, sometimes “hidden away” (like in the case of Android). However, applications need to start and run without much delay, so it’s important that the compilation and time-expensive compiler code optimization to the accelerator doesn’t happen at the application’s load time (as often happens with many current accelerator APIs).
Most code optimization should happen once, when producing the application binary and then readily loaded and mapped to the accelerator. This needs a portable, accelerator-neutral ISA with fast transcription to the target accelerator ISA, instead of full compilation. Hence, it’s important to define a vendor-neutral ISA, which in the case of HSA is called HSA Intermediate Language (IL) or HSAIL. This IL represents a common ISA to target by compilers and is designed to be close to a data-parallel accelerator like a GPU, DSP or other hardware.
The source code written in a common high-level language like C++ or Python, be it an application framework or a popular application, will then produce code that’s defined in the IL. The compiler can apply all the extensive optimization steps to generate the intermediate code, which can then can be linked with other libraries, and even with modules written in different languages, such as C++, for some functions.
By integrating the IL as a binary section in the application binary (which is defined in an object format called BRIG), the program loader can then load both the host ISA and the accelerator code blocks in parallel and allow each to execute the program as written by the programmer without the end user seeing a difference from regular program load. Using the HSA run-time functionality, the software engineer can either target the HSA run-time directly or use an application interface or framework sitting on top of it, such as OpenCL.

But that’s not all. AMD has developed an open-source HSA run-time called Radeon Open Compute (ROCm) and added a portability layer called Heterogeneous Interface for Portability (HIP) that allows source code using proprietary CUDA APIs to compile and run on top of the ROCm run-time, while keeping source code compatibility. Alongside CodeXL, an open-source tool for profiling and debugging data parallel applications, this a powerful toolset to automatically port and run large application frameworks. While not using all ROCm features, it’s an easy way to take advantage of AMD’s HSA implementation without refactoring legacy code.
More information can be found in half-day HSA-focused tutorial at the HPCA/CGO conference in a couple of weeks.
New HSA Academic Centers of Excellence Undertake Research and Development of HSA-Compliant Technologies
By Dr. John Glossner, Computing Now: https://www.computer.org/web/hsa-connections/content?g=54930593&type=article&urlTitle=new-hsa-academic-centers-of-excellence-undertake-research-and-development-of-hsa-compliant-technologies
The Heterogeneous System Architecture (HSA) Foundation recently added two new HSA Academic Centers of Excellence – Technische Universitaet (TU) Darmstadt, and Friedrich-Alexander-University Erlangen-Nurnberg (FAU), both in Germany.
As mentioned in previous HSA Connections posts, HSA is a standardized platform design that unlocks the performance and power efficiency of the parallel computing engines found in most modern electronic devices. It allows developers to easily and efficiently apply the hardware resources—including CPUs, GPUs, DSPs, FPGAs, fabrics and fixed function accelerators—in today’s complex systems-on-chip (SoCs).
The research these universities are undertaking around HSA could potentially have significant impact for large-scale commercial use, particularly in the data center.
The Embedded Systems and Applications Group (ESA) of TU Darmstadt, for instance, is working with the HSA Foundation to explore how reconfigurable computing (specifically via Field-Programmable Gate Arrays or FPGAs), can be employed in processing units within the HSA framework.
Two key technologies developed by ESA are driving development: The Threadpool Composer system automatically assembles high-performance multi-threaded compute accelerators on FPGAs from existing hardware blocks, providing both pre-built hardware interfaces (e.g., to external memories or the host CPU), as well as software services (e.g., for dispatching compute jobs to the FPGA).
The joint research performed with the HSA Foundation also encompasses major development work on ffLink, a flexible high-performance PCI Express Gen3 x8 interface developed by ESA, which is capable of reaching a transfer rate of more than 7 GB/s between the FPGA accelerator and the host. Both Threadpool Composer and ffLink have been released as open source at https://git.esa.informatik.tu-darmstadt.de.
In the press release, Professor Andreas Koch, who heads the Embedded Systems and Applications Group (ESA) of TU Darmstadt, noted that the collaboration with the HSA Foundation is enabling them to make great progress on research that would have been extremely difficult to tackle without the additional insight provided by the industry partners.
The Department of Computer Science at FAU is currently focusing on integrating image processing accelerators in an FPGA and developing an HSA compliant interface. They are developing a self-designed processor core (packet processor) which is able to process requests and send them back to a host CPU using an HSA interface. The packet processor is then connected to the FAU’s own accelerator core and a PCI Express link to the system’s main memory.
To enable a fast interconnect to the host CPU, FAU is collaborating with TU Darmstadt’s Embedded Systems and Applications Group, which is providing a PCI Express core. FAU is also working on a technical prototype with the HSA Foundation; to support this work AMD, an HSA Foundation member, is providing the host system and technical help for using AMD HSA-enabled APUs, CPUs and GPUs.
The key takeaway from all of this?
FAU Professor Dietmar Fey summed it up nicely: “We’re now able to reach out to many industrial partners and work with them in establishing a standardization for heterogeneous computing platforms. It’s now becoming possible to combine fundamental research from a university with real world industry architectures and applications.”
The two universities are joining Northeastern University in Boston as HSA Academic Centers of Excellence. We’re looking forward to engaging on more projects with other universities worldwide to help make true heterogeneous processing a reality.
Heterogeneous Systems Architecture (HSA) Foundation Adds Top German Universities as HSA Academic Centers of Excellence
TU Darmstadt and FAU Researching, Developing HSA-Compliant Technologies
BEAVERTON, OR – Dec. 6, 2016 – The Heterogeneous System Architecture (HSA) Foundation has expanded its Academic Partnership Program with the addition of two new HSA Academic Centers of Excellence – Technische Universitaet (TU) Darmstadt, and Friedrich-Alexander-University Erlangen-Nurnberg (FAU), both in Germany. The universities will undertake critical research that will help to further proliferate HSA platforms.
HSA is a standardized platform design that unlocks the performance and power efficiency of the parallel computing engines found in most modern electronic devices. It allows developers to easily and efficiently apply the hardware resources—including CPUs, GPUs, DSPs, FPGAs, fabrics and fixed function accelerators—in today’s complex systems-on-chip (SoCs).
“Both TU Darmstadt and FAU are working on a number of innovative HSA projects that will have significant impact not just in the academic community, but also potentially for large-scale commercial use, particularly in data center settings,” said HSA Foundation President Dr. John Glossner.
The Embedded Systems and Applications Group (ESA) of TU Darmstadt (Germany), headed by Professor Andreas Koch, is working with the HSA Foundation to explore how reconfigurable computing (specifically via Field-Programmable Gate Arrays or FPGAs), can be employed in processing units within the HSA framework.
“Our collaboration with the HSA Foundation is enabling us to make excellent progress on research topics that would have been extremely difficult to tackle without the additional insight provided by the industry partners. We look forward to advancing the architecture of heterogeneous computers for both academic as well as industrial use-cases,” said Professor Koch.
According to FAU Professor Dietmar Fey, the Department of Computer Science is currently focusing on integrating image processing accelerators in an FPGA and developing an HSA compliant interface. FAU is also collaborating on HSA technology with TU Darmstadt’s Embedded Systems and Applications Group; and is working on a technical prototype with HSA Foundation members.
“Our Academic Center of Excellence relationship is now having a significant impact – we’re now able to reach out to many industrial partners and work with them in establishing a standardization for heterogeneous computing platforms. It’s now becoming possible to combine fundamental research from a university with real world industry architectures and applications,” said Professor Fey.
“Professor Koch and Professor Fey have extensive experience in heterogeneous platform architecture design,” noted Paul Blinzer, a Fellow in the Systems Software group at Advanced Micro Devices, Inc. (AMD). “Their participation in Heterogeneous System Architecture will advance state-of-the art research in heterogeneous systems.”
Last month the HSA Foundation announced the first Academic Center of Excellence at Northeastern University in Boston. To learn more about engaging with the HSA Foundation on academic programs, contact academic@hsafoundation.com.
About TU Darmstadt
TU Darmstadt was founded in 1877. In 1882, the university established the first chair in Electrical Engineering worldwide. The university is a member of TU9, an alliance of leading institutes of technology in Germany. TU Darmstadt offers a comprehensive spectrum of undergraduate, graduate and doctoral programs, with a focus on the natural and engineering sciences. The university, which also hosts two Fraunhofer Institutes, performs world-class research and actively collaborates with academic and industrial partners nationally and internationally.
About FAU
Founded in 1743, FAU is a research university with an international perspective and one of the largest universities in Germany, with 40,174 students, 256 degree programs, 4,000 academic staff (including over 647 professors), and 500 partnerships with universities all over the world. FAU’s outstanding research and teaching is reflected in top positions in both national and international rankings, as well as the high amount of DFG funding which its researchers are able to secure.
About the HSA Foundation
The HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive. HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption. HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
Follow the HSA Foundation on Twitter, Facebook and LinkedIn.
The First HSA Academic Center of Excellence
Posted on November 9, 2016, Entrepreneur Podcast Network: http://epodcastnetwork.com/the-first-hsa-academic-center-of-excellence/ (please visit website for podcast download)
Dr. John Glossner, President of HSA or The Heterogeneous System Architecture a non-profit whose goal is making programming for parallel computing easy and pervasive joins Enterprise Radio to discuss more about the foundation, the overall benefit and the new partnership.
Listen to host Eric Dye & guest Dr. John Glossner discuss the following:
- Describe the Foundation and how it got started.
- What are the overall benefits of joining the Foundation and adopting the membership?
- You recently announced new developments in heterogeneous architecture from Northeastern University. Can you tell us more about those developments?
- Any particular reason why your selected Northeastern for this partnership?
- Will the Foundation also be developing relationships with other key research universities worldwide? Have you identified those partnerships yet?
- Which industry constituents might be interested in the HSA Foundation’s work?
- How does the HSA Foundation development process work?
- How does one get involved and what is the criteria for being accepted into the Foundation?
John Glossner, Ph.D. is the President of The Heterogeneous System Architecture (HSA) Foundation and is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive.
HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption.
HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
Glossner currently serves as CEO of General Processor Technologies.
SC16 to Feature Milestones in Heterogeneous System Architecture (HSA) Programming Languages, Open Standards, and Open Source Tools
Beaverton, OR, Nov. 9, 2016 – SC16, the international conference for high-performance computing (HPC), networking, storage, and analysis, will feature sessions that highlight recent heterogeneous system architecture (HSA) momentum. SC16 brings together the international supercomputing community to discuss the technologies that will shape the future of large-scale technical computing and data-driven science.
WHO: The HSA Foundation, a non-profit consortium whose goal is making programming for parallel computing easy and pervasive. Participants in the SC16 HSA sessions include:
- HSA Foundation Chairman and Senior Director of Radeon Open Compute for AMD Gregory Stoner
- AMD Senior Fellow Design Engineer and GPU CTO Ben Sander
- AMD Senior Member of Technical Staff Mayank Daga
WHAT: HSA is a standardized platform design supported by more than 40 technology companies and 17 universities that unlocks the performance and power efficiency of the parallel computing engines found in most modern electronic devices. HSA sessions at SC16 will highlight progress toward the Foundation’s goal of bringing true heterogeneous computing to platforms including vision based IoT systems, mobile devices, desktops, HPC systems, AR/VR environments, and servers. HSA-related sessions at SC16 include:
- Emerging Technologies: Programming High-Performance Heterogeneous Computing Systems with the Radeon Open Compute Platform (ROCm)
- Panel Discussion: Bringing About HPC Open-Standards World Peace
- Exhibitor Forum: Revolutionizing Large-Scale Heterogeneous HPC Systems with AMD’s ROCm Platform
WHERE: Salt Palace Convention Center in Salt Lake City
WHEN: Nov. 13-18, 2016; visit the SC16 website for specific session times
“Many HSA Foundation members such as AMD are now delivering a wide range of heterogeneous systems, including those based on HSA,” noted HSA Foundation President Dr. John Glossner. “It’s very exciting as one of the Foundation’s goals is to bring true heterogeneous computing to an array of platforms, some of which include Deep Neural Networks (DNN’s), vision based IoT systems, mobile devices, desktops, high-performance computing (HPC) systems, AR/VR environments, and servers.”
“The HSA Foundation is a strong proponent of open source development tools directly and through its member companies,” said HSA Foundation Chairman Greg Stoner. “AMD’s Radeon Open Compute Platform (“ROCm”) initiative, for example, brings a rich heterogeneous programming foundation for developers, and offers an array of development tools now freely available supporting HSA.”
Stoner added that ROCm via an HSA standardized object loader supports two compiler foundations:
- LLVM (Low Level Virtual Machine) compiler supports:
– HCC compiler for heterogeneous C++ with PSTL development
– HIP compiler for simply porting CUDA codes
– Continuum’s Anaconda with Numba for supporting Python development
– Khronos Group’s OpenCL C-based programming language
- SUSE GCC via enablement of HSA runtimes and HSA object format in conjunction with General Processor Technologies and Parmance
About the HSA Foundation
The HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive. HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption. HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
Follow the HSA Foundation on Twitter, Facebook and LinkedIn.
Contact:
Neal Leavitt
Leavitt Communications
(760) 639-2900
neal@leavcom.com
HSA Foundation Announces New Developments in Heterogeneous Systems Architecture (HSA) from Northeastern University
BOSTON, MA – Oct. 18, 2016 – The HSA Foundation has expanded its Academic Partnership Program with the addition of Northeastern University as the first HSA Academic Center of Excellence. The HSA Foundation will be expanding the program by driving innovation in heterogeneous processing to help usher in the next evolution in computing.
HSA is gaining increasing traction, noted HSA Foundation President Dr. John Glossner, with recently announced HSA compliant products, the launch earlier this year of the HSA 1.1 specification, and other key developments.
“The HSA Foundation is now developing relationships with key research universities worldwide that are looking to work on the next evolution in computing both in hardware and software,” said Glossner. “HSA Academic Centers of Excellence will be exploring a wide range of HSA related areas across computer graphics, computer vision, computational photography, programming language and model research, and more.”
Glossner added that research universities are key to driving forward the industry’s understanding of the challenges and possibilities in heterogeneous computing.
The Northeastern University Computer Architecture Research (NUCAR) Laboratory, led by Prof. David Kaeli, has recently released HeteroMark, the first set of benchmark applications developed to evaluate HSA systems. In addition to this contribution to the open source community, the NUCAR team has also introduced Multi2sim-HSA, the first architectural simulator that supports HSA execution. This new simulator has been integrated in the Multi2sim 5.0 framework (www.multi2sim.org), an open source heterogeneous simulation infrastructure used by hundreds of international researchers.
“It is a pleasure for us to work collaboratively with the HSA Foundation members. We are already seeing that our tools and workloads are being leveraged by both industry and academia, enabling them to explore the many benefits of this new computing model,” said Kaeli.
“The work on HeteroMark by Northeastern University is creating an excellent architecture- and API-neutral test suite for common data parallel workloads using modern heterogeneous architecture features,” said AMD Fellow Paul Blinzer. “It allows analysis of GPU and CPU contributions to traditional and collaborative compute patterns with either GPU or CPU as a producer and consumer of data, and provides a good point of comparison with traditional systems designs, clearly demonstrating the benefits of modern heterogeneous systems features defined by the HSA specifications and, for example, implemented via AMD’s ROCm infrastructure.”
Blinzer added that “Multi2sim is a popular system simulation tool in academic research. Integration of HSA system features allows researchers to better understand and analyze modern platform features available on heterogeneous platforms based on HSA technologies.”
To learn more about engaging with the HSA Foundation on academic programs, contact academic@hsafoundation.com.
About Northeastern University
Founded in 1898, Northeastern is a global, experiential, research university built on a tradition of engagement with the world, creating a distinctive approach to education and research. The university offers a comprehensive range of undergraduate and graduate programs leading to degrees through the doctorate in nine colleges and schools, and select advanced degrees at graduate campuses in Charlotte, North Carolina, Seattle, Silicon Valley, and Toronto.
About the HSA Foundation
The HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive. HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption. HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
Follow the HSA Foundation on Twitter, Facebook and LinkedIn.
Platform and Hardware Requirements for HSA Technologies
by Paul Blinzer, September 8th, Embedded Computing Design: http://embedded-computing.com/guest-blogs/platform-and-hardware-requirements-for-hsa-technologies/#
Heterogeneous system architecture (HSA) is now a standardized platform design, supported by more than 40 technology companies and 17 universities, that unlocks the performance and power efficiency of the parallel computing engines found in most modern electronic devices. Spearheading HSA is the HSA Foundation, a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs, and ISVs, whose goal is making programming for parallel computing easy and pervasive.

Briefly, HSA allows developers to easily and efficiently apply the hardware resources—including CPUs, GPUs, DSPs, FPGAs, fabrics, and fixed function accelerators—in today’s complex systems-on-chip (SoCs).
In this first of two posts, I’ll focus on platform and hardware requirements for HSA technologies; the second part will center on software and toolchains. Both will be discussed in depth at a tutorial at the upcoming 25th International Conference on Parallel Architectures and Compilation Architectures (PACT). The conference will be held from Sept. 11-15 in Haifa, Israel.
The architecture pillars of HSA
One of the key benefits of HSA is a set of platform architecture features and a programming model that software can depend on for parallel computing. Software using accelerators through an API like OpenCL, CUDA, or similar typically cannot assume that certain hardware features are available on every platform and so must either set a lowest common denominator or support many wildly different ways to essentially implement the same thing all over again to take advantage of an optional API feature while still supporting a lesser-equipped platform.

[Figure 1 | Pillars of HSA requirements]
HSA, in contrast, sets a requirement for a select few modern hardware features that make using the accelerator way more efficient and simplify the programming enormously, similar to what a CPU ISA like x86-64 or ARM has accomplished for compilers and application software development, with a reasonable expectation that a program will run efficiently on a platform with a particular processor and OS.
Let’s start with a brief list of requirements with a short explanation why they’re important.
Security and “quality of service”
An HSA accelerator is used as a peer processor to the CPU by the application, with all benefits and obligations. One of the key obligations is not allowing bad application code to do bad things to the system. Therefore, an HSA accelerator has a “user mode” ingrained for execution, where the operating system (OS) runtime sets strict policies at the hardware level for the accelerator. It can only access data and execute code that is part of the application process and if it accesses anything outside of the expected data, the OS runtime gets notified and can shut down the accelerator access by the application without affecting the rest of the system. A memory management unit (MMU) and other hardware to support it are therefore required by the HSA standard.
Shared virtual memory (SVM)
SVM allows the accelerator to access the application’s data directly and process it. That requires an MMU in hardware. HSA accelerators require it for system security reasons already, so no problem here.
Accelerators without SVM using OpenCL 1.x or the common CUDA API require the CPU to do a lot of work including parse application data, select/copy all data to/from the accelerator (which may need a dedicated buffer), and validate results. There is no concept of passing a pointer and allowing the accelerator to operate on shared memory. Often multiple data must be copied to the accelerator, but only one set of data is chosen. This can waste a lot of time to copy if we don’t know in advance which datasets are required. This copy overhead can seriously degrade the performance benefit of the accelerator.
This performance degradation is eradicated on an HSA accelerator. SVM allows the accelerator to parse and only access the application data it needs directly. As an added benefit, the accelerator and the CPU can access the same data in memory, avoiding unwanted duplicates.
Platform atomic operations
Anyone that has programmed with multithreaded code on a CPU knows how useful atomic operations are for ensuring that two threads can operate on the same data safely. Atomic operations are used in many different ways for implementing semaphores, mutexes, histograms, and many other tasks that require a particular order of execution or of access to work correctly. HSA compatible accelerators must support 32-bit or 64-bit atomics, as threads running on the CPU and on the accelerator operate and synchronize with each other very efficiently using atomic operations. Older accelerators without it always require arbitration using software APIs on the CPU. Software arbitration is very inefficient and these systems end up using far more CPU cycles that are better spent on other tasks.
HSA signals and doorbells
This is something special to HSA and a very significant feature for power-efficiency. You can consider these “atomics with benefits.” HSA Signals are data types created by the runtime that behave similar to atomically updated variables in memory. However, they allow the hardware to monitor and notify state changes, e.g. when a value is changed by an application thread or by an accelerator. One or more accelerators can update an HSA signal, listen and wait on a signal state change, and – if they have nothing else to do – go to sleep while waiting and be woken up immediately when something has changed. By using HSA signals, one accelerator can notify other accelerators directly that data is ready and these can start their processing immediately. If implemented fully, the CPU doesn’t need to perform any coordination and can stay asleep. This is a significant power saving feature because the only hardware that is needed for a task is only active when needed. HSA signals can be used easily everywhere in the software architecture and even HSA-based OpenCL implementations benefit.

User queues and dispatch
If you have programmed any accelerator using OpenCL or CUDA and followed in the debugger how it reaches the hardware, you will have noticed layers upon layers of software levels that the accelerator code and data must pass through until it finally reaches the hardware to be processed. HSA removes this inefficiency and cuts out the middle layers by defining a hardware-based user queue dispatch mechanism that can be directly accessed from the application runtime. The architected queuing language (AQL) that a packet processor of an accelerator uses allows any accelerator to either dispatch work to itself, to the CPU of the system to call OS runtime functions, or to other accelerators.
Cache coherency
Most HSA accelerators have caches that keep frequently used data close to the execution units. But since other processors in the system may also access the cached system memory, the application and the hardware must make sure that their content doesn’t get stale – especially in multithreaded execution. HSA accelerators therefore must provide mechanisms to keep the cached data current and either flush out pending data or invalidate cached data if other processors access the cached system memory. Cache coherency can be automatically enforced by hardware bus protocols or alternatively require instruction controls, which in the case of HSA is part of the definition.
With that, I hope I’ve made you interested enough to eagerly wait for the next blog entry, where I will touch on the HSA memory model HSAIL and how HSA integrates into today’s embedded systems.
