HSA Q&A with Dr. John Glossner

Computing Now: https://www.computer.org/web/computingnow/insights/content?g=53319&type=article&urlTitle=hsa-connections
HSA computing standards have progressed significantly since the HSA Foundation (HSAF) was established in 2012. Today, for instance, there are not only royalty free open specifications available but also fully operational production systems.

Representatives from newly joined HSA Foundation members in China

Pictured: Representatives from newly joined HSA Foundation members in China
In this Q&A, Dr. John Glossner, HSA Foundation president, provides additional insights on HSA-specific trends and issues:
What are the connections/differences between heterogeneous computing, general purpose computing and specialized computing? If heterogeneous computing is the future, what will happen to general purpose computing and specialized computing?
General purpose computing is what you find in a CPU. It is meant to be able to process any function but streaming data, like artificial intelligence (AI), might not always be efficiently processed on a CPU.
Specialized computing would be a design made for one particular application such as AI but it would not be intended to run general purpose code (sometimes called control code). The specialized accelerator typically has the advantage that it is much lower power to execute the special purpose application (e.g., AI).
Heterogeneous computing combines the best of both. It specifies how a CPU can talk to an accelerator and often finds both integrated onto the same silicon die. So heterogeneous processors, meaning different types – such as CPUs, GPUs, DSPs, specialized accelerators and others, are all integrated together and cooperate to achieve an ideal balance of performance and power consumption for a given application.
What is the ultimate goal for the HSAF? How and what need to be done to achieve this?
The goal of the HSA Foundation is to make heterogeneous programming easier. That means creating standards that allow different types of processors to be programmed in the same language, using one single source file, and then automatically distributing parts of the application to the best processor to do the computing.
If research institutions and companies participate in establishing and promoting the standards of heterogeneous computing, will it affect their current development and solutions?
With open specifications and open source implementations of standards and tools, the Foundation’s hope is that it accelerates the pace of development and adoption of the technology. Corporations participating in HSAF enjoy royalty free access to all technologies developed.
The Foundation announced the formation of the China Regional Committee (CRC) in May. What were the motivations and goals in establishing the CRC and what is the connection/differences between CRC standards and HSA standards?
While the HSA Foundation has made a lot of progress there are always regional considerations and research opportunities to improve current systems. Recently China has become a leader in AI and other semiconductor technologies. With the emergence of low latency applications such as AI and virtual reality (VR) the Foundation anticipates improvements to current specifications. As this is an area of research and development being led by China, it is natural to invite key scientists and companies from China to adopt and adapt technologies and specifications.
How many local organizations have joined the CRC? What are members’ perspectives?
More than 30 members have joined the CRC to date. They comprise semiconductor companies, research universities and institutes (e.g., Chinese Academy of Sciences), tools and algorithms designers, test verification, and China standardization groups.
What effects will in-depth research and development of heterogeneous computing standards and technologies have on promoting China’s semiconductor industry advances?
China has become a global leader in semiconductor development and algorithms such as AI that execute on semiconductor chips. Heterogeneous systems that are now emerging are expected to accelerate R&D throughout the global industry. The formation of the CRC and future global adoption of the work done by the CRC should advance China’s semiconductor industry as well as contribute to worldwide growth.
What are the implications of developing and promoting heterogeneous computing standards for the creation of China’s heterogeneous computing industry chain and ecosystem?
While the algorithms that the CRC is evaluating are of immediate concern within China, it is expected that the entire global community and ecosystem will benefit from the standardization work being performed by the CRC.
Do heterogeneous computing chips have a wide range of AI applications? What are the specific advantages?
Heterogeneous chips have the potential to dramatically reduce the electric power to perform AI applications. When programs are optimized for specialized heterogeneous systems, each processor in the system can execute code that is most power efficient for its own function. This provides higher performance at lower power than non-heterogeneous systems.
What should China do to rapidly cultivate the heterogeneous computing industry?
By participating in the HSAF CRC, China can adapt and adopt technologies related to heterogeneous systems for China-specific issues. However, it is anticipated that these enhancements will be integrated into global HSAF specifications because the problems are common to many semiconductor companies.

Heterogeneous Computing Standards & International AI Conference Paving the Way Towards Global HSA Specifications

Xiamen, Fujian, China, July 9, 2017 – The recently concluded Heterogeneous Computing Standards & International AI Conference, held in Xiamen, is helping to lay the groundwork for heterogeneous computing standards not only in China, but worldwide. The two-day event was co-hosted by the China Electronic Standardization Institute (CESI), the HSA Foundation and the Chinese Association of Artificial Intelligence, with an organizing committee including Huaxia General Processor Technologies, the HSA Foundation’s newly formed China Regional Committee (CRC), and the Xiamen Integrated Circuit Industry Association.
Heterogeneous System Architecture (HSA) is a standardized platform design that unlocks the performance and power efficiency of the parallel computing engines found in most modern electronic devices. It provides an ideal mainstream platform for next-generation SoCs in a range of applications including artificial intelligence.
The Heterogeneous Computing Standards & International AI Conference brought together a number of industry leaders to discuss processors, software, applications, machine learning, and fintech for heterogeneous systems in artificial intelligence applications. HSA Foundation members including AMD, Arteris, Cadence, CESI, Huaxia General Processor Technologies, Imagination Technologies, Shanghai Advanced Research Institute – Chinese Academy of Sciences, and Xiamen University shared their latest results with hundreds of participants at the event.
Other presenting companies included Creekspring AI, DeepGlint, DeepPhi Tech, Gold Medal Global Investment, ICETech, KACHIP, Sanechips Technology, State Grid, and others. Dozens of renowned scholars and officials from universities, institutes and related industry companies also participated in the event.
The recently formed HSA Foundation CRC is laying the groundwork for standardization progress in heterogeneous computing standards in China. It is focused on supporting the needs of HSA Foundation members in China and helping to fulfill the mission of the Foundation, which is to make heterogeneous programming universally easier. The formation of the CRC and potential global adoption of the work done by the CRC will advance China’s semiconductor industry as well as contribute to worldwide growth.
“As China is emerging as a powerhouse in programming heterogeneous systems, AI and semiconductor technology, it is natural to invite key scientists and companies from China to adopt and adapt technologies and specifications. We fully anticipate that these changes will not remain local to CRC working groups but will be incorporated into the global specifications and adopted worldwide,” said Dr. John Glossner, HSA Foundation president.
The CRC has instituted the following working groups and elected their Chairs to evaluate, enhance and develop HSA technologies:
• Application & System Evaluation Working Group – Dr. Kunlun Gao, Global Energy Interconnection Research Institute
• Virtual ISA Working Group – Dr. Jun Han, Fudan University
• System Architecture Working Group – Wanting Tian, Sanechips Technology
• Compilation & Runtime LIB Working Group – Dr. Lei Wang, Huaxia GPT
• OS & Multivendor Working Group – Dr. Min Gong, Beijing Linx Technology
• Interconnect Working Group – Dr. Zhiyi Yu, Sun Yat-sen University
• Security & Protection Working Group – Dr. Songhai Liang, Nationz Technologies
• Conformance Test Working Group – Dawei Chen, CESI
Supporting Quotes:
“Nearly all SoC’s are heterogeneous systems. The HSA Foundation’s technology makes programming these systems much simpler by providing single-source toolchains, common API’s, and a choice of programming languages. When executed on an HSA runtime, both high performance and low power can be achieved. GPT has licensable cores supporting HSA technologies and is actively contributing to the development of the specifications”. By internally adopting HSA, GPT has accelerated development of heterogeneous systems in multiple application domains including machine learning and artificial intelligence.”

Kerry Li, CEO, Huaxia General Processor Technologies

“China is firmly placed at the heart of heterogeneous systems, AI and semiconductor technology, with the HSA Foundation playing a key role in increasing awareness within the industry of the challenges and driving the availability of solutions. The recent China Regional Council event was a real triumph and Imagination was very pleased to participate in such a successful event. The event highlighted just how much potential heterogeneous computing has in terms of AI. As a founding member of the HSA Foundation, we look forward to continuing our work with other members to create specifications that make it easier to develop and program heterogeneous SoCs, as well as developing IP cores that enable the realization of such SoCs.”

James Liu, VP and GM China, Imagination Technologies

Application & System Evaluation Working Group
“Our goal is to verify the advanced nature of the HSA technology and the applicability of the HSA standards through a typical application demonstration. HSA has become a trend in advanced computing technology; its huge technical potential cannot just stay on paper as for standards, it also plays a role in multiple applications, reflecting the technical value through verification of actual cases. State Grid, the world’s largest public service corporation, has an urgent need for high-speed computing and artificial intelligence computing for ultra-large-scale power grids. We anticipate that HSA technology will be used in the future to meet these computing needs and ensure the smooth implementation of the national strategy on Global Energy Interconnection.”

Dr. Kunlun Gao, Director of Computing and Application Lab, Global Energy Interconnection Research Institute

Virtual ISA Working Group
“Our working group will focus on virtual explicitly parallel ISA that brings parallel acceleration to high level language. The virtual ISA, called HSAIL, can be finalized to native ISAs of different architectures such as CPU, GPU, DSP, custom accelerator, etc. Enabling data parallel programming is a key feature of HSAIL, so flexible vector processing, such as variable vector lengths and mixed-precision vector operations, will be involved in the technical discussion of our group. Moreover, some special instructions related to AI applications might also be considered for inclusion in HSAIL. This is an important open problem so far.”

Dr. Jun Han, Professor, Fudan University

System Architecture Working Group
“The establishment of the CRC will drive the HSA standardization process, and the CRC will become an important force in building HSA standards. The CRC System Architecture Working Group will study the necessity and performance advantages of heterogeneous architecture from an overall perspective, and topics brought by heterogeneous architecture on processor design, interconnected bus design, memory system design, low-power design, and testability design, etc. in order to form the heterogeneous architecture design methodology.”

Wanting Tian, Vice President, Sanechips Technology

Compilation & Runtime LIB Working Group
“The compiler and runtime are interrelated components to connect HSA and its working groups, supporting the virtual ISA and operating system interface specification. The compiler and runtime are the main method of user evaluation system and directly determine the developer/user experience with the HSA system.”

Dr. Lei Wang, Technical Director, Huaxia General Processor Technologies

OS & Multivendor Working Group
“We are dedicated to providing operating system support for the CRC and HSA Foundation. The main focus of the OS & Multivendor Working Group will include kernel work on system security and multi process resource sharing as well as coordinating multiple vendors on hardware-OS and application development.”

Dr. Min Gong, Chief Scientist, Beijing Linx Technology

Interconnect Working Group
“The interconnect network is becoming increasingly important due to the larger number of heterogeneous cores and more advanced fabrication technology. The CRC’s Interconnect Working Group will organize experts with a strong background from academic and industry. Our goal is to evaluate interconnect network protocol/standards for many-core heterogeneous systems, which will be efficient, scalable, and can be reused in various systems.”

Dr. Zhiyi Yu, Professor, Sun Yat-sen University

Security & Protection Working Group
“There is no doubt that the security and protection issues have become the foundation of the key technologies of heterogeneous computing for heterogeneous system architectures. The main task of the Security & Protection Working Group is to systematically solve the problem of safe operation and system protection of the HSA, and to develop a corresponding interface strategy and specifications from various aspects of instruction, thread, process, storage, IO, on-chip interconnection, operating system, application, etc. This is to promote and ensure the sustainable development and healthy growth of the new generation of heterogeneous computing chip products and its ecosystem.”

Dr. Songhai Liang, Chief Scientist of SoC Design, Nationz Technologies

Conformance Test Working Group
“CESI plays a role in the CRC to deal with the work of standardization and conformance test. HSA technology has a significant influence on the design of the next generation of SoCs. With the aim of promoting positive developments for the HSA Foundation, it is necessary for relevant parties to make efforts to research and develop relevant technical specifications of HSA and to lead relevant companies to adopt and commercialize the specifications. CESI can provide relevant products with tests and verifications, which are compliant to the standards of the HSA.”

Dawei Chen, Professor & Research Center Director, CESI

About the HSA Foundation
The HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive. HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption. HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
Follow the HSA Foundation on Twitter, Facebook, LinkedIn and Instagram.

Parallel pleasure: deep-geek chip consortium opens test tool

By Adrian Bridgwater, ComputerWeekly UK: http://www.computerweekly.com/blog/Open-Source-Insider/Parallel-pleasure-deep-geek-chip-consortium-opens-test-tool

The HSA Foundation has made available to developers the HSA PRM (Programmer’s Reference Manual) conformance test suite as open source software.

HSA who?

Yes, sorry… the HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive.

The test suite is used to validate Heterogeneous System Architecture (HSA) implementations for both the HSA PRM Specification and HSA PSA (Platform System Architecture) specification.

But what is HSA?

HSA is a standardised platform design designed to unlock the performance and power efficiency of the parallel computing engines found in most modern electronic devices.

It allows developers to apply the hardware resources—including CPUs, GPUs, DSPs, FPGAs, fabrics and fixed function accelerators—in today’s complex systems-on-chip (SoCs).

“The HSA Foundation has always been a strong proponent of open source development tools directly and through its member companies,” said HSA Foundation chairman Greg Stoner. “Open sourcing worldwide the PRM conformance test suite is yet another example of an expanding array of development tools freely available supporting HSA.”

The HSA Foundation through its member companies and universities has also released many additional projects which are all available on the Foundation’s GitHub site.

Parallel pleasure: deep-geek chip consortium opens test tool

By Adrian Bridgwater, TechTarget USA: http://itknowledgeexchange.techtarget.com/open-source-insider/parallel-pleasure-deep-geek-chip-consortium-opens-test-tool/
The HSA Foundation has made available to developers the HSA PRM (Programmer’s Reference Manual) conformance test suite as open source software.
HSA who?
Yes, sorry… the HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive.
Parallel pleasure
The test suite is used to validate Heterogeneous System Architecture (HSA) implementations for both the HSA PRM Specification and HSA PSA (Platform System Architecture) specification.
But what is HSA?
HSA is a standardised platform design designed to unlock the performance and power efficiency of the parallel computing engines found in most modern electronic devices.
It allows developers to apply the hardware resources—including CPUs, GPUs, DSPs, FPGAs, fabrics and fixed function accelerators—in today’s complex systems-on-chip (SoCs).
“The HSA Foundation has always been a strong proponent of open source development tools directly and through its member companies,” said HSA Foundation chairman Greg Stoner. “Open sourcing worldwide the PRM conformance test suite is yet another example of an expanding array of development tools freely available supporting HSA.”
The HSA Foundation through its member companies and universities has also released many additional projects which are all available on the Foundation’s GitHub site.

HSA Foundation Establishes China Regional Committee to Enhance Global Awareness of Heterogeneous Computing

Committee Members Include Leading China Institutes, Universities, and Standards Authorities
Xiamen, Fujian, China, May 11, 2017 – The HSA Foundation has announced the formation of the China Regional Committee (CRC), with founding members comprised of 20 renowned institutes, universities and standards authorities throughout China. With a focus on growing the HSA ecosystem, the CRC’s mandate is to enhance the awareness of heterogeneous computing and promote the adoption of standards such as Heterogeneous System Architecture (HSA) in China. Dr. Xiaodong Zhang, from Huaxia General Processor Technologies, will serve as the CRC’s chairman.
“The CRC will help define regional heterogeneous computing needs, obtain advice from local experts, help China market segments become more integrated with continuously expanding HSA technologies, and serve as a gateway for the HSA Foundation to be more proactive and effective in addressing heterogeneous computing opportunities and issues affecting the region,” noted Zhang.
“China’s fast growing role in semiconductor innovation, combined with its skilled talent base, makes it a strategically advantageous location for the HSA Foundation to establish its first regional committee. Our hope is to accelerate China’s heterogeneous computing development in line with the standardization work, as well as to benefit the local industry community with high performance heterogeneous systems with reduced complexity. The establishment of the CRC will help significantly in these efforts,” said HSA Foundation President Dr. John Glossner.
“The HSA ecosystem continues to grow rapidly in China and we look forward to further collaborative ventures with our new CRC colleagues,” said HSA Foundation Chairman and Managing Director Greg Stoner.
Glossner said that the HSA Foundation is gaining increasing traction, with recently announced HSA compliant products worldwide, the introduction of the HSA 1.1 specification, and other key developments.
The CRC’s initial members include CESI, a professional institute for standardization in the field of electronics and IT industry in China under the Ministry of Industry and Information Technology (MIIT), and organizations that play an influential role in the HSA ecosystem in China, especially in the fields of artificial intelligence (AI), machine learning, AR/VR and many others which require support from heterogeneous processing. Founding members of the CRC include:
• China Electronics Standardization Institute (CESI)
• Fudan University
– State Key Laboratory of ASIC and System
• Hunan Institute of Science and Technology
• Institute of Computing Technology (ICT), Chinese Academy of Sciences
• Jiangsu Research Center of Software Digital Radio
• Nanjing University
– State Key Laboratory for Novel Software Technology
• Nanjing University of Aeronautics and Astronautics
• Nanjing University of Posts and Telecommunications
• Nanjing University of Science and Technology
• Nantong University
• Peking University
• Shanghai Advanced Research Institute, Chinese Academy of Sciences
• Shanghai Institute of Microsystem and Information Technology (SIMIT), Chinese Academy of Sciences
• Shanghai Jiao tong University
• Shanghai Research Center for Wireless Communications
• Shanghai University
• Shenyang Institute of Automation, Chinese Academy of Sciences
– State Key Laboratory of Robotics
• Southeast University
– State Key Laboratory of Mobile Communications
• Sun Yat-sen University
• University of Science and Technology Beijing
2017 Heterogeneous Architecture Standards and Artificial Intelligence Conference
The first CRC Symposium is part of the 2017 Heterogeneous System Architecture Standards and Artificial Intelligence Conference, which will be held in Xiamen on May 25 – 26. The two-day event is co-hosted by CESI, the HSA Foundation and Chinese Association of Artificial Intelligence, with an organizing committee including Huaxia General Processor Technologies, the HSA Foundation CRC, and Xiamen Integrated Circuit Industry Association.
Renowned scholars and officials from related industry organizations will be invited to exchange and discuss standards and technologies for heterogeneous computing and artificial intelligence. A list of outstanding industry leaders will speak at the AI conference, joined by numerous other attending companies from related fields. For more conference information, a list of speakers and online registration, please visit www.hsa-china.com.
HSA is rapidly becoming a mainstream platform to support the promotion and application of the artificial intelligence industry and to develop standards for the next generation of SoCs and heterogeneous processors. The Symposium will bring together dozens of universities, institutes and companies to discuss the HSA Foundation and its development in China. Topics will include standards, key technologies, collaborative development, and software ecosystem construction, among others.
The CRC will also take an active role in developing the second annual Heterogeneous System Architecture 2017 Global Summit (visit www.hsafoundation.com; details to be posted soon). The two-day 2016 event was co-sponsored by the HSA Foundation and the China Semiconductor Industry Association (CSIA), and was also supported by the Beijing Economic and Technological Development Zone (E-Town), the Ministry of Industry and Information Technology of the People’s Republic of China (MIIT), and Cyberspace Administration of China.
Supporting Quotes

China Electronics Standardization Institute
“Heterogeneous computing is the key technology in the next-generation processor design. China Electronic Standardization Institute (CESI), as the primary non-profit and comprehensive research institution for China’s standardization of electronic information technologies, is very pleased to be a member of the CRC, and together with other CRC members, will drive heterogeneous computing standardization work in China. As a member of the HSA Foundation, we look forward to joining global colleagues to improve the HSA technical standardization and better promote the development of next generation processors worldwide including China.”

  – Baoyou Wang, Director of Basic Product Research Center, China Electronics Standardization Institute

Nanjing University
“The School of Microelectronics at Nanjing University focuses on a variety of core disciplines, some of which include multi-core processing chip architectures and implementations, reconfigurable computing, three-dimensional network-on-chip (NoC) design, SoC design and high-performance VLSI implementations in digital signal processing algorithms. Heterogeneous computing is one of today’s hottest technologies and encompasses important applications such as mobile devices, the Internet of Things (IoT), cloud computing, and artificial intelligence. We look forward to working with the HSA Foundation in effectively using CPU, GPU, DSP, FPGA and other hardware and software resources to support research and development of heterogeneous system architectures. We thank the HSA Foundation for facilitating a dedicated research platform for institutions and universities.”
– Hongbing Pan, Professor, Nanjing University

Shenyang Institute of Automation, Chinese Academy of Sciences

“The institute’s main research directions include wireless sensor and communication technology, and industrial digital control systems. Our research group is engaged in R&D of industrial bus technology related to communications chips, and system-on-chip with communication functions. We look forward to working with HSA Foundation’s CRC where we will focus on the research of heterogeneous multi-core technology for industrial control SoC’s. With the development of China’s “Industry 4.0”, the traditional centralized control is transitioning to a decentralized model. Industrial control systems are composed of heterogeneous cores including micro controllers and DSPs connected by a common bus. HSAF technologies address these types of systems providing flexibility, high performance, integration, and miniaturization. We look forward to adopting HSAF technology and evaluating the effectiveness of HSA for industrial control systems.”
– Chuang Xie, Senior Engineer and Director of SoC Designs, Shenyang Institute of Automation, Chinese Academy of Sciences

Southeast University
“The establishment of HSA Foundation’s CRC will further promote the rapid development of heterogeneous computing technology in the region. Southeast University has made several innovations in deep learning and cloud computing. Its Laboratory of Image Science and Technology, one of the earliest units in China to be involved in image processing, looks forward to contributing innovative technology solutions. This will enable researchers to focus on algorithm research and evaluate their effectiveness in HSA systems.”
– Aodong Shen, Assistant Professor, Southeast University

Sun Yat-sen University
“Processors are facing great challenges. Moore’s Law is slowing down, while new applications such as big data and artificial intelligence require higher computation and storage capability. Heterogeneous computing is proposed as ”CPU+” architecture. It can significantly improve the system performance and energy efficiency for a wide range of application domains, and is evolving to become the main platform for the next generation computation industry. The HSA Foundation aims to standardize the heterogeneous computing architecture. It’s my honor to participate in HSA Foundation’s CRC. We look forward to providing input to the HSA Foundation with regional requirements and application results that will help develop the next generation standard for HSA, and push forward the research, development, and industrialization of heterogeneous computing in China.”
– Zhiyi Yu, Professor, Sun Yat-sen University

AMD
“We are glad to see the HSA Foundation is expanding, and we will continue to take active role to participate in heterogeneous computing activities and its open source efforts via the ROCm platform that bring HSA-enabled drivers, runtimes, compiler and tools to the global developer community. We hope together with the new members to promote more academic research in the China region.”
– Paul Blinzer, AMD Fellow

Huaxia General Processor Technologies
“As a HSA Foundation member, it is exciting to see that universities, institutes and companies in China are joining the CRC and making it a growing platform for heterogeneous computing in the region. Huaxia GPT focuses on designing and licensing embedded HSA-compatible processors and optimizing them to enable quicker, easier programming of high-performance parallel computing devices in heterogeneous ecosystems. We look forward to the future collaboration with these newly joined forces on the cutting-edge applications in the field of machine vision, Internet of Things (IoT), Machine-to-Machine (M2M), edge computing and deep learning.”
– Kerry Li, CEO, Huaxia General Processor Technologies

Imagination Technologies
“As a founding member of the HSA Foundation, Imagination works closely with other members to create specifications that make it easier to develop and program heterogeneous SoCs, and we are also developing IP cores enabling the realization of such SoCs. The role of China in designing next-generation semiconductors cannot be underestimated, and the HSA Foundation’s CRC can play a key role increasing awareness within the industry of the challenges and solutions around heterogeneous computing.”
– James Liu, VP and GM China, Imagination Technologies

About the HSA Foundation

The HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive. HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption. HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
Follow the HSA Foundation on Twitter, Facebook, LinkedIn and Instagram.

Heterogeneous: Performance and Power Consumption Benefits

 

Why multi-threaded, heterogeneous, and coherent CPU clusters are earning their place in the systems powering ADAS and autonomous vehicles, networking, drones, industrial automation, security, video analytics, and machine learning.High-performance processors typically employ techniques such as deep, multi-issue pipelines, branch prediction, and out-of-order processing to maximize performance, but these do come at a cost; specifically, they impact power efficiency.If some of these tasks can be parallelized, this impact could be mitigated by partitioning them across a number of efficient CPUs to deliver a high-performance, power-efficient solution. To accomplish this, CPU vendors have provided multicore and multi-cluster solutions, and operating system and application developers have designed their software to exploit these capabilities.
Similarly, application performance requirements can vary over time, so transferring the task to a more efficient CPU when possible improves power efficiency. For specialist computation tasks, dedicated accelerators offer excellent energy efficiency but can only be used for part of the time.
So, what should you be looking for when it comes to heterogeneous processors that deliver significant benefits in terms of performance and low power consumption? Let’s look at a few important considerations.
Multi-threading
Even with out-of-order execution, with typical workloads, CPUs aren’t fully utilized every CPU cycle; they spend most their time waiting for access to the memory system. However, when one portion of the program (known as a thread) is blocked, the hardware resources could potentially be used for another thread of execution. Multi-threading offers the benefit of being able to switch to a second thread when the first thread is blocked, leading to an increase in overall system throughput. Filling up all the CPU cycles with useful work that otherwise would be un-used leads to a performance boost; depending on the application, the addition of a second thread to a CPU typically adds 40 percent to the overall performance, for an additional silicon area cost of around 10 percent. Hardware multi-threading is a feature that in CPU IP is bespoke to Imagination’s MIPS CPUs.
A Common View
To move a task from one processor to another requires each processor to share the same instruction set and the same view of system memory. This is accomplished through shared virtual memory (SVM). Any pointer in the program must continue to point to the same code or data and any dirty cache line in the initial processor’s cache must be visible to the subsequent processor.

Figure 1: Memory moves when transferring between clusters.

Figure 1: Memory moves when transferring between clusters.

Figure 2: Smaller, faster memory movement when transferring within a cluster.

Figure 2: Smaller, faster memory movement when transferring within a cluster.

Cache Coherency
Cache coherency can be managed through software. This requires that the initial processor (CPU A) flush its cache to main memory before transferring to the subsequent processor (CPU B). CPU B then has to fetch the data and instructions back from main memory. This process can generate many memory accesses and is therefore time consuming and power hungry; this impact is magnified as the energy to access main memory is typically significantly higher than fetching from cache. To combat this, hardware cache coherency is vital, minimizing these power and performance costs. Hardware cache coherency tracks the location of these cache lines and ensures that the correct data is accessed by snooping the caches where necessary.
In many heterogeneous systems, the high-performance processors reside in one cluster, while the smaller, high-efficiency processors reside in another. Transferring a task between these different types of processors means that both the level 1 and level 2 caches of the new processor are cold. Warming them takes time and requires the previous cache hierarchy to remain active during the transition phase.
However, there is an alternative – the MIPS I6500 CPU. The I6500 supports a heterogeneous mix of external accelerators through an I/O Coherence Unit (IOCU) as well as different processor types within a cluster, allowing for a mix of high-performance, multi-threaded and power-optimized processors in the same cluster. Transferring a task from one type of processor to another is now much more efficient, as only the level 1 cache is cold, and the cost of snooping into the previous level 1 cache is much lower, so the transition time is much shorter.
Combining CPUs with Dedicated Accelerators
CPUs are general purpose machines. Their flexibility enables them to tackle almost any task but at the price of efficiency. Thanks to its optimizations, the PowerVR GPU can process larger, highly parallel computational tasks with very high performance and good power efficiency, in exchange for some reduction in flexibility compared to CPUs, and bolstered by a well-supported software development eco-system with APIs such as OpenCL or Open VX.
The specialization provided by dedicated hardware accelerators offers a combination of performance with power efficiency that is significantly better than a CPU, but with far less flexibility.
However, using accelerators for operations that occur frequently are ideal to maximize the potential performance and power efficiency gains. Specialized computational elements such as those for audio and video processing, as well as neural network processors used in machine learning, use similar mathematical operations.
Hardware acceleration can be coupled to the CPU by adding Single Instruction Multiple Data (SIMD) capabilities with floating point Arithmetic Logic Units (ALUs). However, while processing data through the SIMD unit, the CPU behaves as a Direct Memory Access (DMA) controller to move the data, and CPUs make very inefficient DMA controllers.
Conversely, a heterogeneous system essentially provides the best of both worlds. It contains some dedicated hardware accelerators that, coupled with a number of CPUs, offer the benefits of greater energy efficiency from dedicated hardware, while retaining much of the flexibility provided by CPUs.
These energy savings and performance boost depend on the proportion of time that the accelerator is doing useful work. Work packages appropriate for the accelerator are present in a wide range of sizes—you might expect a small number of large tasks, but many smaller tasks.
There is a cost in transferring the processing between a CPU and the accelerator, and this limits the size of the task that will save power or boost performance. For smaller tasks, the energy consumed and time taken to transfer the task exceeds the energy or time saved by using the accelerator.
Data Transfer Cost
To reduce time and energy costs, a Shared Virtual Memory with hardware cache coherency—as found in the I6500 CPU—is ideal as it addresses much of the cost of transferring the task. This is because it eliminates the copying of data and the flushing of caches. There are other available techniques to achieve even greater reductions.
The HSA Foundation has developed an environment to support the integration of heterogeneous processing elements in a system that extends beyond CPUs and GPUs. The HSA system’s intermediate language, HSAIL, provides a common compilation path to heterogeneous Instruction Set Architectures (ISAs) that greatly simplifies the system software development but also defines User Mode Queues.
These queues enable tasks to be scheduled and signals to trigger tasks on other processing elements, allowing sequences of tasks to execute with very little overhead between them.

Beyond Limitations
Heterogeneous systems offer the opportunity to significantly increase system performance and reduce system power consumption, enabling systems to continue to scale beyond the limitations imposed by ever shrinking process geometries.
Multi-threaded, heterogeneous and coherent CPU clusters such as the MIPS I6500 have the ideal characteristics to sit at the heart of these systems. As such they are well placed to efficiently power the next generation of devices.


Tim-Mace-2Tim Mace is Senior Manager, Business Development, MIPS Processors, Imagination Technologies.

New Open Source Test Suite Adds to Broad Toolset for Heterogeneous System Architecture Development

Beaverton, OR, May 2, 2017 – The HSA Foundation has made available to developers the HSA PRM (Programmer’s Reference Manual) conformance test suite as open source software. The test suite is used to validate Heterogeneous System Architecture (HSA) implementations for both the HSA PRM Specification and HSA PSA (Platform System Architecture) specification.
With this addition to the already available HSA Runtime Conformance tests, HSA developers now have a fully open source conformance test suite for validating all aspects of HSA systems.
HSA is a standardized platform design that unlocks the performance and power efficiency of the parallel computing engines found in most modern electronic devices. It allows developers to easily and efficiently apply the hardware resources—including CPUs, GPUs, DSPs, FPGAs, fabrics and fixed function accelerators—in today’s complex systems-on-chip (SoCs).
“The HSA Foundation has always been a strong proponent of open source development tools directly and through its member companies,” said HSA Foundation Chairman Greg Stoner. “Open sourcing worldwide the PRM conformance test suite is yet another example of an expanding array of development tools freely available supporting HSA.”
According to HSA Foundation President Dr. John Glossner, “The decision to open source the conformance test suite is strongly supported by the HSA Foundation and we believe this is an important step for allowing the developer community including non-member China Regional Committee (CRC) participants to test HSA systems. With the ability to develop conformance tests, the community can now contribute to the new test and thus drive the continual improvement of the test quality and consistency.”
“Good quality open source components are crucial in making heterogeneous computing more accessible to programmers and standards adopters. It is great to see that HSA Foundation continues its open source strategy by releasing the important PRM conformance test suite to the public,” said Dr. Pekka Jääskeläinen, CEO of Parmance.
The HSA Foundation through its member companies and universities has also released many additional projects which are all available on the Foundation’s GitHub site including:

  • HSAIL Developer Tools: finalizer, debugger, assembler, and simulator
  • GCC HSAIL frontend developed by Parmance and General Processor Technologies (GPT) allowing gcc finalization for any gcc machine target; the frontend is included in the upcoming GCC 7 release
  • Heterogeneous compute compiler (hcc) for single-source compilation of heterogeneous systems
  • Runtime implementations including AMD’s ROCm and phsa-runtime by Parmance and GPT; phsa-runtime can be used together with GCC HSAIL frontend to support the entire HSA programming stack using open source components
  • Portable Computing Language (pocl), an open source implementation of the OpenCL standard with a backend for HSA developed by the Customized Parallel Computing group of Tampere University of Technology (TUT) –an HSA Foundation Academic Center of Excellence

See the complete roster at: https://github.com/HSAFoundation.
About the HSA Foundation
The HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive. HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption. HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
Follow the HSA Foundation on Twitter, Facebook, LinkedIn and Instagram.

Mixed Reality: Computer Vision Killer App Will Change How We Communicate, Collaborate

By Jeff Bier, Founder, Embedded Vision Alliance. Computing Now: https://www.computer.org/web/hsa-connections/content?g=54930593&type=article&urlTitle=mixed-reality-computer-vision-killer-app-will-change-how-we-communicate-collaborate
At this year’s Consumer Electronics Show, I walked many miles and saw countless demos. Several of these demos were memorable, but one in particular really got my mental gears turning: Microsoft’s HoloLens.
HoloLens will spur many “aha” moments, leading to accelerated innovation in wearable computer vision devices, low-power 3D computer vision, and mixed reality.
HoloLens, of course, is Microsoft’s “mixed reality” glasses product, which has been shipping in pre-production form for about a year. Previously, I would have used the term “augmented reality” to refer to HoloLens, which overlays computer-generated graphics on the user’s view of the physical world. But here I’m adopting Microsoft’s preferred term, “mixed reality,” which many people now use to describe systems in which “people, places, and objects from your physical and virtual worlds merge together.”
Over the past five years, I’ve seen many demos of virtual reality, augmented reality and mixed reality. Most of these showed promise—but the promise usually felt distant, because the demos weren’t sufficiently polished to feel “real,” and weren’t easy to use.
That was then, this is now: HoloLens has nailed both the “feels real” and ease-of-use aspects. Wearing HoloLens, I played a shoot-em-up video game against an army of robots, illustrated in this video. The experience was stunning, thanks to three key capabilities. First, HoloLens is a wearable, battery-powered device so I was able to move about the room to dodge hostile robots. Second, HoloLens accurately mapped the room I was in, enabling the robotic invaders to create what looked like real cracks in the actual walls of the room. And third, as I turned my head and shifted my position within the room, HoloLens adapted to these movements seamlessly so that the illusion of merged physical and virtual worlds was maintained.
Now that I’ve experienced robust mixed reality, I foresee many compelling applications for this technology beyond gaming: Enabling physicians to see inside a body to enable safer, more accurate treatment. Giving utility workers a clear view of underground pipes and cables. Providing consumers with a realistic preview of how a room will look after redecorating it. Allowing museum visitors to see a skeleton transform into a fully formed, animated dinosaur (the fact that HoloLens sells for $3,000 suggests that, for a while at least, this technology is more likely to be adopted by hospitals, utility companies and museums than by individual consumers).
Of course, a convincing mixed reality (“MR”) experience—one in which the virtual and physical worlds interact in a realistic way—requires the MR device to maintain an accurate understanding of the surrounding physical world—and the user’s position within it—in three dimensions with very low latency. That is, it requires fast, highly accurate 3D computer vision.
Mixed reality doesn’t necessarily require a wearable device. Vehicle applications, for example, can use the windshield as a projection screen. And 8tree’s clever handheld device for quantifying surface damage projects information onto the surface being inspected. But in many cases, glasses are the most compelling way to deliver mixed reality. This is because they leave your hands free, because they know where you are looking, and because they have the ability to project information into your field of view wherever you’re looking. Packing all of the technology required for a convincing MR experience into a wearable device is a daunting challenge, however. With HoloLens, Microsoft has given us a hint of what’s possible. The HoloLens team has clearly put enormous effort into everything from custom chips to industrial design to create a device that’s reasonably comfortable to wear (though still bulky).
One of the key challenges for developers of products like HoloLens is harnessing the capabilities of heterogeneous compute resources—CPUs, GPUs, DSPs, FPGAs, and fixed-function accelerators—to deliver high performance with low cost and low energy consumption. HSA provides an approach that enables developers to easily and efficiently apply compute resources to demanding applications in today’s complex SoCs.
Learn more about heterogeneous computing for efficient computer vision at the upcoming Embedded Vision Summit. Marc Pollefeys, Director of Science for HoloLens and a pioneer in 3D computer vision, will be one of the keynote speakers.