Parallel pleasure: deep-geek chip consortium opens test tool

By Adrian Bridgwater, ComputerWeekly UK: http://www.computerweekly.com/blog/Open-Source-Insider/Parallel-pleasure-deep-geek-chip-consortium-opens-test-tool

The HSA Foundation has made available to developers the HSA PRM (Programmer’s Reference Manual) conformance test suite as open source software.

HSA who?

Yes, sorry… the HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive.

The test suite is used to validate Heterogeneous System Architecture (HSA) implementations for both the HSA PRM Specification and HSA PSA (Platform System Architecture) specification.

But what is HSA?

HSA is a standardised platform design designed to unlock the performance and power efficiency of the parallel computing engines found in most modern electronic devices.

It allows developers to apply the hardware resources—including CPUs, GPUs, DSPs, FPGAs, fabrics and fixed function accelerators—in today’s complex systems-on-chip (SoCs).

“The HSA Foundation has always been a strong proponent of open source development tools directly and through its member companies,” said HSA Foundation chairman Greg Stoner. “Open sourcing worldwide the PRM conformance test suite is yet another example of an expanding array of development tools freely available supporting HSA.”

The HSA Foundation through its member companies and universities has also released many additional projects which are all available on the Foundation’s GitHub site.

HSA Foundation Establishes China Regional Committee to Enhance Global Awareness of Heterogeneous Computing

Committee Members Include Leading China Institutes, Universities, and Standards Authorities
Xiamen, Fujian, China, May 11, 2017 – The HSA Foundation has announced the formation of the China Regional Committee (CRC), with founding members comprised of 20 renowned institutes, universities and standards authorities throughout China. With a focus on growing the HSA ecosystem, the CRC’s mandate is to enhance the awareness of heterogeneous computing and promote the adoption of standards such as Heterogeneous System Architecture (HSA) in China. Dr. Xiaodong Zhang, from Huaxia General Processor Technologies, will serve as the CRC’s chairman.
“The CRC will help define regional heterogeneous computing needs, obtain advice from local experts, help China market segments become more integrated with continuously expanding HSA technologies, and serve as a gateway for the HSA Foundation to be more proactive and effective in addressing heterogeneous computing opportunities and issues affecting the region,” noted Zhang.
“China’s fast growing role in semiconductor innovation, combined with its skilled talent base, makes it a strategically advantageous location for the HSA Foundation to establish its first regional committee. Our hope is to accelerate China’s heterogeneous computing development in line with the standardization work, as well as to benefit the local industry community with high performance heterogeneous systems with reduced complexity. The establishment of the CRC will help significantly in these efforts,” said HSA Foundation President Dr. John Glossner.
“The HSA ecosystem continues to grow rapidly in China and we look forward to further collaborative ventures with our new CRC colleagues,” said HSA Foundation Chairman and Managing Director Greg Stoner.
Glossner said that the HSA Foundation is gaining increasing traction, with recently announced HSA compliant products worldwide, the introduction of the HSA 1.1 specification, and other key developments.
The CRC’s initial members include CESI, a professional institute for standardization in the field of electronics and IT industry in China under the Ministry of Industry and Information Technology (MIIT), and organizations that play an influential role in the HSA ecosystem in China, especially in the fields of artificial intelligence (AI), machine learning, AR/VR and many others which require support from heterogeneous processing. Founding members of the CRC include:
• China Electronics Standardization Institute (CESI)
• Fudan University
– State Key Laboratory of ASIC and System
• Hunan Institute of Science and Technology
• Institute of Computing Technology (ICT), Chinese Academy of Sciences
• Jiangsu Research Center of Software Digital Radio
• Nanjing University
– State Key Laboratory for Novel Software Technology
• Nanjing University of Aeronautics and Astronautics
• Nanjing University of Posts and Telecommunications
• Nanjing University of Science and Technology
• Nantong University
• Peking University
• Shanghai Advanced Research Institute, Chinese Academy of Sciences
• Shanghai Institute of Microsystem and Information Technology (SIMIT), Chinese Academy of Sciences
• Shanghai Jiao tong University
• Shanghai Research Center for Wireless Communications
• Shanghai University
• Shenyang Institute of Automation, Chinese Academy of Sciences
– State Key Laboratory of Robotics
• Southeast University
– State Key Laboratory of Mobile Communications
• Sun Yat-sen University
• University of Science and Technology Beijing
2017 Heterogeneous Architecture Standards and Artificial Intelligence Conference
The first CRC Symposium is part of the 2017 Heterogeneous System Architecture Standards and Artificial Intelligence Conference, which will be held in Xiamen on May 25 – 26. The two-day event is co-hosted by CESI, the HSA Foundation and Chinese Association of Artificial Intelligence, with an organizing committee including Huaxia General Processor Technologies, the HSA Foundation CRC, and Xiamen Integrated Circuit Industry Association.
Renowned scholars and officials from related industry organizations will be invited to exchange and discuss standards and technologies for heterogeneous computing and artificial intelligence. A list of outstanding industry leaders will speak at the AI conference, joined by numerous other attending companies from related fields. For more conference information, a list of speakers and online registration, please visit www.hsa-china.com.
HSA is rapidly becoming a mainstream platform to support the promotion and application of the artificial intelligence industry and to develop standards for the next generation of SoCs and heterogeneous processors. The Symposium will bring together dozens of universities, institutes and companies to discuss the HSA Foundation and its development in China. Topics will include standards, key technologies, collaborative development, and software ecosystem construction, among others.
The CRC will also take an active role in developing the second annual Heterogeneous System Architecture 2017 Global Summit (visit www.hsafoundation.com; details to be posted soon). The two-day 2016 event was co-sponsored by the HSA Foundation and the China Semiconductor Industry Association (CSIA), and was also supported by the Beijing Economic and Technological Development Zone (E-Town), the Ministry of Industry and Information Technology of the People’s Republic of China (MIIT), and Cyberspace Administration of China.
Supporting Quotes

China Electronics Standardization Institute
“Heterogeneous computing is the key technology in the next-generation processor design. China Electronic Standardization Institute (CESI), as the primary non-profit and comprehensive research institution for China’s standardization of electronic information technologies, is very pleased to be a member of the CRC, and together with other CRC members, will drive heterogeneous computing standardization work in China. As a member of the HSA Foundation, we look forward to joining global colleagues to improve the HSA technical standardization and better promote the development of next generation processors worldwide including China.”

  – Baoyou Wang, Director of Basic Product Research Center, China Electronics Standardization Institute

Nanjing University
“The School of Microelectronics at Nanjing University focuses on a variety of core disciplines, some of which include multi-core processing chip architectures and implementations, reconfigurable computing, three-dimensional network-on-chip (NoC) design, SoC design and high-performance VLSI implementations in digital signal processing algorithms. Heterogeneous computing is one of today’s hottest technologies and encompasses important applications such as mobile devices, the Internet of Things (IoT), cloud computing, and artificial intelligence. We look forward to working with the HSA Foundation in effectively using CPU, GPU, DSP, FPGA and other hardware and software resources to support research and development of heterogeneous system architectures. We thank the HSA Foundation for facilitating a dedicated research platform for institutions and universities.”
– Hongbing Pan, Professor, Nanjing University

Shenyang Institute of Automation, Chinese Academy of Sciences

“The institute’s main research directions include wireless sensor and communication technology, and industrial digital control systems. Our research group is engaged in R&D of industrial bus technology related to communications chips, and system-on-chip with communication functions. We look forward to working with HSA Foundation’s CRC where we will focus on the research of heterogeneous multi-core technology for industrial control SoC’s. With the development of China’s “Industry 4.0”, the traditional centralized control is transitioning to a decentralized model. Industrial control systems are composed of heterogeneous cores including micro controllers and DSPs connected by a common bus. HSAF technologies address these types of systems providing flexibility, high performance, integration, and miniaturization. We look forward to adopting HSAF technology and evaluating the effectiveness of HSA for industrial control systems.”
– Chuang Xie, Senior Engineer and Director of SoC Designs, Shenyang Institute of Automation, Chinese Academy of Sciences

Southeast University
“The establishment of HSA Foundation’s CRC will further promote the rapid development of heterogeneous computing technology in the region. Southeast University has made several innovations in deep learning and cloud computing. Its Laboratory of Image Science and Technology, one of the earliest units in China to be involved in image processing, looks forward to contributing innovative technology solutions. This will enable researchers to focus on algorithm research and evaluate their effectiveness in HSA systems.”
– Aodong Shen, Assistant Professor, Southeast University

Sun Yat-sen University
“Processors are facing great challenges. Moore’s Law is slowing down, while new applications such as big data and artificial intelligence require higher computation and storage capability. Heterogeneous computing is proposed as ”CPU+” architecture. It can significantly improve the system performance and energy efficiency for a wide range of application domains, and is evolving to become the main platform for the next generation computation industry. The HSA Foundation aims to standardize the heterogeneous computing architecture. It’s my honor to participate in HSA Foundation’s CRC. We look forward to providing input to the HSA Foundation with regional requirements and application results that will help develop the next generation standard for HSA, and push forward the research, development, and industrialization of heterogeneous computing in China.”
– Zhiyi Yu, Professor, Sun Yat-sen University

AMD
“We are glad to see the HSA Foundation is expanding, and we will continue to take active role to participate in heterogeneous computing activities and its open source efforts via the ROCm platform that bring HSA-enabled drivers, runtimes, compiler and tools to the global developer community. We hope together with the new members to promote more academic research in the China region.”
– Paul Blinzer, AMD Fellow

Huaxia General Processor Technologies
“As a HSA Foundation member, it is exciting to see that universities, institutes and companies in China are joining the CRC and making it a growing platform for heterogeneous computing in the region. Huaxia GPT focuses on designing and licensing embedded HSA-compatible processors and optimizing them to enable quicker, easier programming of high-performance parallel computing devices in heterogeneous ecosystems. We look forward to the future collaboration with these newly joined forces on the cutting-edge applications in the field of machine vision, Internet of Things (IoT), Machine-to-Machine (M2M), edge computing and deep learning.”
– Kerry Li, CEO, Huaxia General Processor Technologies

Imagination Technologies
“As a founding member of the HSA Foundation, Imagination works closely with other members to create specifications that make it easier to develop and program heterogeneous SoCs, and we are also developing IP cores enabling the realization of such SoCs. The role of China in designing next-generation semiconductors cannot be underestimated, and the HSA Foundation’s CRC can play a key role increasing awareness within the industry of the challenges and solutions around heterogeneous computing.”
– James Liu, VP and GM China, Imagination Technologies

About the HSA Foundation

The HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive. HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption. HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
Follow the HSA Foundation on Twitter, Facebook, LinkedIn and Instagram.

Heterogeneous: Performance and Power Consumption Benefits

 

Why multi-threaded, heterogeneous, and coherent CPU clusters are earning their place in the systems powering ADAS and autonomous vehicles, networking, drones, industrial automation, security, video analytics, and machine learning.High-performance processors typically employ techniques such as deep, multi-issue pipelines, branch prediction, and out-of-order processing to maximize performance, but these do come at a cost; specifically, they impact power efficiency.If some of these tasks can be parallelized, this impact could be mitigated by partitioning them across a number of efficient CPUs to deliver a high-performance, power-efficient solution. To accomplish this, CPU vendors have provided multicore and multi-cluster solutions, and operating system and application developers have designed their software to exploit these capabilities.
Similarly, application performance requirements can vary over time, so transferring the task to a more efficient CPU when possible improves power efficiency. For specialist computation tasks, dedicated accelerators offer excellent energy efficiency but can only be used for part of the time.
So, what should you be looking for when it comes to heterogeneous processors that deliver significant benefits in terms of performance and low power consumption? Let’s look at a few important considerations.
Multi-threading
Even with out-of-order execution, with typical workloads, CPUs aren’t fully utilized every CPU cycle; they spend most their time waiting for access to the memory system. However, when one portion of the program (known as a thread) is blocked, the hardware resources could potentially be used for another thread of execution. Multi-threading offers the benefit of being able to switch to a second thread when the first thread is blocked, leading to an increase in overall system throughput. Filling up all the CPU cycles with useful work that otherwise would be un-used leads to a performance boost; depending on the application, the addition of a second thread to a CPU typically adds 40 percent to the overall performance, for an additional silicon area cost of around 10 percent. Hardware multi-threading is a feature that in CPU IP is bespoke to Imagination’s MIPS CPUs.
A Common View
To move a task from one processor to another requires each processor to share the same instruction set and the same view of system memory. This is accomplished through shared virtual memory (SVM). Any pointer in the program must continue to point to the same code or data and any dirty cache line in the initial processor’s cache must be visible to the subsequent processor.

Figure 1: Memory moves when transferring between clusters.

Figure 1: Memory moves when transferring between clusters.

Figure 2: Smaller, faster memory movement when transferring within a cluster.

Figure 2: Smaller, faster memory movement when transferring within a cluster.

Cache Coherency
Cache coherency can be managed through software. This requires that the initial processor (CPU A) flush its cache to main memory before transferring to the subsequent processor (CPU B). CPU B then has to fetch the data and instructions back from main memory. This process can generate many memory accesses and is therefore time consuming and power hungry; this impact is magnified as the energy to access main memory is typically significantly higher than fetching from cache. To combat this, hardware cache coherency is vital, minimizing these power and performance costs. Hardware cache coherency tracks the location of these cache lines and ensures that the correct data is accessed by snooping the caches where necessary.
In many heterogeneous systems, the high-performance processors reside in one cluster, while the smaller, high-efficiency processors reside in another. Transferring a task between these different types of processors means that both the level 1 and level 2 caches of the new processor are cold. Warming them takes time and requires the previous cache hierarchy to remain active during the transition phase.
However, there is an alternative – the MIPS I6500 CPU. The I6500 supports a heterogeneous mix of external accelerators through an I/O Coherence Unit (IOCU) as well as different processor types within a cluster, allowing for a mix of high-performance, multi-threaded and power-optimized processors in the same cluster. Transferring a task from one type of processor to another is now much more efficient, as only the level 1 cache is cold, and the cost of snooping into the previous level 1 cache is much lower, so the transition time is much shorter.
Combining CPUs with Dedicated Accelerators
CPUs are general purpose machines. Their flexibility enables them to tackle almost any task but at the price of efficiency. Thanks to its optimizations, the PowerVR GPU can process larger, highly parallel computational tasks with very high performance and good power efficiency, in exchange for some reduction in flexibility compared to CPUs, and bolstered by a well-supported software development eco-system with APIs such as OpenCL or Open VX.
The specialization provided by dedicated hardware accelerators offers a combination of performance with power efficiency that is significantly better than a CPU, but with far less flexibility.
However, using accelerators for operations that occur frequently are ideal to maximize the potential performance and power efficiency gains. Specialized computational elements such as those for audio and video processing, as well as neural network processors used in machine learning, use similar mathematical operations.
Hardware acceleration can be coupled to the CPU by adding Single Instruction Multiple Data (SIMD) capabilities with floating point Arithmetic Logic Units (ALUs). However, while processing data through the SIMD unit, the CPU behaves as a Direct Memory Access (DMA) controller to move the data, and CPUs make very inefficient DMA controllers.
Conversely, a heterogeneous system essentially provides the best of both worlds. It contains some dedicated hardware accelerators that, coupled with a number of CPUs, offer the benefits of greater energy efficiency from dedicated hardware, while retaining much of the flexibility provided by CPUs.
These energy savings and performance boost depend on the proportion of time that the accelerator is doing useful work. Work packages appropriate for the accelerator are present in a wide range of sizes—you might expect a small number of large tasks, but many smaller tasks.
There is a cost in transferring the processing between a CPU and the accelerator, and this limits the size of the task that will save power or boost performance. For smaller tasks, the energy consumed and time taken to transfer the task exceeds the energy or time saved by using the accelerator.
Data Transfer Cost
To reduce time and energy costs, a Shared Virtual Memory with hardware cache coherency—as found in the I6500 CPU—is ideal as it addresses much of the cost of transferring the task. This is because it eliminates the copying of data and the flushing of caches. There are other available techniques to achieve even greater reductions.
The HSA Foundation has developed an environment to support the integration of heterogeneous processing elements in a system that extends beyond CPUs and GPUs. The HSA system’s intermediate language, HSAIL, provides a common compilation path to heterogeneous Instruction Set Architectures (ISAs) that greatly simplifies the system software development but also defines User Mode Queues.
These queues enable tasks to be scheduled and signals to trigger tasks on other processing elements, allowing sequences of tasks to execute with very little overhead between them.

Beyond Limitations
Heterogeneous systems offer the opportunity to significantly increase system performance and reduce system power consumption, enabling systems to continue to scale beyond the limitations imposed by ever shrinking process geometries.
Multi-threaded, heterogeneous and coherent CPU clusters such as the MIPS I6500 have the ideal characteristics to sit at the heart of these systems. As such they are well placed to efficiently power the next generation of devices.


Tim-Mace-2Tim Mace is Senior Manager, Business Development, MIPS Processors, Imagination Technologies.

New Open Source Test Suite Adds to Broad Toolset for Heterogeneous System Architecture Development

Beaverton, OR, May 2, 2017 – The HSA Foundation has made available to developers the HSA PRM (Programmer’s Reference Manual) conformance test suite as open source software. The test suite is used to validate Heterogeneous System Architecture (HSA) implementations for both the HSA PRM Specification and HSA PSA (Platform System Architecture) specification.
With this addition to the already available HSA Runtime Conformance tests, HSA developers now have a fully open source conformance test suite for validating all aspects of HSA systems.
HSA is a standardized platform design that unlocks the performance and power efficiency of the parallel computing engines found in most modern electronic devices. It allows developers to easily and efficiently apply the hardware resources—including CPUs, GPUs, DSPs, FPGAs, fabrics and fixed function accelerators—in today’s complex systems-on-chip (SoCs).
“The HSA Foundation has always been a strong proponent of open source development tools directly and through its member companies,” said HSA Foundation Chairman Greg Stoner. “Open sourcing worldwide the PRM conformance test suite is yet another example of an expanding array of development tools freely available supporting HSA.”
According to HSA Foundation President Dr. John Glossner, “The decision to open source the conformance test suite is strongly supported by the HSA Foundation and we believe this is an important step for allowing the developer community including non-member China Regional Committee (CRC) participants to test HSA systems. With the ability to develop conformance tests, the community can now contribute to the new test and thus drive the continual improvement of the test quality and consistency.”
“Good quality open source components are crucial in making heterogeneous computing more accessible to programmers and standards adopters. It is great to see that HSA Foundation continues its open source strategy by releasing the important PRM conformance test suite to the public,” said Dr. Pekka Jääskeläinen, CEO of Parmance.
The HSA Foundation through its member companies and universities has also released many additional projects which are all available on the Foundation’s GitHub site including:

  • HSAIL Developer Tools: finalizer, debugger, assembler, and simulator
  • GCC HSAIL frontend developed by Parmance and General Processor Technologies (GPT) allowing gcc finalization for any gcc machine target; the frontend is included in the upcoming GCC 7 release
  • Heterogeneous compute compiler (hcc) for single-source compilation of heterogeneous systems
  • Runtime implementations including AMD’s ROCm and phsa-runtime by Parmance and GPT; phsa-runtime can be used together with GCC HSAIL frontend to support the entire HSA programming stack using open source components
  • Portable Computing Language (pocl), an open source implementation of the OpenCL standard with a backend for HSA developed by the Customized Parallel Computing group of Tampere University of Technology (TUT) –an HSA Foundation Academic Center of Excellence

See the complete roster at: https://github.com/HSAFoundation.
About the HSA Foundation
The HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive. HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption. HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
Follow the HSA Foundation on Twitter, Facebook, LinkedIn and Instagram.

Mixed Reality: Computer Vision Killer App Will Change How We Communicate, Collaborate

By Jeff Bier, Founder, Embedded Vision Alliance. Computing Now: https://www.computer.org/web/hsa-connections/content?g=54930593&type=article&urlTitle=mixed-reality-computer-vision-killer-app-will-change-how-we-communicate-collaborate
At this year’s Consumer Electronics Show, I walked many miles and saw countless demos. Several of these demos were memorable, but one in particular really got my mental gears turning: Microsoft’s HoloLens.
HoloLens will spur many “aha” moments, leading to accelerated innovation in wearable computer vision devices, low-power 3D computer vision, and mixed reality.
HoloLens, of course, is Microsoft’s “mixed reality” glasses product, which has been shipping in pre-production form for about a year. Previously, I would have used the term “augmented reality” to refer to HoloLens, which overlays computer-generated graphics on the user’s view of the physical world. But here I’m adopting Microsoft’s preferred term, “mixed reality,” which many people now use to describe systems in which “people, places, and objects from your physical and virtual worlds merge together.”
Over the past five years, I’ve seen many demos of virtual reality, augmented reality and mixed reality. Most of these showed promise—but the promise usually felt distant, because the demos weren’t sufficiently polished to feel “real,” and weren’t easy to use.
That was then, this is now: HoloLens has nailed both the “feels real” and ease-of-use aspects. Wearing HoloLens, I played a shoot-em-up video game against an army of robots, illustrated in this video. The experience was stunning, thanks to three key capabilities. First, HoloLens is a wearable, battery-powered device so I was able to move about the room to dodge hostile robots. Second, HoloLens accurately mapped the room I was in, enabling the robotic invaders to create what looked like real cracks in the actual walls of the room. And third, as I turned my head and shifted my position within the room, HoloLens adapted to these movements seamlessly so that the illusion of merged physical and virtual worlds was maintained.
Now that I’ve experienced robust mixed reality, I foresee many compelling applications for this technology beyond gaming: Enabling physicians to see inside a body to enable safer, more accurate treatment. Giving utility workers a clear view of underground pipes and cables. Providing consumers with a realistic preview of how a room will look after redecorating it. Allowing museum visitors to see a skeleton transform into a fully formed, animated dinosaur (the fact that HoloLens sells for $3,000 suggests that, for a while at least, this technology is more likely to be adopted by hospitals, utility companies and museums than by individual consumers).
Of course, a convincing mixed reality (“MR”) experience—one in which the virtual and physical worlds interact in a realistic way—requires the MR device to maintain an accurate understanding of the surrounding physical world—and the user’s position within it—in three dimensions with very low latency. That is, it requires fast, highly accurate 3D computer vision.
Mixed reality doesn’t necessarily require a wearable device. Vehicle applications, for example, can use the windshield as a projection screen. And 8tree’s clever handheld device for quantifying surface damage projects information onto the surface being inspected. But in many cases, glasses are the most compelling way to deliver mixed reality. This is because they leave your hands free, because they know where you are looking, and because they have the ability to project information into your field of view wherever you’re looking. Packing all of the technology required for a convincing MR experience into a wearable device is a daunting challenge, however. With HoloLens, Microsoft has given us a hint of what’s possible. The HoloLens team has clearly put enormous effort into everything from custom chips to industrial design to create a device that’s reasonably comfortable to wear (though still bulky).
One of the key challenges for developers of products like HoloLens is harnessing the capabilities of heterogeneous compute resources—CPUs, GPUs, DSPs, FPGAs, and fixed-function accelerators—to deliver high performance with low cost and low energy consumption. HSA provides an approach that enables developers to easily and efficiently apply compute resources to demanding applications in today’s complex SoCs.
Learn more about heterogeneous computing for efficient computer vision at the upcoming Embedded Vision Summit. Marc Pollefeys, Director of Science for HoloLens and a pioneer in 3D computer vision, will be one of the keynote speakers.

Heterogeneous Systems Architecture (HSA) Foundation Names Tampere University of Technology an HSA Academic Center of Excellence

BEAVERTON, OR – March 14, 2017 – The Heterogeneous System Architecture (HSA) Foundation continues to expand its Academic Partnership Program with the addition of Finland-based Tampere University of Technology (TUT) as an HSA Academic Center of Excellence. TUT is in Tampere, about 170 km (105 miles) north of Helsinki.
TUT is now the third European university accorded this distinction — in December, the Foundation announced that Technische Universitaet (TU) Darmstadt, and Friedrich-Alexander-University Erlangen-Nurnberg (FAU), both in Germany, were named Academic Centers of Excellence; Northeastern University was the first in North America.
HSA is a standardized platform design that unlocks the performance and power efficiency of the parallel computing engines found in most modern electronic devices. It allows developers to easily and efficiently apply the hardware resources — including CPUs, GPUs, DSPs, FPGAs, fabrics and fixed function accelerators — in today’s complex systems-on-chip (SoCs).
“We’re excited to have TUT on board as an Academic Center of Excellence and look forward to collaborating with the university on several projects,” said HSA Foundation President Dr. John Glossner. “TUT is the forefront of research in areas that intersect closely with heterogeneous computing such as intelligent machines and networked systems.”
“The HSA ecosystem is growing rapidly not only in Finland, but throughout Europe,” said HSA Foundation Chairman and Managing Director Greg Stoner. “TUT has a long-established reputation encompassing an array of innovative technologies. Our members — and the global tech community — will benefit greatly from this burgeoning partnership.”
Jarmo Takala, a professor in the TUT Faculty of Computing and Electrical Engineering, added that the research group is currently working on an open source implementation of the OpenCL standard, called the Portable Computing Language project.
“We’re also going to add support for HSA specs and create a complete tool flow for HSA runtime customized accelerators based on transport-triggered architecture, and open source design tools for these processors, known as the “TTA-based Co-Design Environment” added Takala.
Dr. Pekka Jääskeläinen, who is currently working at TUT as a postdoctoral researcher funded by the Academy of Finland and also involved in various HSA- related activities, said adopting HSA standards “is enabling us to build well-documented IP interfaces to SoC components. HSA is also providing a framework for more studies related to programmer-productivity challenges still hindering heterogeneous platform adoption.”
About TUT
Established in 1965 as a subsidiary of Helsinki University of Technology, TUT became an independent university in 1972. Today, more than 8,300 undergraduate and postgraduate students attend TUT. Of these, about 1,500 students from more than 60 countries are currently pursuing studies. TUT is a sought-after partner for collaborative research and development projects with business and industry worldwide.
About the HSA Foundation
The HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive. HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption. HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
Follow the HSA Foundation on Twitter, Facebook, LinkedIn and Instagram.
Contact Information
Contact:
Neal Leavitt
Leavitt Communications
(760) 639-2900
neal@leavcom.com

The HSA Foundation expands its Academic Partnership Program

HSAFJohnGlossner-ea39abe42b1ba6b583663d54964c7d8f-e1487196664909
Entrepreneur Podcast Network: http://epodcastnetwork.com/the-hsa-foundation-expands-its-academic-partnership-program/
Dr. John Glossner, President of HSA or The Heterogeneous System Architecture a non-profit whose goal is making programming for parallel computing easy and pervasive again joins Enterprise Radio to discuss more about the foundation, the overall benefit and the new partnership.
Listen to host Eric Dye & guest Dr. John Glossner discuss the following:

  • Dr. Glossner, we last talked in early November. For the benefit of our listeners, can you please provide a brief synopsis again on what the HSA Foundation is.
  • In November, we also talked about what the Foundation calls Academic Centers of Excellence. Please elaborate again on what these are, and how does a higher educational institution become one?
  • You mentioned then that Northeastern University in Boston was the first of these; in early December, two leading German universities also became Academic Centers of Excellence. Tell us about each and elaborate on some of the innovative HSA projects they’re working on.
  • AMD, a founding member of the Foundation, recently provided a tutorial at an international conference on code generation and optimization. The title was ‘Updates in Heterogeneous Compute.’ Please share what you see as recent heterogeneous compute updates and developments.
  • It appears that heterogeneous compute will be applicable for an array of apps. This can be everything from vision based IoT systems to mobile devices; desktops, high-performance computing (HPC) systems, AR/VR environments, and servers. So how will heterogeneous compute improve performance and power efficiency?
  • How does HSA make life easier for IP and system designers?

John Glossner, Ph.D. is the President of The Heterogeneous System Architecture (HSA) Foundation and is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive.
HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption.
HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
Glossner currently serves as CEO of General Processor Technologies.
hsaflogo2015
Website: www.hsafoundation.com
Social Media Links:
Facebook: facebook.com/thehsafoundation
Twitter: @hsafoundation

HSA Foundation, AMD Spearheading Heterogeneous Compute Tutorial at CGO

BEAVERTON, OR–(Marketwired – January 26, 2017) – The HSA (Heterogeneous System Architecture) Foundation together with Foundation member AMD will be providing a half-day tutorial entitled, ‘Updates in Heterogeneous Compute’ at the International Symposium on Code Generation and Optimization (CGO). The conference will be held from Feb. 4-8 in Austin, TX.
CGO provides a venue to bring together researchers and practitioners working at the interface of hardware and software on a wide range of optimization and code generation techniques and related issues. The conference spans the spectrum from purely static to fully dynamic approaches, and from pure software-based methods to specific architectural features and support for code generation and optimization.
The half-day tutorial will be presented by AMD Fellow Paul Blinzer on Sunday, Feb. 5, at 1:15 PM in Room 616B. Blinzer’s talk will provide insight into the latest developments in hardware and software for heterogeneous compute, a solution required for a growing number of applications including vision based IoT systems, mobile devices, desktops, high-performance computing (HPC) systems, AR/VR environments, and servers.
The talk will include updates on HSA, a standardized platform design supported by more than 40 technology companies and 23 universities that unlocks the performance and power efficiency of the parallel computing engines found in most modern electronic devices.
The tutorial and other CGO sessions will be held at the Hilton hotel, 500 East 4th St., Austin. For more information, including a full list of speakers, supporting organizations and sponsors, as well as registration information, please visit: http://cgo.org/cgo2017.
About the HSA Foundation
The HSA (Heterogeneous System Architecture) Foundation is a non-profit consortium of SoC IP vendors, OEMs, Academia, SoC vendors, OSVs and ISVs, whose goal is making programming for parallel computing easy and pervasive. HSA members are building a heterogeneous computing ecosystem, rooted in industry standards, which combines scalar processing on the CPU with parallel processing on the GPU, while enabling high bandwidth access to memory and high application performance with low power consumption. HSA defines interfaces for parallel computation using CPU, GPU and other programmable and fixed function devices, while supporting a diverse set of high-level programming languages, and creating the foundation for next-generation, general-purpose computing.
About Paul Blinzer
Paul Blinzer works on a wide variety of Platform System Software architecture projects and specifically on the Heterogeneous System Architecture (HSA) System Software at Advanced Micro Devices, Inc. (AMD) as a Fellow in the System Software group. Living in the Seattle, WA area, during his career he has worked in various roles on system level driver development, system software development, graphics architecture, graphics & compute acceleration since the early ’90s. Paul is the chairperson of the “System Architecture Workgroup” of the HSA Foundation. He has a degree in Electrical Engineering (Dipl.-Ing) from TU Braunschweig, Germany.
https://www.linkedin.com/in/paul-blinzer-4523602
Follow the HSA Foundation on Twitter, Facebook, LinkedIn and Instagram.

You’ll likely find the HSA software and toolchains quite useful and timeless

by Paul Blinzer, Embedded Computing Design: http://embedded-computing.com/guest-blogs/youll-likely-find-the-hsa-software-and-toolchains-quite-useful-and-timeless/#
Many people talk about hardware architecture as if it’s the most important part of a new platform. It’s true that hardware architecture is important for performance, which was discussed at length in a previous blog post. As a refresher, the pillars of the Heterogeneous System Architecture (HSA) are unified and shared virtual memory user-mode dispatch, platform atomics, architected signals, strict memory model, quality of service, and cache coherency.
However, including these features into the platform architecture is not for their own sake; it allows to be written easily and to run efficiently. Even more so, it enables existing software to be ported easily and ideally automatically onto the new architecture.
While hardware typically has a limited lifespan of a few years at most, software may live almost forever. Sure, almost no one uses actual VT100 text terminals to communicate with the computer and the programs running back then, yet a lot of the software used today uses libraries and application frameworks that have their origin as far back as the 1970s. That software set the foundation of high-performance computing, the Internet, and security protocols used today, usually behind a shiny user interface. Even the good old VT100 terminal still lives on in the command lines of many popular operating systems (OSs) where the control sequences still behave as they did 40 years ago.
This is one reason why some platform architectures have endured over decades. While the and implementation may have changed substantially internally, the software-visible (ISA) has endured and got incrementally extended without breaking backward compatibility to run the old programs, while other, more modern architectures were popular for a time but ultimately withered away as their performance advantage diminished. Software-compatible platforms came close enough to their levels to make binary software compatibility the overwhelming factor. Good examples are the x86 ISA, the ARM instruction architecture, or IBM’s System/360 ISA, the latter celebrating its 53rd anniversary and still in use.
How do you ensure the long-term viability of a platform architecture? You ensure that software written for the traditional architectures can run well and faster on it but also keep the software development tool chain like compilers, linkers, and development process familiar, so that the programmer doesn’t have to deal with two or more different software toolchains to get to performant software running on the platform.

Today’s extensive use of open-source software is an important factor, especially the GNU and LLVM-based compiler toolchains, readily available in open source repositories, and OSs like , which are used as a foundation in embedded systems in various forms, sometimes “hidden away” (like in the case of ). However, applications need to start and run without much delay, so it’s important that the compilation and time-expensive compiler code optimization to the accelerator doesn’t happen at the application’s load time (as often happens with many current accelerator APIs).
Most code optimization should happen once, when producing the application binary and then readily loaded and mapped to the accelerator. This needs a portable, accelerator-neutral ISA with fast transcription to the target accelerator ISA, instead of full compilation. Hence, it’s important to define a vendor-neutral ISA, which in the case of HSA is called HSA Intermediate Language (IL) or HSAIL. This IL represents a common ISA to target by compilers and is designed to be close to a data-parallel accelerator like a GPU, or other hardware.
The source code written in a common high-level language like C++ or Python, be it an application framework or a popular application, will then produce code that’s defined in the IL. The compiler can apply all the extensive optimization steps to generate the intermediate code, which can then can be linked with other libraries, and even with modules written in different languages, such as C++, for some functions.
By integrating the IL as a binary section in the application binary (which is defined in an object format called BRIG), the program loader can then load both the host ISA and the accelerator code blocks in parallel and allow each to execute the program as written by the programmer without the end user seeing a difference from regular program load. Using the HSA run-time functionality, the software engineer can either target the HSA run-time directly or use an application interface or framework sitting on top of it, such as OpenCL.

But that’s not all. AMD has developed an open-source HSA run-time called Radeon Open Compute (ROCm) and added a portability layer called Heterogeneous Interface for Portability (HIP) that allows source code using proprietary CUDA APIs to compile and run on top of the ROCm run-time, while keeping source code compatibility. Alongside CodeXL, an open-source tool for profiling and debugging data parallel applications, this a powerful toolset to automatically port and run large application frameworks. While not using all ROCm features, it’s an easy way to take advantage of AMD’s HSA implementation without refactoring legacy code.
More information can be found in half-day HSA-focused tutorial at the HPCA/CGO conference in a couple of weeks.