HSAIL: Write-Once-Run-Everywhere for Heterogeneous Systems – IEEE article

Ben Sander of AMD and  Chien-Ping Lu MediaTek HSA Foundation Working group leader for HSA Programer Reference Manual pen a nice article on HSAIL and HSA technology
 
“Power efficiency has emerged as a primary design goal for modern silicon chips.  Accelerators such as GPUs have well-known advantages in compute density per-watt and per-mm^2 – note for example that the systems at the top of the latest Green500 (http://www.green500.org/) and Top500 (http://www.top500.org/) lists are now based on heterogeneous designs.
However, these systems have traditionally been difficult to program, due to two challenges.  First, many accelerators support only dedicated address spaces that require cumbersome copy operations and prevent the use of pointer-based data structures on both the accelerator and the host processor.   Second, accelerator programming has traditionally required a specialized language such as OpenCL™ or CUDA™.  Some of these specialized languages are only supported by a single hardware vendor, which further constrains their adoption.
An intermediate language called HSAIL is helping to address some of the challenges. One of the benefits of HSAIL is its portability across multiple vendor products.  Compilers that generate HSAIL can be assured that the resulting code will be able to run on a wide variety of target platforms. HSAIL also provides existing programming languages with an efficient parallel intermediate language that runs on a wide variety of hardware.  This provides the underlying infrastructure and brings the benefits of heterogeneous computing to existing, popular programming models such as Java™, OpenMP™, C++, and more”. ………..  read more at this link bellow
http://www.computer.org/portal/web/computingnow/software%20engineering/content?g=53319&type=article&urlTitle=hsail%3A-write-once-run-everywhere-for-heterogenous-systems

Foundation Blasts Off

 

HSA Foundation is now in flight and already locked on target.

One year ago Phil Roger stood up at AFDS and stated he wanted to change 30 years of PC Architecture legacy,  truly address the core chalanges in getting heterogeneous paralallel computing to be approachable by all programers.  On top of this he told the Audience AMD wanted to make the specification for this open.
Roll one year forward, AMD worked with ARM, Imagination Technologies, MediaTek, and Texas Instruments  to establish the HSA Foundation.  We are all interested in solving similar issues around moving computing platform forward in mobile devices, Tablets, Laptops, Workstation, and Servers.   On the anniversary of 2011 AFDS, the team has achieved it goal of bring your truly open specifications around HSA technology via the HSA foundation. Which has very solid team  of passionate people and founding companies who will drive this capabilities forward into market.

We look forward to working with all of you around this exciting new initiative.

HSA Brings GPGPU Computing and Extends It Further

GPU are great at attacking problems that come in the following form.
•Data Parallelism – SPMD
•Embarrassingly Parallel Application
–Rendering – Raster based
–Simulation of large scale particle system
–Large Graphs/networks
–Numerical Integration
–Monte Carlo Integration
–Bioinformatics
–Genetic Algorithms and evolutionary computation metaheuristics
–Ensemble Calculations
–Visual Mining – ex Large Scale Face Recognition
–Brute force searches in cryptography
–Distributed Set Processing  of relational data
–Sieve Analysis for GNFS and QS for integer factorization
HSA also bring about wider set of problem by addressing
•What GPU do well today
–Data Parallelism – Embarrassingly Parallel Application
–SPMD
•Plus a much richer set of Parallel Solutions
–Task-Parallelism,
–Nested-Parallelism,
–Braided-Parallelism,
–Irregular parallelism
–MPMD
•Parallel Problems that are Communication Intensive
–Need High Bandwidth Low Latency Interconnects
–Examples
•Solution that are Parallelizable by Domain Decomposition
•Partial differential equations on a regular grid using discrete time stepping
–More efficient implementation of MapReduce, Hash Tables, Sparse Matrix Vectors and Conjugate Gradients solvers, FETI-DP Method, List Ranking, and spatial search
•Algorithm that need Inter-task Communication
•Application with the need of branching support.
•Application that need exceptions processing
•Dynamic load balancing task between processing elements
What is revolutonary
•What GPU do well today
–Data Parallelism – Embarrassingly Parallel Application
–SPMD
•Plus a much richer set of Parallel Solutions
–Task-Parallelism,
–Nested-Parallelism,
–Braided-Parallelism,
–Irregular parallelism
–MPMD
•Parallel Problems that are Communication Intensive
–Need High Bandwidth Low Latency Interconnects
–Examples
•Solution that are Parallelizable by Domain Decomposition
•Partial differential equations on a regular grid using discrete time stepping
–More efficient implementation of MapReduce, Hash Tables, Sparse Matrix Vectors and Conjugate Gradients solvers, FETI-DP Method, List Ranking, and spatial search
•Algorithm that need Inter-task Communication
•Application with the need of branching support.
•Application that need exceptions processing
•Dynamic load balancing task between processing elements