Everything You Need to Know About Why AMD Open Sourced the OpenCL Driver Stack for ROCm

HSAIntroduction:
AMD is a co-founder and member of the HSA Foundation. This article is excerpted and edited from a blog post by Vincent Hindriksen, founder of Stream HPC, a Netherlands-based software development company

Last May, AMD open sourced the OpenCL driver stack for ROCm. With this they kept their promise to open source (almost) everything. Earlier the hcc compiler, kernel-driver and several other parts were open sourced.

Why this is a big thing?
There are indeed several open source OpenCL implementations, but with one big difference: they’re secondary to the official compiler/driver. So, implementations like PortableCL and Intel Beignet play catch-up. AMD’s open source implementations are primary.

They contain:

  • OpenCL 1.2 compatible language runtime and compiler
  • OpenCL 2.0 compatible kernel language support with OpenCL 1.2 compatible runtime
  • Support for offline compilation right now – in-process/in-memory JIT compilation is to be added.

Performance of ROCm was mostly on par with AMD’s closed source drivers, with a few outliers. A few months ago ROCm 1.6 was released, where again performance was noticeably improved. For the next release performance improvements are expected again.

Why was it open sourced?
There were several reasons. AMD listened carefully to their customers in HPC, while taking note of where the industry was going.

Get deeper understanding of how functions are implemented
It’s useful to understand how functions are implemented. For instance the difference between sin() and native_sin() can tell you a lot more on what’s best to be used. It doesn’t tell how the functions are implemented on the GPU, but does tell which GPU-functions are called.

Learning a new platform has never been so easy. Deep understanding is needed if you want to go beyond “it works”.

Debug software deeper
Any software engineer has experience with libraries that don’t perform as promised or work as documented. Integration issues with “black box” libraries, are therefore a typical reason for big project delays. If the library was open source, the debugger could step in and give all information needed to solve the problem quickly.

When working with drivers it’s about the same. GPU drivers and compilers are extremely complex and inevitably your project hits that one bug nobody encountered before. With all open source drivers, you can step into the driver with the same debugger. Moreover, the driver can be recompiled with fixed code instead of having to write a less secure work-around.

Get bugs solved quicker
A trace now includes the driver-stack and the line-numbers. Even a suggestion for a fix can be given. This also helps reduce the time to get the fix for all steps. When a fix is suggested AMD only needs to test for regression to accept it. This makes the work for tools like CLsmith a lot easier.

A bonus of open source projects is that over time the code quality becomes better than projects where code is never seen by outsiders, which also adds to quicker solving of bugs.

Get low-priority improvements in the driver
Popular software like Blender and the LuxMark benchmark can expect to get attention from driver developers. For the rest of us, we have to hope our special code-constructions are comparable to one that is targeted. This results in many forums-comments and bug-reports being written, for which the compiler team doesn’t have enough time. This is frustrating for both sides.

Now everyone can help build a driver for everyone.

Get support for complete new things
Proprietary code needs official access and legal documents that have all kinds of restrictions, which open source code does not.

More often there is opportunity in what is not there yet, and research needs to be done to break the chicken-egg conundrum. Optimized 128-bit computing? Easy complex numbers in OpenCL? Native support for Halide as an alternative to OpenCL? All up-to-date driver-code is available to make these possible.

Nurture other projects
Code can be “borrowed” from AMD’s projects and be used in (un)expected places. This ranges from GPU-simulators to experimental compilers.

Currently the forks of the ROCm-driver are mostly used to fix bugs or are thousands of commits behind. Who knows what the future brings.

Get better support in more Linux distributions
It’s easier to include open source drivers in Linux distributions. These OpenCL drivers do need a binary firmware (which were disassembled and seem to do as advertised). There is a discussion if firmware can be seen as hardware and can be marked as “libre”, but fact is that AMD’s contributions to the Linux 4.x kernel do get accepted.

Improve and increase university collaborations
If the software was protected, it was only possible under strict contracts to work on AMD’s compiler infrastructure. In the end it was easier to focus on the open source backends of LLVM than to go through the legal path.

Universities are very important to find unexpected opportunities, integrate the latest research in, bring potential new employees and do research collaborations. Timour Paltashev (senior manager, Radeon Technology Group, GPU architecture and global academic connections) can be reached via timour dot paltashev at amd dot com for more info.

Final words
It probably makes total sense to open source the drivers. Most notably key advantages include reduced costs and increased control due to easier debugging and bug-solving.

AMD is now a modern hardware company that understands software is a crucial part of their products. They believe that open source software gives an edge over the competition and made this bold move to let everybody peek in their kitchen.