The Customized Parallel Computing (CPC) research group leads the development of an open source OpenCL implementation called Portable Computing Language (pocl). After some time in making, we are proud to announce the 1.0 release which passes all of the OpenCL 1.2 conformance tests! Pocl 1.0 includes also other nice features such as NVIDIA GPU support via its CUDA backend contributed by James Price / University of Bristol, and the usual supported Clang/LLVM version upgrade.
To lay out a bit of pocl’s history, it was originally a research collaboration project between Carlos Sánchez de La Lama (then at Universidad Rey Juan Carlos, Madrid) and myself (Pekka Jääskeläinen) that started in around 2008. Initially, Carlos focused on mechanisms to fine-grain parallelize OpenGL shaders for static instruction level parallel machines (such as VLIWs and of course our favorite processor paradigm, TTA) with the aim to generate efficient code for a new heavily software-programmable GPU design we called TTAGPU (which is still in VGA group’s agenda, by the way). Soon after OpenCL 1.0 was published, the shader compiler work was expanded to work on OpenCL C kernels to better support general purpose computing.
During Carlos’ collaboration visits to Finland, and my visits back to Spain in 2008-2010, the research collaboration work evolved quickly with a rough work split of me focusing on utilizing it as a backbone heterogeneous programming API for our Application-Specific Instruction-Set Processor (ASIP) design and programming toolbox TCE, and Carlos more on the compiler technique side. At some point we noticed that the kernel compiler in fact can improve performance portability of running GPU-optimized massively parallel kernels on “CPUs” (in case of pocl this in practice means most “non-GPUs” really) via the generic work-item parallelization in the kernel compiler. The thought led to the initial release of pocl, with the purpose to provide the community with an OpenCL implementation framework that is not only portable, but also performance portable, thanks to the kernel compilation techniques involved.
Now in late 2017, OpenCL and pocl are still being heavily used in CPC. In our ASIP design flow use case we like OpenCL because it has relatively wide vendor support, which means cross-vendor portability, and provides an extensive API with a lot of programmer control. However, this sometimes translates to a lot of work to get the implementations done, and we keep our eyes on ways to add more programmer-productive programming layers in our ASIP flow’s software stack. From this aspect, the C++ standardization efforts related to explicit parallelization, and the shared virtual memory heterogeneous platform specification work led by HSA Foundation look very interesting. As the field is quickly evolving, it sure is interesting to see what the most popular productive programming models for diverse heterogeneous platforms (not limited to the common CPU+GPU setup!) will be in a few years!