OpenACC for free! -- NVIDIA OpenACC Toolkit

NVIDIA and PGI are offering “PGI Accelerator with OpenACC” free to academia (or 90 day trial for commercial users) under the banner “NVIDIA OpenACC Toolkit”. It’s about time! I’ve been bugging NVIDIA and PGI guys every chance I’ve had for over a year to do something to make OpenACC more accessible for developers to try. Free is good! No one wants to pay over a thousand dollars for a compiler suite just to try directive based GPU coding when the CUDA development tools are available for free. If you are a parallel programmer you have most likely used OpenMP. Now you can try the same programming model to accelerate your code for GPU’s.

A blog post by Paresh Kharya announced the “NVIDIA OpenACC Toolkit” on July 13th

It features the industry-leading PGI Accelerator Fortran/C Workstation Compiler Suite for Linux, which supports the OpenACC 2.0 standard. With the OpenACC Toolkit, we’re making the compiler free to academic developers and researchers for the first time (commercial users can sign up for a free 90-day trial).

At the Developer Zone NVIDIA OpenACC Toolkit page you can find the link to the download registration page. There you can apply for a free academic license or the commercial use 90 day trial. [ If you are not in academia I still recommend you grab the 90 day trial and give it a try. ]

Jeff Larkin has a fresh post up on the PARALLEL FORALL blog titled Getting Started with OpenACC, give it a read.

Directive based parallel programming is important!

OpenACC directives are code add-ins like the following C pragmas that tell the compiler to do it’s magic for you. (fortran uses !acc comments)

#pragma acc kernels
...
#pragma acc parallel reduction(+:result)
...
#pragma acc parallel loop
...
#pragma acc data copyin(A[0:N], B[0:N])
...


$ pgcc -acc -Minfo=accel  yourgreatparallelcode.c -o greatprogram

I periodically rant about how important directive based programming is. Dealing with many core processors and accelerators/co-processors is a serious challenge and going forward it is becoming essential that developers have a sane way to deal with this. OpenMP and OpenACC are tightly related and I believe will merge at some point in the future. These tools will become hardware agnostic. Meaning that compute hardware capability will be detected at run time and the appropriate code will execute to take advantage of it. This paradigm is future-proof. Hardware will continue to change so developing code with directives to exploit these changes makes sense. Programing language directive syntax may change too but these changes will be trivial on the code side. (There will no doubt be huge changes on the implementation side as hardware evolves and directives become more sophisticated at exploiting the hardware.) This is very different from say CUDA. CUDA is rather low level and single hardware platform focused. CUDA is wonderful but CUDA code by it’s nature will have a limited lifetime and will be more difficult to maintain than directives like OpenMP and OpenACC.

I feel strongly that parallel programmers should be using directives in their code. You can add, for example, CUDA regions in places where you can’t quite get the speedup that you want. Then in the future when OpenMP/OpenACC improves to the point that is takes care of that tricky point for you (and your CUDA code is no longer usable anyway) It’s a simple change in your code. As new hardware is developed and directive implementations improve, your code maintenance remains simple.

Go get the NVIDIA OpenACC Toolkit and write some code!

Happy computing! –dbk

Tags: NVIDIA, NVIDIA OpenACC Toolkit, OpenACC, OpenMP, parallel programming