Docker and NVIDIA-docker on your workstation: Motivation

The basic problem with software and environment configuration

I've been working with workstations and clusters running scientific/research software, multiple language programming environments, huge commercial scientific packages, parallel execution environments, etc, for more than two decades. I was always the guy who could "make it work".  Guess what? It hasn't really changed in all this time! 

I've seen real job postings with descriptions that had phrasing from the following sets of words;

"Must be able to (handle, fix), +

(library, configuration, version, environment) + ( nightmare, hell)!" 

Linux package managers like apt and yum,  and installation packaging like deb and rpm are great for software install and updates. This is wonderful for system installation and and adding useful applications and utilities. However, for your "serious work" the packages that are maintained (or not maintained) in the repositories may not be up-to-date and are probably not optimally built or configured for best performance. If you are are doing work that involves research level code there is a good chance that you are not going to be happy with what is available in the repos. There is also a good possibility that what you want to work with hasn't been packaged at all. You may be looking at compiling the application you want from source and setting up an environment to run it in by hand. In the worst cases it can take days or even weeks of a skilled system administrators time to "make it work". Even if what you want to use is relatively easy to install it may not play nice with other "stuff" on your system.

For a workstation configuration there is also the problem of just having a nice desktop environment to interact with. Software that is packaged for easy install may only be available for a distribution or distribution version that you really don't want to use for your desktop environment. Also, code that you want to build yourself may require a build environment that is completely incompatible with your workstation environment. Want to use Ubuntu 16.10 for your desktop and build CUDA apps? Good luck. …too new of a system gcc version. (you can do it, maybe, but, really, good luck!). Docker might just offer a way out of these messes.


Docker

Let me start by saying that Docker is complicated. It can take a bit of effort to understand the details. However, the basic idea is not too bad. 

The essential idea is to wrap up an application and all of its dependencies (binaries, libraries, config files, etc.) in a package (an image) that includes its own file system overlay in a way that it can run on any system that supports Docker. From that image you can start an instance of the application ( a container ). It is a "slice" of the host OS in its own name-space. You might think that sounds like a virtual machine. It's really quite different. A virtual machine has its own OS and all of its own services. A Docker container uses the underlying host kernel and only includes what is needed to run your application. 

The best analogy I have seen is that a virtual machine is like a house and a docker container is like an apartment. ( Containers are not VMs ) A house has it's own foundation, plumbing, electrical system, a door to keep strangers out etc.. A Docker container system is more like an apartment building. There are spaces with their own doors but there is a shared foundation, plumbing, electrical service etc.. A city block can have several houses but that same area could have many more apartments of various sizes. One room efficiency size to elaborate penthouses. Docker containers make much better use of hardware resources than virtual machines. Containers are light weight, portable and very fast to start up and shut down. 

[ note: Docker uses "Linux containers" utilizing kernel namespaces, control groups and other low level kernel features. Microsoft is developing native containers for Windows Server 2016 but it is very new on that platform. You can run Docker on Windows and MacOS using tools that build it on top of a compact Linux virtual machine. See "Get Docker" for installation.  ]

Docker is an entire ecosystem

Docker was released as open source only a few years ago, 2013. Adoption has been massive! It is dominating in modern IT and DevOps. Google runs nearly everything in docker containers. Google launches billions of containers every week. Containers can be started, used, and shut-down in milliseconds. The advantages are so compelling that  a very robust and expanding ecosystem has developed around Docker.

Docker Hub is a registry host with over 100,000 image repositories. There are public and private registries and it is being well utilized by the Docker community and is constantly expanding.

Docker is still under rapid development and it is pretty obvious that usage in the IT and DevOps community is going to continue its transformative adoption. It is driving web applications using clusters of containers providing "microservices" and is replacing many of  the traditional uses of virtualization.

OK, it's great on the server infrastructure side. What about the desktop? 


Docker on the Desktop

In the first couple of paragraphs of this post I described the "basic problem" — getting "non-standard", difficult to install and configure programs running in your user environment.  What if your "user environment" is your personal Workstation? Can Docker be used on a personal workstation?  Yes! Is it easy to use? Well, It's not "easy", but it's not impossibly hard either. I believe it is time to really consider this.

There are some obvious use cases;

  • Development environments  
  • Scientific packages 
  • Software testing environments
  • Machine learning frameworks
  • Running web applications
  • Trying software that you are not sure about 

You can probably think of application that you would have liked to have run on your workstation but they were only packaged for a distribution you don't use. Or they have dependencies you just don't want to clutter your system up with. 

Caveats 

What could possible go wrong?

  • Complexity — Docker is not trivial to understand and it has a lot of commands and options. I just counted 62 options and sub-commands commands just for the docker command line application. There are multiple tools and methods for creating images. It is a client server application which requires a bit of understanding to setup.
  • Docker is mostly used on the server infrastructure side of things so using it as a workstation tool is more of a special use case.
  • Security — Because Docker runs as a low level service it requires elevated privileges to use. You will have to add yourself to the docker group to run as a regular user. The main application process inside a container is normally running with UID 1 a.k.a. root. ( that is UID 1 in the containers namespace not necessarily your host system namespace.)
  • Isn't docker for server or command-line applications only running on a CPU?. I want to use my GPU! I'll get to that in a bit…

 … it's not so bad

The caveats above are real but they not as bad as they may sound.

  • Complexity — You don't have to know every detail of Docker to use it. I am going to document my efforts and report on it. Docker is so important and is being used by so many "professionals"  that I believe it is inevitable that it will move to the desktop. I've been using it and I'm optimistic about this usage scenario.
  • Security — Security was my biggest concern with Docker. Running it on your desktop for user-space applications just seemed risky. However, since Docker version 1.10 (it is version 1.13 as of this writing) there has been included an implementation of kernel User Namespaces.   This allows Docker containers to have their UIDs and GIDs mapped to a non-root user namespace. It is limited right now to a single user but that should be OK to setup a more secure single user workstation! 

Can I run GUI applications in Docker containers? Yes.

This can be a point of contention! In order to run a GUI application in a docker image you have to bind the X socket to the container. The X-Window system is insecure, It always has been and It will be until it is replaced ( hopefully soon by Wayland ). X-Window has been in use for over 20 years and it has never been secure. The issue that bothered me the most with using X in a Docker container was giving a root process access to the X socket. Starting the Docker service daemon with a remapped namespace via User Namespaces will mitigate this.


NVIDIA-docker 

NVIDIA-docker is what really got me interested in looking at Docker on the desktop. I had the pleasure of talking with really nice fellow from NVIDIA at Supercomputing 2016. ( I'm sorry I didn't remember his name, I think that may have been Felix Abecassis ) He was presenting some of the official NVIDIA Docker images utilizing their nvidia-docker and nvidia-docker-plugin.  This is great!

Being able to use compute acceleration on NVIDIA GPUs from a Docker container is what motivated me to seriously consider using Docker on a workstation.

The NVIDIA driver is a kernel level process so using it with Docker is tricky (remember the kernel is shared via namespaces with containers, they don't have their own kernels). These tools from NVIDIA have been working well for me.

I will be writing posts on how to set all this up and examples of usage over the next several weeks.

 ________________________________________ 
/                                        \
|  "Happy computing --dbk"               |
\                                        /
 ---------------------------------------- 
    \
     \
      \     
                    ##        .            
              ## ## ##       ==            
           ## ## ## ##      ===            
       /""""""""""""""""___/ ===        
  ~~~ {~~ ~~~~ ~~~ ~~~~ ~~ ~ /  ===- ~~~   
       \______ o          __/            
        \    \        __/             
          \____\______/