Puget Systems print logo

https://www.pugetsystems.com

Read this article at https://www.pugetsystems.com/guides/1131
Dr Donald Kinghorn (Scientific Computing Advisor )

GTC 2018 Impressions

Written on April 2, 2018 by Dr Donald Kinghorn
Share:


NVIDIA's Graphics Technology Conference (GTC) is probably my all-time favorite conference. It's an interesting blend of "Scientific Research meeting" and Trade-Show. It's put on by a hardware vendor but still feels like a scientific meeting. It's not just a "Kool-Aid" fest! Innovative research in software and hardware is presented at this conference. You see everything from current autonomous-vehicles to VR gaming from the largest businesses in the world to the smallest startups. You see current and, in development, hardware and software products along with excellent research paper presentations. Really a nice mix!

NVIDIA listens to feedback to the conference seems to get better and better. GTC is a joy for attendees and exhibitors. They have good food and drink, they don't overlap exhibits and talks and you never have to walk more than 20 meters to find good (free) coffee and water to keep you going. It's a comfortable and energetic meeting.

GTC is not just an NVIDIA "Kool-Aid" fest! It's a serious scientific and industry research meeting and a show-case for some of the most innovative software and hardware being created today. It's an intellectual feast of the latest developments and applications of Machine Learning/AI, and stunning visualization technology.

I'll share some of my feelings and impressions about this years conference.


The Big Picture (My Perspective)

First, it was a great meeting! I got to talk with many wonderful and interesting people. I got questions answered about projects I'm working on including hardware issues and performance testing. I made new friends and got to connect with several followers of my blog posts. I got to talk about the machines we do at Puget Systems and I got to spend an enjoyable week with my coworkers and colleagues. Nice!

This years GTC had an impressive turnout and the level of "understanding" seemed higher. By that I mean there was a maturing of Machine Learning/AI knowledge. There were still many that were new to the field but more people were doing real work and a larger portion of people are at the stage of deploying applications and looking at the problems associated with that.

Above I just mentioned Machine Learning/AI that's because it was definitely the predominate interest. There was a bit of a disconnect in this regards with NVIDIA. The key-note had a lot of emphasis on visualization and it just really didn't fit with expectations. I enjoyed the key-note but I was wishing for more focus on Machine Learning. I was also disappointed that there wasn't a second "technical" key-note. In the past there have been Deep Learning luminaries like Andrew Ng and Jeff Dean on stage and incredibly inspiring messaging. That didn't happen this year. However, as I mentioned above attendees were already "inspired" and doing real work now.

I had three main personal interests going in to the meeting, Docker, TensorFlow, and Hardware/software performance.

I had several enjoyable talks with the NVIDIA Docker team. They are doing great work and have ambitions for much more. I attended as many talks related to Docker as I could and it is clear that it is one of the most important aspects of practical work. It was discussed and demonstrated heavily at the key-note along with the announcements about GPU container management with Kubernetes. I got to meet the fellow who wrote the code for User-Namespaces in the Linux kernel and great guy from Canonical (Ubuntu) who works on LXD. My interactions will all the docker/container people was enlightening and motivating. I'd say that was my favorite part of the meeting!

I also, tried to go to talks and have conversations about TensorFlow. That was pretty easy since it is arguably the most important machine learning framework. I'll be doing more testing for sure inspired by information I got about performance. Also, the TensorFlow group released version 1.7 last week and there are lots of improvements I'm anxious to explore.

Another takeaway from the meeting is that everybody seems to be adversely effected by the Spectre Meltdown mess. Talking with other hardware vendors and component suppliers evoked a lot empathy. It really is a mess. One of the talks that I wanted to see but missed was by by a team from RedHat discussing performance problems caused from this mess. I'll watch that from the recording to see what they had to say. Did I mentioned that this is a mess!

There was an incredible amount of seriously interesting "content" at the meeting. I barely scratched the surface of my interests. I could have used a few personal "clones" to take more of it in. Thankfully most of the talks are recorded. I have a link for that in the next section.


Some stats and info

There was a a very large turn-out for GTC18 and lots of stuff going on;

  • Attendance was estimated at 8500. The San Jose McEnery Convention Center is not really all that big so the meeting felt a little crowded. I don't have past attendance at hand but this definitely seemed to be largest turnout of the 4 or 5 times I've attended.
  • 990 Sessions That includes,
    • 67 "connect with the experts" -- Those are informal meeting areas that provide an opportunity to talk with some of the great people at NVIDIA to get advice discuss problems etc.. I took advantage of that to get some questions answered.
    • 86 Workshops, Tutorials, and Labs -- A wide variety of hands on learning opportunities. (I would have liked to have done a couple of them!)
    • 160 Posters -- There were a lot of nice posters presented. That was largely academic research papers being presented. Quality of work was generally very high. I always enjoy seeing what people are working on. The Tuesday night poster reception was packed! Beer and Science go well together!
    • Talks -- There were 641 talks presented in the 4 days of the meeting! There were probably 100 talks that I wanted to attended ...I did manage to make it to around 12 or so. There were a few that I was not able to get into because the filled up quickly. For talks that overflowed they had a video stream on a monitor in the hallway, nice!

One of the great things about GTC is that a large number of the talks get recorded. "Important" talks are video recorded including the speaker and stage. Other talks are usually recorded through the AV presentation system providing the speakers voice and slides. In a few days you should be able to find talks at GTC-On-Demand.

  • 158 Exhibitors -- That included Puget Systems of course! The exhibit floor at GTC is always very interesting. I'll come back to that...

Keynote

The Key note was good and I enjoyed all two and a half hours of it. Here's my feelings about some of the key points;

  • First Jensen (Jensen Huang Founder and CEO of NVIDIA) ... he was great! His humor was wonderful at the keynote. Maybe it's because I've been going for several years but I "get" and appreciate his humor. He was spontaneous and playful and poked fun at himself a couple of times. I particularly enjoyed when he was leading up to the presentation of the new DGX2 and walked over to the edge of the stage and ask someone to hand him a card (I think it was the GV100). He then did his his classic stance and voice holding up the card ... "...here it is, the worlds most gigantic GPU" ... followed by "you didn't fall for that did you?" Jensen obviously loves his company and his employees. When he says "I love you guys" I believe he genuinely means it. He directs that at the developers and vendors in the audience too.

  • Of course the hall was packed. It seats 4000 and there was more than twice that many people there. It was streamed live and recorded. You can watch the recorded stream.

  • The keynote started with visualization. In particular real time ray tracing. This was a bit unusual because the focus in the past has been almost entirely machine learning and AI. It was interesting and impressive and lead up to the announcement of the Quadro GV100. The Volta replacement of the GP100. That is an amazing card which can be pared with a second card using NVLINK-2 to acts as a single memory space combining the 32GB of memory on each card for a total of 64GB. It is designed to take advantage of what they are calling NVIDIA RTX technology for real-time ray tracing. That was the best use case for the GP100 and it looks like they have made big improvements for that use. Visualization is something I've always appreciated and been interested in and I believe it is an important part of the "big picture" but I talked to several people who had wished he would not have spent so much time on it.

  • Next it was on to the relentless and amazing growth of compute capability provided by GPU acceleration. It really is incredible how much compute capability has increased over the last few years driven by NVIDIA GPU's and the developer ecosystem. This is one of the critical driving forces for the advances in Machine Learning and AI that we have been seeing. It is the focus and interest of the majority of GTC participants.

  • Next was discussion of the importance of docker, cloud services, GPU servers and the announcement of GPU support in Kubernetes. Of course you know I'm a big fan of docker on your desktop too.

  • Then visualization came up again in the form of medical imaging. This is important for the front-end of all the machine learning work that is being done. You could feel a little restlessness in the audience. As if OK, great, but I came here for ML/AI.

  • Next up was the DGX2. Jensen was referring to it as the a "truly gigantic GPU". In a way that's a pretty good description. That is the one of the most impressive pieces of hardware I remember ever seeing. It has 16 Volta SXM3 units each with 32GB and some very high speed and bandwidth networking interconnects. It's a 10K Watt "node" with a theoretical compute capability of 2 peta FLOPS. It has a $400K price. That seems like a lot of money but it is 10 times the performance of the DGX1 that sold for $120K. I really didn't feel like it was overpriced! However, it's certainly not optimal price/performance for most workloads but on the highest-end of problem requirements it might be a bargain. He pointed out that 5 years ago it took 6 days to train Alexnet on the ImageNet competition data-set with 2 GTX 580's, the DGX2 can do that in 18 minutes!

  • Next was announcing the new version of TensorRT. At this GTC there was more activity with the end result of Machine Learning i.e. Inference. This is using the models you have trained for some task. In general, you want to reduce the size of the model so that it can be implemented at the "edge" or provide an on-line service efficiently. Jensen talked about TensorRT which is an NVIDIA project for assisting with this stage of deploying models. TensorRT version 4 will be more tightly integrated with TensorFlow. He was almost chuckling about the acronym he came up with, PLASTER, for describing what needs to be addressed, Programability, Latency, Accuracy, Size, Throughput, Energy efficiency, Rate of learning. I like it! This was illustrated with an impressive demonstration of scaling multi-GPU with Kubernetes for an inference problem identifying flowers (a demo that they have used in the past). It was stunning to see how much performance gain there was going from CPU to multi-node GPU accelerated containers launched with Kubernetes.

  • Then it was on to autonomous vehicles. He discussed the advances but what caught my attention was a remark about how difficult this problem is. You could here in his voice the realization of how massive and difficult this problem really is. He showed their new vehicle computer control system. I don't follow this hardware myself but it looked very impressive and had appropriate redundancy and such. There is no doubt NVIDIA is a big player in autonomous vehicles. This lead into a really brilliant idea/project (in my opinion) for simulating driving conditions in a way that duplicates what automotive sensors would actually encounter. This is in order to generate the billions of driving miles needed to train models in all types of conditions without risking safety and as a way to accelerate research.

  • The keynote was running over 2 hours when they demoed the "Holodeck VR" project by having someone in the Holodeck drive a real car remotely. It was amazing.

Then it was on to ...

The Exhibition

The first round of the exhibition starts after the keynote with lunch served in the hall. With the lack of overlap with talks and other events the exhibit hall is generally packed while it's open.

Hardware

The usual suspect were there. Big and small computer hardware vendors were well represented. This is no surprise since GPU acceleration is the largest advance in numerical computing throughput since the start of the commodity cluster era. The Summit supercomputer going in a Oak Ridge National Labs was highlighted. From the supercomputing "heavy iron" there was a range all the way down to embedded systems for IoT devices. Machine Learning Inference at the edge was a common theme.

The important computer component suppliers and system vendors were there and it was a pleasure to talk with them. A common theme in this regard was the huge mess that the Intel Spectre and Meltdown issues has caused. Everyone is suffering from this! I talked with system vendors like us (Puget Systems) and everyone is having trouble from the current mess. System qualification and development delays are affecting the whole industry.

More Hardware

There was a lot of interesting sensor devices on display. This was present in relation to autonomous vehicles and just "in general". There was a very interesting LIDAR device the size of a "cup-cake" that was demoed attached to a drone. It had embedded ML capability of course.

The automotive industry is well represented at GTC. There is amazing work being done for autonomous driving. The vehicles on display ranged from transformed "normal" cars bristling with sensors and cameras. That included a full size "semi" to small electric cars. There were vehicles with no place for a driver ... that still creeps me out a bit! ... but it's coming. The autonomous vehicle work is maturing and the realization that there is a huge amount of hard work that still needs to be done was in full realization.

Visualization

There was a good representation of gaming technology and visualization. There was some amazingly beautiful examples of real-time ray-tracing. Lots of candy for the eyes! The VR pavilion seemed less active than last year but it was still well attended and there were constant lines waiting to try out new games.

Cloud

The big cloud vendors were out in force, Amazon (AWS), Google (GCP), Microsoft (Azure) and IBM cloud and Watson had large and busy booths. Machine Learning/AI is critical for these companies internally and as a service. Everyone now has GPU accelerated instances available and there are many specialized services geared toward machine learning work-flows. However, they were not the busiest booths! They are important but usability and value is still lacking for the "average" researcher and I heard many complaints in that regard. The thing is, a GPU accelerated workstation is not a huge expense and provides impressive compute capability. As a result many users are still very much interested in on premises hardware. I, of course, really like to have my own hardware but I find "the cloud" interesting too.

Software, Machine Learning AI

With all the variety of the exhibits I would say the emphasis was still very heavy on machine learning and AI. Again this ranged from the largest players to brand new startups. The vast majority of activity at GTC is around machine learning and AI. I had several people tell me they were annoyed at the visualization emphasis of the keynote. Nearly all the posters and most of the talks were focused on machine learning and AI. That is where the most interest was in the exhibit hall too.

There was an interesting shift this year with the machine learning/AI work represented in the exhibits. There was significantly more "product" or production level projects going on. The advances are starting to move to the "real world" so more "edge" and "inference" focus was present.


If you missed GTC18 then I hope what I wrote above gave you some feel for what went on. I really do enjoy this conference and hope that I can continue to be a regular attendee. I always learn new things and leave inspired for my own work.

Hope to see you at GTC19!

Happy computing! --dbk

Tags: GTC18, NVIDIA, Docker, TensorFlow