NVIDIA Blackwell GPU GenAI Software Support

Always look at the date when you read an article. Some of the content in this article is most likely out of date, as it was written on May 6, 2025. For newer information, see our more recent articles.

Table of Contents

Introduction

In the days following the release of the NVIDIA GeForce RTX™ 50-series cards, those who could get their hands on the new GPUs may have found that they could not use them with some of their favorite applications without software updates applied to enable compatibility. This was certainly the case in the generative AI space where applications relying on PyTorch would not function without updating to a version of PyTorch with Blackwell / CUDA 12.8 support. This wasn’t much of an issue for people familiar with building PyTorch from source or locating prerelease PyTorch wheels, along with updating their environments’ dependencies, but the growing number of users expecting a “one-click” experience with their generative AI applications would generally find themselves encountering esoteric error messages.

Since the initial 50-series launch, PyTorch 2.7 has been released, complete with support for the NVIDIA Blackwell GPU architecture and pre-built wheels for CUDA 12.8. Now that Blackwell support is in the main branch, we are curious to know whether common software packages are already taking advantage of this, and whether folks with Blackwell GPUs will now be able to easily install and run popular generative AI applications without manually installing a compatible version of PyTorch.

In this post, we want to go over a few of the most popular open-source Gen AI solutions, and cover their current public support for Blackwell GPUs.

ComfyUI

The ComfyUI team was on their A-game with Blackwell support, with this GitHub discussion page about Blackwell support pointing early adopters in the right direction for getting ComfyUI working with Blackwell GPUs. However, any Blackwell owners who didn’t discover that page and used the traditional methods of installation would have been snagged by the PyTorch compatibility issue.

In this post, we’re going to take a look at the two most commonly used methods of installing ComfyUI: their desktop application and the portable package.

Desktop application

Downloaded from ComfyUI.org, this executable is the simplest way to install ComfyUI and has some additional benefits, like automatic integration of the ComfyUI-Manager extension. Surprisingly, though, at the time of this post, the most current version available (0.4.41) installs 2.6.0+cu126 when choosing the NVIDIA installation method. It would be wonderful to find that the “manual configuration” installation method provides a solution, but this has failed to properly install ComfyUI any time we’ve tried it. To be fair, there is a warning under the manual configuration option which states, “This is entirely unsupported, and may simply not work.”

It likely won’t take long before we see an updated installation package that supports PCs with Blackwell GPUs, but for now, anyone wishing to install ComfyUI using this method can follow these steps to update PyTorch and enable Blackwell support:

Restart ComfyUI
Install ComfyUI using the NVIDIA option available in the installation wizard
Start ComfyUI
Open the terminal (see image below)
Enter the following command: pip3 install –upgrade torch torchvision torchaudio –index-url https://download.pytorch.org/whl/cu128

Describes how to open terminal via GUI in ComfyUI.

Portable Package

Shortly after the release of PyTorch 2.7.0, version 0.3.30 of the ComfyUI portable package was released, which helpfully includes the updated PyTorch package in its ready-made environment. That means that owners of Blackwell GPUs who download this version of ComfyUI will have compatibility right “out of the box”.

If you’re using an older version of the ComfyUI portable package, and would like to update your existing installation to support a new Blackwell GPU, then we can take a cue from the “update_comfyui_and_python_dependencies.bat” script found in the ComfyUI_windows_portable\update directory and run a command to update the included python environment. However, be aware that existing custom nodes or extensions may not work with this new version, so some caution is recommended.

Make sure ComfyUI is not running, and back up your ComfyUI_windows_portable folder
Open a terminal window in the “\ComfyUI_windows_portable” directory (Where the run_nvidia_gpu.bat script used to start ComfyUI is located)
Enter the following command: .\python_embeded\python.exe -s -m pip install –upgrade torch torchvision torchaudio –extra-index-url https://download.pytorch.org/whl/cu128 -r .\ComfyUI\requirements.txt pygit2
Wait for the download & installation process to finish

Stable Diffusion WebUI (AKA AUTOMATIC1111)

Stable Diffusion WebUI has an installation package that includes the Blackwell compatible prerelease PyTorch wheels, v2.6.0 + CUDA 12.8, which can be found on this GitHub discussion page. This package will work “right out of the box” but if you are interested in updating to the current PyTorch version, we can use the same update method as we would for an older ComfyUI portable package.

Wait for the download & installation process to finish.
Make sure SD WebUI is not running, and back up your sd.webui folder.
Open a terminal window in the “\sd.webui” directory (Where the run.bat script used to start SD WebUI is located)
Enter the following command: .\system\python\python.exe -s -m pip install –upgrade torch torchvision torchaudio –extra-index-url https://download.pytorch.org/whl/cu128

Stable Diffusion WebUI Forge

Unlike the original SD WebUI, the popular Forge fork does not have a prebuilt package that is blackwell compatible, and the latest version of PyTorch offered is 2.4.0 + CUDA 12.4. However, Blackwell owners can follow the exact same steps as used for the original SD WebUI to update PyTorch. Just be aware that Forge has many integrated extensions, and they may not all be compatible with the current PyTorch version.

Wait for the download & installation process to finish.
Make sure SD WebUI is not running, and back up your sd.webui folder.
Open a terminal window in the “\webui_forge” directory (Where the run.bat script used to start Forge is located)
Enter the following command: .\system\python\python.exe -s -m pip install –upgrade torch torchvision torchaudio –extra-index-url https://download.pytorch.org/whl/cu128

Text Generation WebUI

Portable package

Unlike the full installation of Text Generation WebUI, the portable version only contains the llama.cpp backend rather all of the options such as ExLlama or the classic Transformers.

In our original examination of Blackwell GPU performance, we found that the prompt processing scores for the new GPUs were abnormally low with the build of llama.cpp we used during that testing (4493). However, that issue has since been resolved in newer releases. At the time of writing, the latest version of the portable release for Text Generation WebUI is v3.1 and includes binaries for llama.cpp build 5203, which no longer exhibits the poor prompt processing performance issue we saw in earlier releases.

Full installation

Currently, the complete installation of Text Generation WebUI via downloading the repository and running the one-click installer sets up the python environment with PyTorch v.2.6.0 + CUDA 12.4, meaning it will not support Blackwell GPUs immediately after installation. Although the llama.cpp loader will still function, neither ExLlama nor Transformers will work with a Blackwell GPU without updating PyTorch. However, ExLlama needs to be specifically built for a given version of PyTorch & CUDA, and updating PyTorch is not enough to get it working.

Follow these steps to update PyTorch:

Run the cmd_windows.bat file to open a terminal window with the environment active
Enter the following command: pip3 install –upgrade torch torchvision torchaudio –index-url https://download.pytorch.org/whl/cu128
Wait for the download & installation process to finish

Additionally, you may run the following command to update ExLlamaV2:

pip3 install https://github.com/turboderp-org/exllamav2/releases/download/v0.2.9/exllamav2-0.2.9+cu128.torch2.7.0-cp311-cp311-win_amd64.whl

After updating PyTorch, the Transformers loader should now load and run models without issue. ExLlamaV2 should also work, but with some caveats. By default, it’s configured to use Flash Attention, which, as of this writing, does not appear to have prebuilt wheels available for the current release of PyTorch. This means that without an updated version of Flash Attention, you will need to enable the “no_flash_attn” and “no_sdpa” options when loading exl2 models.

ExllamaV2 settings found within Text Generation WebUI

llama.cpp

As llama.cpp does not rely on PyTorch, it and the many projects built upon it, such as Ollama or LMStudio have supported Blackwell GPUs since their release. As mentioned above in the discussion of the portable package version of Text Generation WebUI, we did encounter some performance issues with the llama.cpp builds available at the time of Blackwell’s release, but those issues have since been resolved.

Conclusion

Although pre-release versions of PyTorch with Blackwell GPU support have been available since their release, less technically inclined owners of these new GPUs may have struggled to get their favorite generative AI applications working with their new hardware. However, with the full release of PyTorch 2.7.0 for CUDA 12.8, the state of 50-series software support has seen significant progress. Currently, many applications’ installation packages still need to be updated to install this latest PyTorch release automatically, but it won’t be long before owners of Blackwell GPUs are provided with the seamless experience that we’ve come to expect from the tools we’ve grown to rely on.

If you need a powerful workstation to tackle the applications we’ve tested, the Puget Systems workstations on our solutions page are tailored to excel in various software packages. If you prefer to take a more hands-on approach, our custom configuration page helps you to configure a workstation that matches your exact needs. Otherwise, if you would like more guidance in configuring a workstation that aligns with your unique workflow, our knowledgeable technology consultants are here to lend their expertise.

Looking for an AI and Scientific Computing workstation?

We build computers tailor-made for your workflow.

Configure a System

Don’t know where to start?
We can help!

Get in touch with one of our technical consultants today.

Talk to an Expert

Latest Content

View All

Tags: AI, ComfyUI, GPU, llama.cpp, LLM, NVIDIA, RTX 5080, RTX 5090, SD WebUI, SD Webui Forge, Text Generation WebUI

NVIDIA Blackwell GPU GenAI Software Support

Introduction