ML/AI Archives

Featured Image for Introducing the Puget Systems App Packs Article with Text Overlaid on a Blue-tinted Screenshot of an AI Workflow

From Unboxing to Inference: Introducing the Puget Systems Docker App Packs

Posted on March 17, 2026 by Dustin Moore

We look at AI the way our customers do, which is why we built the Puget Systems Docker App Packs: to help you get up and running with AI inference fast!

Featured Image for Blog Post about Standing Up an AI Demo for Supercomputing 2025 with a Comino Grando GPU Server in the Background

Standing Up AI Development Quickly for Supercomputing 2025

Posted on December 12, 2025 by Dustin Moore

How I used “Vibe Coding” and 25 years of experience to tame a liquid-cooled supercomputer in two weeks.

Exploring Hybrid CPU/GPU LLM Inference

Posted on March 20, 2025 by Jon Allman

A brief look into using a hybrid GPU/VRAM + CPU/RAM approach to LLM inference with the KTransformers inference library.

What’s the deal with NPUs?

Posted on October 25, 2024 by Jon Allman

An introduction to NPU hardware and its growing presence outside of mobile computing devices.

Local alternatives to Cloud AI services

Posted on April 11, 2024 by Jon Allman

Presenting local AI-powered software options for tasks such as image & text generation, automatic speech recognition, and frame interpolation.

LLM Server Setup Part 2 — Container Tools

Posted on November 20, 2023 by Dr. Donald Kinghorn

This post is Part 2 in a series on how to configure a system for LLM deployments and development usage. Part 2 is about installing and configuring container tools, Docker and NVIDIA Enroot.

LLM Server Setup Part 1 – Base OS

Posted on November 15, 2023 by Dr. Donald Kinghorn

This post is Part 1 in a series on how to configure a system for LLM deployments and development usage. The configuration will be suitable for multi-user deployments and also useful for smaller development systems. Part 1 is about the base Linux server setup.

Can You Run A State-Of-The-Art LLM On-Prem For A Reasonable Cost?

Posted on July 17, 2023 by Dr. Donald Kinghorn

In this post address the question that’s been on everyone’s mind; Can you run a state-of-the-art Large Language Model on-prem? With *your* data and *your* hardware? At a reasonable cost?

UPDATE v0.2 NVIDIA GPU Powerlimit Setup

Posted on July 6, 2022 by Dr. Donald Kinghorn

This is just a short post to announce a more usable version of the NVIDIA GPU powerlimit setup script that I released a few months ago. This update to version 0.2 uses an interactive mode to set GPU powerlimits and optionally setup a systemd unit file to set these limits on subsequent reboots.

NVIDIA GPU Power Limit vs Performance

Posted on February 22, 2022 by Dr. Donald Kinghorn

This post presents testing data showing that power-limit reduction on NVIDIA GPUs have give significant benefits for both high wattage and lower wattage GPUs. Power-limit vs Performance data is presented for 1-4 A5000 and 1-4 RTX3090 GPUs.