
High-Performance Servers for Scaling AI
These rackmount AI servers offer high GPU memory capacities in order to facilitate training and scaling out large deployments of cutting-edge large language models (LLMs) and other AI-powered tools.
These servers are designed for training and scaling AI, but we also offer AI development workstations and systems for piloting and deploying AI.
|
|
|
|
|---|---|---|
4 GPU LLM Server |
8 GPU LLM Server |
|
|
Puget’s Take |
||
| Puget’s Take |
Compact 2U server for large language model inference |
Maximum GPU power in a 4U server for LLM inference and training |
|
CPU |
||
| CPU | AMD EPYC 9375F | 2x AMD EPYC 9354 |
|
GPUS(s) |
||
| GPU(s) | 4x NVIDIA RTX PRO Blackwell Max-Q 96GB | 8x NVIDIA RTX PRO Blackwell Server 96GB |
|
RAM |
||
| RAM | 12x 64GB DDR5 RDIMM (768GB total) | 24x 64GB DDR5 RDIMM (1.5TB total) |
|
Features |
||
| Features |
NVIDIA RTX PRO and L40S GPUs Configurable up to 384 GB of VRAM 70B model inference in fp16 with room for large context / KV cache |
NVIDIA RTX PRO and H200 NVL GPUs Configurable up to 1128 GB of VRAM 150B model inference in fp16 with room for large context / KV cache |
|
Price as Configured |
||
| Price as Configured |
$90,338.47 |
$184,358.20 |
|
Starting At |
||
| Starting At |
$20,209.79 |
$43,686.17 |
| Configure | Configure | |
Standardized Baseline & Validation Blueprints: Puget App-Packs
At the Scale tier, 24/7 mission-critical reliability and predictable, high-throughput performance are non-negotiable. Every Puget AI workstation and server includes access to our pre-validated Docker App-Packs, specifically optimized for the hardware we build.

High-Throughput Validation
Our team_llm flavor running vLLM acts as a highly-standardized benchmarking blueprint. It allows your enterprise teams to test, benchmark, and verify inference throughput and reliability under load on our multi-GPU servers before scaling out to custom production orchestration layers.

DevOps-Ready Docker Base
A standardized baseline containerized environment that integrates perfectly with modern enterprise CI/CD and deployment pipelines. Bypassing GPU driver configuration issues means your systems engineering team starts with a fully-validated, repeatable, and stable foundation.
Our Customers Include
View more of our customers here.
Equipped to Serve Customers of Any Size
Puget Systems has specialists on staff who cater to the needs of businesses and educational institutions. We are listed on numerous purchasing portals and offer optional onsite support. Click through to read more about how we can help your organization!
Talk to an Expert
We specialize in building workstation PCs, servers, and storage systems tailored for each of our customers. The best way we’ve found to accomplish that is to speak with you directly. There is no cost or obligation, and our no-pressure, non-commissioned consultants are experts at configuring a computer that will meet your specific needs. They are happy to discuss a quote you have already saved or guide you through each step of the process by asking a few questions about how you’ll be using your computer. There are several ways to start a conversation with us, so please pick what works best for you:



