Large Language Model Server Banner Image Visualizing Data Streams

High-Performance Servers for Scaling AI

These rackmount AI servers offer high GPU memory capacities in order to facilitate training and scaling out large deployments of cutting-edge large language models (LLMs) and other AI-powered tools.

Talk to an Expert

These servers are designed for training and scaling AI, but we also offer AI development workstations and systems for piloting and deploying AI.


	4 GPU LLM Server	8 GPU LLM Server
Puget’s Take
Puget’s Take	Compact 2U server for large language model inference	Maximum GPU power in a 4U server for LLM inference and training
CPU
CPU	AMD EPYC 9375F	2x AMD EPYC 9354
GPUS(s)
GPU(s)	4x NVIDIA RTX PRO Blackwell Max-Q 96GB	8x NVIDIA RTX PRO Blackwell Server 96GB
RAM
RAM	12x 64GB DDR5 RDIMM (768GB total)	24x 64GB DDR5 RDIMM (1.5TB total)
Features
Features	NVIDIA RTX PRO and L40S GPUs Configurable up to 384 GB of VRAM 70B model inference in fp16 with room for large context / KV cache	NVIDIA RTX PRO and H200 NVL GPUs Configurable up to 1128 GB of VRAM 150B model inference in fp16 with room for large context / KV cache
Price as Configured
Price as Configured	$79,905.67	$164,164.20
Starting At
Starting At	$19,988.99	$46,763.57
	Configure	Configure

Our Customers Include

View more of our customers here.

Equipped to Serve Customers of Any Size

Puget Systems has specialists on staff who cater to the needs of businesses and educational institutions. We are listed on numerous purchasing portals and offer optional onsite support. Click through to read more about how we can help your organization!

Enterprise

Government & Education

Talk to an Expert

We specialize in building workstation PCs, servers, and storage systems tailored for each of our customers. The best way we’ve found to accomplish that is to speak with you directly. There is no cost or obligation, and our no-pressure, non-commissioned consultants are experts at configuring a computer that will meet your specific needs. They are happy to discuss a quote you have already saved or guide you through each step of the process by asking a few questions about how you’ll be using your computer. There are several ways to start a conversation with us, so please pick what works best for you:

Sales Consultation

Tech Support

If you’d rather not wait, you can reach out to us via phone during our business hours.

Monday – Friday | 7am – 5pm (Pacific)

425-458-0273 | 1-888-784-3872

High-Performance Servers for Scaling AI

4 GPU LLM Server

8 GPU LLM Server

$79,905.67

$164,164.20

$19,988.99

$46,763.57

Our Customers Include

Equipped to Serve Customers of Any Size

Enterprise

Government & Education

Talk to an Expert