LLM Icon
Quad GPU LLM Server
Large Language Model Server Banner Image Visualizing Data Streams

Quad GPU Large Language Model Server

Compact 2U rackmount server supporting up to four NVIDIA GPUs for fine-tuning and inference with AI large language models.

Overview

Quad GPU 2U server supporting NVIDIA RTX Ada and L40S graphics cards

  • Up to 192GB of VRAM across four GPUs
  • Great for 70B parameter fp16 inference and fine-tuning smaller models
  • Requires two power connections on separate circuits
  • 240V power required for PSU redundancy

Not sure what you need?

and one of our experts will reply within 1 business day to help configure the right computer for your workflow. If you don’t see what you are looking for here, check out our other systems for more options.

System Core




NVIDIA Mellanox Dual 100GbE QSFP56 PCI-E Card   Limited Supply [add $1090.06]
NVIDIA Mellanox Dual 100GbE QSFP28 PCI-E Card   Limited Supply [add $1150.94]
Up to one OCP 3.0 card and one PCI-E card may be selected to provide high-speed networking capability.

Storage


Chassis & Cooling


For redundancy to be functional, the total power consumption of the system must be lower than the maximum output of an individual PSU module. That output depends on the input voltage.

Software


Also available with Windows Server 2022. Please contact us for licensing costs.

Accessories


Additional Information

Help us help you! We review each configuration to ensure you’re getting the right hardware. Any info you can provide about your workflow and software will help us provide you with a better experience.


System Cost

Loading…

per unit

Typically ships in 1-2 weeks

Contact us for lead times

Contact us for quotes for more than 100 units

Quantity