NVIDIA GTX 295: Bad choice for liquid cooling?

I've always had a rocky relationship with dual GPU video cards. Our first bad experience was with the NVIDIA 7950GX2, which we found over time to suffer from higher shipping damage rates. The NVIDIA 9800GX2 was even worse. Now the NVIDIA GTX 295 is the major NVIDIA dual GPU card on the market. Are we set up for a repeat experience?

I should start by saying that my problem with dual GPU video cards goes deeper than simply my bad experiences. In general, I don't like the concept. You find me saying the same things about SLI, Crossfire, and especially RAID — while they are perfectly valid ways of forcing more performance into your system, they are no substitute for the true innovation of a newer, faster technology. The NVIDIA GTX 295 is especially on my bad side, because the name attempts to position itself as a cutting edge product. In reality, it is simply two NVIDIA GTX 260 cards in SLI. The NVIDIA GTX 285 video card is a superior technology, but you'd never know it by the name.

ATI 4870X2, dual GPUs on a single PCB.
NVIDIA GTX 295, using two PCBs,
and one GPU per PCB.

Now that I've revealed my bias, let's move on to the details. There are two ways that ATI and NVIDIA are implementing dual GPU video cards. The first is to simply put two GPUs on one PCB. This is the way that the ATI 4870X2 is implemented. The second is to take two PCBs, with one GPU each, and connect them together. This is the way the NVIDIA GX2 cards were done, and is how the NVIDIA GTX 295 is done. Both single and dual PCB implementations have the problems of high power draw, and high heat.

The additional problem with having two PCBs is that you have double the points of physical failure. While single PCB boards still suffer from this problem to some extent, the dual PCB boards have a greater risk of physical failures — there are twice the connectors, screws, mounts, and brackets. They also tend to be heavier, and these two things add up to a pretty dramatic increase in risk of shipping damage. This is a problem we're seeing right now. Even with custom shipping boxes and instapak foam, we are seeing a 25% DOA rate when shipping these cards in liquid cooled systems. You heard me right — twenty five percent!

One particularly vulnerable point of failure on the NVIDIA GTX 295 is on the bridge connectors that link the two PCBs together. This is actually the reason for this blog post! In an air cooled card, these connectors are rarely a problem. They're well protected by the plastic cover on the NVIDIA stock cooler. However, when you liquid cool the video card, these connectors are exposed, and are quite susceptible to being jostled. If the connectors were stronger, it wouldn't be an issue. But look at that picture — see how shallow the connector sockets are? It is extremely easy to unseat them, and that is a major problem. 

For this reason, we've just stopped selling the NVIDIA GTX 295 as a liquid cooled card. Sorry NVIDIA — I love your products…we just can't liquid cool this one in good conscience. It's disappointing that a $2 connector is such an obvious Achilles' heel on this cutting edge video card. To their credit, rumor has it that NVIDIA is working on a single PCB version of the card, due out in May. I look forward to this card, and I expect we'll be offering the NVIDIA GTX 295 for sale again in liquid cooled configurations once it arrives. Again, they're still perfectly fine when air cooled.

If you're a current customer and you have one from us, don't fret! The vast majority of the risk is in shipping, so if you already have your machine, you're safe. Also, now that we know what to look for, diagnosis and repair of these cards is fairly straightforward. To be honest, I worried about posting this information. I worried about how it would make people feel that have bought liquid cooled NVIDIA GTX 295 cards in the past. But then I remembered one of the core values at Puget Systems — to cut through the hype of the industry and to provide real, honest information. However, this doesn't mean I need to limit my blog posts to the negatives. The fact that we actively weed out problem parts means that we have an absolutely stellar product line, so look for some blog posts this week that spotlight some of our best and most reliable products (with hard data to back it up).

Whether you're considering Puget Systems for you next build, or someone else — I hope this information helps you, and saves you some of the headaches and frustrations that could have been. I'm including some extra pictures we took below, as a resource for anyone that wants it.