Puget Systems print logo

https://www.pugetsystems.com

Read this article at https://www.pugetsystems.com/guides/1158
Article Thumbnail

[FIXED] Watchdog DPC Violation Error with Multiple NVIDIA GPUs

Written on May 11, 2018 by Ken Colorossi
Share:

Introduction

Recently we have been seeing an issue in our support department with a specific hardware combination: the Asus X99E-10G WS motherboard and triple or quad NVIDIA GPU configurations. When running a workload that is heavy on both the CPU and GPUs, systems with this hardware can crash and give the DPC_WATCHDOG_VIOLATION error or "blue screen of death" (BSOD).

We have created an instructional video to guide users through resolving this problem by reverting to an older NVIDIA graphics driver and then preventing Windows from automatically updating to a newer version. We are also currently coordinating with NVIDIA on a permanent fix. Updates will be published to the article as they become available.

Updates

Final Update 9/10/2018 [SOLUTION]

We have identified a solution that mitigates the DPC errors and while it does not completely solve the problem, in all testing the systems are significantly more stable with fewer crashes. If you need to update to the newest NVIDIA driver in order to run a particular application then this adjustment will help you. If you do NOT need to update in order to run an application we would advise against deploying the adjustment. The adjustment involves editing the registry so at this time we are not releasing a detailed guide however we are releasing an executable that will perform the registry edits for you.

Please navigate HERE and download the files to your affected computer. Once downloaded, run the application to perform the adjustment.

Alternatively, if you have any questions or prefer our Puget Systems Support staff perform the adjustment please don't hesitate to reach out; we are here to help!

 

Update 6/13/2018

Nivida has stated that the watchdog BSOD w/ multi GPU on X99-E 10G WS is now a "blocking issue", which means they cannot release a new driver unless it contains a fix.

 

Update 5/24/2018

With the release of new drivers Nvidia has removed the 382.53 driver from their list. You can access the download from the Nividia database here

 

Update 5/15/2018

Currently Nivida has been able to replicate this issue in their lab and are working on a solution.

How to install NVIDIA Graphics Driver v382.53

The video is embedded below but if you would prefer to read the instructions rather than watch it you can .

Conclusion

We hope you found the video above helpful!

If you do not own a Puget System, but are experiencing this issue and need assistance, please go here to submit a help request with NVIDIA directly. You may also want to contact the manufacturer of your system for additional support.

Need help with your Puget Systems PC?

If something is wrong with your Puget Systems PC. We are readily accessible, and our support team comes from a wide range of technological backgrounds to better assist you!

Contact Puget Systems Support

Looking for more support guides?

If you are looking for a solution to a problem you are having with your PC, we also have a number of other support guides that may be able to assist you with other issues.

Puget Systems Online Help Guides

Tags: Windows, BSOD, Asus, X99E-10G WS, NVIDIA, Multi, GPU, DPC, Watchdog, Violation, Fix, Solution, Video
David Houston

Yeah unfortunately this is not a good fix when you use software that requires the newest Nvidia drivers.

Posted on 2018-05-18 00:36:50
Dark Matter

I'm affected by this issue too. Unable to use Vray RT and Octane Render. Both need newer driver than 382.53. Very sad this problem is going on for 6 months now. Nvidia please wake up. Thanks Puget to make Nvidia acknowledge the problem.

Posted on 2018-06-11 22:01:54
Dark Matter

Thank Puget Systems for taking care of this problem and giving us this update! I really hope the problem will be fixed soon !

Posted on 2018-06-15 17:57:02
Alex

Any update on this? New nVidia driver does not contain the fix and still lists this issue as Unresolved.

Posted on 2018-06-26 16:37:21
Dark Matter

New Nvidia Driver has been released and the bug is still in the Known Issue. So apparently Nvidia didn't think it was a "blocking issue". That is really frustrating.

Posted on 2018-06-26 18:01:37
Ken Colorossi

Hello. Nvidia did publish a new driver that did not resolve the issue. However, we have been conducting internal testing and have found that the new driver + a specific registry edit DOES resolve the issue. We are working on putting together detailed instructions for implementing this fix. Please stay tuned in to the article for an update in the next couple of days

Posted on 2018-06-26 21:59:53
Sam

Hi!, if you are talking about TDR Delay it didnĀ“t fix the issue for me

Posted on 2018-06-27 13:59:22
Dark Matter

Thank you Ken. Looking forward to test it.

Posted on 2018-06-27 18:09:31
We'd

Thanks Kens. Really hope this works!

Posted on 2018-06-28 07:07:14
Dark Matter

Ken, do you have any update? We're are hanging on your words

Posted on 2018-06-29 18:00:11
Ken Colorossi

Thank you for your Inquires everyone. We are still working on our internal testing and hope to have something for you very soon. Our current solution is not yet stable enough for public release.

Posted on 2018-06-29 18:54:44
StedmanChet

I need a favor from someone using an x99-e 10g ws motherboard. I have been dealing with an sli issue on this particular motherboard for awhile and I do believe it is related to the PLX chip found on the board. I have rma'd the videocards, the motherboard and the memory. At this point I am ruling the memory and the videocards out. As for my request, can someone please SLI two (no more, no less) nvidia cards on this motherboard on slots one and five (cannot be any other slots) using the MOST RECENT drivers and tell me if they get any crashes, errors etc. I only ask for this to be done as I currently have two GTX 1070s in slots one and three (liquidcooled and the linking block only spans that far) and I am wondering if I sli the two cards across slots one and five (hence each card will be operating on a different PLX chip) will the issue cease. Currently I am limited to using only nvidia driver 382.53 (its the only one that works, no crashes, errors, BSODs). Without the latest drivers installed a lot of current games will not recognize my second videocard, some games wont even start (GTA V). Also I am using Windows 10 64 bit. Thanks guys.

Posted on 2018-07-09 15:39:31
Padi

The registry script above did not set my GPUs correctly in MSI mode. I found the MSI utility v2 linked here doing a better job.

https://forums.guru3d.com/t...

It did not fix my problem nonetheless.

Posted on 2018-07-16 00:29:03
James

Sorry for posting in an old thread, but thank you for this, I spent so much money on my system only to have it unstable for so long and this fix worked! I only have one question if anyone knows; Once the fix is applied is it permanent or do I have to re-apply it after every nvidia driver update? Thanks again!

Posted on 2019-12-16 05:47:24
Ken Colorossi

I'm glad this was able to help you. Unfortunately, This does need to be applied after every Nvidia update. We have found that Geforce Experience will alert you of an update. That way you can install the update and apply the fix right away instead of windows doing it in the background without your knowledge. Thus allowing you to maintain a stable system. Side note if your running Quadro GPUs I do not know if Geforce Experience will work.

Posted on 2019-12-16 15:20:24
burnforce

I would like to take a minute and thank your entire team for finding a solution for this problem, even if it's a temporary one. This debilitating issue has plagued my 4 GPU setup for two years now, ever since I upgraded to Windows 10. I've reinstalled the system software multiple times, got a brand new set of GPUS, a new harddrive, shipped my machine out to the company that built it TWICE and the only solution anyone ever had was to roll it back to Windows 7. Sure, this worked for a year but now that Windows 7 is no longer supported, I knew i'd have to find a better solution. I was literally on the verge of giving up until I stumbled upon your post...all because I happen to add "with multiple GPUs" to my google search parameters when searching for additional clues about this dreaded "DPC_WATCHDOG_VIOLATION" bsod. I don't know a whole lot about PCs (have always been a Mac guy) and this issue almost had me throwing in the towel, but today you folks have restored a teenie tiny bit of confidence into my PC workflow. Thank you.

Posted on 2020-02-16 21:15:35
Ken Colorossi

It is unfortunate that Nvidea is unable to implement this fix to their driver stack. But, They do say this is a proper fix to the problem with system equipped with PLX chips. Thankfully Intel has released CPUs that have enough lanes to make this not a problem on future systems. I'm glad that my article was able to help you make your system more stable.

Posted on 2020-02-17 15:23:37
grue3d

hello, wow so i get the crashes all the time too since Windows 10 update 2 months ago. 4 nvidia gtx 1080ti cards and i believe it's the mobo but i'm not sure about the 10G part. anyway if i could ask a couple of questions: 1. is there a way to undo this once i run the executable? 2. what generally does it do? 3. since it has been a while, has nvidia addressed this in their driver - i have driver version 441.66 on my cards. i typically do not install geforce experience, i just try to get the driver only. thank you for this - for the first time i have HOPE!

Posted on 2020-03-11 13:44:42
Ken Colorossi

So this fix works on any motherboard that uses PLX chips. The Asus X99E-10G WS is just one, the Asus X99E WS would be another. To try and answer your questions.
1. To unto this executable would simply require you to reinstall your GPU drivers.
2. This tool forces the drivers to run is MSI (Message Signaled-Based Interrupts) Mode.
3. Unfortunately, Nvidia has not addressed this in their driver and can't. When we were working with them on this, they said it couldn't be added into the driver.

On a side note with the release of the new TRX40 platform and the new Asus Pro WS C621-64L SAGE/10G this fix won't need to be used on newer systems due to them having enough CPU lanes to not need the PLX chips.

Posted on 2020-03-11 15:52:19
grue3d

well, i think we've fixed it due to this thread/article! i and my IT guy didn't want to run the exe, but we researched it and found what needed to change. so far so good. i have two monitors and a sony 75" tv plugged into one card, the other three are not SLI'd or plugged into anything. i use for GPU rendering. so far no crashes at all since the mods were made. Ken thanks a ton!

Posted on 2020-03-12 22:11:44