Specs:
Mobo: ASRock Z77 Extreme4 Socket 1155
CPU: Intel Core i5 3570K 3.4GHz
GPU: Palit GTX 970 x2 SLI
RAM: Corsair Vengeance Performance 16GB (2x8GB) DDR3 1600MHz
PSU: Corsair 1050W HX Series Modular (3 years old)
OS: Windows 7 Home Premium 64bit
Case: Corsair Carbide 300R Case
Monitor: Samsung TV
Recently I had got my second 970 replaced (had a fan fault) and after I'd installed it, Nvidia control panel wasn't picking it up. I didn't look too much into it as one of the first things I'd done was update my nvidia drivers to 358.50 and that did the trick. Not long after I started getting BSODs although it doesn't seem like it's related to the gpu but rather my cpu - disabling my 0.8ghz overclock stopped the BSODs (overclocked to 4.2ghz constant for probably over a year now - probably wear and tear).
Unfortunately I didn't realise this before removing my second GPU and running the Plan9 visualiser (turned out to be a reliable way to trigger the BSODs) - basically what I had tone was run teh test with both GPUS (crash) and then remove the second gpu that had been replaced and tested it again (crash). THEN I decided to disable the overclock and hey presto, no more BSOD!
But now I have another problem - when I put the second gpu back in it had the same problem as before where nvidia control panel won't recognise the SLI setup (option to enable SLI not present - instead just has "Configure surround, phys X" tab instead). Easy right? Just update drivers again, worked last time. Well, I was already at most recent so I uninstalled drivers using Display Driver Uninstaller and then re-installed 358.50. Nope. Ooookay, so I rolled back to the previous version 355.98. Still nope. Updated to 358.50 from there, still no dice. I've yet to roll back further but not sure if it's worth the time wasted on trial and error (please let me know if it is although it's crazy to think that could solve the problem).
So entered boot menu on BIOS - system browser recognises both 970s just fine.Had a look on device manager and aha! Little yellow icon denoting troubles afoot on the second 970. "Windows has stopped this device because it has reported problems. (Code 43)" Googled this and from what I can tell the best answer is "something is wrong."
So I've swapped the 970s around and it's whichever 970 is in PCI bus 2 that has error code 43. To double check this I also had just one 970 at a time in bus 1 and both worked fine (I've not done the same with just bus2 as I'm not sure if that would work).
I've yet to try a different SLI bridge but I don't think that's the problem (but will get another one if people recommend I do so - seems to be more about what is causing the error 43 code though). I'm a bit worried a short circuit that happened about a year ago may have turned my mobo (or at least that region) a ticking time bomb - I had a network card that didn't quite fit the case so it wasn't screwed in super securely - it became loose without me realising which caused a short - the contacts on the network card showed burns. Admittedly I forget which port it went into but I think it may have been the PCIe 2 port which my second 970 is in. I might be worrying about nothing though.
Any suggestions would be hugely appreciated - I'm not the most tech savvy and I'm more or less out of ideas. It's been one problem after the next with this PC and it's somewhat exhausting!
Also as a side if anybody would be willing to peruse my BSOD reports please let me know. As I said, it seems to be related to CPU overclock but perhaps there's a more specific problem that I'm unaware of (and perhaps even related to this gpu problem). Again, much appreciation for anyone who is up for taking a look at that
