Jump to content


 


Register a free account to unlock additional features at BleepingComputer.com
Welcome to BleepingComputer, a free community where people like yourself come together to discuss and learn how to use their computers. Using the site is easy and fun. As a guest, you can browse and view the various discussions in the forums, but can not create a new topic or reply to an existing one unless you are logged in. Other benefits of registering an account are subscribing to topics and forums, creating a blog, and having no ads shown anywhere on the site.


Click here to Register a free account now! or read our Welcome Guide to learn how to use this site.

Photo

BSODs. Can use some help analyzing crash dumps.


  • Please log in to reply
3 replies to this topic

#1 migit5324

migit5324

  • Members
  • 3 posts
  • OFFLINE
  •  
  • Gender:Male
  • Local time:01:14 PM

Posted 19 January 2017 - 01:34 AM

Hi - 

 

My PC has been BSODing recently. It started after I restarted my PC a few days ago.  I restart it maybe once a month so it could have been anything.  The only change I remembered making was changing the page file usage from 2gb-8gb to system managed, which put it to 32gb. The page file was filling up even though there was tons of ram free, so i figured i'd give it some more because I didnt know why it was filling up.  I changed it back and the errors kept going.  I turned page file off completely because I have 32gb of ram and it still happened.  This PC is running windows 10 enterprise edition and its noteworthy that it has an areca raid card in it with a couple raids (raid 10 and raid 6).  It also has an AMD 7970 gfx card.  The PC is used for gaming and as a plex server.   The first recent BSOD happened while playing WoW.  Then after the first BSOD it was chaining them immediately after logging in and starting some programs (everything.exe - comes up in some of the dumps) till I went into safe mode for a while (was backing up my PC). 

 

 

I took a shot at analyzing the crash dumps, half point to memory corruption so I ran memtest overnight and it only came up with a few warnings (no solid errors).  The warnings were for this case: http://www.memtest86.com/troubleshooting.htm#hammer  My ram is not overclocked (nothing is), but just to make sure I underclocked my ram from 1600mhz to 1333mhz. It happened again after that.  I ran a check disk and it was clean (SSD is a few months old).  After taking it out of safe mode, underclocking my ram, and waiting 18 hours or so since the last BSOD, it was running WoW fine for a few hours until it crashed again.

 

 

I ran some graphics card memory tests.  One of the tests (MemTestCL) came up with a ton of errors for one specific test (random), but people online doubt the program's abilities.  I ran most of these here: https://www.raymond.cc/blog/having-problems-with-video-card-stress-test-its-memory/  The first program there only tests 1 gig of ram and forcing it to 3gb didnt work.  It came up with no errors for 1gb of ram. The 4th (EVGA OC Scanner X) came up with 0 artifacts.  One interesting thing here is that my pagefile filled up completely during one of these tests.

 

 

 

3 of the dumps in windbg point to: FAILURE_BUCKET_ID:  MEMORY_CORRUPTION_LARGE_32

The others point to DEFAULT_BUCKET_ID:  WIN8_DRIVER_FAULT   --- Probably caused by : ntkrnlmp.exe ( nt! ?? ::FNODOBFM::`string'+12cb8 )

 

I'm still looking into this but I'm curious what you guys find as a possible root cause here.  Since memtest came back fine(for the most part), I'm not sure why I'm getting memory corruption issues.  

 

I'm currently updating my drivers and windows to see if that helps anything.  I'm also going to replace the fans on my gpu since one of the doesnt spin as fast as the other.  I'm tempted to just reformat and see if the issue goes away...

 

Thoughts?

 

 

Moved from Windows 10 to BSOD

NickAu

Attached Files


Edited by NickAu, 19 January 2017 - 03:42 AM.
Mod edit


BC AdBot (Login to Remove)

 


#2 rarson

rarson

  • Members
  • 45 posts
  • OFFLINE
  •  
  • Gender:Male
  • Local time:05:14 PM

Posted 19 January 2017 - 11:40 AM

The bug check code you're getting seems to indicate that it's either a hardware issue or a driver issue. I would recommend updating the drivers first. If it's a hardware issue, it might involve overheating or it could be faulty hardware, so I recommend stress testing the CPU and GPU while monitoring temperatures to see if anything is abnormal. I usually use HWMonitor to monitor temperatures, Prime95 to stress the CPU, and FurMark to stress the GPU. If the issue is related to overheating, it should be relatively obvious. How many passes did you run on Memtest86+? I usually run at least 8 tests but I've seen failures show up after almost 30 successful passes on several occasions. Memory is notoriously difficult to thoroughly test. Obviously with as much RAM as you have, it's going to take a very long time even just for 8 passes.



#3 migit5324

migit5324
  • Topic Starter

  • Members
  • 3 posts
  • OFFLINE
  •  
  • Gender:Male
  • Local time:01:14 PM

Posted 19 January 2017 - 02:38 PM

The GPU stress test managed to crash my PC. It was running at about 100 degrees C for a few minutes, then I walked away and it crashed with the same bugcheck error. Going to replace the fans on the gpu and run old drivers to see if that makes it more stable. I cloned my old SSD to the new 3 months ago so it still has everything on there. The memtest was almost done with its 5th pass when I stopped it. That took 8 hours to do and I needed the PC back. How would the gpu being bad cause memory_corruption bsods though? The dumps for the memory corruption don't point to any graphics related programs.

#4 migit5324

migit5324
  • Topic Starter

  • Members
  • 3 posts
  • OFFLINE
  •  
  • Gender:Male
  • Local time:01:14 PM

Posted 19 January 2017 - 06:03 PM

After installing the new fan kit the GPU temps dropped to a more normal level (85 degrees).  I bought the wrong fan kit so I only managed to get 1 of the fans installed... Overnighting another fan kit to make sure this thing has proper cooling.

 

The bad fan literally fell off when I touched it.  The fan blades that is... The part screwed in stayed.

 

I'm thinking the memory corruption bsods were related to the page file somehow.  I believe I just had two separate issues occur at the same time, which make troubleshooting a lot harder.

 

Plus while I was thinking "it's the GPU" the whole time, I couldnt help but think I was thinking that so I'd have an excuse to upgrade.  Guess I should listen to my gut more often.






0 user(s) are reading this topic

0 members, 0 guests, 0 anonymous users