My computer started to freeze at irregular times for 3 weeks now.
Please note that this question change with each things that i try. (For additional details)
What happens
My computer freezes, the video stops. (No graphic glitches, it just stops)
Sound keeps playing up to some time (Usually 10-30 seconds) then stops playing.
Sometimes, randomly, the screen on my G-15 keyboard flickers and I see caracters not at the right places. Usually happens for about 1-2 seconds and a bit before my computer freezes.
I have to keep the power button pressed for 4 seconds to shut my computer down.
I still hear my hard drives and fans working.
Sometimes it works with no problems for a full day, some other times it just keeps freezing each time I restart my computer and I have to leave it for the rest of the day.
Sometimes my mouse freezes for a fraction of a second (Like 0.01 to 0.2 seconds) quite randomly, usually before it freezes.
No errors spotted by the "Action center" unlike when I had problems with my last video card on this system (Driver errors).
My G-15 LCD screen also freezes.
Sometimes my G-15 LCD screen flickers and caracters gets caried around temporary under heavy load.
Now, most of the times, the BIOS hard disks boot order gets reversed for some reason and I have to put it to the right one and save each times I boot. (Might be unrelated, not sure, but it first started yesterday)
Sometimes the BIOS doesn't detect my 750GB hard drive plugged in SATA1.
What I did so far
I have had similar problems in the past and I had changed my hard drive (It was faulty), so I tested my software RAID-0 array and it was faulty so I changed it. (I reinstalled Windows 7 with this part). I also tested with unplugging my secondary hard drive.
My CPU was running at about 100 degree Celsius, I removed the dust between the fans and the heatsink and it's now between 45-55.
I ran a CPU stress-test and it didn't freeze during the tests (using Prime95 on all cores)
Ran a memory test (using memtest86+) for a single pass and there were no errors.
Ran a GPU stress test with ati-tools and furmark and it didn't freeze during the tests. (No artefacts either)
I had troubles with my graphic card when I got it, but I think that it got fixed with a driver update.
I checked the voltages in my BIOS setup and they all seemed ok (±0.2 I think).
I have ran on the computer without problems with Fedora 15 on an external hard drive (Appart that it couldn't load Gnome 3 and was reverting to Gnome 2, didn't want to install drivers since I use it on multiple computers) I used it to backup my files from the raid array to my 1TB hard drive for the reinstallation of Windows. (So the crashes only happenned on Windows) [The external hard drive is plugged directly on a SATA port]
I contacted EVGA (My graphic card vendor) and pointed them on this question, I'm looking for an answer.
Ran sensors on Fedora 15 and got this output: http://pastebin.com/0BHJnAvu
Ran 6 short different CPU stress test on Fedora 15 (Haven't found any complete stress testers for Linux) and it didn't crash.
Changed the thermal paste to some Artic Silver 5 for my CPU and stress tested the CPU, temperature was at 50 idle, then 64 highest and slowly went down to 62 during the test.
Ran some stress testing with a temporary graphic card and it went ok.
Ran furmark stress test with my original graphic card and it freezed again. GPU had a temp of 74C, a CPU temp of 58C and a mobo temp of 40C or 45C (Dunno which one it is from SpeedFan).
Ran a furmark stress test and a CPU stress test at the same time, results: http://pastebin.com/2t6PLpdJ
I have been using my computer without stressing it for about 2 hours now and no crashes yet. I also have disabled the AMD Cool'n'quiet function on the BIOS for a more regular power to the CPU. When I ran Furmark without C'n'q my computer didn't freeze but I had a "Driver Kernel Error" that have recovered (And Furmark crashed) all that while running a CPU stress test. The computer eventually frozed without me being at it, but this time my screen just went on sleep and I couldn't wake it.
Using the stability tester in nTune my computer freezed again (In the same manner as before). I notived that Speedfan gives me a -12V of -16.97V and a -5V of -8.78V. I wonder if these numbers are reliable and if they are good or bad.
I have swapped my G-15 with another basic USB keyboard (HP) and I have ran furmark for about 10 minutes with a CPU stability test running each 60 seconds for 30 seconds and my computer haven't crashed yet.
Ran some more extended tests without my G-15 and it freezed like it usually do.
Removed the nForce Hard disk controler.
Disabled command queuing in the NVIDIA nForce SATA Controller for both port 0 and port 1 (Errors from the logs)
Used CPUID HwMonitor, here are the voltages: http://pastebin.com/dfM7p4jV
Changed some configurations in the motherboard BIOS: Disabled PEG Link Mode, Changed AI Tuning to Standard, Disabled the 1394 Controller, Disabled HD Audio, Disabled JMicron RAID controller and Disabled SATA Raid.
When it happens
When I play video games (Mostly)
When I play flash games (Second most)
When I'm looking at my desktop background (It rarely happens when I have a window open, but it does, sometimes)
When my Graphic card and my CPU are stressed.
Sometimes when my Graphic card is stressed.
Never happenned while stressing only the CPU. Sometimes when my CPU is stressed.
Specs
Windows Seven x64 Home Premium
Motherboard: M2N-SLI Deluxe
CPU: AMD Phenom 9950 x2 @ 2.6GHz
Memory: Kingston 4x2GB Dual Channel (Pretty basic memory sticks)
Hard drives: Was 2x250GB (Western digital caviar) in raid-0 + 1TB (WD caviar black), I replaced the raid array with a 750GB (WD caviar black) [Yes I removed the array from the raid configurations]
750W Power supply
No overcloking. Ever.
There have been some power-downs like 4-5 weeks ago, but the problem didn't start immediately after. (I wasn't home, so my computer got shut-down)
Event logs (Warnings, errors and critical errors) for the last 24 hours: http://pastebin.com/Bvvk31T7
My current to-try list
Reinstall the drivers and software 1 by 1 and do extensive stress testing between each.
Update the BIOS firmware to the most recent stable one.
Change my motherboard.
Status updates
Keeping only the last 3
(28/06 04pm) More stress testing and still pass the tests.
(28/06 03pm) Been stress testing for 10 minute straight now and 5 minutes with both CPU and GPU being stressed at the same time.
(28/06 03pm) Stress-testing right now, so far no problems.
A little hope
Tests with Furmark and Prime95.
Testing Windows bare-bone: 30 Minutes stress, no freeze.
Installing an Anti-virus and some software, restarting computer.
Testing with Anti-virus and some software (No drivers installed): 30 Minutes stress, no freeze.
Installing audio drivers, restarting computer.
Testing with the audio drivers: 30 Minutes stress, no freeze.
Installing the latest graphic drivers from EVGA's website (without 3d vision since I don't use it), restarting computer.
Testing with the graphic drivers: 30 Minutes stress, no freeze.
Configuring Windows to my liking and installing more softwares.
In this situation, how can I successfully pin-point the current hardware problem? (If it's a hardware problem) Because I don't really have the budget to just forget and replace everything. I also don't really have hardware to test-replace current hardware.