Hi,
I'm having frequent crashes (1 per week) on a Windows 2012 Hyper-V server.
Server: Lenovo TS430 0441GU -- Brand new server (around 2 months old right now)
Storage: RAID 10
RAID Controller: LSI MR9240-8i SCSI
No Cluster. Workgroup member. Running w/o the GUI. It's a full Windows 2012 Standard with Hyper-V as only role and 2 Windows 2012 Std VMs on top of it.
Here's one memory dump file:
https://skydrive.live.com/redir?resid=E4577B4064C0496B!1518&authkey=!AIqSAQT_GR3qd-0&ithint=file%2c.DMP
Here's an analysis is that dump file:
*******************************************************************************
*
*
* Bugcheck Analysis
*
*
*
*******************************************************************************
HYPERVISOR_ERROR (20001)
The hypervisor has encountered a fatal error.
Arguments:
Arg1: 0000000000000011
Arg2: 000000000020f165
Arg3: 0000000000001005
Arg4: ffffe80004203cb0
Debugging Details:
------------------
BUGCHECK_STR: 0x20001_11_20f165
DEFAULT_BUCKET_ID: WIN8_DRIVER_FAULT
PROCESS_NAME: System
CURRENT_IRQL: f
ANALYSIS_VERSION: 6.3.9600.16384 (debuggers(dbg).130821-1623) amd64fre
LAST_CONTROL_TRANSFER: from fffff8020e58094c to fffff8020e46f440
STACK_TEXT:
fffff880`02c85bd8 fffff802`0e58094c : 00000000`00020001 00000000`00000011 00000000`0020f165 00000000`00001005 : nt!KeBugCheckEx
fffff880`02c85be0 fffff802`0e5ef0a8 : 00000000`00000002 fffff880`02c7a180 fffffa80`19185638 fffff802`0e4230c9 : nt!HvlNmiCallbackRoutine+0x54
fffff880`02c85c20 fffff802`0e46c102 : fffff880`02c7f5c8 fffff880`02c85e30 00000000`00000002 00000000`00000001 : nt! ?? ::FNODOBFM::`string'+0x14702
fffff880`02c85c70 fffff802`0e46bf73 : 00000000`00000000 00000000`00000000 00000000`00000000 00000000`00000000 : nt!KxNmiInterrupt+0x82
fffff880`02c85db0 fffff802`0e5b0664 : 00000000`00000001 fffffa80`19185638 00000000`00000000 00000000`00000001 : nt!KiNmiInterrupt+0x173
fffff880`02ca3890 fffff802`0e4c22ec : fffff880`00000001 fffffa80`19185828 fffffa80`19185550 fffff880`02c85f40 : nt!PpmIdleGuestExecute+0x1c
fffff880`02ca38c0 fffff802`0e4c1be0 : 0000006b`46778727 00000002`e1ad6536 00000023`7766156f fffffa80`19185550 : nt!PpmIdleExecuteTransition+0x47b
fffff880`02ca3ae0 fffff802`0e49898c : fffff880`02c7a180 fffff880`02c7a180 00000000`00000000 fffff880`02c85f40 : nt!PoIdle+0x460
fffff880`02ca3c60 00000000`00000000 : fffff880`02ca4000 fffff880`02c9e000 00000000`00000000 00000000`00000000 : nt!KiIdleLoop+0x2c
STACK_COMMAND: kb
FOLLOWUP_IP:
nt!HvlNmiCallbackRoutine+54
fffff802`0e58094c cc int 3
SYMBOL_STACK_INDEX: 1
SYMBOL_NAME: nt!HvlNmiCallbackRoutine+54
FOLLOWUP_NAME: MachineOwner
MODULE_NAME: nt
IMAGE_NAME: ntkrnlmp.exe
DEBUG_FLR_IMAGE_TIMESTAMP: 51a966cd
IMAGE_VERSION: 6.2.9200.16628
BUCKET_ID_FUNC_OFFSET: 54
FAILURE_BUCKET_ID: 0x20001_11_20f165_nt!HvlNmiCallbackRoutine
BUCKET_ID: 0x20001_11_20f165_nt!HvlNmiCallbackRoutine
ANALYSIS_SOURCE: KM
FAILURE_ID_HASH_STRING: km:0x20001_11_20f165_nt!hvlnmicallbackroutine
FAILURE_ID_HASH: {aeb24d2a-d325-b417-9e98-44210e977143}
Followup: MachineOwner
---------
2: kd> lmvm nt
start end module name
fffff802`0e415000 fffff802`0eb61000 nt (pdb symbols) C:\ProgramData\dbg\sym\ntkrnlmp.pdb\E2A28FBB5A694B22910DBF6F2F0CA7522\ntkrnlmp.pdb
Loaded symbol image file: ntkrnlmp.exe
Image path: ntkrnlmp.exe
Image name: ntkrnlmp.exe
Timestamp: Fri May 31 23:13:17 2013 (51A966CD)
CheckSum: 006B3AE4
ImageSize: 0074C000
File version: 6.2.9200.16628
Product version: 6.2.9200.16628
File flags: 0 (Mask 3F)
File OS: 40004 NT Win32
File type: 1.0 App
File date: 00000000.00000000
Translations: 0409.04b0
CompanyName: Microsoft Corporation
ProductName: Microsoft® Windows® Operating System
InternalName: ntkrnlmp.exe
OriginalFilename: ntkrnlmp.exe
ProductVersion: 6.2.9200.16628
FileVersion: 6.2.9200.16628 (win8_gdr.130531-1504)
FileDescription: NT Kernel & System
LegalCopyright: © Microsoft Corporation. All rights reserved.
I'm wondering if my storage system has any problems because:
1) I'm getting this messages on the System event log:
The description for Event ID 56 from source Application Popup cannot be found. Either the component that raises this event is not installed on your local computer or the installation is corrupted. You can install or repair the component on the
local computer.
If the event originated on another computer, the display information had to be saved with the event.
The following information was included with the event:
SCSI
000000
the message resource is present but the message is not found in the string/message table
2) Also...
The IO operation at logical block address 800 for Disk 3 was retried.
(I have 2 USB external drives plugged in and I don't know how to identifiy the disks using the \Device\Harddisk#\... notation. The message could refer to those external drives. According to diskpart, my disks are 0, 1, 2 though where 0 is my system RAIDed volumen; and 1 and 2 are the USB drives)
3) I tried updating the LSI firmware yesterday and it would go to the 70% and then the server crashed. I tried 2 times. Same behavior.
The MegaCli64 command doesn't report any issues with the controller nor with the disks. That's why I'm not sure.
Any help would be appreciated.
Thanks.
S.