Here's the scenario
Custom Server:
Dual 16-core AMD Opteron processors
Asus Server MB
132GB RAM
8TB Hard drive in RAID 50 (DATA)
512GB Hard drive in RAID 10 (OS)
2 onboard Intel server NIC
1 added Broadcom server NIC
1 added dual-port Intel NIC
1 onboard remote management NIC
Host OS is Hyper-V Server 2008 R2. 4 VM - SBS 2011, 2 Server 2008 R2, Server 2008 web edition. Each VM has a dedicated NIC and IP. All NIC's tie into a 24 port Netgear layer 3 switch. This switch ties into a 3com 48 port layer
2 switch and then to a Netgear 48 Port layer 3 switch. Most of the cable between switches and workstations is CAT3, but the cable from the server to the switch is CATt5e.
Here is the problem: The network crashes. When I see the crashes, my system logs on the Host OS fill up with virtual network errors. The switches flip out and cannot pass traffic. In past breaks, the virtual network was broken and had
to be rebuilt. Currently, the network breaks without notice and is taking longer to fix. When the network goes down, the Server is up, but VM cannot ping between. Physically shutting off the server and pulling the power cord does not resolve the
issue. It takes all switches to be turned off for 5 minutes and to wait before the network comes back up. I believe the issue is the network and the cabling, but we didnt have these issues before we put this server into production. I am taking alot of
heat over this. I have a ticket open with Microsoft regarding the system logs, but they are stumped and have passed it on to higher people. The server performs as expected and has no isses other than the network. There are about 50 users hitting
this server. I am not able to replicate this issue by force and cannot predict when this issue will occur next. It has crashed at 2am, and 2pm--workload does not seem to be a cause here.
I'm pretty desperate for help/advice/guidance here. Could this be electrical noise? bad cabling? a bad NIC? a server that crashes its own network and the rest? Does any of this make sense? is this a network issue thats affecting the server or
a server issue thats affecting the network?
Thank you very much!