vSphere network troubleshooting
Tuesday, April 20th, 2010, by Erik Scholten
During the last month I have been very busy building a new infrastructure at a client site. I’m responsible for the overall technical solution and the basis, a VMware vSphere infrastructure build on five Dell PowerEdge R805′s, Dell EqualLogic PS5000 and 6000 storage and Cisco switches for LAN, DMZ and IP storage networking.
Just before the customer initiated their functional test period we discovered that the overall Windows network performance was slow. We did several test like copying an 8 GB file from local vmdk to local vmdk and VM to VM and found that the storage performance was no issue but the network performance was very slow.
In the last few years that I have been working with virtualization I have always been a fan of a static network configuration. Meaning, when I configure ESX networking I like my network interfaces and physical switch ports to be configured at 1000MB full duplex if the switch/network interface combination allows it. The idea is that if you purchase gigabit network interfaces and switches you know the maximum speeds. So you configure it to run at it’s maximum capacity, eliminating overhead and using as much bandwidth as possible purely for data transfer.
So when we experienced slow network performance I had a colleague check the Cisco LAN switches for errors, drops, packet loss or any other flaw which might indicate a speed or duplex mismatch. None were found so I assumed that the network configuration was not the issue. But as we know by now, ‘Assumption is the mother of all fuck-ups!‘.