Hi All
I have recently been doing a full HP Flex-10 installation with VMware and NetApp storage. Now I have done a few of these in the past and I have always struggled to calculate how the bandwidth should be split between the vNIC's presented to the blades.
There are a few posts out there and all point to gathering customer requirements. Now most customers when asked this will just say "What do you recommend" and this is what i would expect from most of my customers. I am a consultant and I should advise them correctly on this as there compleatly new to the concept and technology.
During a full project life cycle this is more than something I am comfortable with. We would perform a complete capacity planning exercise and in this we could see the existing bandwidth needed by storage and network connectivity, as long as the system was not bottle necking. However most of these HP Flex-10 projects i seem to get handed are half completed, the sell and the design has been completed and we are asked to complete the implementation. Now this is something that I am not a fan of but unfortunately is something that is common in the channel consultancy sector.
So I am sitting infront of the customer and deciding what I should do to carve up the bandwidth. To be 100% sure is impossible so I decided to use the reference architecture produced by HP for vSphere 4.0. Located Here
This details the following breakdown.
This makes a lot of sense in what it suggests,
ESXi Management is given a 500Mb Share - VMware beat practise is to dedicate a 2X physical NICs for failover, but VMware do not define a best practise for bandwidth for the Management traffic. Often Customers using physical separation of networks will use a 100Mb switch witch will more than cope with vCenter agent traffic and heartbeat traffic.
vMotion and Fault tolerance are given a 2.5Gb share - Most of my designs now include these two sitting on the same vSwitch with separate portgroups. To guarantee the 1GB bandwidth is provided and uninterrupted to both portgroups each port group has an active NIC witch is provided to the other prtgroup as a standby adapter to provide redundancy.
ISCSI is given 4GB share - This largely depends on the backend storage, what I am trying to say is providing 4Gb bandwidth to the blade is pointless if the backend storage only has 2X 1Gb connections. The sizing in the reference architecture just so happend to tie in with what was configured on the backend of the storage I was using each storage processor had 2X 10Gb connections, and the storage array had 2 storage processors.
Virtual Machine traffic (multiple Networks) are given a 3Gb Share. - This was the remaining bandwidth and after reviewing some low level perf mon and switch statistics from the existing physical infrastructure, 3Gb was more than enough bandwidth.
Each module has its own defied ethernet network for redundancy, N1 is the first Flex-10 interconnect and N2 is the second Flex-10 interconnect.
Another thing that is important is to make sure you utilise the full bandwidth from each Flex-10, Configuring the above and using 1X10Gb will cause a bottle neck on the uplink's.
I will go into to configuration with VMware in another post. I would also be interested to know if anyone else had any different ways of calculating this?