iSCSI Redux
by Jesse on Apr.14, 2008, under iSCSI
Well – for the last three weeks I’ve designed more NAS/iSCSI systems than I really want to admit to.
I understand what is so alluring about network storage, it’s cheap, plentiful, and it utilizes a technology that every data-center already has in place.
I have to admit that the customer I’m currently working on has done what none other has dared do until now. iSCSI on Symmetrix.
My first question of course is…..Why? Does it make sense to spend millions on a storage array only to refuse to spend $25k on a couple of switches? The only time I could see this as being even remotely useful would be a situation where you have something along the lines of a giant linux distributed computing cluster, though for simplicity’s sake I would probably opt for NFS over iSCSI.
Customer of course asked me what I thought – and I bit my tongue so hard I was afraid he’d caught that I wasn’t sold on the idea. I gave them the politic answer, of course being that “It’s a great, inexpensive way to connect a large number of hosts to a single pool of storage.” which of course neither answered his question nor committed me to an opinion.
Truthfully – I’ve seen it work a few times now, and with a few exceptions about powerpath and different subnets, it seems fairly robust. (Powerpath for windows works out of the box, PowerPath for linux requires that each interface be in a different subnet, which complicates things)
The only thing I can suggest if you’re planning an iSCSI SAN, is that at the very minimum you want to create a separate VLAN for the iSCSI traffic or risk having issues with broadcast traffic interfering with production traffic. Ideally you’ll create a separate physical storage network and n’ere the twain shall meet.
April 16th, 2008 on 4:18 pm
I’ve been feeling the same way about iSCSI and have seen it work particularly well in small ESX implementations. Although just because it says iSCSI on the box, doesn’t mean it will work that well, We had good experiences running an ESX dev lab on FalconSTOR’s foo, but managed to bring a lightweight 4-disk Thecus unit to its knees. Not that I’m cheeky enough to suggest that SOHO storage appliances rate a mention next to the Symm. But we’re cowboys in Brisbane, so it’s all just spindles, right?
April 17th, 2008 on 3:05 pm
I’ve installed a few Celerra/iSCSI systems in past months, and I think it all comes down to this:
If you’re going to run iSCSI, don’t try to do it over a cheap switch. Get one that supports LACP and bind the Celerra (or whatever IP storage system you’re using) interfaces together. Then connect multiple physical addresses on the VMWare server to the virtual switch and set it to share them. You get both (some) redundancy and a little load balancing, though the funniest thing i’ve found so far is that binding the interfaces together in VMWare seems to have a very limited effect.
My ‘vmnic0′ still seems to get 90% of the traffic, even though I have 4 interfaces in the box. Someone tell me what obvious configuration setting I’m missing?
April 17th, 2008 on 9:10 pm
What’s your load balancing set to on the vswitch?
April 18th, 2008 on 8:29 am
I tried enabling LACP and setting the four ports I’ve got coming from the VMWare box to a single port-group, but that broke it.
My switch (cheap Dell PowerConnect) doesn’t support etherchannel that I can see, so i might be out of luck.
April 18th, 2008 on 11:16 am
I’m not completely sure with LACP, but with Cisco’s etherchannel (which is basically LACP, but not). Between two hosts you will never get more than 1x connection’s bandwidth, and the way it balances traffic is using an XOR hash of the MAC (or IP, I forget). If you are always talking to the same host, your XOR hash will always be the same so until a failure that one connection will take all the traffic. If the host talks to someone else it should move around to the other ports. Since I believe NFS & iscsi using a single vmkernel ip address, you are in a one to one situation where no matter how many network interfaces you bring up, you’ll still be limited to only using one of them for the datastore (the network traffic initiated by guests will be XOR’d and go to different interfaces). Unfortunately it’s an ethernet thing in general, so not really much can be done about it (other than create a bunch of IP’s for the same host, which I don’t think you can do right now… but I might be wrong)
With a single switch that doesn’t support some port channeling feature, it’s probably best to use the basic networking config where vmware will do a poor-mans load balancing and failover. Based upon powerup it will assign the guest to a network interface, upon link loss it will move to another interface (if available). You can also enable a feature that basically “pings” through the network to detect if there is a situation where a link is still up but it’s not functioning.
April 18th, 2008 on 2:48 pm
The switch does support LACP, but it doesn’t seem to support whatever it is that VMWare is wanting to support.
On the NFS server I’ve enabled round-robin which is working wonderfully, and judging by activity lights I’m moving traffic across. The switch and the bonding driver in RHEL5 both support LACP so I’m thinking about switching to that when I get the new NIC’s put in, that way at least each host will be coming in through a combined connection.
The fun part of course is that to take the NFS server down I have to shut down every VM in the environment, which takes some coordination.
May 3rd, 2008 on 3:12 am
On your RHEL box bind the two NICs with LACP and configure LACP on the associated switch ports. Add a secondary IP address to the RHEL NIC team. Add another NFS export (vmfs2 of whatever) and move roughly half your VMs into that directory.
On your ESX host create 2 vSwitches, configure the first vSwitch with a vm portgroup and attach 2 physical NICs. On the second vSwitch create a service console portgroup and a vmkernel portgroup and attach 2 physical NICs. On the properties of the vmkernel portgroup set the teaming to route based on IP hash (instead of the vswitch port id – default). Add the new NFS export using the secondary IP address of the RHEL server to the ESX server and add the VMs back into inventory.
Now ESX *should* route traffic to the first NFS export through one physical NIC and traffic to the other export through the other physical NIC (it automatically round robins when network traffic is first started from src-dst and since you’re doing it based on IP it will ignore the fact that its all on the same NFS box).
I’m assuming your not using VLANs (you would need to modify a few thing if you were, e.g. use trunk ports etc).
Let me know if it works for you (I don’t think i missed anything… it’s late).
May 9th, 2008 on 10:57 pm
Right now I’m doing it like this:
NFS Server – (4xgigabit-LACP) –> Switch
Now couldn’t I add a second IP address to the bond0 interface and then break the ESX servers up? so that each VMKernel connects to the same filesystem?
May 9th, 2008 on 11:44 pm
Yeah you’re right you could just follow what i said but leave out the second NFS export. But you will still have to re-import half the VMs. Half of the vms would be on \\192.168.1.100\virtualmachines and the other half on \\192.168.1.101\virtualmachines (or whatever the path is). Even though they exist on the same array, because of the seperate IPs (and assuming you changed the vSwitch to route based on IP hash) it should think it’s talking to two different storage arrays and balance traffic between the 2 vmkernel ports.