50Micron.com

Clariion – Mirrorview – Cisco – FCIP

by on Jun.17, 2008, under Clariion, FCIP, Fibrechannel, Replication

Got into a scary situation this week.  Got called into help with a customer with a Mirrorview implementation.

Situation was:  Customer had Mirrorview/S set up within the existing switch environment, replication worked perfectly.

Then they reconfigured the switches to run FCIP so they could start replication to a remote site.  This is where things went badly.

First off – Cisco sets the Gig/E ports on the 9216i for jumbo-frames.  (MTU defaults to 2300)

This is a great idea for Fibrechannel replication, because a fibrechannel frame is 2114 bytes and this allows an entire FC frame to be sent within an ethernet packet.

Problem is that the default MTU on most network environments is 1500.  Now the *REAL* problem is that when  you first connect the GIg/E ports on the 9216i to a 6509 or other switch – it will at first appear to work perfectly…..until you try to pass data.

When you try to pass data across this link, the DF (Don’t Fragment) bit is set and the larger frames get dropped.  This causes an ISL connection between switches to flap, which causes no end of issues.  The fabrics will segment and re-join repeatedly until the first time you do anything that causes a reconfiguration, like updating the zoneset.  If you do that during a cycle where the ISL is going up and down, the vsan’s will fragment and stay fragmented because it will not be able to re-merge the fabrics.

So I come into this situation and the switches are so badly configured that it takes me a day just to get the ISL’s up and stable.  I set the MTU to 1500 on the switches, took the gig-e links down, and went to each switch and (carefully) deleted each vsan that didn’t belong on that switch.  (In addition to this being set up incorrectly, all three vsans merged when the swtiches were first connected due to the ISL’s being configured incorrectly)

Now the Clariion issue is still open.  A normal mirrorview configuration is as follows:

Source_SPAx –> Target_SPAx  (Where ‘x’ is the highest SP port #)

Source_SPBx –> Target_SPBx   (same here)

Now when the customer’s Clariion’s are zoned this way (in this case SPB3 to SPB1) nothing shows up in the Connectivity Status window.   But when I reverse the zoning, running SPA3 to SPB1, it shows up fine.  (Unfortunately Mirrorview doesn’t work in that configuration.

That’s where we stand.  A “simple 15 minute FCIP fix” is coming to the end of it’s third day.

:, , ,

7 Comments for this entry

  • jm

    Are you using IVR with a transit VSAN (and only that transit VSAN) spanning your FCIP link?

  • SanGod

    Nope -

    What purpose would IVR serve? The whole point of this is to keep traffic from SW1 and SW2 separate, wouldn’t adding Intra-Vsan-Routing cause the fabrics on both switches to merge? That’s exactly the situation we’re supposed to be avoiding.

    Though in retrospect that might explain the extra VSAN’s I found on the swtich when I first started. I’ve done a dozen or so of these installs and never touched IVR.

  • jm

    Hmm… I think IVR would serve the exact opposite purpose, it would prevent the VSANs on both switches from merging. I’ve never implemented an FCIP link on Cisco MDS gear, but if I was going to I always thought I’d want to avoid having slow or unreliable WAN link causing my production VSAN(s) to split and merge. Something like this, let’s see if my formatting gets mangled here…

    Primary site switch DR site switch
    Prod VSAN 20 DR VSAN 30
    Transit VSAN 5 Transit VSAN 5

    All my production zones are normal FC zones in VSAN 20. Anything at my DR site is also done with standard zones in VSAN 30. I have an IVR zone for each Mirrorview connection which would look like:

    ivr zone name z_prod_clar_spa3_dr_clar_spa3
    member pwwn vsan 20
    member pwwn vsan 30

    So you get connectivity between your Clariion SPs, but if there are any issues with your FCIP link there is no impact to your production (or DR) vsans. No fspf recalc, no principal switch elections, no RCFs. If your fabrics are large in either site, you wouldn’t want a single WAN issue to potentially cause hiccups, especially since it’s common for both redundant fabrics to be traversing the same IP link. I think this is one of the main use cases for IVR.

  • jm

    Okay, my formatiing did get mangled. That’ll teach me to use greater than and less than symbols.

    Let me try just type out what I was going for. You have production VSAN 20 and transit VSAN 5 on your primary site switch. You have DR VSAN 30 and transit VSAN 5 on your DR site switch. The ISL that traverses the FCIP link should have VSAN 5 (and only VSAN 5) in its VSAN allowed list. VSAN 20 and 30 remain completely separate.

    Also, my example IVR zone should have looked like this:

    ivr zone name z_prod_clar_spa3_dr_clar_spa3
    member pwwn (prod_clar SPA3 wwn) vsan 20
    member pwwn (DR_clar SPA3) vsan 30

  • SanGod

    Ok, this is the part where I admit I didn’t know that.

    I didn’t know that.

    It does, however, make perfect sense. As I’ve seen, any break in the link between the sites causes the fabric to segment and of course if anything changes in the meantime they will STAY segmented.

    I’m going to be at this particular site on Tuesday to do a port-reassignment (the tech who built the SAN in the first place split the ports odd/even, instead of the usual A0/B3, A1/B2, A2/B1, A3/B0 pairings, so right now all mirrorview ports are across one switch anyway.

    When I do the reconfiguration, I’ll probably break SAN’s apart and set it up the way you specified above. Does anything special need to be done when you create the IVR vsans?

  • jm

    Nothing all that special:

    ivr enable
    ivr distribute
    ivr auto-topology

    That’s about it. Oh, you’ll need the (rather expensive) Enterprise License for IVR I think. You might get IVR with a 9216i switch, not sure.

  • SanGod

    Got it set up like this:

    Source (VSAN10) –> Target (VSAN11)
    Source (VSAN20) –> Target (VSAN21)

    IVRZones are as follows:

    SPA3(10) –> SPA1(11) (via VSAN100)
    SPB3(20) –> SPB1(21) (via VSAN200)

    Still no login on the Connectivity status *BUT*

    When I zone it:

    SPA3(10) –> SPB1(21) (via VSAN100)
    SPB3(20) –> SPA1(11) (via VSAN200)

    I get a login in connectivity status. Since obviously you can’t use Mirrorview with A–>B and B–>A this is useless.

    So I’ve got a SAC case open – the thing that’s driving me nuts is that I’ve got the switch guys telling me it’s a Clariion problem and the Clariion guys say it’s a switch problem.

    My personal feeling is that this smells an awful lot like a problem with the HBA registration on the Clariion.

    Any ideas?

Leave a Reply

 

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Visit our friends!

A few highly recommended friends...