Bug
SRDF woes….
by Jesse on Apr.12, 2007, under Bug, General, Replication, Support-Calls
Ok, for about 4 months I’ve had an SRDF problem that has been kicking my ass.Â
SRDF over Ethernet presents a host of new problems, unique to Ethernet.Â
First, and mostly – the RE adapters want to see *ALL* of the remote adapters in the RDF group. That means if you have four RDF/Ethernet (RE) boards in the source box and four RE boards in the target, the way the Symm does the SRDF is that it expects EACH source RE to see *ALL* target RE’s. It allows for a nice load balancing scenario, but doesn’t do much to allow for either dual paths or direct attach connections.
The second problem I found was an undocumented feature of Solutions Enabler 6.3.1 which caused problems querying SRDF devices, and more importantly, problems running any configuration change script against an RDF device.
For instance – the config change script as follows, took about 45 minutes before it timed out and died without error.
convert dev 0db to 2-way-mir ;
Now some will recognize this as a simple “de-configuration” of an SRDF volume. But because the two symms have to be correctly communicating with each other for this to work, the script simply failed.
I went in to work this morning determined to kick this thing once and for all.
First off I picked up a couple of 29xx series Cisco switches, configured them with ports 1-4 in a trunk leaving the 4 optical ports on each switch for the Symmetrix connections. This allowed me to put all four connections on each Symm into one subnet.
Then I realized that that’s NOT in fact how the symm was configured – the CE left the symm configured (granted – per my suggestion) so that each RE was in a subnet with only the corresponding RE on the remote side. My understanding being that if there was no gateway it would not try to contact the other (target) RE’s. (bad assumption)
So I did the one thing neither Customer-me or Consultant-me should ever do. I did a binfile change on the Symm without contacting EMC and putting it through any form of change control. It’s not a difficult thing to do a binfile change, provided you know the passwords to get into Symmwin and have a basic understanding of how a Symm functions.Â
It’s just frowned upon by EMC because they don’t like the idea that they are not the gods of the universe that they purport themselves to be. (for binfile changes not related to initial setup they also expect to be engaged professionally, for approximately $5k per change – if I were smart, and had the time to spare, I’d start offering my services at half that price)
After the binfile change all lights as far as SRDF went were green, but my config change was still hanging. A brief email exchange with one of the SAC’s better engineers (I’ve worked with him before) pointed me very quickly to the Solutions Enabler upgrade.
The 6.3.2 upgrade did the trick, and is now in the process of being pushed out to all servers.
Â
Has anyone found anything good about Microsoft servers yet?
by Jesse on Feb.10, 2007, under Backup, Bug, General, Microsuck, TimeFinder
I’m not really bashing their workstations, i’m actually quite fond of Vista on my laptop.
However, when it comes to servers, I view being in an environment where Microsoft is the PRIMARY operating system by a factor of 20:1 as a form of torture akin to having my finger-naiils pulled out or being tied to a chair and forced to listen to “Barney” all day.
What I hate most about Microsoft – (and if I keep this up, I’m going to have to rename this site to Microsoft-Hates-Me.Com) – is that it can’t handle the simplest tasks.
For a split-mirror backup, whether it be TimeFinder/Mirror, TF/Clone, or TF/SNAP, the process is the same:
1. Freeze the database / filesystem
2. Snap the volumes.
3. Thaw the database / filesystem
4. Mount the volumes on your media server host.
5. Back the filesystems up.
6. Unmount the volumes from the media server
7. Terminate the Snap session
Seems pretty basic. Microsoft seems to have trouble with #4 and #6. Seems this “Super OS” they’ve got can’t handle the idea that SCSI devices might go on and off the bus at different times.Â
EMC gives a tool, TFIM (TimeFinder Integration Modules) at at least allows you to perform the commands that Microsoft doesn’t even make available, mount, unmount, flush, etc.  But god forbid you reboot a host while the SNAPS are inactive or the BCV’s are established (and thereby not ready to the host). You’re screwed.
Can *SOMEONE* please write a decent SCSI driver for Windows? Please?
Comments bug -
by Jesse on Jan.13, 2007, under Bug, Wordpress
A few people have come to me about the bug in the comments section where it tells people to “Slow down cowboy” because only 1 post is allowed every 15 seconds.
I’ve identified the issue, this is unfortunately an issue with Word-Press and VMWare – because my time-sync isn’t quite right, every time I reboot the server it ends up going back in time about a day as NTP resyncs with my NTP server.
This is causing it to think that too many posts are coming in at the same time. (I’m going to disable it for now until I get the time sync fixed)
Just FYI for anyone who has had the issue with that. Email me at jg@sangod.com if you have any questions.
Veritas NetBackup – Maint. Pack 4
by Jesse on Dec.31, 2006, under Bug, Veritas NetBackup
Just in case you don’t already do it, back-up your C:\Program Files\Veritas\Netbackup\bin directory before you apply MP4.
Unlike previous Maintenance Packs – this one will clobber any custom scripting you have done.
Not nice.