Hours lost….
by Jesse on Aug.22, 2008, under DR/COOP
So we have 12 500G SATA disks in a DMX-4….. Carved them into 147 29G Hypervolumes, the VTOC took over 7 hours.
The fun part happened when the symconfigure script died. Had to have the lab dial into the Symm and step the process through to the end.
The cool part is that we were in the process of doing a fail-back process, other than having to shuffle the process a bit – (did the swap last instead of during the failover process) – the customers absolutely didn’t notice any change.
In fact, we had a 2 hour window for the fail-back, it was completed in 65 minutes.
However it wasn’t until after 4:30am when the VTOC completed and we were able to issue the R1/R2 swap command.
Ouch – I’m tired.
August 25th, 2008 on 10:56 am
Man, you need to put up a cheat sheet for managing the DMXs for those of us that only log into the thing once every other month and wonder what the heck was that command to view the available space and carve out some hypers. I’ve never felt comfortable with the DMX, it seems like the big iron but that maybe because my background is NetApp and even Clariion a fair bit. Do it, do it…please!
August 25th, 2008 on 11:27 am
It would be a waste of effort to put together a cheat sheet since as of the DMX4 and 72 code customers no longer have log-in access to the DMX. I guess too many customers were complaining about having an ‘open’ windows system on their network so it needed to be secured.
Drives me nuts though – I used to do a lot of my design work by pulling binfiles and working with the resultant excel spreadsheets. Now I have to nag the CS guy every time I need a new copy of a bin for review.
August 25th, 2008 on 11:36 am
That’s horrible and that’s the difference between NetApp and EMC. NetApp is an engineering minded company, they don’t mind if you poke around. All the information is at your finger tips – SysAdmins like that. EMC stuff is that big scary black box that makes even the Atheist SysAdmins pray to God that it doesn’t ever have any issues or they’ll look stupid for not knowing where to even begin the troubleshooting.
August 25th, 2008 on 3:05 pm
@stroageguy201
I’m an enc customer with no “official” access to the SP, but I can still look at pretty much anything in my DMX4 running 72 code with no problem using solutions enabler. The amount of things that I can poke and look at what is happening inside the array compared to our NetApp filers running 7.2 code is pretty staggering (but not really surprising the filers are setup to really be easy access/install appliances don’t need any training, etc). Our filers frustrate me to no end in trying to find out “back-end” information at times (but I’m in the minority in wanting to see that data).
What I believe Jesse’s complaining about is that one used to be able to walk upto a symm and download the DMX config onto a laptop; which he then could throw into some EMC tools that would spit out nice information, etc. (or if you were a bit more nefarious completely screw up the entire array, or bypass some of the security features and give access to a finance data lun to some other host) After EMC bought RSA, they had a idea to close add key-fop authentication to the SP.
Pro:
Lock down environment is truely locked-down now, there is an audit of everything going in/out of the DMX now. Both from a customer perspective and from a vendor perspective (no more your service tech did something finger pointing). You can run symaudit and see all the dial-in’s and outs.
Don’t have to worry about ex-employees having access to the symm anymore via an old username/passwd on a modem line
Not much else, it’s really a security thing…
Cons:
Lock-down prevents someone from buying a DMX, not having it under support who doesn’t have any management software and try and do anything with it; what you see is what you get.
I have to talk to my CE to get a bin file pulled (I normally have no reason to do this), other EMC service personal have to talk to the CE to get a bin file pulled (much more often); adds a delay to the process.
I can’t make a quick change on the fly, with sol enabler I can do everything (create drives, change raid, convert, etc); with one exception… I can’t change the IP address of a gig-e director card. Even though one isn’t supposed to, one could change an address that your network group decided to change a couple of days before going live if needed.
Without access to the SP to pull a bin file all the data is there but to make reasonable sense of it can deffinetly be a bear, I actually find it can overwhelm you with data and trying to get something quick out of it can be a chore; but given enough time there is nothing I can’t get out of it. Using perl to parse the reams and reams of data is deffinetly beneficial for it’s text handling capabilities. I’ve never had a problem with getting one (no one from EMC’s said “no you can’t see that”) but it does introduce an annoying delay if you are wanting to use certain tools that use the bin file.
August 25th, 2008 on 3:24 pm
That’s good to know because my knowledge of the Symms is very limited. All I have is the SMC GUI and symm cli – no solutions enabler or CC. Like I said the box seems like a black box to me, it was populated with the hypers initially which I don’t really know where and how they’re spread across the physical disks without consulting a spreadsheet that was handed to me initially with over a dozen cryptic columns. If I squint hard enough it starts to make sense but a month later I’m doing the whole thing over again. Compare that to a netapp or even a clariion it’s hard work just to keep track of my devices, hypers and physical disks.
for the NetApps all you need is sysstat, stats show, wafl_susp and perfstat, that’ll give you just about everything you need to see what’s going on inside. Most of the time you just need to see the bottleneck and usually takes me a couple of commands to see if it’s the cpu, memory or disks. The NetApps are actually fairly complex as they support not only NFS and CIFS but also FC and iSCSI sans but they feel easier because they’re put together that way and thus have had a huge success.
August 25th, 2008 on 4:33 pm
If you have symm cli than I think you should have everything you need then (/usr/symcli/bin on unix platforms).
I’ve found that the symmetrix/DMX are really meant to not worry so much about physical placement and that trying to design around that is counter to it’s design. This is a real different concept than a NetApp, Clariion, HDS, etc system. I’ve always likened the configuration as a very close represenation of Oracle’s S.A.M.E. methodology (stripe and mirror everything). Where luns go wide across the array rather than deep into the disks. If automatically reduces the probability of hotspots compared to isolating spindles as now all the spindles would have to be pegged rather than just the 20 or so you setup for this one app. 6+ years ago we tried to keep things separate for just a few apps (this Oracle DB went only on these disks, etc) and it was pure nightmare to try and manage and keep them separate.
The way I get them binned up (is pretty common and most likely the way yours is configed), each consecutive device # comes from a different raid group. i.e. in a mirror configuration
Dev 100 = disks 1 & 2
Dev 101 = disks 3 & 4
Dev 102 = disks 5 & 6
Dev 103 = disks 7 & 8
I can grab devices 100-103 put them into a striped meta and know they are not on the same physical disks (at least until optimizer starts “doing things”), only after I get to the end do I really have to worry about “wrapping” a stripe back onto the same disks.
I’ve found that trying to force a method of physical separation with the DMX will make your head pop-off I went rounds with our DBA’s and app folks until I got them to “not-care”. I think that if you change your methodology with the DMX your life will be much, much easier (at least I have)
If you don’t have optimizer (product that basically monitors performance trends over time and automatically or by prompt will move hypers to different physical disks) you can use symstat to monitor which disks & luns are getting the heaviest hit.
If you want to see which disks have which hypers you can do a “symdisk -v -hypers list” or “symdev -DA -disk “, if you are an XML jocky you can set an environmental variable and it will dump it out in a program parsible format.
–
On the filers though to me much is still hidden, sysstat shows 100% cpu on one of our filers, that doesn’t mean we are completely out of CPU power but of the 4x cpu’s we have one is pegged out (normally from an NDMP job), same thing with disks the value doesn’t really show me much. I want to be able to see how much bandwidth a paticular client is using, I want to see the nitty-gritty, cache, etc. Stats show is getting there but it’s no where near the level of symstat (or even the ease of use I find).
—
I think probably a lot of it is that which ever platform you’ve had the most “stick time” with you are going to feel the most comfortable with.
August 25th, 2008 on 5:14 pm
Yeah we use RG5 (3+1) – I think the PSE told me that consecutive devices are not on the same physical spindles but he said to always refer to the spreadsheet…all is well and good until I need a 1TB lun and my hypers are 59G…ugh! Trust me I want to “not-care”, like the NetApps, I just say give me a flex volume and I know it’ll spread it on all the physical disks available to the aggr as nicely as possible but with the devices I can’t be certain.
We don’t have optimizer, I’ll try a few of those symcli commands.
On the NetApp, CPU pegged at 100% due to NDMP doesn’t sound good especially if this is a FAS3000 series (you said 4 cpus) unless you have multiple NDMP streams running. Try statit -b (to start) and statit -e to end the monitoring session. The latter will dump all the statistics (lots of them). This includes the CPU usage breakdown. CPUs usage is divided into 4 to 8 groups and you can tell which group is keeping your CPU pegged, I won’t go into the details here and some of this is beyond me so best is to engage NetApp for interpretting this info but here’s part of what you’ll see:
Start time: Mon Aug 25 15:09:06 PDT 2008
CPU Statistics
18.280549 time (seconds) 100 %
0.564904 system time 3 %
0.080891 rupt time 0 % (77915 rupts x 1 usec/rupt)
0.484013 non-rupt system time 3 %
72.557288 idle time 397 %
0.556108 time in CP 3 % 100 %
0.004595 rupt time in CP 1 % (2815 rupts x 2 usec/rupt)
Multiprocessor Statistics (per second)
cpu0 cpu1 cpu2 cpu3 total
sk switches 507.10 1653.34 334.40 2208.91 4703.74
hard switches 0.00 770.44 0.00 1292.90 2063.34
domain switches 0.00 10.94 0.00 17.72 28.66
CP rupts 62.58 30.47 30.47 30.47 153.99
nonCP rupts 1197.50 970.43 970.49 969.77 4108.19
IPI rupts 0.77 0.71 0.77 0.05 2.30
CP rupt usec 168.27 51.26 19.91 11.82 251.36
nonCP rupt usec 1881.02 1475.07 530.84 286.59 4173.62
idle 997950.61 992851.20 999449.09 978846.81 3969097.97
kahuna 0.00 2882.63 0.00 18064.17 20946.96
network 0.00 986.18 0.00 470.55 1456.79
storage 0.00 838.27 0.00 842.32 1680.58
exempt 0.00 229.26 0.00 543.97 773.28
raid 0.00 647.03 0.00 815.84 1462.92
target 0.00 2.35 0.00 13.62 16.03
netcache 0.00 0.00 0.00 0.00 0.00
netcache2 0.00 0.00 0.00 0.00 0.00
cifs 0.00 36.43 0.00 103.88 140.31
wafl_exempt 0.00 0.00 0.00 0.00 0.00
0.467619 seconds with one or more CPUs active ( 3%)
0.017007 seconds with 2 or more CPUs active ( 0%)
0.000000 seconds with 3 or more CPUs active ( 0%)
0.450611 seconds with one CPU active ( 2%)
0.017007 seconds with 2 CPUs active ( 0%)
0.000000 seconds with 3 CPUs active ( 0%)
0.000000 seconds with all CPUs active ( 0%)
Domain Utilization By Exempt (per second)
0.00 idle 0.00 kahuna
0.00 network 0.00 storage
0.00 exempt 0.00 raid
0.00 target 0.00 netcache
0.00 netcache2 0.00 cifs
August 25th, 2008 on 5:42 pm
Yes – Sequential hypervolumes will *ALWAYS* (unless someone jumped the shark in a big way) be on different physical spindles.
The problem I’m running across these days is seeing customers who migrated from a Symm4 –> Symm5 –> DMX1 –> DMX4 and the like.
In the old days, when I started working with EMC, 18G drives were *JUST* released. So people created 4:1 splits, or 4.7GB hypervolumes. With the advent of SDMS, or the Symmetrix Data Migration Service (SRDF+PS) people use SRDF to migrate from one box to the next. The problem is that no-one is making any real effort to get off the tiny little hypervolumes they created when they first started.
Im working right now on a system where with 300G drives they actually have 3.9G hypers. (And some of the hosts are *STILL* mounting them directly instead of using metavolumes.
Not to do the math for people, but that’s in the neighborhood of a 200:1 ratio of Hypers:Physical.
That’s a very very bad place to be.
First reason is that it actually limits the number of physical drives you can put in the Symm, because there can be only so many hypervolumes. (Though truthfully, these days that number is *VERY* high)
The second reason this is bad – if you’ve got 200 splits on a disk and these splits are heavily utilized by a number of hosts, the heads spend all their time thrashing back and forth.
So how do you fix this? Easy. Get a good LVM system for your host and migrate to it.
AIX – Built-in LVM, if you’re not using it you’re crazy. It’s one of the best out there. Add new (bigger) physicals to the volume group and then move the logical volumes to the new physical. 100% online.
Sun – Open the wallet and buy Veritas, Solstice Disk Suite is handicapped as far as functioinality and reliability go.
HPUX – Built-in LVM, though this isn’t a robust as AIX, it’s still worth-while. See earlier posts on doing Switch migrations. HP’s got issues with the cXtXdX number of the device changing when you plug the cable into a different switchport (or switch for that matter)
Windows – Needless to say there is something resembling a volume manager on Windows, though not really – Easiest way is to present a larger disk and copy the data. If you feel like taking the risk you can enable dynamic disks and mirror/split the partition, but there are so many things that don’t work well with Dynamic disks, that I wouldn’t recommend it.
August 25th, 2008 on 5:43 pm
Oh – and don’t forget – Symm Optimizer will take care of hotspots that develop over time. It will literally move the mirror to an under-utilized disk online, in the background, and if you chose without so much as a ‘by your leave’
August 25th, 2008 on 6:01 pm
Actually unless you have some business units that want isolation I think you should be able to “not care”. You should be able to just grab them in order and not really care (if you’ve been adding drives after the initial config it could get a bit more hairy). Remember that the DMX uses a ~960k stripe depth, so
Here’s a script to find a list of devices that are on different disks (it’s off the top of my head and works on a sun box, but make sure it works first by doing some sanity checks yourself).
/usr/symcli/bin/symdev -sid -CAP -protection -noport -nobcv list | sort -k 5 -u | sort -k 1 -n
You can get the drive cap by doing a symdev list it’s the last column
The “-protection” is the raid level for you it’d be “-protection 3+1″
The “-noport” lists drives not allocated to a FA
The “sort -k 5 -u” will sort on key #5 and print out the first one it finds in column 5 (the disk #)
The “sort -k 1 -n” sorts on the device number to align them back in sequential order (if available)
—
Our filers are in the crap load of files relative to their amount of storage, we are up over 1.5 files across our filers so ndmp walking the filesystem can look painful at times via sysstat but there is no real performance problem (other than doing a bazillion inode lookups). I know the ndmp process only got time slices on 1x cpu, I don’t know if 7.2 has changed it to be allowed to run on multiple CPU’s or if it’s a 7.3 feature (or later feature, NetApp guiy mentioned it the other day) Some filesystems don’t actually start moving data for > 24 hours whiles it’s walking, (none > 2TB) so for us it’s a given we are going to have multiple simultaneous sessions running.
August 25th, 2008 on 7:25 pm
When I was working at a ‘certain government entity’ (Think opposite of “Progress”) they actually had put the request in to have Democratic and Republican data separated physically, to the point that I believe they brought in a second Symm.
Your tax dollars at work.
InsaneGeek: You also want to add -NOBCV if you’ve got bcv’s in the box that are for point-in-time only and are not mapped to a front-end port.
Personally, I don’t leave unallocated BCV’s in a symm. I convert them using ‘symconfigure’ if i need them and convert them back when I don’t need them. That way you can be sure that any BCV in the system is actually a used one.
Also – use Device Groups. Even if you use device files for all of your scriipts, device groups offer a handy way to ensure that you know what is assigned where. (When you type ‘symdev list’ there is a Grp’d or N/Grp’d field accordingly.
August 25th, 2008 on 10:02 pm
Whoops got distracted by my puppy and left out parts of the post
…~960k stripe depth, so by the time go get back around to the first drive again you’ve read/written out multiple megabytes (and the first drive has probably serviced multiple other requests). Even if you’ve done something bad and made a stripe with 2x members on the same spindle this can mask some of the pain (provided the stripe is wide enough).
There should have been some stuff in greater/less than, I’m guessing they got ate as possible html tags (I do remember typing that in)
/usr/symcli/bin/symdev -sid [sym] -CAP [capacity] -protection [raid 3+1, etc] -noport -nobcv list | sort -k 5t -u | sort -k 1 -n
Also that should be 1.5 billion files across our filers (1.5 files would be a little too small)
–
Yup I’ve got a whole bunch of bcv’s that are clones of R2′s that aren’t on a front-end port (why oh why can’t I do symsnap!!!), so I use the -nobcv quite often.
I do have a question around dynamic disks under windows. I’m an old unix guy, I’ve got about 10TB of windows storage out there for exchange, ms sql, sharepoint, win fileserver, etc. Depending upon the day it looks like the windows guys have done dynamic or simple disks. How are people dealing with storage growth? Are people growing stripes metas, growing concattenated metas or something else (i.e. dynamic disks). What’s so bad about dynamic disks? Unix is the majority of my environment and I use LVM all the time so it’s a non-issue there but when I get pulled into the windows world it’s a bit unknown and it seems like using dynamic disks would be an appropriate fix.