50Micron.com

When the links go bouncing…

by Jesse on Jul.07, 2010, under General

In a true DR environment where Synchronous replication is used, it’s best to have two routes from source to target, or at the very least a switched route that can dynamically re-route in semi-real-time.

Everyone knows the story.  The link is up, everything is good, source ack’s a write to the host when the target acks it.  The link is down, replication is halted, source ack’s to the host when write is committed to cache on the source.

(Or, in this case, you have two optical routes but somehow managed to put it all through the same DWDM tray, which then failed, taking out both routes)

But i’ve seen it happen more often than not.  The “Bouncing” link.  Up, down, up down up down etc etc etc..

Very few storage systems handle that well.  Mostly because when the link is half-way there the system gets torn between the requirement (in synchronous replication) to acknowledge the link.

The good news is most host operating systems handle it wonderfully.  Sun records such events as “Retryable disk errors”, Windows and AIX I don’t think even report it.

Enter RedHat Linux, or in this case, RHEV.  RHEV uses a standard lvm2 volume group with virtual disks as logical volumes within the volume group.  Simple enough right?

Well what if you have disks from different disk subsystems?  What if  you have some mirrored and some not (the usual reason for that would be test/dev and production in the same environment. (Though putting dev/test and production in the same cluster is kinda nutty)

The situation I just saw was this.  4x 500G volumes, only ONE of them mirrored.   RHEV apparently put them all in the same volume group.

You *NEVER* put mirrored and non-mirrored volumes in the same volume group.  If for no other reason than the disk on the target array is USELESS without it’s partner disks.

In this case we had one disk out of 4 that was dropping on and off-line, some admin gets the idea to reboot the host – which of course attempts to close the volume group.  When it can’t flush those writes to disk the behavior gets a little unpredictable.  Most likely the shutdown will hang, causing some overzealous admin to go hit the power-switch…

Data loss ensues because there are cached-writes that haven’t been committed.

And they call me for help with it.  meanwhile, the freeware VMWare ESXi environment, that is also replicated, and that *I* have been pushing hard for enterprise-wide adoption of, blows right through the 36 hours of random problems with not even a sigh.

The problem with calling me for help with it, is I can just SMELL someone trying to blame the data-loss on EMC, and I want NOTHING to do with it.  So I tell them to open up a support ticket with RedHat.

Oops, they didn’t buy support.  Apparently when you throw in support the cost-benefit analysis vs. VMWare that makes it too expensive..

FML

I worked for 18 straight hours on Friday.

11 Comments more...

On Change…

by Jesse on Jun.18, 2010, under Changing Technology

I’m a bit of a stick in the mud when it comes to change.

I dislike it.

A *LOT*

So what did I do…gave up my years-long love-affair with Blackberry and bought a new Android-based phone.

I’ve sort of had it spelled out to me recently that I resist change even when it makes sense.  Fearing the unknown and all that.

So this is my gesture. 

I managed to hit Sprint.Com during one of the 18 whole hours they had the HTC EVO in stock.  My son wanted to spend his end-of-year-report-card-money on one and in playing with it in the store, I had to admit I was pretty impressed.

First off the screen.  The EVO sports a 4.3″ diag. screen which puts the rest of the droid-based phones to shame.  (Even the new HTC Droid “Incredible” isn’t so incredible at a paltry 3.7″)

Setup was easy, but not as easy as it could have been.  After several unsuccessful attempts at activating online I was forced to call in.  The unsuccessful activations were due to the fact that the phones were tied to the number they were purchased under, and I was trying to cross-activate them.

So far so good.  The only notible exception is the inability to sync my windows “Notes” using exchange active-sync.  I depend on those pretty heavily so will have to find a work-around for it. 

I’ll update when i actually get a chance to use this in a “work” situation.  :)

Leave a Comment more...

On Manners…

by Jesse on Jun.15, 2010, under General

So if consulting firm A talks to consultant B about an engagement, and then decides to go with consultant C because he’s cheaper it’s totally understandable.

If consultant C then calls consultant B for help with that engagement, consulting firm A should expect to get a bill for the time.

I mean first off, its rude.

Secondly, none of us work for free.

Now I’m glad to help when i can, but when it goes from “i’ve got a few questions” to “give me the step-by-step on how to do this” I have to draw the line.

</rant>

Leave a Comment more...

Multivendor or Single Source? Is there a right answer?

by Jesse on May.26, 2010, under Best Practices, Replication, SingleVendor

Every time I turn around it seems I seem to be running into the same question.

Is it better to be multi-vendor or single source?

Well the easy answer to that is, it depends.  Different vendors do things differently, work better/worse with some hardware, etc.

The arguments in favor of a single-vendor solution is easy.  Cost, Simplicity, Management, Interoperability.

Even if you’re buying a more expensive solution, there can STILL be major cost savings.

First, in staffing.  When you maintain multiple vendors, you have to maintain support-staff knowledgable for each vendor.

If you’ve got a storage team that consists of 5 people, and two of them work almost exclusively on Veritas Netbackup.  You *MIGHT* be lucky if you get one subject matter expert capable of doing Tier1 (IE Symmetrix) one for Tier2 (Clariion) and one for NAS (Celerra) .

But throw in HDS, IBM DSxxxx, XiV, IBM GPFS, IBM HPSS, NetApp, SONAS, Sun StorEdge, etc. etc. etc.  And what do you have?

You either have an overworked staff (and as i’ve discussed, union protected salaried federal employees aren’t known for 70 hour weeks) or stuff just plain doesn’t get done.

If you don’t spend the money on staffing, you *WILL* spend the money in support and professional services.  Now support is one thing.  If my XiV or Symm or whatever loses a harddrive, I expect the vendor to own that problem and fix it.

They will *NOT* however send people out to help with day-to-day provisioning without a pretty hefty P.O. associated with it.

And the last reason for single-vendor options is simple.  I want stuff that is going to work together.  Now yes, functionality costs, but one of the things I like about EMC is that when it comes down to it, it *ALL* works together.  I can move data from Symm to Clariion or vice-versa using SanCopy, I can migrate fileservers to celerra and within storage tiers as needed.

There is nothing worse than needing to expand one storage system by 20TB and having the storage somewhere else, but unusable.  It means you’re wasting money buying storage you already have.  (Especially when your purchase cycle is 4-6 months on average.)

Not a happy thing to explain to the boss.

“Yes we have 80TB of Clariion avaialble, but the IBM DS4800 is running short so I need to spend an extra $100k on disks.”

“Yes, I know this isn’t budgeted, but the data grew faster than we’d expected.”

(Of course, you can span filesystems across arrays, as long as it’s not replicated data, because you can’t get a consistent split when half of your extents are on one array and half on another)

2 Comments more...

Vendor Abuse…

by Jesse on May.19, 2010, under Backup, Vendor Abuse, Worst Practices

Just a quick question as it’s *WAY* past my bedtime. 

Has anyone else ever heard of a customer that a sales-person or tech would quit rather than set foot in?

I’m remembering my days back at Disney in California.  (over 7 years ago, so I’m assuming the statute of limitations has expired on that one)

I got called up for a 6-week scripting engagement, and 18 months later I had to actually quit the consulting firm I was working for and move across the country to get away from them.  (Happiest place on earth?–Not if you’re in their IT dept.)

These words actually came from one of their “cast members”:

“Once you work for the mouse, you ALWAYS work for the mouse.”

When I told them i was moving to Virginia/DC to do goverment work, the response was even more ominous…

“The sun never sets on the ears…you can’t move far enough away.”

In the end, the experience I got at Disney (Scripting TimeFinder to back up AIX/DB2/SAP) was invaluable and one of the guys there basically taught me more about kornshell scripting than I think I’ve learned from any single source since then.

But the environment was toxic.  I’m curious as to whether anyone else has stories like these or if I’m the only one who runs into them?

1 Comment more...

By request…

by Jesse on May.18, 2010, under General

Ok, believe it or not there are a number of people who would like me to stay online.

Though I can’t commit the kind of time I’d like to, so i’d like to request guest-contributors if possible.

Just email – jg(at)50micron(dot)com

Leave a Comment more...

IBM XIV….

by Jesse on Mar.24, 2010, under IBM, Marketing/Engineering

Oh Moshe – you disappoint me so….

When I was at EMC I used to look up at Moshe Yanai like he was a god.  The father of the Symmetrix, used to fly in and out of work in his own Helecopter on occasion, the uber-engineer that we all aspired to be.

Oh how the mighty have fallen.

There was an engineering adage that my uncle taught me when I was very young.

He said: “You can have it cheaper, you can have it faster, you can have it smaller – now pick any two of those.”

it’s always held true.  Cheap / Fast is usually huge, Small/Fast is usually expensive as hell, and Cheap/Small is usually show as molasses in January.

I got a chance to get a real close look at the XIV for the first time last week, and I have to say it’s got to be the biggest pile of garbage I’ve ever seen in my life.  In the above addage, it definitely falls into the “Cheap/Small/Slow” category.

From a “Tiering” standpoint I put it somewhere on the low end of being between the Clariion and Centerra, maybe like Atmos without the universal namespace and nas connectivity.

The idea that someone I work with THINKS they can run a transactional database on it is absolute nuttiness, and would be fun to watch if it wasn’t also just *SO* painful.

Here’s the stats I’ve found on it.

22,000-25,000 IOPS peak.

Depending on the cache-friendliness of the appliation.  However it must be said that IBM’s “testing” shows a much higher rate, but when you’re writing zeros to a system that assumes a zero-state at the start and just drops the write when it matches, it’s not a fair test now is it.

Power/Heat:

Read these numbers:

Operating environment

Temperature:  10 to 35 degrees C
Relative humidity:  25 to 80  percent
Max wet bulb:  23 C
Thermal dissipation:  26K BTU/hour (Holy surfaceofthesun!)
Maximum power consumption in watts:  8.4 KW
Sound Power, LwAu = 8.4 bels

8400 Watts/hr @ 0.15US/kWh = over $11,000 per year just in power requirements for a single frame. That’s not including cooling, which given that I almost got heat-stroke spending 5 minutes standing behind the thing, cooling has got to add up to a pretty penny as well.  Barry Burke over at The Storage Anarchist put the total operating cost at between $20,000 and $22,000 per year to run.

You can have any Protection level you want, as long as it’s Raid-1 (and any remote mirroring as long as it’s synchronous)

With a tip of the hat to Henry Ford.  Moshe *NEVER* liked anything but RAID1.  he grugingly added Raid-S to the Symm (and did it so half-heartedly that it *NEVER* worked right) because whiny customers demanded it.  For some reason he doesn’t like Raid-5 despite the fact that it has a place, especially in sequential read-intensive applications.  So customers start out being forced to buy twice the storage they actually need.

It works on a distributed-node system

Like the Atmos or Centerra.  (More like the Centerra in it’s methods actually, the XIV stores data in “BLOBS” (Binary-Large-Objects).  1Meg in size is what I’m led to understand, spread across the ENTIRE array.  So in theory, if you write a 200Meg file, a peice of that file is on every disk, and mirrored to at least one more disk.

The nodes presumably run a customized Linux OS (As much as I could get out of the CE before he realized I was an EMC-o-phile and quit talking to me)   So the downside of course is that if a node fails, you lose 12 disks in the system.

A dual-disk failure on a full system would almost certainly bring disaster

Yes, I said it.  This is the first system I’ve seen that, when full, would be incapable of losing two drives without data loss.  (The only way it would work would actually be if both drives were in the same node, since presumably the algorithm that governs the writes is smart enough not to mirror blobs within the same node.)

If a disk fails, it immediately starts a rebuild of the data to other disks (presuming the free space exists, but it must reserve enough to know it can re-balance a failure).  Now IBM says the time to rebuild is about 30 minutes.  (maybe true, haven’t seen it so I won’t say for certain)  Now if a second disk fails during that rebuild time, because of the distributed nature of the writes, it’s almost a 100% certainty that if the disk is in another node, it will have elements of the failed disk on it.  When this happens, even IBM’s redpaper says that restore from backup is necessary.  (Don’t know if like the Symm it’s capable of reading from the remote mirror to rebuild, if so that may be how they get around it, but then in order to have REAL protection you have to buy FOUR TIMES the disk space you actually need)

And last, but not least:

The IBM XIV back-end consists of…..

(drumroll please)

…(gigabit ethernet)

Yeah…I said it again.  Gig-E.  And not DCE or anything fancy (and lossless) but plain old standard copper gigabit ethernet.

First off, if you’re going to use Gig-E fine, but use optical.  In a box that is rife with magnetic fields, even the most shielded Cat-6 cable is easily penetrated.

The faults in this are too many to fathom.  If they had used DCE with Class3 service (guaranteed delivery) they might have had a chance of making this anything but Tier3 storage.

But the way it works is this.  The fibrechannel connections go to four of the 15 nodes.  These nodes are then in turn connected to the rest of the array via a dual-ethernet setup.  (Probably Round-Robin, I dont think the switches they used are layer-3 capable and as such support etherchannel or LACP, please correct me if I’m wrong)

So now *ALL* of your IO is now being processed by 4 system, which then have to write the data to the other 11.

That means, if you have a dual-connected 4 Gig host connect, it actually has only 4 GIG TOTAL instead of the 8Gig front end connection.  Since the back end is completely distributed, every host you add to the XIV takes a percentage of the bandwidth.

So let’s see, if you connect 30 hosts you have 1/30 of 8Gig of bandwidth, (four-dual-attached FC nodes) or 273megabits/sec if they all happen to hit at the same time.  (Now we all know that’s not likely, and that given normal operation *MOST* IO will not get queued up behind other IO.

Then it depends on the switch, if they used a switch with a 12G 100% non-blocking backplane, they might pull it off, seeing as the most they’ll have running at any given time is 4Gig.

But when you pair that up against a clariion CX4-960 (which this customer also has) and look at it’s 16x 4G dedicated Fibrechannel busses, you wonder what the hell they were thinking.

What does IBM say to do when an XIV system starts getting slow?

“Stop adding hosts/storage to it”

Really?  Are you really saying that if I’m at 70% capacity and I start seeing performance degridation and wait-for-disk in #topas, I should just write-off the remaining 24 terabytes of usable space?

Wow – that’s quite a marketing gimmick.  I bet you’d like me to come and buy another XIV when that happens too.

——————-

So yes, they bought one, and yes, they’re trying to put transactional databases on it, and yes, it’s going to fall flat on it’s face.

Not sure I want to be around when THAT happens, because I’m *SURE* they’ll try and blame me somehow.

Stay tuned – next week we’ll be evalutating the difference between VMWare and RedHat Enterprise(?) Virtualization…. (someone shoot me now…please)

36 Comments more...

What I run… (And why boot-from-san is a good thing.)

by Jesse on Mar.03, 2010, under General

It amazes me how much power you can get in a desktop today.

From the Quad-core “Extreme” desktop processors, to 64bit Operating Systems that are almost ready for prime-time, the options are limitless.

I recently ‘came into’ some hardware and decided to build a server workstation around it.

The ‘found’ hardware was 16G of Registered, ECC memory in a 4×4 configuration.  (From a client for whom I upgraded to 4×8 who told me to…and I quote…”Keep it, I’ve got no use for it.”  – Why thank you, don’t mind if I do.)

First step was to put a motherboard under it.  Most people know that Registered, Fully buffered, ECC memory won’t work in any motherboard.  Requires server hardware.  So I go to my favorite computer shop, Affinity Computer Technologies (Sterling, Virginia area) and I ask Bill what I I should buy.

My requirements:

  • Must take the memory I have on hand.
  • 2x PCIe x16 slots (for the dual/dual-port video cards I run)
  • at least 1x PCI-X slot for the Emulex LP9802-DC.

What he comes up with, after about 15 minutes of careful research (he’s that good) is a Supermicro X7DAL-E+ motherboard.  (Link – pictured above)

This board rocks.  Dual Xeon, supports up to 24G of RAM, and meets the rest of my requirements.

Next buy was the processors, because i certainly don’t have Xeon processors lying around.  (Well, point of fact, I do, but not THOSE Xeon processors)  I opted for a pair of Quad-Core 2.0Ghz processors.  They weren’t the best processors I could buy, but they were in my price range.

Thanks to the boys at Affinity, the whole thing was had for under $1,250.

And I bought a new case, the Antec 300, because I the case i was running wasn’t fully ATX compliant, and required that I choose between having a CD-Rom and the second processor.  (Not gonna happen)  Antec makes some pretty decent looking / sized cases for under $100.

No hard-drives please.  I’m booting from SAN.

First thing I have to say is that when I first installed Win7 on my old workstation, I learns is that powerpath does NOT work correctly at all on Windows 7.

Second thing I learn is that Windows2008-R2 64Bit almost perfectly emulates Windows7 when you put it in desktop mode and enable all the bells/whistles.

Third thing I learn is that 8CPU cores 16G’s of ram, and 2 1Gbyte video cards makes World of Warcraft SCREAM. ;)   Even when there are four VM’s running in the background. :)   Yes, I’m *THAT* nerd. :)

But the best part of it was the migration.  Now as I’ve said in the past I’ve been running the boot-from-san for some time (most recently with the Win2k8R2/Powerpath up and running), so of course the Emulex drivers were already a part of the operating system. So this is how it went:

1. Build new system.

2. Shut down old system.

3. Move Video/Emulex cards to new system

4. Connect and Power On.

5. Reboot twice as motherboard/CPU specific drivers are loaded.

6. Done.

Total migration time – about 45 minutes, including hardware swap.

Now *THIS* is the reason I strongly support and encourage boot-from-san in a datacenter.  Not only does it make it amazingly easy to protect your data.  (SnapView, MirrorView, etc) but you have the option of upgrading hardware and keeping your disks/OS in-tact.

So when the G3 HP’s go out of fashion,  you shut it down, make a simple zoning/masking change, and power the new box on.

if it’s linux, you don’t even need a reboot most of the time… (however your ifconfig settings will need to be updated – they’ll get hashed when the MAC of the network card is changed.

This is what I do for fun. ;-)

2 Comments more...

On Symantec/Norton Technical Support….

by Jesse on Feb.24, 2010, under General

Ok – I just had an interaction with a “technical” support rep from Symantec that is quite simply driving me insane.

I want to thank “Shajeewin” for renewing my objection to outsourcing jobs overseas.  Yes people, it’s cheaper.  But then again you do get what you pay for.

Background:  I am trying to set up a customer who wants to push a DR image of ONE system across the internet to my storage.  Initial push about 30G, daily updates in the megabytes range.  The hard part is this customer isn’t the type to spend a lot of money on Bandwidth, so went with Verizon DSL, with it’s whopping 128K upstream speed.

My solution for this was to sneaker-net the initial recovery point, and then push the incremental updates over the wire.  Simple, right?

So I look at Norton/Symantec Ghost.  First option, I’ve always liked Norton.

I’m changing my mind about that QUICKLY.

Here is the chat that ensued (with my comments thrown in)

(continue reading…)

8 Comments more...

Haitus….

by Jesse on Feb.21, 2010, under General

Well – thanks to a contract glitch I’m spending a few weeks off work. :) No biggie, and I’ll be back before you know it.  It’s been a great opportunity to get some stuff down around the house, build my new workstation, get backups under control, etc. :)

My new workstation is a riot, started out Building an updated workstation, ended up something completely…well….other.

The particulars:

  • SuperMicro Serverboard
  • Dual Xeon Quad-Core processors
  • 16GB of Buffered/Registered/ECC memory
  • Dual NVidia dual-head video cards.  (4 heads total)
  • Emulex LP9002-DC Fibre Card
  • Generic BDRAM drive.

It’s nuts.   First think you’ll note, no hard-drives.  I’m booting from the Clariion.  Set up a dedicated Raid-Group and built a 128G Raid-1/0 lun for the boot volume.

Lastly, the Operating System.  Well I couldn’t use Windows7 for the OS, as much as I wanted to, because PowerPath doesn’t support Windows7 (yet?) so I went with Windows 2008 R2, x64.  Nicely you can put it in desktop mode (by adding the “Desktop Experience” feature) and it does basically the same thing.  A copy of VMWare server and I run anything that requires XP in a VM.  (Like my work VPN c;oemt, which *HATES* 2008, or 64bit, or both. :)

The greatest part is booting from the SAN I have nightly snapshots taken of the OS volume, which makes life easier in case I blow something up.

It even runs World of Warcraft, which of course was also a requirement. :)

Leave a Comment more...

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Visit our friends!

A few highly recommended friends...