Archive for the 'EMC Failures' Category

I remember the good old days….

Tuesday, March 17th, 2009

When the triage guys @ EMC actually listened to the person calling in the problem and directed the call appropriately. (Just had one I specifically asked for PSE and they routed me to software for some unknown reason, now, three hours later, it’s been re-routed to PSE)

When the support specialist working a call would stay through the end of the problem, and didn’t give you the “Sorry my shift is over, please explain your problem to a new guy for 45 minutes before you do anything else productive”

Follow-up. It’s now 0100 the next morning. I’ve been working on this problem for 8 hours straight. The software guy who was originally assigned went home without turning the call over to someone else. I’ve gotten not a single call since before 9pm.

This is not support people. This is the opposite of support. I’ve since fixed the problem myself, without the help of the SAC/PSE folks. The sad part is if I had done this 8 hours ago I could have been home eating my corned-beef / cabbage and drinking my Guiness.

Totally missed St. Patrick’s day, the only religious holiday I actually observe.

Not happy.

Enterprise vs….not

Sunday, June 22nd, 2008

I have a cousin. Very well-to-do man, owns a company that does something with storing and providing stock data to other users. I don’t pretent do know the details of the business, but what I do know is that it’s storage and bandwidth intensive.

He’s building his infrastructure on a home-grown storage solution – Tyan motherboards, Areca SATA controllers, infiniband back-end, etc. Probably screaming fast but I don’t have any hard-numbers on what kind of performance he’s getting.

Now I understand people like me not wanting to invest a quarter-mil on “enterprise-class” storage, but why would someone who’se complete and total livelihood depends on their storage infrastructure rely on an open-source, unsupported architecture?

One of the things you get with the Symmetrix is the 24×7 monitored support. One of the stories I tell people was about my first experience with EMC. When I worked at Intuit I was on the graveyeard operations shift. (The grunt shift, that most of us have been subjected to at least once in their lives) About 4am one morning I got a call from EMC saying that a hard-disk in our old Symmetrix-3 array had failed, and that the tech would be onsite in about 20 minutes (I guess they gave him the head-start) to replace it. I asked them if there was anything I needed to do and they told me that it was transparent and that the hosts wouldn’t notice the difference.

I was in love.

People ask what the “Enterprise” money gets you, and that’s it. You get the security of knowing that it doesn’t matter when, where, or how a failure happens, they are on top of it and have it dealt with before you even know the problem exists most of the time.

My second great EMC story – I was working at the Library of Congress on a tech-refresh, they had four Symm4 and 2 Symm5 arrays that were being upgraded to a pair of DMX’s. About two weeks before we were to have decomissioned one of the Symm4′s, it started experiencing problems. It seemd that 2 of the three power supplies had failed. The Symm4 was at least 7 years old at the time, and was designed for n+1 redundancy.

Even with two-thirds of it’s power gone, the thing kept running for almost 7 hours, tapping the internal batteries as needed. (Unfortunately it took only slightly longer to locate a replacement power-supply for such an antiquated peice of hardware, but at least it gave us the chance to gracefully power-down the last remaining hosts and gracefully power-off the Symm.

I’ve heard other stories, one in particular of a Symm in California that, after an earthquake, ran laying on it’s side until the hardware could be replaced and the data-migrated off it. (But having no first-hand knowledge of this, I will consider this an urban ledgend until someone who witnessed it tells me it really happened)

*THAT* is what you get for enterprise money.

Of course another relative from the same branch of the family is the one who told me “I have RAID, why do I need backups?”