Archive for the 'Gripe' Category

On Storage and Security.

Tuesday, April 24th, 2012

Hi there.  Remember me?  I’m your friendly (not) neighborhood (probably not) storage blogger (ok, well sometimes)

So the situation keeps coming up, and it’s worthy of a post, so here I am.  You’ve all (all six of you) probably gathered that I’m not here to promote anything.  I don’t play favorites, and I certainly don’t get any money from the stupid ads I placed on the sidebar… (why are those still there?)  Hell I’m well aware that I am sorely lacking as a writer even.

Tonight’s topic:


Storage Security

Why are companies locking the storage admins out of the hosts?

Why for the love of pete?  I have a customer where the storage-admin’s job stops at the connection to the server, for ‘security reasons’

It’s a useless endeavor, it doesn’t gain you ANYTHING as far as security goes, and in fact *WILL* end up costing you more than you would ever have dreamed of saving.

it also makes your storage admins feel untrusted and unappreciated… (but employers don’t care about that so much these days.)

So, in a nutshell, I will list the three common reasons for locking the storage people out of the server environments, and why doing so is a complete waste:


Scenarios

  • My computers have sensitive information on them that the storage people shouldn’t have access to.

If you trust someone to manage your storage environment you trust them with your data.  I can name two different ways off the top of my head that a storage admin could gain access to data without ever going NEAR the server, either physically or over the network.  And one of those would be COMPLETELY UNTRACEABLE.

Long story short, the storage admin has access to the data.  Just get used to that fact and stop making up ways to make their lives more difficult.

If you have doubts about the people you’re hiring, look at your hiring practices.

  • The storage admin could inadvertently crash the host.

Well gee.  Anyone with access to the power cord could do that.  Again, can think of at LEAST two different ways a storage admin could do that without even trying, and that happen on a daily basis.  (Remove device masking, remove zones)  Again – you’re fixing the chicken-coop with the fox inside.

Try trusting the people you hire to do what you pay them to do.

  • The storage admin doesn’t require access.

Well, this is kind of a generalization.  Many companies practice a “if your job doesn’t directly relate to the server you aren’t granted access to it.  If troubleshooting only extended to the point where the server connected to the SAN the above would be a true statement.  But as with most systems, there are inter-relationships that are crucial.  Multi-path software, HBA management software, Drivers/Firmware, *ALL* are a part of the storage environment.

And the bottom line is this:  Storage touches EVERYTHING.

If, like most sane companies, backup is included in the storage job, that’s 100% everything, otherwise there are SOME occasions where non-SAN attached hosts don’t require storage-admin access.

Troubleshooting in an environment where the storage admin’s access ends at the HBA connection can take HOURS longer than it would normally take, and requires at least twice the manpower.

Storage doesn’t stop at the physical layer.  Storage management software counts!


My scenario – Here’s why giving your storage administrator access to the servers *WILL* save you money.

It’s 4:15 on a friday afternoon. The dual-port PCI-e HBA you put into the server (to save money and slots which are tight in 1U servers) has failed. Not the port (which, granted, is infinitely more likely) but the chip itself. The SAN storage for the host is down.

As the storage admin, I got a page when the switch ports went dark. Assuming the storage environment is managed properly, I instantly know what host is experiencing the problem. (it’s also safe to assume that the host owner knows because his disks are MISSING)

Now as the storage admin, I’ve tested the connections, the switch ports and I’ve narrowed it down to an HBA issue. The host needs to be shut down (assuming it’s not Windows and blue-screened at the first sign of trouble)

Now if I have to coordinate the reboot, the installation of the new HBA’s, flash up-to-date firmware, pull WWPN’s, rezone, remask and reboot the host again, we’re talking about time. Maybe not much, maybe the host admin is on the ball, and maybe if you’re clever you can zone/mask before the initial boot, but you still need to flash firmware to stay within supportability and not risk further problems.

I’ve done this. If I’m doing it myself the system is back up by now, and the only thing i need the application owner to do is validate the app is functioning correctly.

If you don’t have access but are sitting in the same room with the person it’s still fairly simple but takes a little longer, though not much.

So let’s hope the failure happens during business hours—If it’s after hours, you’ve got two people driving in instead of one. Hours of downtime, total, that is, if you’re lucky enough to be able to get ahold of the host admin.

Now this came about because I had an outage happen. A VMWare lun disappeared and the owners of the “secure” vmware environment were nowhere to be found. (on what planet is it ok for an IT person to not respond to a page?)

Myself and the owners of the “unsecure” vmware environment sat around for a while twiddling our thumbs before the decision was made that the host owner wasn’t going to get back to us and the management decision was made to leave it for the night.

That’s a whole night this host will be down because the people who were there didn’t have the information needed to finish fixing the problem.

I’ve said it before, I’ll say it again.  If you don’t trust the people you hire, maybe who has access to what isn’t your primary problem.

And suddenly…(redux)

Friday, May 6th, 2011

**ALERT** I’ve had to…modify this post so it won’t offend someone who doesn’t realize that the storage community is very small and that word will get out regardless…

I’m unemployed.

Unexpectedly too.  Unexpected because right up until the day they told me to go home because I wasn’t getting paid, everyone assured me that the contract renewal was in the bag.

I’m such a sucker.  Believing people like that.  Never again.  I’ll also never believe anyone who tells me “don’t worry about it, I’ve got you covered if there’s a gap.”

It’s ok, next gig is on the horizon already…  And it looks like it will be something that while geographically unpleasant, will be a great job I can learn a LOT from, and truly excel at, which for me is key, because I’ve spent the past two years trying to shoe-horn new ideas into the heads of people who think a new idea is like anthrax, to be avoided at all costs.

(And with that I’d like to say hello to the nice folks at the NSA.  Please forgive me, it was an analogy, if a badly placed one.)

Consulting sucks sometimes.  The worst part of course is not knowing where you’ll be working from year to year, or the fact that you have to keep your eyes open, in permanent recruiter mode.

Of course the money is great, and if you tend to go stagnant on doing the same thing over, and over, and over again…It’s nice to be able to change.

It’s a pity that with being yanked out of an environment with no notice comes no turnover on the projects, and that there are a few implementations that I was in the middle of that might blow up if not tended to properly and in the right time-frame, which sadly isn’t far off.

(Ok, first anthrax and then the phrase ‘blow-up’ – the boys in black are DEFINITELY knocking on my door tonight.)

So the real question is…who is going to get saddled with picking up where I left off, *AND* are they going to ask me to help…

Can’t wait until that happens to give me the opportunity to lecture someone on the value of giving notice. :)

Engineering…

Tuesday, September 14th, 2010

I have seen the future of data storage and I weep for it.

A few random rants at 2am after a data migration didn’t go because I’m not willing to kill a backup process that has a hold of my mount-point.

—-

Engineering seems to be a thing of the past.  We’ve graduated to the world of “that’s good enough for launch, we’ll fix the bugs in future releases” in spades.

I’m so tired of hearing people explain to me that a solution is “good enough” or compromising technically due to cost, when we all know that a badly engineered product will suck the life and profitability out of any company.

Most vendors, and EMC is included in this generalization, have taken to solving hardware engineering problems with software.  Enter the “appliance”.  Consumer grade hardware thrown in with crappy software designed in some third-world country without any thought to the long term failure rates on such combination.

Poor engineers make the mistake of assuming that an overly complex solution must be better than a simple one.  IBM GPFS, Sun Shared QFS, EMC Celerra MPFS are all examples of a psychotically overcomplicated solution to a very simple problem that can be solved with NFSv4 and a decent network back-end.

The more moving parts you throw at a problem, the more chances something stupid, or someone stupid, is going to foul up the mix.

And the more external vendors you buy your parts from the more your chances of having to deal with a problem that is of SOMEONE ELSE’s making, since you’ve given up control of your product and Quality is truly a thing of the past.  (You’ve also made your product *WAY* more expensive than it needs to be, because you’re not supporting the margin of everyone else between you and the supplier)

And if you buy the cheapest product to do a job, remember that it’s cheap for a reason.  IBM has been known to give up to 90% discounts on the XiV platform in order to compete with EMC.

90% – wow.  Reminds me of the usual assumption about guys and big trucks.

If you’ve got to give that much of a discount to convince people to buy your product, then it’s not much of a product now is it.

—-

Consulting vs. Contracting…. A primer….

Wednesday, November 4th, 2009

Ok, I can say it in a sentence.

A contractor is someone you hire to do a job, a consultant is someone you hire to fix a problem.

I’ve done both, but in the last 8 years I’ve been primarily a “Consultant.”  My job is to fix whatever perceived problem.

Some companies might have a backup problem.  You streamline their process and reduce redundancy, and poof, backup problem solved.

Some companies might have a replication problem.  You analyze their environment then recommend and implement changes.

Some companies have a data management problem.  You simplify storage, identify Tiers, move storage to where it best suits the orginization.  (IE Static Image data doesn’t belong on Tier-1 Symmetrix)

Some companies have a culture problem.

Here I got nothing.

But when your culture problem interferes with the consulting that you are asking me to do, I bristle.

When your culture problem causes me to wait 8 months of a 1 year contract before I’m given the tools to do my job, I boil.

When your culture problem is making me feel like I should take up golf.  I start looking at dice.com for something better.  (I hate golf)

Maybe it’s just that I *LOVE* what I do.

I do.  I love what I do.  I get paid to do what I love.  Which is why I can’t stand seeing people who are either A> There to collect a paycheck and maybe if they’re lucky a pension. or B> try and create their little empire so they can brag to colleagues about how much money they have to spend this year on nothing.

It boggles the mind.

Backup Vs. Archive

Tuesday, September 15th, 2009

The fundamental difference between BACKUP and ARCHIVE.

A backup is there to help you deal with a crisis such as “My datacenter is a smoking hole in the ground now what do I do?” or something not quite as dramatic like “A virus ate my data.”  You recover from the backup to the last known good and all is happy, right?  Well except for the two or three days that might have gone since your last good backup…  (Was in one lawfirm that lost a drive only to find out their backups hadn’t been running for two months.. came back two weeks later to find a COMPLETE change in personnel had gone on while I was gone – lawyers are not very forgiving when they lose two months worth of email.)

An archive is data that, while not “Active” still might be required on a day-to-day basis.  Film / Video / Image archives are a good candidate for and example of that.

So on a disk-based archive you have some platform, ostensibly EMC/Legato DiskExtender or Rainfinity or something along those lines – that will move the data from “Active” storage to “Archive” storage.  In some applications you can even set up a true HSM, moving data that hasn’t been accessed to Tier-2(Enterprise SATA) and even Tier-3(yes, tape) as it ages, only to be recalled to Tier-1 when it’s accessed.

More often than not I’m brought face to face with people who don’t understand that very subtle difference.  One of my recent customers is actually doing it appropriately, using DX and a smallish Centerra to archive data that, while retention is required, is almost never actually accessed.

Then there are the people who use backup technology for archival purposes.

I’m pretty “old school” when it comes down to it.

Tape is for backup.  Tape is *NOT* supposed to be used as nearline storage when there are equally inexpensive (and more reliable) disk methods out there.

My main complaint about tape as archive: You don’t know if it’s bad until you try to read it.  And time you read it the simple act of moving the tape into a tape drive that was manufactured under less than ideal conditions means you are putting your data at risk.

Spending millions of dollars on a new Room-Sized tape library doesn’t make sense when Centerra storage is fairly inexpensive *AND* provides redundancy of the data automatically.

Spending more millions of dollars on three of them is lunacy when one EMC Atmos set up could provide redundancy and a single namespace for recall.  (and if you go whole hog, geographically relevant retrieval is an option to, so you automatically get it from the closest copy.)

It pains me to see it done wrong.  Especially when it involves trying to shoe-horn two more STK monsters into an already cramped datacenter when the work of it could be done in a couple of floor-tiles of spinning disks.

Would you like a sign-on bonus with that sir?

Tuesday, May 12th, 2009

I recently put a very terse email together to a recruiter.  These people have been sending me 3-5 emails PER WEEK on a position that is almost completely foreign to “Data Storage”  (Well, it does involve data…but that’s it about as close as it gets.)

Recruiting is another place where supposedly technical work is going to the lowest bidder, and judging by the accents I’m hearing, usually overseas.  This drives me nuts.  Not just because American companies continue to ship jobs overseas despite the fact that our own unemployment numbers are ready to surpass 9% (Source: Bureau of Labor Statistics)

The other reason I hate it is that it’s obvious (see below) that the comprehension level of these people is around the 3rd grade.

Recruiting used to be an art form.  Now it’s usually just some idiot saying “you want fries with that?”

And to people who outsource their recruiting let me tell you something.  If you can’t afford a real recruiter, you probably can’t afford a real staff either.  And if you would rather screw your country than pay someone what they’re worth, I really don’t want to work for you.

————————–My Response:

To whom it may concern:

I usually don’t like to remove myself from job lists, but you’ve been (incorrectly) identifying me for this positions for months upon months now.

Please make the madness stop. I am not a “data modeler” by any stretch of the imagination and by continuing to attempt to recruit me for this position proves that you actually have no idea what a “data modeler” actually is.

Data modeling is a way to structure and organize data so it can be used easily by databases. If you’ll check the information that you’ve got somewhere in your data banks on me, you’ll find that “Structure” and “Organize” have nothing to do with my field. (“Databases” only relates to my field on the periphery, in that it all has to be stored somewhere.)

Thank you for your PROMPT consideration.

/JG

———————-Original Email

Dear Jesse,

If you have the experience required for the following job order, please forward your latest resume to kaden@catstaffing-us.com Or Call me at 201 255 0319 x 177., along with responding to the following questions:

What is your hourly rate?
Where do you currently reside (city, state)?
Would you be willing to relocate?
Are you a registered http://www.logtalent.com User? ( It is FREE, It will help us to track your availability and resume for future job openings )
What is your availability to start a new project?
Are you authorized to work in the United States?
If you are not a US citizen, do you have the legal right to remain permanently in the US ? If not, what is your visa status?
Do you have a personal website URL? Your own blog or personal website. ?

PLEASE CAREFULLY READ THE JOB DESCRIPTION DETAILS AS PROVIDED BELOW. THIS IS AN IMMEDIATE OPENING:

Job Title: Data Architect
Job Location:
Job Type: Contract
JobDetail Description:

Job Title: Data Architect Location: Worcester, MA Length: 6+ months Skills: Data Architect , XML Description: Position Available for Data Architect in Worcester area, MA Summary The Data Architect will responsible for the design and architecture for the enterprise data solutions implemented in projects.
The data architect must be involved in the early stages of projects and produce data-related design deliverables that will enable project teams to build/modify systems in keeping with the overall data architecture strategy.
The Data Architect will need a demonstrated ability to produce conceptual, logical, and physical data models.
Responsibilities Assist project teams regarding data mapping and data modeling as well as facilitation of information gathering sessions, provide data analysis services and document them as required; Develop, implement and maintain processes for logical to physical data model creation.
Processes will provide data integrity, quality, reliability, availability, and reuse; Ensure adherence to company data architectural guidelines, principles and standards in all project milestones and deliverables; Define and implement data strategies as part of the project life cycle; Research, evaluate and recommend new technologies and techniques to more effectively monitor and manage data asset; Design messaging models and schemas for SOA such as XML; Perform dimensional modeling for data warehousing; Demonstrate responsibility and accountability for the deliverables; Other responsibilities as required.
What are the must haves of this position? Solid leadership and influencing skills which balance creative yet practical solutions for the businesses without compromising corporate data strategies and standards; demonstrate respect and a positive attitude, while projecting a sense of calm and control under pressure situations; Strong verbal and written communication skills which can clearly articulate complex concepts and ideas to all levels of the organization in both technical and non-technical terms; Extensive knowledge of data mapping and data modeling processes; Understanding of data architecture as it relates to operational and analytical information, i.
e.
platforms, reference data, DBMS, data integration; Analytical skills which effectively deal with conceptual and tangible ideas along with attention to detail; Working knowledge of the IT development process and life cycle, i.
e.
roles/responsibilities, tasks, milestones, deliverables; Team oriented with ability to work effectively with many different people across many diverse organizations; Ability to work on multiple, concurrent projects in a dynamic environment assignment-based organization and manage time appropriately; Management skills, with the ability to plan, organize and orchestrate activities such as JAD and review sessions; ability to identify and manage risks and issues, including appropriate escalation when needed; Need to be a self-starter, passionate for data and able to work independently with minimal supervision; Requirements: Bachelors Degree and 3 – 5 Years of Experience in data related fields; Experience with enterprise-level DBMS systems and tools e.g. Oracle, DB2, SQL Server Experience with Erwin, XML Spy is a plus; Experience with ACORD is a plus.

I remember the good old days….

Tuesday, March 17th, 2009

When the triage guys @ EMC actually listened to the person calling in the problem and directed the call appropriately. (Just had one I specifically asked for PSE and they routed me to software for some unknown reason, now, three hours later, it’s been re-routed to PSE)

When the support specialist working a call would stay through the end of the problem, and didn’t give you the “Sorry my shift is over, please explain your problem to a new guy for 45 minutes before you do anything else productive”

Follow-up. It’s now 0100 the next morning. I’ve been working on this problem for 8 hours straight. The software guy who was originally assigned went home without turning the call over to someone else. I’ve gotten not a single call since before 9pm.

This is not support people. This is the opposite of support. I’ve since fixed the problem myself, without the help of the SAC/PSE folks. The sad part is if I had done this 8 hours ago I could have been home eating my corned-beef / cabbage and drinking my Guiness.

Totally missed St. Patrick’s day, the only religious holiday I actually observe.

Not happy.

Ok, ok. Not my brightest moment…

Tuesday, October 7th, 2008

Not storage, but business related none-the-less:

Yes, I’m sure most of you have seen it, and yes, it’s me.  I will say that the reports of my demise are greatly exaggerated.  (Spell check says that’s right, but it doesn’t look right to me)

Where you shop may hit your credit – MSNBC.COM

For those of you who don’t care about the inner-workings of my life and business, please feel free to ignore the rest of this post. :)

(more…)

Comcast -

Monday, April 30th, 2007

They’re getting their last chance.  Morons took me down with their last config fix by re-enabling the blocking of port-80 when they uploaded the most recent “cure-all” config to the modem.

Mental note:  If anyone ever tries to hand you an SMC 8013 or SMC8014 cable modem, run, don’t walk, in the opposite direction.  They are complete crap and can’t handle a simple, single-threaded ISO image download from Microsoft.  (god forbid I do anything in a peer-to-peer network, where downloads are coming in from multiple sources….)

Tomorrow they’re bringing a new NetGear modem out.  if this doesn’t work, I’m going to buy my own (after some careful consulting with one who knows much more about these things than I do) and yell and scream until they provision it for me.

If that doesn’t work, they can go spit, I’ll go back to Verizon.  It was slow, but at least it bloody-well worked.

The Fake Synchronous SAN

Tuesday, February 6th, 2007

In response to this article on “EnterpriseStorageForum”:

 Synchronous SAN Sets Fibre Channel Distance Record

My Response:

True Synchronous transmission works over any distance – if you can live with the latency.   The problem is that most hosts operating systems can’t.  So different buffering schemes are cooked up to fool the host into thinking the write is complete on both sides when in fact it’s not.

Any time you get over about 30km the latency, that is the time it takes for the IO to be transmitted, acknowledged, and released, becomes about that of a normal unbuffered physical drive, about 20-30 ms.

Any further and you will start seeing slower and slower response times and eventually IO timeouts on the source hosts.

In order for a storage system to be truly “synchronous”, the array cannot acknowledge the I/O to the host until it’s received the write ACK from both the source, *AND* the target array.  If there is buffering going on between point-A and point-B, such as a cisco MDS with the buffer credits cranked up or a Nisshan IPS3300, it is not a truly synchronous replication, because the failure of the switch on the source (or target side) will cause the target array to have missed I/O’s that have already been acknowledged to the host as having been complete.

Sorry – but this test, as it appears to have been run was obviously designed by the various vendors to accentuate their hardware without showing the failures and flaws in the logic.  I’m sure I could walk in there and in about 5 minutes simulate a link failure that would have the remote site in an inconsistent state at worst, or having to roll back journaled IO’s at best.