You know, everytime I start thinking of “Cloning” I am afraid the far-right is going to burn me in effigy, just on the principle of it.
But in this case, I’m talking about cloning disks within an array for a data migration.
The decision was made to move our Microsoft Exchange (Corporate Email)Â from the Tier-1 (Raid-1+SRDF) to Tier-2 (3+1 Raid-5) storage.Â I guess the logic is that losing a day of email won’t hurt us horribly.Â (Ok, maybe it will, but I’ll get into the solution to that problem in a different post)
In this case, I’ve got the following drives:
7x 84G Metavolumes
4x 33G Metavolumes
2x single hypervolumes.
The new Exchange administrator, who I am actually most impressed with so far, would like to add 3x 200+G Metavolumes to the mix.Â The main reason for the move is that we’re rapidly running out of Tier-1 storage, and need to save it for expansion, production growth, etc.Â (Or buy new disks, but that’s yet another story)
So I am going to use this opportunity to demonstrate the power of TimeFinder/Clone.
I’ve created the new volumes on the Raid-5, mapped them to the front end ports, and prepared the masking scripts to move the device masking from the old devices to the new.Â
Now in the old days, the way you’d do this is; Shut the hosts down, do a bit-by-bit copy of the disks (if you’re lucky you can do them in parallel, otherwise it’s single-threaded) change the pointers on the host, and bring them back up, hoping everything is exactly how you left it.Â Net downtime could be in the neighborhood of several hours, if you’re lucky.
This is a new beast.Â Enter TimeFinder/Clone.Â Now I have the blank devices.Â I do things in a slightly different order.
- Create the clone session – this establishes the pairing of devices.
- Shut down the Exchange services.
- (each node) Unmount the existing disks using admhost or symntctl
- (each node) Change the device masking so Exchange sees it’s new disks
- (each node) re-scan the bus to remove the entries for the old disks and create the entries for the new luns. (at this time, it will see all new blank disks)
- Shut the cluster hosts down.
- “Activate” the clone sessionÂ
- Bring the first “Active” node of exchange up
- Bring the passive node up.
Net downtime is about an hour.Â The reason being, once the clone session is activated, a background copy is started.Â However from the target side, any reads to “invalid” tracks on the target disks actually get serviced from the source disks.Â As far as the host is concerned it’s all the original data.Â Â
As the copy progresses, more and more of the reads are serviced from the new disks.
When the array receives a write to a track that hasn’t been copied to the target yet, that track is first copied, THEN the write is processed on the target disks only.Â This preserves your production disks in their original state.
With the advent of TF/Clone, I’m surprised anyone still uses BCV’s.Â They’re so “old-tech”Â The main hang-up of course was the fact that while you (in theory) could protect a BCV using Raid-1, the performance hit you took during establish and split operations was so bad that it wasn’t worth it.Â With TF/Clone you can go from Raid-1, to 3+1 Raid5, to 7+1 Raid5, etc. etc.Â Without minimal performance impact.
The only downside comes of course when you’re cloning production volumes while they’re in use.Â Since reads are being serviced by the production disks and not the clone disks (technically, the tracks you’re reading are simply being copied to the target while the host read is serviced from the cached track) you’re impacting the production spindles during the copy process.
It’s a cool bit of magic – and it’s really fun to play with the minds of people who don’t understand the technology.