50Micron.com

Recoverpoint vs. Conventional Replication

by on Mar.17, 2009, under Replication

Ok – I see the surface benefits of a third party replication appliance, such as Recoverpoint. I even sat through a long discussion on it. I’m still totally on the fence but there are a few questions I need satisfactorally answered before I can recommend this to my customer.

I want to ask my readers, (all five of you) What’s your take on Recoverpoint or any appliance-based replication standard vs. array-based SRDF, Mirrorview/S, and the like?

There were a few points in the presentation that I had to cry foul.

“There is no appreciable latency in the appliance.”

Sorry, that doesn’t wash. Any time you put an intermediary device into a fibre path, you’re going to introduce latency. The real question (and answer) is “Is the latency introduced by said appliance mitigated by the compression gain.”

“Fully Synchronous Replication”

Synchronous replication exists when the IO is not acknowledged back to the host until the write is completed to disk on BOTH SIDES. Because of obvious latency issues this is not possible without A> Sufficient Bandwidth B> low-latency. Since the Recoverpoint product seems to respond to the source array as soon as the write is journaled, that means there is a potential for data-loss if the source building becomes a smoking hole in the ground. (Or even if something as simple as a regional power-failure happens, provided such a failure affects the circuit as well as the datacenter.)

Anyway – I really need to know – are my concerns off base or am I just being protectionist of the technologies I already know?


12 Comments for this entry

  • Storagezilla

    “Any time you put an intermediary device into a fibre path, you’re going to introduce latency. The real question (and answer) is “Is the latency introduced by said appliance mitigated by the compression gain.”

    Are we talking splitting via a switch or on a host? If it’s a switch then the write will be split at ASIC speed in parallel, it’s it’s on a host the writes will be seralised. It’ll commit to the appliance and then to the primary storage when replication is active for the volumes being written to.The reason it does that is so you can regulate applications and never end up in a runaway situation where you fall too far behind.

    ‘Synchronous replication exists when the IO is not acknowledged back to the host until the write is completed to disk on BOTH SIDES. Because of obvious latency issues this is not possible without A> Sufficient Bandwidth B> low-latency. Since the Recoverpoint product seems to respond to the source array as soon as the write is journaled, that means there is a potential for data-loss if the source building becomes a smoking hole in the ground. (Or even if something as simple as a regional power-failure happens, provided such a failure affects the circuit as well as the datacenter.)”

    Also correct. Sync Replication is supported with CDP, Stretched CDP and Cascaded RecoverPoint. What they all have in common is that the sync part is over FC not IP and the primary journal and associated replica volumes are located on the *target* side.

    With CRR the journal would be located on the source side. That’s why it’s async.

    And you’re not being a protectionist. My rules of thumb.

    Symm to Symm = SRDF
    Clariion to Clariion over short distances = MirrorView/S
    Clariion to Clariion over long distances = I think RP can do a lot more here than MirrorView/A but anyone squeezing a Dollar might stick with MV/A
    Symm to Other = RecoverPoint
    Clariion to Other = RecoverPoint
    Other to Other = RecoverPoint
    CDP of any stripe = RecoverPoint

  • Jesse

    Thanks – you just did a better job of explaining RP to me than any of the experts I’ve talked to to date.

    And I’m sure that any idea EMC might have of replacing SRDF or Mirrorview in the distant future are pure fantasy…..I’m hoping….

  • Storagezilla

    No problem. I’ve asked all these questions myself hence I have more answers.

    SRDF and MV are not being replaced by RecoverPoint. It fits in places they don’t. They fit in places it doesn’t.

  • Superstar

    Greetings,
    We took a look at this a few years back (2 years now?) EMC had quite a time getting it running with our Cisco MDS 9509 switch using their recommended firmware levels throughout the stack. While it eventually worked — the Firmware on the switch broke other things. Split writes were pretty sweet — just be sure to get it in house to test before you find it breaks other things!

    • Jesse

      It’s hard to argue RecoverPoint against something like SRDF or MV when a lot of the stories I’m getting are like that one…

      Bottom line is:

      SRDF works. It just works, there is never a question about it, it’s the backbone product and what makes a Symm uniquely a Symm.

      Mirrorview works. Maybe with a few more gotchas and caveats than SRDF, but it also, once configured, is fire and forget.

      When it comes to mirroring technology for DR purposes, I want something that just…works.

      Of course I say this as I sit on a customer site waiting for an SRDF script that bombed out at 5 this evening to finish. Guess that will teach me to run a config change without rebooting the SP’s first.

  • StorageGuy201

    Well this is more a question but still on-topic. We have a DMX and a Clariion (tier 1 and tier 2). We’re considering replication to another site (doesn’t exist yet) so we want a single product that will replicate both DMX, Clariion to the secondary storage (perhaps another clariion). We also have a celerra but we could potentially use celerra replicator for that. Unfortunately our budget is small and so both SRDF and recoverpoint are out. We were thinking about building our own rsync based solution but then for DR we want something bullet proof with end-to-end resiliency. I miss my old NetApp days…snapvault or snapmirror turn it on and forget about it – RPO/RTO well within specs but I’m sure SRDF is similar except in my situation I have a clariion to account for too – it’s a shame that while they’re both EMC products they don’t natively talk to eachother.

    • Jesse

      Sorry for the late response, this got lost in the bit-bucket and I didn’t catch it.

      SANCopy on the remote clariion will work, but you’ll need FibreChannel connectivity end-to-end, which might be more expensive than you want.

      Do you own timefinder on the DMX and SNAP on the clariion? If so, you can buy one copy of SANCopy for the remote, and do pull replications from the sources (snaps)

      You can’t do a pull from a hot source (IE from the actual production volumes) because SANCopy isn’t capable of halting the IO remotely to provide a consistency point.

      It’s a good alternative, though in your situation I would actually recomend RecoverPoint (and those that know me know I don’t do that lightly) as a single point replication engine. Otherwise it’s OpenReplicator for DMX–>Clariion and SANCopy or Mirrorview for the Clariion–>Clariion.

    • Jesse

      Sorry for the late response, this got lost in the bit-bucket and I didn’t catch it.

      SANCopy on the remote clariion will work, but you’ll need FibreChannel connectivity end-to-end, which might be more expensive than you want.

      Do you own timefinder on the DMX and SNAP on the clariion? If so, you can buy one copy of SANCopy for the remote, and do pull replications from the sources (snaps)

      You can’t do a pull from a hot source (IE from the actual production volumes) because SANCopy isn’t capable of halting the IO remotely to provide a consistency point.

      It’s a good alternative, though in your situation I would actually recomend RecoverPoint (and those that know me know I don’t do that lightly) as a single point replication engine. Otherwise it’s OpenReplicator for DMX–>Clariion and SANCopy or Mirrorview for the Clariion–>Clariion.

  • Dimitris Krekoukias

    The other main advantage of recoverpoint is that you can arbitrarily roll back your data, not so much that it efficiently replicates…

    • Jesse

      So it becomes a “You can have it fast, *OR* you can have it reliable” type situation.

      That’s fine and dandy, but ethically shouldn’t it be marketed that way? When i started asking for numbers on latency I thought the Recoverpoint dude was going to burst a seam..

      “No measurable latency” was his basic response…

      Sorry, but I can’t buy that for any measurable time increment. I think it’s more likely that they’ve either never tested it or just never released the results. Handling data takes time, no matter how you handle it. Even simply passing an IO through an ASIC can take time, even if it is a fraction of a millisecond, it’s still time, and that time still ads up.

  • Bob

    After reading the EMC docs, this is my current working theory: When not using a host splitter, a switch (e.g. Cisco MDS 9000 with SSM line card) is a part of the RP (RecoverPoint) solution data path, RP is in the primary data path, along with switch+appliance latency and cost. Turn off either the switch or the RP appliance and storage writes stop. That means that both the RP splitter switch and the RP appliance are in-band to the data path. This is the host write data path: (1) Host writes to SSM line card, (2) SSM card adds latency by sending a Pending Write Log [PWL] command to the RP appliance and holds up the storage write waiting for a reply. (3) RP appliance acknowleges the PWL to the SSM card. (4) Then, and not until then, the SSM card sends duplicate writes to the storage and the RP appliance. The RP appliance not only adds additional latency, but IS IN THE PRIMARY WRITE PATH. The SSM line card design includes an ARL – “for the detection of host writes that never reached the RP appliance”, and a PWL – “for the detection of host writes that reached the RP appliance but never reached the storage”.

  • assaf

    recover point journal is on the replica site and not on production thus, you claim is about building destroyed is not corret

Leave a Reply

 

Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!

Visit our friends!

A few highly recommended friends...