DavidRing.ie

EMC RecoverPoint Architecture and Basic Concepts

This is my first blog on RecoverPoint; in this initial post I will detail some of the basic concepts and terminology around RecoverPoint and the GEN 5 hardware appliance specification. […]

This is my first blog on RecoverPoint; in this initial post I will detail some of the basic concepts and terminology around RecoverPoint and the GEN 5 hardware appliance specification.

•Overview
•Gen5 Hardware
•Terminology

Overview

RecoverPoint provides continuous data protection for storage arrays running on a dedicated appliance (RPA) allowing for the protection of data at both local and remote levels. RecoverPoint provides bi-directional replication enabling the recovery of data to any point in time while replicating data over any distance; within the same site (CDP), to another distant site (CRR), or both concurrently (CLR). Data transfer inside the same site is performed using fibre channel connectivity and for transfer between sites both FC and IP (WAN) is supported. Synchronous replication is supported when the remote sites are connected through FC and provides for a zero RPO. For a synchronous configuration the lag between the production and the remote is always zero since RecoverPoint does not acknowledge the write before it reaches the remote site. Asynchronous replication provides crash-consistent protection and recovery to specific points in time.

Untitled2

An example of a local Continuous Data Protection (CDP) solution:

From the above image you can see that the splitter sends a copy to the Production LUN and the RPA.The write is acknowledged by the LUN and the RPA. The RPA writes the data to the journal volume along with a time stamp and bookmark metadata.The data is then distributed to the local replica in a write-order-consistent manner. This means that if your consistency groups contains many LUNs, all the data being written is write-order consistent.

Untitled

An example of a Continuous Remote Replication (CRR) solution:

If we examine the IO sequence of the CRR solution we can see again that the IO is split sending one copy to the production LUN and the other to the RPA. The Process as mentioned can be:

1. Asynchronous – In Asynchronous repl the write IO from the host is sent to the RPA. The RPA acks it as soon as data arrives into its memory.
2. Synchronous – In Sync mode no data is ack’d by the RPA until it reaches the memory of the DR’s RPA or DR persistent storage depending on whether the “measure lag to remote RPA” flag setting is enabled in the configuration. Sync replication can be run over FC or IP with the requirement that when using FC the latency limit does not exceed 4ms for a full round trip and for IP the latency does not exceed 10ms for a full round trip.

For a concurrent local and remote (CLR) solution, both CDP and CRR occur simultaneously to provide CLR.

The RecoverPoint family consists of three license offerings:
RecoverPoint/CL (Classic) for replicating across EMC Arrays and non-EMC storage platforms with the use of VPLEX. Note: capacity is ordered per RPA cluster not per RP system. Supports all EMC array splitters.
RecoverPoint/EX for VMAXe™, VPLEX™, VNX™ series, VNXe3200, CLARiiON® CX3 and CX4 series, XtremIO, ScaleIO and Celerra® unified storage environments.
RecoverPoint/SE for VNX series, VNXe 3200, CLARiiON CX3 and CX4 series, and Celerra unified storage environments.

Gen5 Hardware

The RecoverPoint appliance (RPA) is a 1u hardware based server (Intel R1000). The specification of the RPA is as follows:

• 2 x Quad Core Sandy Bridge Processors
• Two 300GB 10K RPM 2.5” SAS Drives in RAID1 configuration
• 6 x 1GE ports (RJ-45) WAN, LAN & Remote management + 3 ports are unused
• 16 Gig DDR3 Memory
• PCIe slot 1: Quad Port 8GB FC QLogic 2564 Card (PCIe slot 2 is empty)

From the image below you can see the port usage for WAN, LAN and the HBA Port Sequence (left to right) 3-2-1-0. For each RPA, we use two Ethernet cables to connect the Management (LAN) interface to eth1 and the WAN interface to eth0.

GEN5 RPA:

RP_GEN5_Rear

Note: RecoverPoint clusters must have a minimum of 2 RPAs and a maximum of 8 RPAs. Cluster sizes must be the same at each site of an installation. A RecoverPoint Environment can have up to 5 clusters either local or remote although RP/SE has a limit of two clusters. GEN4 & GEN5 RPAs can co-exist in the same RP cluster.

Terminology

Splitter – The function of the Array-based splitter is to ensure that the RPA receives a copy of each write to the protected LUN. In the Production site the function of the splitter is to split the IO’s so that both the RPA and the storage receive a copy of the write while maintaining write-order fidelity. In the DR site, the responsibility of the splitter is to block unexpected writes from hosts and support the various types of image accesses.

RecoverPoint Repository Volumes – are dedicated volumes on the SAN-attached storage at each site, one repository volume is required for each RPA cluster. The repository holds the configuration information about the RPAs and consistency groups. Repository volumes are only exposed to the RPAs. The minimum size for the repository is 2.86GB.

RecoverPoint Journal Volumes – are SAN-attached storage volume(s) for each copy that is used in a consistency group (the production copy, local replica copy, and remote replica copy). Again journal volumes are exposed only to the EMC RPAs, not to the hosts. There are two types of journal volumes:
1. Replica journals – used to hold snapshots that are either waiting to be distributed, or that have already been distributed to the replica storage. It also holds the meta-data for each image and bookmarks. The replica journal holds as many snapshots as its capacity allows.
2. Production journals – are used when there is a link failure between sites, in this situation marking information is then written to the production journal and synced to the replica when the link comes online. This process is known as delta marking (Marking Mode). The production journal does not contain snapshots used for PIT recovery. Note: Minimum size of journal volumes is 10GB for a standard consistency group and 40GB for a distributed consistency group.

Replication Set – a protected SAN-attached storage volume from the production site and its replica (local or remote) are known as a replication set.

Consistency Group – consists of replication sets grouped together to ensure write order consistency across all the replication sets’ primary volumes. A configuration change on a consistency group will apply to all its replication sets, such as changing compression and bandwidth limits on the group. A RecoverPoint system has a maximum limit of 128 CGs max per RP system and a max of 64 CGs per RPA, if an RPA in the cluster fails the CGs running on that RPA will fail over to another RPA in the cluster.

Distributed Consistency Group – in order to obtain higher throughput rates it is possible to configure the CG as a DCG which can use up to 4 RPAs (1 RPA is used per standard CG), you can configure a maximum of 8 DCGs. 128 CGs (CG&DCG) max per RP system.

Image Access – refers to providing host access to the replication volumes, while still keeping track of source changes. Image access can be physical (also known as logged), which provides access to the actual physical volumes, or virtual, with rapid access to a virtual image of the same volumes.

In the next RecoverPoint blog I will detail sizing and performance characteristics for the Journal and Replica volumes.