I’m on the Edge….of storage… (and I’m hangin’ on a modem with you)
If you haven’t guessed from the blatant pop culture reference in the title of my blog , I spent the first week of June at the IBM Edge storage conference (and I promise if you keep reading that I’ll refrain from making any puns on the Edge theme – despite the temptation to bring up a favorite Irish rock band hero ). Anyway, it would hardly be appropriate to mention another band when Foreigner did such an awesome job rockin’ the conference. Who knew when I was growing up that the 80’s would produce the greatest rock ballads of all time ?
Anyway, it’s been a great week at IBM Edge, hearing about all the latest advances in storage technology; in case you missed the talk on SVC Stretch Clusters as an example of the ODIN reference architecture, let me say a few words about it here. This will get a bit technical, but don’t worry…we’re not going to have a quiz at the end.
The problem we’re trying to solve is VM mobility over extended distance, and multi-site workload deployment across data centers. VM mobility not only improves availability of your applications, it’s a more efficient way to use limited storage resources. The most common reason for using this approach typically involves some form of business continuity or disaster avoidance/recovery solution, including such planned events as migrating one data center to another or eliminating downtime due to scheduled maintenance. But given an increasingly global work force, there are other good reasons to explore VM mobility. Many clients are realizing that this approach provides load balancing and enhanced user performance across multiple time zones (the so-called “follow the sun” approach). Others are realizing that by moving workloads over distance, it’s possible to optimize the cost of power to run the data center; since the lowest cost electricity is available at night, this strategy is known as “follow the moon”.
IBM has announced a software bundle featuring Storage Volume Controller (SVC), which includes Stretch Clustering over long distance. This provides read/write access to storage volumes located far apart from each other, enabling data replication across multiple data centers. SVC works in concert with Tivoli Productivity Center (TPC) to manage your storage, and integration with VMWare’s products like VMotion and vCenter enables transparent migration of virtual machines and their corresponding data or applications.
Let’s consider two data centers separated by up to 300 km (supported in SVC 6.3), and interconnected by a traditional IP network such as the internet or by dark optical fiber. We require many of the features for an Open Datacenter with an Interoperable Network (ODIN) for this solution, including lossless Ethernet fabrics, automated port profile migration, Layer 2 VLANS in each location, and an intersite Layer 2 VLAN supporting MPLS/VPLS (preferably with a 10G or 100G Ethernet line speed between sites, since the SAN infrastructure is likely running either 8G or 16G Fibre Channel). An SVC split cluster uses industry standard Fibre Channel links for both node-to-node communication and for host access to SVC nodes, so your production sites must be connected by Fibre Channel links or FC-IP.
Generally a business continuity solution will define one physical location as a failure domain, though this can vary depending on what you’re trying to protect against; a failure domain could also consist of a group of floors in a single building, or just different power domains in the same data center. In order for SVC to decide which storage nodes survive if we lose a failure domain, the solution uses a quorum disk (a management disk that contains a reserved area used exclusively for system management). At a minimum, you should have one active quorum disk on a separate power grid in one of your failure domains; up to three quorum disks can be configured with SVC, though only one is active at any given time. Metro mirroring is recommended for this type of solution; a maximum round trip delay of 80 ms is supported (note that routing is required, since the fabrics at each location are not merged).
Connectivity between sites may take several forms. First, if the regular Internet provides sufficient quality of service (QoS) and meets your business objectives for recovery time, recovery point, etc., the IBM SVC uses industry standard protocols (FC-IP) in conjunction with a Brocade switch infrastructure to transport storage over distance. This is typically a low cost option, though you might require multiple circuits with load balancing (a so-called virtual trunk). Second, it’s possible to run a Brocade inter-switch link (ISL) between SVC nodes (with SVC 6.3.0 or higher). Brocade switches provide ISL options including consolidation of up to four ISLs at 4 Gbit/s each (creating a 16G trunk), or up to eight ISLs at 16 Gbit/s each (creating a 128G trunk). Buffer credit support for up to 250 km (nearly the SVC limit) is available. SVC supports SAN routing (including FC-IP links) for intercluster storage connections. Finally, note that you can connect multiple locations with optical fiber and use a variety of protocol-agnostic wavelength division multiplexing (WDM) products in this solution. This may provide better QoS or dedicated bandwidth for large applications. A 10G passive WDM option is available on some Brocade switches (with options such as in-flight compression and encryption), or a stand-alone WDM product can be employed (IBM has qualified many such solutions, including those from ODIN participants Adva, Ciena, and Huawei). Your local service provider may also offer a variety of managed service backup options using a combination of these features. Attachment of each SVC node to both local and remote SAN switches (without ISLs) is typically done in this case. Both the ISL and non-ISL approaches are known as split I/O groups.
IBM SVC storage manager works in concert with vCenter through API plug-ins. This includes VADP (which provides data protection for snapshot backups at the VMware-level rather than the LUN level, allowing you to concentrate on the value of the VM rather than the physical location of the associated data). Performance improvements can be achieved by offloading some functions to the storage hypervisor, as well. The storage hypervisor includes a virtualization platform, controller, and management (TPC supports application aware snapshots of your data through Flash Copy Manager). At the management level, IBM also allows the storage hypervisor components to be managed as plug-ins for vCenter. VM location can be managed through vCenter with Global Server Load Balancer (GSLB), which works in concert with a Brocade API plug-in. Further, vCenter is integrated with Brocade Application Resource Broker (ARB), which can report VM status back to a Brocade ADX switch. vCenter and GSLB manage both VM and IP profiles, performing intelligent load balancing to redirect traffic to the VM’s new location.
With this combination of ODIN best practices, IBM SVC, and Brocade SAN/FC-IP connectivity, your data can rest easy, wherever it happens to be (and so can you).