I am coming up on my 27th anniversary in the storage industry having held positions as both a consumer and a peddler and in roles spanning from administrator, to thought leader, to strategic planner, to senior manager. Most recently I have served as the Business Line Manager for IBM Storage Software. This blog – The Line – is for the open exchange of ideas relating to the storage software and systems that come together to solve the enduring challenges we all face in taking care of the world’s data.
Back at the Storage Networking World Fall 2012 conference, I participated in a round table on storage hypervisors hosted by ESG Senior Analyst Mark Peters. I was joined by Claus Mikkelsen - Chief Scientist at Hitachi Data Systems, Mark Davis – CEO of Virsto (now a VMware company), and George Teixeira – CEO of DataCore. Following the conference, Mark Peters posted a very nice series of three video blogs with perspective from the round table participants. They are worth a listen.
The discussion is continuing at SNW Spring 2013 at Rosen Shingle Creek in Orlando, Florida. The panel discussion "Analyst Perspective: The Storage Hypervisor: Myth or Reality?" will happen on Tuesday, April 2 at 5:00 pm EDT
With the mechanics of storage virtualization being offered by IBM and Hitachi for 10 and 9 years respectively, EMC joining the list almost 3 years ago, VMware’s acquisition of Virsto earlier this year, and talk of software-defined everything, the conversation around storage hypervisors is heating up and that’s been keeping us very, very busy. As we prepare for the round table next week, I thought it worthwhile to offer a point of view on storage hypervisors.
The fuel behind the storage hypervisor conversation is use cases – beyond being a cool technology, how does it contribute to helping me solve those enduring challenges we all face in taking care of the world’s data?
Perhaps the most obvious expectation is improved efficiency and data mobility. The basic idea behind hypervisors (server or storage) is that they allow you to gather up physical resources into a pool, and then consume virtual slices of that pool until it’s all gone (this is how you get the really high utilization). The kicker comes from being able to non-disruptively move those slices around. In the case of a storage hypervisor, you can move a slice (or virtual volume) from tier to tier, from vendor to vendor, and now, from site to site all while the applications are online and accessing the data. This opens up all kinds of use cases that have been described as “cloud”. One of the coolest is active-active datacenters. Each year almost all of the tropical storm activity in the Atlantic Ocean happens between June and November. If you operate a datacenter near the Atlantic coast, and if you have implemented both a server hypervisor (let’s say VMware vSphere for your Intel servers and IBM PowerVM for your Power systems), and a storage hypervisor (let’s say IBM SmartCloud Virtual Storage Center), then here’s how you might react to a tropical storm in the forecast: “Hey, the hurricane is coming, let’s move operations to our active-active datacenter further inland…” IBM SmartCloud Virtual Storage Center in a stretched-cluster configuration allows you to access the same data at both locations giving you the ability to do an inter-site VMware vMotion and PowerVM Live Partition Mobility (LPM) move – non-disruptively. IBM and its Business Partners have been helping hundreds of clients implement this sort of stretched-cluster configuration all over the world for the last 5 years.
But storage hypervisors are more, much more than just virtual slices and data mobility. We’re driving cost out of the equation. Sure, we’re getting high utilization from allocating virtual slices, but are we being as smart as we could be about allocating those slices? A good storage hypervisor helps you be smart.
- Thin provisioning: You have a client that asks for 500GB of new capacity. You’re going to give it to him as thin provisioned virtual capacity which is a fancy way of saying you’re not going to actually back it with real physical storage until he writes real data on it. That helps you keep cost down.
- Compression: Same guy also asks to keep several snapshot copies of his data for recovery purposes. You’re going to start by giving him thin provisioned capacity for those snapshots, but you’re also going to compress whatever data those snapshots produce – again adding to your efficiency. For that matter, you’re going to compress his source data too.
- Agnostic about vendors: Because you’re getting your storage services from a storage hypervisor (software-defined storage), you have the freedom to shift the physical storage you operate from all tier-1 to a more efficient mix of lower tiers, and while you’re doing it you can create a little competition among as many disk array vendors as you like to get the best price / support.
- Smart about tiers: If you shut your eyes real tight and think about the concept of a “virtual” disk that is mobile across arrays and tiers, you’ll quickly start asking questions about having the storage hypervisor monitor the utilization and response of your physical hardware infrastructure, watch for I/O patterns on blocks within that virtual disks, and apply some analytic intelligence toward moving the right data to the right tier to both meet requested SLA’s and optimize utilization of your hardware infrastructure. This is especially important with flash showing up in multiple places in the infrastructure (in arrays, in the network, in the server). You simply won’t be able to manage all that with a tier-management system that is tied to an array. You need…dare I say it…a software-defined storage layer (a storage hypervisor) that includes both the raw mechanics of virtualization and the analytics to determine when and how to best use the mechanics.
To truly enable a hypervisor – in servers or storage – it’s important that the hypervisor not be dependent on the underlying physical hardware for anything except capacity (compute capacity in the case of a server hypervisor like VMware, storage capacity in the case of a storage hypervisor). Think about it… Wouldn’t it be odd to have a pair of VMware ESX hosts in a cluster, one running on IBM hardware and one on HP hardware, and be told that you couldn’t vMotion a virtual machine between the two because some feature of your virtual machine would just stop working? If you tie a virtual machine to a specific piece of hardware in order to take advantage of the function in that hardware, it sort of defeats the whole point of mobility. The same thing applies to storage hypervisors. Virtual volumes that are dependent on a particular physical disk array for some function, say mirroring or snapshotting for example, aren’t really mobile from tier to tier or vendor to vendor any more.
But it’s more than just a philosophical issue, there’s real money at stake. The reason so many datacenters have an overabundance of tier-1 disk arrays on the floor is because, historically, if you wanted to take advantage of things like thin provisioning, application-integrated snapshot, robust mirroring for disaster recovery, high performance for database workloads, access to flash storage, etc… you had to buy tier-1 ‘array capacity’ to get access to these tier-1 ‘storage services’ (did you catch the subtle difference?) Now, I don’t have anything against tier-1 disk arrays (my company sells a really good one). In fact, they have a great reputation for availability (a lot of the bulk in these units are sophisticated, redundant electronics that keep the thing available all the time). But with a good storage hypervisor, tier-1 ‘storage services’ are no longer tied to tier-1 ‘array capacity’ because the service levels are provided by the hypervisor. Capacity…is capacity…and you can choose any kind you want. Many clients we work with are discovering the huge cost savings that can be realized by continuing to deliver tier-1 service (from the hypervisor), only doing it on lower-tier disk arrays. We routinely see clients shift their mix of ‘array capacity’ from 70% or 80% tier-1 to 70% or 80% lower-tier arrays while continuing to deliver tier-1 ‘storage services’ to their data.
Join the conversation! Share your point of view here. And if you are going to be at SNW next week, come by and listen to the round table. I would love to meet you. Follow me on Twitter @RonRiffe