To borrow a phrase from a fellow blogger… Interest from customers on cloud storage is very, very hot, and that’s been keeping us very, very busy. The interest underscores the fact that public storage cloud providers have sent a “cost shockwave” through the industry and customers are taking notice.
While CIO’s may still be too concerned about security and service levels to put much real corporate information in the public cloud, they have taken notice that these service providers are offering storage capacity at prices that are often lower than what they are paying for their own private storage. Sure, a service provider theoretically has more economy of scale and so could demand a better price from their hardware vendors, but they also have some profit margin to build into their “service”. There has to be more to it. The customers I talk to are wondering what these service providers are doing to operate at those costs – and if any of their techniques can be applied in a private storage environment.
The situation begs the question “what is it that differentiates these public storage clouds from the traditional private storage environments that most clients operate?” From our experience with customers, there are four significant differences.
- Storage resources are virtualized from multiple arrays, vendors, and datacenters – pooled together and accessed anywhere.
(as opposed to physical array-boundary limitations)
- Storage services are standardized – selected from a storage service catalog.
(as opposed to customized configuration)
- Storage provisioning is self-service – administrators use automation to allocate capacity from the catalog.
(as opposed to manual component-level provisioning)
- Storage usage is paid per use – end users are aware of the impact of their consumption and service level choices.
(as opposed to paid from a central IT budget)
In this post, I’m going to try to explain these four concepts in sufficient detail that somebody responsible for a private storage environment could walk away with some practical recommendations that could save their company a pile of money. Most of this isn’t really original (the concepts have been around for a while), but so few enterprises operate this way that the person who introduces their company to these ideas often looks like a genius (and who doesn’t like that!!). It’s a long topic, so I’ve broken it into 3 posts.
Storage resources are virtualized. Do you remember back when applications ran on machines that really were physical servers (all that “physical” stuff that kept everything in one place and slowed all your processes down)? Most folks are rapidly putting those days behind them. Server hypervisors and the virtual machines they manage have improved efficiency (no more wasted compute resources), freed up mobility, and ushered in a whole new “cloud” language.
Well, the same ideas apply to storage. As administrators catch on to these ideas, it won’t be long before we’ll be asking the question “Do you remember back when virtual machines used disks that really were physical arrays (all that “physical” stuff that kept everything in one place and slowed all your processes down)?”
But storage hypervisors are more, much more than just virtual slices and data mobility. Remember, we’re trying to think like a service provider who is driving cost out of the equation. Sure, we’re getting high utilization from allocating virtual slices, but are we being as smart as we could be about allocating those slices? A good storage hypervisor helps you be smart.
- Thin provisioning: You have a client that asks for 500GB of new capacity. You’re going to give it to him as thin provisioned virtual capacity which is a fancy way of saying you’re not going to actually back it with real physical storage until he writes real data on it. That helps you keep cost down.
- Compression: Same guy also asks to keep several snapshot copies of his data for recovery purposes. You’re going to start by giving him thin provisioned capacity for those snapshots, but you’re also going to compress whatever data those snapshots produce – again adding to your efficiency.
- Agnostic about vendors: Because you’re providing virtual storage resources from a storage service catalog (we’ll talk more about that in Part II of this post), you have the freedom to shift the physical storage you operate from all tier-1 to a more efficient mix of lower tiers, and while you’re doing it you can create a little competition among as many disk array vendors as you like to get the best price / support.
- Smart about tiers: If you shut your eyes real tight and think about the concept of a “virtual” disk that is mobile across arrays and tiers, you’ll quickly start asking questions about having the storage hypervisor watch for I/O patterns on blocks within that virtual disk that would benefit from higher tier capacity, like solid-state (SSD) or flash disk for example. A good storage hypervisor will automate the detection of such patterns and move hot data blocks to these highest tiers of storage if you have them.
Are you getting the picture of why so many enterprises are beginning to agree with Gartner that a storage hypervisor can be a great first step in transitioning traditional IT into a private cloud storage environment? Application owners come to you for storage capacity because you’re responsible for the storage at your company. In the old days, if they requested 500GB of capacity, you allocated 500GB off of some tier-1 physical array – and there it sat. But then you discovered storage hypervisors! Now you tell that application owner he has 500GB of capacity… What he really has is a 500GB virtual volume that is thin provisioned, compressed, and backed by lower-tier disks. When he has a few data blocks that get really hot, the storage hypervisor dynamically moves just those blocks to higher tier storage like SSD’s. His virtual disk can be accessed anywhere across vendors, tiers and even datacenters. And in the background you have changed the vendor storage he is actually sitting on twice because you found a better supplier. But he doesn’t know any of this because he only sees the 500GB virtual volume you gave him. It’s “in the cloud”.