This is part 3 of a 3 part post on how somebody responsible for a private storage environment could save their company a pile of money by implementing cloud storage techniques. Part I introduced the concept of a storage hypervisor as a first step in transitioning traditional IT into a private cloud storage environment. Part II explained how a storage service catalog, self-service provisioning, and usage-based chargeback can be used to drive down cost. In this 3rd post, I’m going to share some thoughts that should help you be smarter about choosing a storage hypervisor.
The first step is to remind ourselves what we’re trying to accomplish with a storage hypervisor. From our experience deploying over 7000 storage hypervisors, the starting point for most folks is improved efficiency and data mobility. Remember, the basic idea behind hypervisors (server or storage) is that they allow you to gather up physical resources into a pool, and then consume virtual slices of that pool until it’s all gone (this is how you get the really high utilization). The kicker comes from being able to non-disruptively move those slices around. In the case of a storage hypervisor, people are looking for the freedom to move a slice (or virtual volume) from tier to tier, from vendor to vendor, and more recently, from site to site all while the applications are online and accessing the data.
To pull off this level of mobility – in servers or storage – it’s important that the hypervisor not be dependant on the underlying physical hardware for anything except capacity (compute capacity in the case of a server hypervisor like VMware, storage capacity in the case of a storage hypervisor). Think about it… Wouldn’t it be odd to have a pair of VMware ESX hosts in a cluster, one running on IBM hardware and one on HP hardware, and be told that you couldn’t vMotion a virtual machine between the two because some feature of your virtual machine would just stop working? If you tie a virtual machine to a specific piece of hardware in order to take advantage of the function in that hardware, it sort of defeats the whole point of mobility. The same thing applies to storage hypervisors. Virtual volumes that are dependant on a particular physical disk array for some function, say mirroring or snapshotting for example, aren’t really mobile from tier to tier or vendor to vendor any more.
But it’s more than just a philosophical issue, there’s real money at stake (you may want to read what comes next a couple of times). In Part II of this post I discussed using a storage service catalog as a means of defining specific service level needs for your different categories of data. These service levels covered the gamut from capacity efficiency and I/O performance (for you techies that’s RAID levels, thin provisioning, use of solid state disks, etc), to data access resilience and disaster protection (multi-pathing, snapshotting, mirroring…). The reason so many datacenters have an over abundance of tier-1 disk arrays on the floor is because, historically, if you wanted to take advantage of things like thin provisioning, application-integrated snapshot, robust mirroring for disaster recovery, high performance for database workloads, access to solid-state disk, etc… you had to buy tier-1 ‘array capacity’ to get access to these tier-1 ‘storage services’ (did you catch the subtle difference?) Now, I don’t have anything against tier-1 disk arrays (my company sells a really good one). In fact, they have a great reputation for availability (a lot of the bulk in these units are sophisticated, redundant electronics that keep the thing available all the time). But with a good storage hypervisor, tier-1 ‘storage services’ are no longer tied to tier-1 ‘array capacity’ because the service levels are provided by the hypervisor. Capacity…is capacity…and you can choose any kind you want. Many clients we work with are discovering the huge cost savings that can be realized by continuing to deliver tier-1 service (from the hypervisor), only doing it on lower-tier disk arrays. As I noted in Part II of this post, we’ve seen clients shift their mix of ‘array capacity’ from 70% tier-1 to 70% lower-tier arrays while continuing to deliver tier-1 ‘storage services’ to their data. This YouTube video describes an example of that at Sprint.
Smart idea #1: Be careful to understand what, if any, dependency a storage hypervisor has on the capability of an underlying disk array to deliver function to your virtual volumes (like thin provisioning, compression, snapshotting, mirroring, etc.)
Next thought. There are three rather interrelated solution categories in the area of dealing with outages and protecting data.
- Disaster avoidance (“I know the hurricane is coming, let’s move the datacenter further inland”)
- Disaster recovery (“oh oh, the hurricane hit, and my datacenter is dead”)
- Data protection (“oops, I goofed up my data and I need to recover”)
Smart idea #2: Remembering smart idea #1, be sure that your storage hypervisor has its own ability to provide for disaster avoidance (inter-site mobility), disaster recovery (mirroring that’s integrated with recovery automation tools) and data protection (snapshotting that’s integrated with applications and backup tools).
One final thought. A storage hypervisor isn’t an island unto itself. Like a server hypervisor, it exists in a broader datacenter. Administrators need to be able to see it in the context of the disk arrays it manages, the servers (or virtual machines) that use its virtual volumes, the applications that need backups or clones, the disaster recovery automation that’s coordinating recovery of servers, storage, networks… You get the picture. When the challenges of day-to-day operations happen (and they do happen most every day)…
- …the storage network planner needs to look at the logical data path from a virtual machine to its physical server, through the SAN switch, to the storage hypervisor and finally to the physical disk array. He’ll need that storage hypervisor to be integrated with a SAN topology tool.
- …an application owner calls up with a performance issue that he’s blaming on ‘the storage’. You’ll need to be able to isolate performance across the whole data path (including the part of the path that goes through the storage hypervisor).
- …an application owner wants a consistent snapshot of his application to use as a backup copy (or a test clone). You’ll need a connector that talks to both the application and the storage hypervisor to identify the virtual volumes that need to be snapshotted, prepare the application for the snapshot, and then provide the application owner with an inventory of snapshots he can use to recover from.
- …you make the move toward cloud techniques in your private datacenter – implementing a storage service catalog, self-service provisioning, and usage-based chargeback. You’ll need a storage hypervisor that can be auto provisioned and that can provide the metrics on who is using how much storage.
Smart idea #3: Make a list of all the day-to-day operational things you do today with physical storage, and the things you hope to automate in the future, and be careful to understand if your storage hypervisor is sufficiently instrumented and integrated – or if it’s creating a new island to be separately managed.
I had an opportunity recently to talk on an open microphone podcast with ESG senior analyst Mark Peters about this whole area of storage hypervisors. It was an enlightening conversation full of very actionable thoughts. I recommend listening to the podcast.
And now a word from our sponsors :-) IBM offers the worlds most widely deployed storage hypervisor. With over 7000 deployments, hundreds in the newer inter-site disaster avoidance configuration, we’ve had a lot of opportunity to build some depth. As you evaluate using cloud storage techniques in your private enterprise, you’ll find things I talked about in this blog series available in IBM products today. They can help you save your company a pile of money (and make you look like a genius while you’re doing it).
Storage hypervisor platform: IBM System Storage SAN Volume Controller (SVC)
Storage hypervisor management, storage service catalog, and self-service provisioning: Tivoli Storage Productivity Center Standard Edition (TPC SE)
Usage-based chargeback: Tivoli Usage and Accounting Manager (TUAM)
Data protection integration: Tivoli Storage FlashCopy Manager and Tivoli Storage Manager Unified Recovery
Thanks for staying with me through this blog series – hope you find it valuable!
The conversation continues!
Earlier this week, fellow IBM blogger Amalore Jude joined the conversation with a perspective on Eliminating Management inefficiencies associated with your cloud foray. And Tony Pearson added thoughts on how a Storage hypervisor improves resource utilization and management. Join the conversation! The virtual dialogue on this topic will continue in another live group chat on October 7, 2011 from 12 noon to 1pm Eastern Time .