thank you for taking a moment to peruse my first post as an independent
blogger. A short introduction is in order.
I am coming up on my 27th anniversary in the storage
industry having held positions as both a consumer and a peddler and in roles
spanning from administrator, to thought leader, to strategic planner, to senior
manager. Most recently I have served as the Business Line Manager for IBM
Storage Software. This blog – The Line – is for the open exchange of ideas
relating to the storage software and systems that come together to solve the
enduring challenges we all face in taking care of the world’s data.
Back at the Storage Networking
World Fall 2012 conference, I participated in a round table on storage
hypervisors hosted by ESG Senior Analyst Mark
Peters. I was joined by Claus Mikkelsen - Chief Scientist at Hitachi Data
Systems, Mark Davis – CEO of Virsto (now a VMware company), and George Teixeira
– CEO of DataCore. Following the conference, Mark Peters posted a very nice
series of three video blogs with perspective from the round table participants.
They are worth a listen.
The discussion is
continuing at SNW Spring 2013 at Rosen Shingle Creek in Orlando, Florida. The
panel discussion "Analyst Perspective: The Storage Hypervisor: Myth or
Reality?" will happen on Tuesday, April 2 at 5:00 pm EDT
With the mechanics of storage virtualization being offered
by IBM and Hitachi for 10 and 9 years respectively, EMC joining the list almost
3 years ago, VMware’s acquisition
of Virsto earlier this year, and talk of software-defined everything, the
conversation around storage hypervisors is heating up and that’s been keeping
us very, very busy. As we prepare for the round table next week, I thought it
worthwhile to offer a point of view on storage hypervisors.
The fuel behind the storage hypervisor conversation is use cases – beyond being a cool
technology, how does it contribute to helping me solve those enduring
challenges we all face in taking care of the world’s data?
Perhaps the most obvious expectation is improved efficiency and data mobility. The basic idea behind
hypervisors (server or storage) is that they allow you to gather up physical
resources into a pool, and then consume virtual slices of that pool until it’s
all gone (this is how you get the really high utilization). The kicker comes
from being able to non-disruptively move those slices around. In the case of a
storage hypervisor, you can move a slice (or virtual volume) from tier to tier,
from vendor to vendor, and now, from site to site all while the applications
are online and accessing the data. This opens up all kinds of use cases that
have been described as “cloud”. One of the coolest is active-active datacenters. Each year almost all of the tropical
storm activity in the Atlantic Ocean happens between June and November. If you
operate a datacenter near the Atlantic coast, and if you have implemented both
a server hypervisor (let’s say VMware vSphere for your Intel servers and IBM
PowerVM for your Power systems), and a storage hypervisor (let’s say IBM SmartCloud Virtual
Storage Center), then here’s how you might react to a tropical storm in the
forecast: “Hey, the hurricane is coming, let’s move operations to our active-active
datacenter further inland…” IBM SmartCloud Virtual Storage Center in a stretched-cluster
configuration allows you to access the same data at both locations giving you
the ability to do an inter-site VMware vMotion
and PowerVM Live Partition
Mobility (LPM) move – non-disruptively. IBM and its Business Partners have
been helping hundreds of clients implement this sort of stretched-cluster
configuration all over the world for the last 5 years.
But storage hypervisors are more, much more than just
virtual slices and data mobility. We’re driving
cost out of the equation. Sure, we’re getting high utilization from
allocating virtual slices, but are we being as smart as we could be about
allocating those slices? A good storage hypervisor helps you be smart.
provisioning: You have a client
that asks for 500GB of new capacity. You’re going to give it to him as
thin provisioned virtual capacity which is a fancy way of saying you’re
not going to actually back it with real physical storage until he writes
real data on it. That helps you keep cost down.
Same guy also asks to keep several snapshot copies of his data for
recovery purposes. You’re going to start by giving him thin provisioned
capacity for those snapshots, but you’re also going to compress whatever
data those snapshots produce – again adding to your efficiency. For that
matter, you’re going to compress his source data too.
about vendors: Because you’re getting your storage services from a
storage hypervisor (software-defined storage), you have the freedom to
shift the physical storage you operate from all tier-1 to a more efficient
mix of lower tiers, and while you’re doing it you can create a little
competition among as many disk array vendors as you like to get the best
price / support.
about tiers: If you shut your eyes real tight and think about the
concept of a “virtual” disk that is mobile across arrays and tiers, you’ll
quickly start asking questions about having the storage hypervisor monitor
the utilization and response of your physical hardware infrastructure, watch
for I/O patterns on blocks within that virtual disks, and apply some
analytic intelligence toward moving the right data to the right tier to
both meet requested SLA’s and optimize utilization of your hardware
infrastructure. This is especially
important with flash showing up in multiple places in the infrastructure
(in arrays, in the network, in the server). You simply won’t be able to
manage all that with a tier-management system that is tied to an array.
You need…dare I say it…a software-defined storage layer (a storage hypervisor)
that includes both the raw mechanics of virtualization and
the analytics to determine when and how to best use the mechanics.
To truly enable a hypervisor – in servers or storage – it’s
important that the hypervisor not be dependent on the underlying physical
hardware for anything except capacity (compute capacity in the case of a server
hypervisor like VMware, storage capacity in the case of a storage hypervisor).
Think about it… Wouldn’t it be odd to have a pair of VMware ESX hosts in a
cluster, one running on IBM hardware and one on HP hardware, and be told that
you couldn’t vMotion a virtual machine between the two because some feature of
your virtual machine would just stop working?
If you tie a virtual machine to a specific piece of hardware in order to
take advantage of the function in that hardware, it sort of defeats the whole
point of mobility. The same thing applies to storage hypervisors. Virtual
volumes that are dependent on a particular physical disk array for some
function, say mirroring or snapshotting for example, aren’t really mobile from
tier to tier or vendor to vendor any more.
But it’s more than
just a philosophical issue, there’s real money at stake. The reason so many
datacenters have an overabundance of tier-1 disk arrays on the floor is
because, historically, if you wanted to take advantage of things like thin
provisioning, application-integrated snapshot, robust mirroring for disaster
recovery, high performance for database workloads, access to flash storage,
etc… you had to buy tier-1 ‘array capacity’ to get access to these tier-1
‘storage services’ (did you catch the subtle difference?) Now, I don’t have
anything against tier-1 disk arrays (my company sells a really good one). In
fact, they have a great reputation for availability (a lot of the bulk in these
units are sophisticated, redundant electronics that keep the thing available
all the time). But with a good storage hypervisor, tier-1 ‘storage services’
are no longer tied to tier-1 ‘array capacity’ because the service levels are
provided by the hypervisor. Capacity…is capacity…and you can choose any kind
you want. Many clients we work with are discovering the huge cost savings that
can be realized by continuing to deliver tier-1 service (from the hypervisor),
only doing it on lower-tier disk arrays. We routinely see clients shift their
mix of ‘array capacity’ from 70% or 80% tier-1 to 70% or 80% lower-tier arrays while continuing to deliver
tier-1 ‘storage services’ to their data.
Join the conversation! Share your point of view here. And if you are going to be at SNW next
week, come by and listen to the round table. I would love to meet you. Follow me on Twitter @RonRiffe