IBM NAS Storage
There are few times that I look at what a company markets as the 'Next big thing' in the storage world and get the same reaction I got when I started learning about the SONAS product. There is already some technical details in the announcement and in Tony's blog from a few days ago so I wont go into that today, but I will go over how this product really makes a paradigm shift in the NAS storage world.
Traditionally NAS storage is looked as the little brother to the bigger systems of SAN. SAN systems tend to be the athletes of the storage high school with their matching letter jackets and oversized girth. All the while, NAS was the band geeks, some frail and thin and some over sized but always in large numbers and not very organized. NAS technology was born from the need to share data over he company and as the amount of information grew so did the servers, network bandwidth and backups. SAN storage is still the big guy on campus but the people that track trends for our industry say NAS has become just as important as the large databases, ERP systems and the like.
If you look at how we have stored NAS data, it has been on single file systems that had local disk drives shared out over a single 10/100 mb network. As storage systems became more advanced, we saw people using clustering, snapshots, thin provision, de-duplication and replication to help keep our companies communicating. When we needed more throughput or more storage we added a server or added disks which created islands of unshared power.
If you look at 2009 and one of the hottest buzz words in the storage market, it was cloud computing. Having a large source of power in one area to pull resources from without having to provision new equipment. We also saw more and more clients looking at NAS protocols as the Ethernet could support faster speeds than traditional fibre channel. A huge amount of you have been looking at and moving your virtual environments to NFS to help cut down on administration overhead and to take advantage of the CNA technology.
With a higher demand for NAS technology, comes the burden of being able to scale at the same rate the storage, network and throughput increases. Older NAS systems allowed clients to increase the amount of storage but once you reach the critical mass the system allowed you had to purchase another clustered system. This creates multiple islands of storage pools that have be managed, provisioned and backed up. Not a great solution for companies that are growing and have fewer administrators to do the work.
Now, IBM has a product that allows our NAS clients to grow and scale as their companies grow. SONAS is a highly scable NAS that works like a cloud. The underlying technology, GPFS, is the same found in some of the fastest computers in the world. SONAS uses a method of scaling in both storage and throughput by adding storage pods (60 SATA or SAS disks) or interface modules (x3650 servers) like Lego blocks. All of this is managed by a central command module that allows a client to have full control over the entire system no matter how much storage or servers are in the system.
So the "Next big thing" in my opinion is here today and IBM is using the best of the best of IBM research for it's clients. The SONAS solution is designed from the ground up as a true blue NAS storage solution. Look for future SONAS blogs on GPFS, creating an ILM strategy and more.
Today, I helped our local Client Engineers install a couple of new nodes and some more storage into a local SONAS system. This was exciting for me as I love working with the hardware and software and it keeps up my keyboard skills. This client is bringing online more demand and needs both horsepower (interface nodes) and storage to accommodate a new business line. I was amazed at how easy the system is to upgrade and soon his little starter rack is almost full.
We added two interface nodes, IBM xSeries 3650 m2 and two 60 disk shelves to the unit. Once the disks are online and presented up to the interface modules, they can start creating shares for the new operation. As they need more storage or more interface nodes, another rack will be but in and the same process of pooling these resources together will come together.
The idea of having multiple interface nodes and storage pools is to not have single points of failure. In traditional storage, if a controller goes down, its partner has to pick up the entire work load for the down hardware. Not so in SONAS, if a controller goes down, the work is then evenly spread across all of the other nodes in the system. This is why we do not have a problem of loosing CIFS connections when systems go down.
The addition of new storage is also interesting as we are tripling the amount of storage the base system had originally with two 4 U shelves. These shelves are highly dense, top loading containers using either SAS or SATA disks. In this instance today, we were installing 120 2 TB SATA drives. A total of 240TB in 8 U of space. Not too shabby.
At the end of the day, I was pleased to see that IBM is moving forward with smarter storage systems. If you look at the entire portfolio, you can see that our systems like XiV grid, the auto tiering on DS8700, SVC virtualzation, all of these systems are helping our goal of a Smarter Planet. Look for some more pictures and maybe a video on Monday.
Richard Swain 060000VQ8G firstname.lastname@example.org Tags:  compression rtc nas storage ibm pkzip race 805 Visits
Labor day has come and gone and so has all of the holidays between now and Thanksgiving. This is only augmented with the hope that your favorite football team (both American football and what we call Soccer) has a great weekend match and you get to celebrate with the beverage of your choice.
During your work-week, which can and sometimes does include weekends, all you hear is no more money to do the things you have to do to keep the business running. If you have kept up with squeezing more out your systems with virtualization that’s great but your network is now overtaxed. The staff that used to take care of certain aspects of the day to day running of your data center has been let go and their job has been ‘given’ to you with no thought of compensating you for the extra tasks.
The Earth is warming, the weather is out of control and the price of gas is so high that you decide to bike to work to help save the planet. You spend more time on the road commuting and look like you need a shower when you get to work after dodging traffic all morning. Your coffee is priced higher now because the coffee house wants to use Fair Trade coffee from farmers in a county you have never been. And your dog is on anti-depressing meds because you are not home as much and he can’t go out in the yard because of the killer bees migrating north from Mexico.
Our lives seem to be getting more complicated and it’s nice when we find things that not only help us but are easy to use. When you come across these items they make such an impression that you like to tell others about your great fortunes. I came by a solution that was very easy to use and the value was so great that at first I didn’t believe the whole story.
About a year ago, I was asked to help out on the Storewize/Real Time Compression (RTC) team as it transitioned into the IBM portfolio. I met with the engineers and sales people and all had wonderful things to say about the technology. I listened but was hesitant to drink all of the kool aid they were pouring.
A year later I am very much a believer of the RTC technology and think it really could be a game changer in the market. If you keep up with IDC, Gartner and the other analyst, they all point to compression of the data as being one of the larger items for handling future growth. There are a lot of vendors that claim they can compress data but it’s not all done the same.
One of the things that stood out from day one is the idea of using LZ compression in real time to compress data instead of deduplication. Coming from a N series (*Netapp) background I understood how deduplication works and where it was useful. But this was compression which is a different ball game. Now we are able to shrink the storage footprint that wasn’t exactly the same as before. Given that Netapp has issues with block size and offsets, this is exactly what is needed in the market.
The next question I always get and one I had was “That’s great, you can compress data with the best, but whats the overhead?”. I waited a long time to see what the performance numbers were going to be and found an astonishing outcome. The RTC appliance made a performance improvement on the overall solution. It does help by adding cache and adding processing to the serving of data but it also improves the performance of the system by having to process less data.
For example, if a system has to save 100GB of data with no compression, then all of the data has to be laid out on the disk, that sping for 100GB of data, cache, CPUs, I/O ports all have to work harder to save 100GB of data. But if we get 2:1 or 3:1 compression ratios, then all of the components have to work less. No longer are they working to save 100GB of data but 50GB or 25GB or data. This allows the system to process more data and have cycles to respond quicker to I/O requests (IE lower latency).
So the final thing is always the question of how hard is this to install. Is there a period of time that you have to wait or have 5 IBM technicians to install it. All I have to say is its easy. So easy that there is a good YouTube video that goes through the entire process of unpacking to racking to compressing data. I think the video speaks for itself:
Richard Swain 060000VQ8G email@example.com 800 Visits
One of my favorite TV programs is the BBC show Top Gear. They go through and test cars not only for handling, looks, and cup holders but mainly for power. At the end they run all of the cars through the same test track and get a time. That time then gets recorded on their list of all the cars tested and is celebrated for achievement or scorned at for doing poorly. No matter what the car turns up, they were all treated equally.
IBM put together a SONAS system consisting of 10 interface nodes and 8 storage pods with all SAS disk. A total of about 900TB of usable disk, and about 1/3 of the maximum SONAS configuration. There was no solid state disk or extra tweaks done just a SONAS system that you could order today. That said, the IBM SONAS set a new world record for performance for a single file system at 403,000 IOPS per second.
Yes you read that right, 403k IOPS in a single file system. If you look at the other vendors they have used multiple file systems to aggregate the performance together in order to achieve a benchmark. Then they tend to use a virtual name space with software that is layered over all of the file systems, but here SONAS is one file system over 900TB with a true global name space. Some issues with multiple file system is they cannot stripe data across the file systems and the load balancing becomes an issue. If you look at the comparison of performance per file system, you can see that IBM is WAY beyond the competitors.
So you maybe asking, "Yeah that's pretty cool but what was the response time?". According to the test, the average response time was 3.23 MS from 0 to 403k IOPs per second. This is extremely good and when you think that was coming from one file system of 900TB, you realize how good that number is compared to other results.
There will be tons of vendors trying to debunk how IBM out performed them and how they have better software or better market share but it really boils down to these key points:
· An all-spinning SAS disk SONAS configuration, typical of SONAS configurations being installed today
· Single file system featuring ease of use, minimum complexity, global load balancing, sharing of resources, proof of scale
· 903 TB usable capacity is indicative of current real life customer scale out NAS requirements
· An environment in which all applications would benefit from the single file system and benefit from the high IOPs and excellent response time
· One can clearly correlate the SONAS SPECsfs benchmark with the response time received to a real world application by today’s SONAS
I have included the slide deck for the announcement below. Feel free to check out the information on the SPECsfs website.
I was driving into the IBM Almaden Research Center and just enjoying the beautiful scenery of the San Jose area. The campus is on top of a hill and surrounded by farm lands. I would really like to have a corner office here, but I don't think I would get much done. So here is my Vlog for this morning and I am hoping to get some interviews on here from some of the presenters and attendees.
Here is a great demo from my friend Ian Wright, who is an excellent engineer. Keep on the look for more videos from him!
I was working with a company today that had 5 storage vendors supplying gear for their data center. They were interested in SONAS not only for the scalability but for the idea that we could consolidate their storage footprint from 5 vendors to 1 with out having to sacrifise performance. Now must vendors would say, sure we can rip and replace all of the storage there and replace it with their own, but can they do it to the same scale (both horizontal and vertical), with a high performance engine like GPFS, one global names space and one single management tool? The customer was really impressed by our solution as is very interested in how we can help him go from 40+ racks of storage down to 13 and increase his efficiency.
The idea is to use multiple teirs to put meta data on the faster disk, data that is being accessed currently on a medium speed and then 'near to archive on slow, fat disk. Currently they have no way of moving data between their 5 different systems. This causes them to keep running into issues where they are buying expensive disk to run applications/databases where they have data that is stagnant or non as important on the existing fast disks. SONAS will allow them to create pools of storage (tiers) and policies to move the data between the pools. Then with the HSM integration, the data can be archived off to tape until needed down the road.
From a cost perceptive, the cost associated with the size and speed of the disk match their purpose with out them having to manually move the data themselves. With IT departments that have a huge growth in storage and are not hiring more admins, this makes sense. They can set the policies up front and not have to worry about some old file sitting on a expensive SAS disk. The other expense savings is in support contracts. The reduction from five support contracts to one will save customers money, and allows them to have one place to get all of their support.
I can not wait to start work on this account as it looks like we will be putting in a great system and helping a client save money.
I was on my way down to Miami today and was talking to the gentleman sitting next to me about storage technology and the conversation turned to how everyone is scrambling to be in the cloud business. He had heard multiple vendors come in and start talking about cloud technology and how it was going to save him money, time and effort. This gentleman worked for a retail chain that has multiple district offices through out the eastern US and headquarters in Atlanta. He has multiple technologies all helping him keep the business running but nothing planned and as the company grew, they simply cookie cut the previous installation and planted it into the new office. Each office would also replicate back to HQ and that would be the main repository for backups/restores. I would guess there are thousands of companies out there with similar setups.
So instead of going into how he could leverage cloud storage technology, I asked him what were his problems and listened. They basically came down to this:
1. Multiple independent islands of storage that are aging, causing his support contracts to go up.
2. Backups take way to long and systems are slowing down as they get closer to 'capacity'.
3. Future growth was expensive as every time they added a new capacity, they had to add entire systems.
Now they were not cutting edge technolgy leaders nor were they wanting to be, but he needed a way to solve some of these traditional storage problems. He didn't want to go out and buy a new large system that would take forever to get in and while it may solve his problems, it would bring in even more issues. What he needed is less overhead and more throughput.
We sat there for a while thinking, we didn't say much until I offered this tidbit, "So what does cloud mean to you?" After a nice laugh, he stated that he really didn't know and the more he read, the more 'cloudy' became the answer.
There are many interpretations about what cloud really is and it differs between storage vendors. If there is a true declination of what cloud storage really could be, I think it could be defined using NAS technology. NAS lends to be a kinder and gentler protocol set and the need is growing leaps and bounds. Our traditional way of adding more systems and creating more independent silos works for smaller environment but it does not scale when clients want large disk pools of storage under one umbrella. There are ways of making volumes span in to large pools but the underlying storage is still made up smaller components that are typically active/active/passive nodes, even the best load balance will not help if you are overloading that system.
There are ways to find a balance between the same old way and going out and dropping tons of cash on huge storage gear. Find a system that will grow and scale as your storage needs do. Think of ways to keep everything under one umbrella (name space for example) and also try to solve issues that you are having today with real technology and not work arounds.
With NAS technology, we will always be at the mercy of the backup target whether its disk or tape. No matter if we are taking snapshots or ndmp backups, we have to write that out to some target to have a restore point. This is your basic strategy on how to do a backup/restore, why not consider using different types of disks to create a tier and offload disks to slower pools as the data gets 'older'. A few vendors have said there is no need for tiering, mainly because their systems can't take advantage of this and therefore they shun those who do. ILM tiering can help you achieve not only higher utilization rates with the storage but it puts the data that is accessed more frequently on faster disk, and moves the rest away to makes more room. Why pay for fast disk if the data on it is not being accessed frequently?
Future expansion has always been tough for administrators, they tend to over buy on controller size and skimp on the disk. Systems like SONAS from IBM allows you to grow both in storage capacity and server throughput; independently. If a customer needs more storage but doesn't need the additional throughput, why force him to add more controllers? SONAS systems can scaled up to 30 storage servers and 14.4 PB of spinning disk all under one name space. No more having multiple nodes with their own names; this storage is called Accounting1, Accouting2 .... etc. They are called Storage and everyone gets the benefit of having all of the nodes, not just one system.
By the time we had gone through all of this, our flight was landing. It was a great talk and both of us gained a different perceptive on how cloud is perceived. If any of you want to find out more information about the IBM Cloud strategy or SONAS go to the following links:
SONAS by IBM
This weekend I was working on moving some of my winter clothes and spring/summer clothes in and out of my closet and into containers. Last Fall I purchased a few plastic containers that sealed so I could put my short sleeve golf shirts away and some of my shorts. Here in North Carolina, we can get a mild day and it is nice to have a short sleeve shirt to wear. On these days I would go back to the containers and dig through the nicely folded items until I found the shirt I wanted. Sometimes I had to go through multiple containers because I had forgotten which one I had put it in a few months ago. This weekend when I pulled out the containers they were in a mess, nothing was folded and it took me more time trying to figure out what was what as they all were mixed up. I then wondered what if I bought a bigger container and instead of using multiple ones, I could use one large container to store all of my winter clothes? What would the issues be, would I have enough space to store the container? Would there be someway of indexing the clothing inside to quickly find what I was looking for? Was there a way to put some clothes that I would need in case of cool day in a separate container just in case I needed them?
There seemed to be more issues with just using one larger container than I thought. It would be easy to dump all the clothes into the larger bin and claim victory but that did not help me down the road. I needed a system, something to help me consolidate efficiently while still giving me access to those things I used on occasion. I also had to keep in mind the space I was going to use in my storage area. I didn't want to buy one large container and not be able to get it in the space I already had allocated. I needed a flexible system, maybe a few boxes that had labels and I could get to quickly if I needed something inside.
Take a look at some of the noise some of the storage vendors are making about data storage consolidation. Most of them are telling you we can come in and take your smaller boxes and dump the data into one big box. While that helps on saving you space and might keep you from administrating multiple storage devices, you need to look at the downside of just having one big pool of disks. A large storage system that is replacing multiple smaller systems will need more cache and processor power to handle the same load as before. If you want to move data around to different tiers of disks or tape, can you achieve that with the new system?
I started down the road of buying the biggest container I could find but decided against it as it would be too much trouble to find things. Your data storage systems need to be flexible enough to have multiple storage pools so that data that is able to move off faster disk and make room for data that is accessed more frequently. This not only allows your clients to have better response times on files they frequently use, but it tells you how much 'real' data people are using in your data center. The other issue I had was I needed some type of labeling systeming or an index to tell me where the shirts were and were was my ski jacket, etc. Your data is much the same, you need to keep up with where data lives in the storage system. As our storage systems get larger, we need save the meta file data easily and keep it in a table so we can run queries against it.
There is also last part of moving my clothes around that I hated the most, the purge. I went through and found the shirts that had been worn too many times or may not fit the same as when I bought them originally. I packed these in a cheap cardboard box and took them to a donation place. This is the same as getting rid of old data in your system. Old data that is not being accessed is costing you money. You not only have to pay the environmental cost of keeping those bits spinning, but its taking up room where new data could reside therefore costing you money to expand. True archive and purging of data will be needed for any system large or small. Make sure you find a system that is easy to work with and automates this process based on policy.
In the end, if you are looking at consolidation of your data storage, there are multiple things you will need to find out about a system. Just because a bigger container can replace multiple smaller containers may not give you the flexibility needed to meet your changing needs. For more information on a better way to consolidate your storage platform and moving your data, check out the information on SONAS and TSM.
Sorry Bill, there is a new question burning in our minds today. There seems to be a lot of buzz lately about tiering your data
storage and who can and who can not, why and how but not alot of people
are talking about when to tier your storage. Netapp has indicated they
are not as concerned with a tiering approach and this is true for the
IBM XiV product. Others like 3par and IBM' SONAS has it built in for
clients to move data from one pool to the next. But how does one gauge
this old standard of giving the best to the most demanding and the
least to the dregs of our storage footprint?
In answer to your requests for IBM N series demos, Andrew Grimes will be delivering a demo on Thursday, March 11th. This Introduction to IBM N series will be followed by a brief and informative demonstration of how N series delivers storage efficiency with disaster recovery solutions. This is your opportunity to demonstrate N series features and ease-of-use to your customers and prospects, plus get some assistance in closing business this quarter. All attendees who fill out the post-event survey will be entered into a drawing for a free Apple iPod.
WHEN: Thursday, March 11th, 10-11:30am CST.
PRESENTED BY: Andrew Grimes
Click here to REGISTER TODAY!
The topics that will be discussed during this N series presentation are:
1. Simplifying Data Management
2. Storage Efficiency
3. Protecting mission critical business applications (Oracle, Exchange, SQL, VMware & SAP) better than our competitors
4. Most importantly, see how we recover these applications in a matter of minutes!
The old adage of faster, smaller, cheaper has been revived in the N series product line. This week (officially) IBM released the information around the highly anticipated OEM re-brand of Netapp's FAS 2040; the N3400. This system has a small 2U form factor but delivers higher performance than its beefier brother the N3600. If you want to see a full comparison of the three boxes, click here for more information.
IBM has three systems that round out the entry level or departmental storage platform. The N3600, the N3300 and now the N3400. All three are based on internal drives with some expansion to a few shelves as needed. The N3600 comes with 20 internal drives and the smaller N3300 and N3400 comes with only 12 internal disks and can expand to a maximum capacity of 136TB. There are two controllers that allow administrators to have a high availability solution for low cost. This makes the system more attractive as it also supports FCP, iSCSI, CIFS and NFS all from one platform.
The N3400 does have a few things I want to point out:
All of these help set this box up for an important role within your datacenter. If you compare this system with other storage systems in the market, you find the new N3400 is well stacked and can compete even with larger mid-tier systems. This box is ideal for our SMB clients who really need the all in one system with the horsepower to keep up with a growing company. The system is a long way from the first entry level system IBM decided to roll out, the N3700. If the two were to be compared the N3700 would be a 'Happy Meal' and the N3400 would be a super sized 2lb Angus burger with fries and shake, maybe even an apple pie.
This new system is considered ideal for both Windows consolidation and virtual environments alike. With the additional ports the system does leverage a larger life span as the new EXN 3000 SAS shelves are becoming more of the standard for the N series product line. The system on the other hand does not support 10GBPS cards or FCoE as the N3600 does. But as all N series systems support the same Data Ontap code, the robust system uses the same commands, interface and is built on the same technology as the other N60x0 and N7X000 lines.
Overall, this is an enhanced refresh of the exisitng N3300 with more ability to scale with currently technologies. The performance will be more than the N3600 which begs the question of the need for the N3300/N3600 systems. I suspect as Data Ontap 8 becomes general available from Netapp, there will be more entry level storage devices released.
For more information on the N3400 and all other N series related information, follow this link or contact your local IBM Storage Rep.
IBM Storwize® V7000 Unified stores up to five times more unstructured data in the same space with integrated Real-time Compression
Richard Swain 060000VQ8G firstname.lastname@example.org 696 Visits
IBM Storwize® V7000 Unified stores up to five times more unstructured data in the same space with integrated Real-time Compression
Today IBM announced the enhancement of compressing not only block data on the V7000 but also now it includes the file data on the V7000 Unified. The V7000 was first set up with compression back in the summer with a big announcement surrounding “Smarter Storage”. This optimization was the same code and engine that was purchased from a company named Storwize a few years ago.
IBM initially kept the compression appliance that Storwize was first known for in the market. Using LZ compression with a RACE (Random Access Compression Engine) providing an optimized real-time compression without performance degradation. Thus slowing down data growth and reducing the amount of storage to be managed, powered and cooled.
The compression does not require the compression or decompression of entire files to access the data block. The engine will compress and decompress the relevant data blocks “on the fly”. As data is written the RACE engine compresses the data into a smaller chunk and its 100% transparent for systems, storage and applications.
The V7000 Unified can now deliver a larger compressed platform than any other mid-range platform. With compression percentages around 75%, a system that was maxed out at 2.8 PB (960 drives x 3TB each) can now see the system handle up to 5 PB of storage.
Each V7000 Unified with code base 6.4 has the option of turning on a 45 day trial of the compression software. After setting the license to “45” then you can add new compressed volumes on the system. You can also compress data on virtualized storage arrays.
Compression has been part of NAS for a very long time. We have seen compression of files from jpeg to office documents. But the best part is the end user will never have to worry about which files needed to zipped or compressed. Everything that comes through the V700 Unified can be compressed in line before it writes the data to disks.
A couple of other improvements that IBM announced were the addition of a integrated LDAP server to V7000 Unified. This now allows customers to use both local authentication and external authentication servers to allow access to data. Another feature was the ability to upgrade a V7000 to a V7000 Unified in the field. If you currently own a V7000 but need to add file access to the system, IBM will sell you the two file modules and corresponding software to upgrade you system. Now mind you there is a list of requirements that will need to be met so check with your local storage engineer for more information. And finally we now have support for a 4 way cluster on V7000 unified. This allows for more disks to be provisioned and can compete with some of the other mid-range storage platforms in the market.
This all together makes a nice round of improvements that will make life easier for IBM customers. As the V7000 platform matures it looks like IBM is putting their money where their mouth is and making storage smarter and more efficient. More to come on this platform as I suspect we will see bigger things down the road.
Do you expect more out of your storage? IBM thinks you should and is putting their money where their mouth is. In the past it has gone under different names like STG University and Storage Symposium, but now IBM has revamped its premier storage conference. The big announcement came today with much fanfare that included a new website, some videos and bunch of hype on twitter. A three part conference for executives, gear heads and business partners there is something for everyone. But what will be different tham years in the past? I think IBM looked around how other vendors use conferences to help pump up its customer base (VMWorld, EMCwhatever) and decided to put some hype in the conference.
Think of this as a great place to go and network, learn and have a good time. The conference will be in Orlando and there will be tons of time to sit in class rooms and learn about the latest technologies but there will be sessions where IBM will be pulling in our top execs and analysts to tell you where IBM is going in the storage world.
The Executive Edge will feature different speakers from Jeff Jonas, Aviad Offer and IT Finance expert Calvin Braunstein. This track will take executives through new announcements, deep dives on technical platforms, one on one sessions with IBM Execs and some great entertainment. This is a new feature of the conference as in the past it was more geared towards the technical teams.
Of course the Executive Edge will be limited so talk to your local storage sales person to get a chance to be a part of this special event. There will be time to bring in your team and have special sessions and round tables with the IBM engineers who can help you find your way down this path of crazy storage growth. And there is a golf course on site which I have heard is very nice. Bring your clubs or rent them, I am sure there will be plenty of us out there so find a partner and have a good time.
More importantly IBM is making the effort to step up the event and have it on par with the other IBM conferences like Pulse. The technical portion will have over 250 sessions on storage related topics. You will also get road-map information from the product teams as well as a chance to become a certified technician. One area that has been expanding is our hands on labs and this year we will have the biggest one yet. You will be able to come in to the labs and actually see our storage systems and have a chance to 'test drive' them.
Early bird registration is open now
and you can sign up today. The conference will be in sunny Orlando
Florida at the Waldorf Astoria and Hilton Orlando at Bonnet Creek. The
event starts on June 4th and runs to the 8th. You can follow the
conference on twitter @IBMEdge and use the hashtag #ibmedge For the conference website go here
I look forward to seeing you in June.
My father is a retired teacher but loves to work with his hands. I can remember very early on in my up bringing, him teaching me that it is good to measure twice and cut once. Whether it was building a deck or just a bird house the point was it took more time to cut something wrong and then has to re-cut the board shorter or even wastes the old board and cut a whole new one.
When I was preparing for this article I remember having to learn that lesson the hard way and how much effort really is put into that second cut. The problem in the storage industry is the misaligned partitions from a move of a 512 byte sector to a new 4096 byte sector. This has to be one of the bigger performance issues with virtualized systems and new storage.
Disk drives in the past had a limit on the number of sectors to 512 bytes. This was ok when you had a 315 MB drive because the number of 512 byte blocks was not nearly as large as what is in a 3 TB drive of today’s’ systems. Newer versions of Windows and Linux will transfer the 4096 data block that match the native hard disk drive sector size. But during migrations even new systems can have an issue.
There is also something called 512 byte sector emulation which is where a 4k sector on the hard disk is remapped to 8 512 byte sectors. Each read and write would be done in eight 512 byte sectors.
When the older OS is created or migrated, it may or may not align the first block in the eight block group with the beginning of the 4k sector. This causes misalignment of a one block segment. As the reads and writes are laid down on the disks the misalignment of the logical sectors from the physical sectors mean the 8 512 byte blocks now occupy 2 4k sectors.
This now forces the disk to perform an additional read and/or write to two physical 4k sectors. It has been documented that sector misalignment can cause a reduction in write performance of at least 30% for a 7200 RPM hard drive.
This issue is only magnified when adding other file systems on top of this misalignment. When using a hyper visor like VMWare or Hyper-V, the virtual image can be misaligned and cause even further performance degradation.
There are hundreds of articles and blogs written on how to check for you disk alignment. A simple Google search of the words “disk sector alignment” and you will find this has been a very popular topic. Different applications will have different ways of checking and possibly realigning the sectors.
One application that can help you identify and fix these is a tool called the Pargon Alignment tool. This tool is easy to use and will automatically determine if a drive’s partitions are misaligned. If there is misalignment the utility then properly realigns the existing partitions including boot partitions to the 4k sector boundaries.
I came across this tool when looking for something to help N series customers who have misalignment issues in virtual systems. One of the biggest things I saw as an advantage was this tool can align partitions while the OS is running and does not require the snapshots to be removed. It also can align multiple VMDKs within a single virtual machine.
For more information on this tool and alignment check out the Paragon Software Group website.
In the end, your alignment will effect how much disk space you have, how much you can dedupe and the overall performance of your storage system. It pays to check this before you start having issues and if you are already seeing problems I hope this can help.