One thing I've been working on in the past few months is Pulse, and specifically defining the content of the Hot Topics in Service Management track. Doing this is by turns both fun and frustrating. Fun because working with all the new and breaking trends over a wide range of topics is interesting. Frustrating because there are only twelve one-hour sessions to fit everything into.
I'm pleased with how this has shaped up, despite the time constraints. We've got a good selection of topics that covers most of the critical, challenging, issues that business and IT leaders are facing today. We've added more client speakers to share their real-world experiences (this is true across the whole of this year's conference), while still putting up the key IBM executives and technical leaders people want to hear from. We've added a couple of panel discussions this year to add some variety to the session format - and to build upon last year's great audience interaction.
So what are we going to cover?
The track kickoff this year will be lead by Laura Sanders, Tivoli's VP of Development, and Wing To, Tivoli's VP of Strategy and Product Management. Laura and Wing will provide this big picture overview, the "State of the Nation" if you will, of the industry and service management, where we are, where we're going and how - most importantly - to move forward in an business environment that is demanding results but still keeping a tight rein on investment. Joining Laura and Wing will be a customer who will be sharing their experience in implementing a service quality program across their organization.
The IBM CIO study will also feature in Hot Topics again this year. A popular session at Pulse 2009, this presentation is a readout from the results of IBM's annual CIO study, and provides insights into what a large population of CIOs have on their minds this year. The last study was being undertaking just as the worst of the financial crisis hit in late 2008: it will be interesting to see how CIO outlooks have changed over the past year.
Of course Cloud Computing remains very much a hot topic as we enter the new decade. Two years ago, at the first Pulse conference in Orlando, the cloud discussion was very much focused on possibility: the idea of cloud computing was just emerging and the talk was all about the potential benefits that could be had if and when it came to fruition. Last year the cloud discussion was around different approaches - accompanied by a very healthy and robust side debate on exactly what was the true definition of a cloud! This year, with more experience under our belts as an industry, the focus will be on what works and how to get results, however you want to define cloud computing.
Cloud and related topics will take up fully a third of the week's agenda, as might be expected. After that we're going to focus on an array of other hot topic areas: integrating IT management with the management of "real-world" devices, alternative delivery models for service management solutions, storage and infromation management, building applications that are "management-ready", security of mobile networks and applications, and a look at how managing power networks is evolving and what we can learn from that.
As I said there's a lot to fit in and some topics, unfortunately, had to give. That said, if last year's Hot Topics is anything to go by, it will be a lively three days, with discussion and networking amongst peers facing the same challenges rounding out a very full agenda.
See you in Vegas!
Wow, it just dawned on me that tomorrow is my last working day (in theory at least) of the decade. Time for some predictions! I've seen a lot of predictions for 2010 itself, but I'm going to shoot for the whole 10 years. That way I'll be closer to retirement before I'm held up to ridicule...
Here we go. By the end of 2019:
Happy holidays, and enjoy your decade.
Pete Marshall 110000HXS0 email@example.com Tags:  management ibm websphere rational infrastructure service-management pulse information dynamic 2010 tivoli 569 Visits
In response to: The BIG Questions at PulseThe BIG questions for the HOT TOPICS track at Pulse:
- what's on the mind of my fellow CIO's and IT strategists?
- cloud computing is a great idea, so how can I effectively move my existing business systems?
- can someone share their successful cloud implementation story with me?
- cloud is great in theory, so what are the specific opportunities in my industry?
- how do I lower costs and improve service through virtualization?
- how do I leverage IT in the real world and start making my business part of a "Smarter Planet"?
- how can I leverage the benefits of software as a service and appliances to lower costs and improve quality?
- how much process integration is enough, and what's the right balance between big projects and point solution implementations?
- how do I manage the data juggernaut?
- how do I get my development and operations people to work together to improve my end-user's lives?
- what are the best practices in managing mobile devices and maintaining a secure environment?
- as we roll out smart solutions, can we avoid making all the same mistakes we made when moving to the Internet?
I'll post more soon...
Pete Marshall 110000HXS0 firstname.lastname@example.org Tags:  service-management analytics cloud-computing cloud 520 Visits
IBM announced today that it has set up a huge internal business analytics cloud and is throwing it open to more than 200,000 employees to enable them - I should say us - to make better business decisions. Details are already out on IBM's intranet and I'll looking forward to taking it for a test drive.
This is going to be a learning experience in more ways than one. At the business level it's going to interesting to see what the impact is on planning and decision making. IBMers like to have facts to back up their proposals and a lot of time and effort goes into finding and analyzing data. This cloud - along with its petabyte-plus of data should have a huge positive impact on productivity and the time it takes to get questions answered. As well as that, being able to analyze datasets in new ways quickly should help folks get a deeper understanding of customer needs, business trends etc.
The other interesting experience that is going to come out of this, I believe, is real-world experience in running a cloud in that most hostile of environments: servicing the empowered knowledge worker! If cloud computing is all about agility and the ability to respond to rapidly changing workloads, offering a cloud service to a huge community of smart folks in search of just the right piece of data to make a business case is absolutely the right way to find out if you are really up to the task!
Knowledge workers can be hugely disruptive to IT. There's no telling what they might want or try to do. At one end of the scale they go off and try and do things themselves. Twenty years ago that led to the breaking of the IT department's hegemony and, arguably, to highly fragmented data environments most organizations face today. In turn this has led to that most modern of problems: people downloading huge amounts of data to their laptops to do their own analysis which leads to a huge exposure to data loss or theft.
Going to an internal, behind the firewall, business analytics cloud makes a huge amount of sense from both a risk management and cost perspective. Information can be better controlled and analytic tools (beyond spreadsheets) can be deployed without having to manage across thousands of laptops. The downside of all this is workload management: an analytics cloud sets itself up to make the pre-Christmas holiday (or any other natural business cycle) workload spike look like gently rolling hills.
I've been constantly amazed by what smart non-IT people can get up to. One incident stands out in my mind, and it's a good illustration of the power of the end user to wreak havoc. I was working with a customer on a database performance issue. Everything was fine except for the occasional moments when response times would become really bad and the system would eat up huge amounts of systems resources: real "lights going dim" moments from a workload perspective. Eventually we tracked the culprit down to an end-user who had been given access at the SQL level and who was busily writing and submitting queries. And what queries! The "structured" in structured query language was apparently named after the ability to nest queries (i.e. SELECT ... WHERE (SELECT ... WHERE), that kind of thing) which this user had completely grokked and was using to the max. Now no program was ever going to do this, and no knowledgeable IT person would submit such a request except, perhaps, in the dead of night.
What this user was doing through these complex queries was generating requests that could not be optimized given the current database structure. Every time the user hit send the database essentially gave itself over to looking through every record in multiple tables, driving resource usage through the roof and transaction response times out of the window. Interestingly this customer took an enlightened approach to their user: rather than banning them for life, they built indexes to support the queries and provided some tutoring on how to write less complex queries.
Now this customer was a fairly small shop that had only a handful of end users accessing their databases directly and, fortunately, only one of these user who knew enough to get themselves - and everyone else - into trouble. Now scale that up to a medium to large scale enterprise with a huge amount of data and potentially thousands of users. Then say to everyone of these users something along the lines of: "go ahead, knock yourself out!" That is truly setting you up for a challenge.
It will be interesting to see how this works in practice. What is certain is that, as well as learning much about how liberating business data across an organization plays out from a business perspective, there are going to be some great lessons in making clouds work in extreamly unpredictable workload environments.
One of the givens in our business is "the gap" between IT and the business. Business people are frustrated by it, IT people are trying to narrow it, and there are any number of ideas about how to bridge it. Some people take a more positive approach and spurn the use of the word "gap", preferring instead to talk about the need to align IT to the business (it always seems to be "the" business for some reason), but things that are out of alignment have gaps between them, so it's the same thought.
Now there's no doubt that this gap has narrowed over the years: IT departments are much more closely aligned with the needs of the organization than they were in the past and continue to get better, a process of continual improvement that has been going for good number of years now.
But the gap still persists, and the need to close it seems to be as urgent as ever. What's going on? Is the gap actually closing? Or is it actually just as bad as it used to be? Or is it, in fact, just something that is no longer real but the idea of it has got stuck in our collective minds?
I don't have a simple answer to this question. I doubt one exists. There's a lot of variation from organization to organization. But I do think a lot of the time this idea of "the gap" goes unexamined and as a result people can get overly anxious about it or, depending on your organizational zeitgeist, resigned to it. I do think that examining what's going on in a little more depth can help get beyond these reactions and provide a way to move forward. Specifically I think there are three or four "givens" about the gap that need re-examining:
Given #1: The Gap can be closed, and until we do close it we've failed.
I doubt the gap can ever be closed. At the end of the day the gap isn't a measurable thing, it's really a statement about how one set of people - "the business", whoever they are - feel about the performance of another set of people - the IT department. So given that this is primarily a perception issue, and given that we're human (and tribal, even inside businesses) it's unlikely that the gap will ever close completely. We should stop worrying about it and concentrate on maximizing, to the extent that we can, customer satisfaction with IT.
Managers of a quantitative nature might object to what I am saying here. Can't we create metrics that indicate the size of the problem and then bring them down to zero? Certainly you can try - and many organizations would benefit from doing more of this: agreeing on metrics can really help with the perception problem - but even if you get to zero it is only human nature for people to want more and the gap will still be there. Bottom line: closing the gap is a journey, not a destination.
Given #2: It's all IT's fault
It take two sides to make a gap, yet the problem is usually expressed as being solely an IT issue. This is simply not true: "the business" (again, whoever they are!) need to step up as well.
The biggest issue here is expectation, especially expectations of what can be achieved in a particular time-frame for a given budget. Now IT does have to shoulder some of the blame here; we have had a history of over promising and indulging in what Brooks described (back in 1975!) as "gutless estimation". That said every business manager still working today grew up in the age of computers, and those under, say, 50 years of age really have no excuse. It's not that every manager needs to be some kind of IT expert (some people will disagree with that statement) but the willful "I'm just the business guy, don't tell me what you can't do attitude" that still persists does not help.
The good news here is that, over time, this problem will take care of itself as workplace demographics change. In the meanwhile, encouraging a more balanced view of the realities of IT (especially time vs quality vs budget) is called for - and it really is up to IT to call for it.
Given #3: It's only IT
When people talk about "the gap" there's often an unspoken assumption that this is an IT-only thing, that, in fact, the rest of "the business" gets along just fine and IT is the only problem child. This, clearly, isn't true: many organizations have different problem children and the same kind of gaps exist all over the place. Again points 1 and 2 above play a role here: IT probably gets singled out simply because it is the child, and, again, over time this problem will decline as more and more people outside of IT have a grasp of at least some of the issues.
Given #4: It's all a bad thing, this gap.
Actually it's not all bad. There should be some kind of gap between IT and the business. If we did somehow eliminate the gap it would probably cause more harm than good. Consider the gap between, say, accounting or legal services in an organization and the (rest of the) business. These gaps exist because these organizations are help accountable not only to "the business" but to shareholders and the law. Businesses that have exactly "aligned" finance and/or legal services have often resulted in people doing significant jail time!
At the end of the day, IT people, while being supportive of the businesses they work for should hold themselves to outside standards. For most people in IT this isn't - hopefully - going to some down to an issue of breaking the law, but I do believe we should as IT practitioners hold ourselves accountable to doing the right thing when it comes to IT, even if the business wants to do something wrong. Yes, that can be difficult at times. Yes, that does mean there will be gap between IT and the business for all time. But, in the final analysis, doing things this way is, by definition, doing things in a professional manner.
Pete Marshall 110000HXS0 email@example.com Tags:  service-management lifecycle-management tivoli rational 1 Comment 864 Visits
The idea of rapid, iterative, development has been with us for some time now. The basic idea is you replace the whole specify, code, unit-test, system-test waterfall with one where you add or modify something relatively small, test it out, and if all is OK then do another cycle with something else relatively small. Rinse and repeat. The advantage of this approach is that you can engage the business customer with the key question: "did we interpret this requirement correctly?" much earlier on and avoid going too far down the wrong road.
Like most things in IT, the merits and problems of this approach are still up for debate, and exactly which is the best approach to this kind of development is even more hotly debated. But the key is the tooling is there to do it if you want to.
This iterative approach is also getting into production, especially in the Web 2.0 world. Not sure if customers will like feature X? No problem: code it up, put it into production, see if people like it and react accordingly. There's a lot of value in this approach.
Of course this is much easier to do in the Web 2.0 world than it is in the typical mature enterprise. Governance issues aside, the complexity of the enterprise environment makes implementing rapid deployment a lot more challenging. The word legacy is used pejoratively in our business but there's no substitute, value-wise, for working systems and established processes.
So, let's assume you have an established development process on one hand, and an established service management process on the other. Getting into rapid, iterative, deployment of applications is valuable, but perhaps not at the cost (potentially huge both in terms of change and risk management) at tossing out one or the other - or both - sets of processes and tools.
So what we really need to do is to link our processes and tools while maintaining the value we already have in place. And this is exactly what Tivoli and Rational are working on and announced today. There has been integration in the past but these announcements are really starting to flesh out the capabilities:
The new integrated software offerings include:
I am sure we will see more capabilities and sharing over time - this is, after all, only the latest product of the "Rativo" integration work that has been going on for at least five years - and I'll watch eagerly as clients start to create integrated processes and move to a develop-test-deploy cycle for all types of applications.
I've often heard people use an analogy that selling the benefits of service management is a little like selling insurance. The problem with service management is that, like insurance, it's all about avoiding the consequences of bad things (for example outages or car crashes) in general people a) don't want to think about it, and b) don't think it will happen to them. I first heard this analogy in my first week in the software business (let's just say during the Regan Administration) and a couple of weeks ago.
I get the point and I don't doubt that people would rather hear about, and spend money on, new applications rather that service management, in the same way they would rather spend money on new cars rather than auto insurance. That said, the analogy isn't really sound.
First off, it's wrong to think about service management as some kind of tax that comes with the purchase of infrastructure. A better way of thinking about service management is to think of it as the rational design trade-off that lowers the cost of ownership and optimizes ROI of any system implementation, a subject I went into more depth on here.
Secondly, the way service management works is entirely different from the way insurance works. Service management is like magic insurance! With ordinary insurance I pay my premiums and, should anything go wrong, I'm protected from the financial consequences. The magic of service management is that you pay your premiums and things actually get better! Implement security management and you get fewer break-ins. Implement application management and you get fewer crashes. This is a great deal - I wish my auto insurance worked the same way!
The problem is, of course, that this magic goes unappreciated, at least by the people paying for it. Things get better, this state of improved service becomes the new normal, and over time people forget that this is an improved state, forget that it is possible to go backwards, and then start wondering why they are carrying these costs. In that sense, service management *is* like the insurance business, I'll admit it.
Joel Spolsky started a bit of a flurry when he posted an entry on his blog about a month ago titled: The Duct Tape Programmer. Spolsky's argument was that he would rather hire programmers that were focused on shipping product rather than writing code. Of course this generated a lot of reactions (fueled by the salty language!), both in agreement and disagreement. Of course none of those disagreeing argued that shipping product shouldn't be a priority, but many did take issue with the implied and explicit criticism of certain development approaches and methodologies.
I have my own thoughts on this but won't rehash a lot of stuff that's been discussed since that article was posted. Rather I would like to ask a simple question: is there an equivalent of the Duct Tape Programmer in the operations space, the Duct Tape Operator, perhaps*? Second question: if we do, this that a good thing?
Let's start at the end of the spectrum that Spolsky dislikes. Do we have our equivalent of the multi-threaded COM-loving, design-pattern-conference-going programmer in the operations world? Well not at the programming level thankfully - although I've seen some pretty heroic looking REXX execs in my time! - but we do have a certain breed of process-focused types who might provide a match. These are the folks that lose sight of the original goal - improve service, lower costs, improve quality - and become fixated on implementing some process model rather than on achieving results. For these folks the means and the ends get confused.
At the other end of the scale is the operations hero, ready to put the quick fix in, bask in the glory of the saved moment, and then quickly move on. I've talked about these folks before - people (especially customers) think these folks are great, but over time they can do more harm than good. If you've organization has a good change management process in place, these people are easy to spot - they'll be at the top of the list of the number of emergency changes by user...
The sweet spot is somewhere in the middle. The people who get results are the one's that help define and build the appropriate amount of process (and automation around it) and the stick to the disciplines. Yes, make an emergency change where necessary, but only when it's called for by the situation, not just as a means of bucking the system. These are the valuable folks, the operations equivalent of The Duct Tape Programmer, the people who improve the lives of end-users, help lower costs and improve service quality and know both the value and limitations of process models.
Final thought: I'm not sure Duct Tape is really the right word (in either development or operations). Perhaps Andy Hunt and Dave Thomas' word pragmatic is better. That said I did once see, after the Northridge Earthquake in '94, a mail server that some pragmatic operator had put back together with duct tape...
* Operator is probably the wrong word here, but we don't have a simple analog to the generic "programmer" in the operations world. Our "operator" here could be a systems programmer, sysadmin, ops analyst, etc. etc... chose which ever one works best for you.
Pete Marshall 110000HXS0 firstname.lastname@example.org Tags:  service-management rational tivoli dynamic-infrastucture 511 Visits
Checking out today's Dynamic Infrastructure announcements, one sentence caught my attention:
"New service management capabilities provide greater visibility, control, and automation, leading to faster time-to-market of services." (my emphasis)
Time-to-market. That's not a phrase one usually thinks about when it comes to service management. Time-to-market is usually perceived as a development issue. Service management is usually thought of as an after-launch activity: it's what we need to do once an application has been moved from development into production.
Traditionally we've thought of development and operations as two distinct (if not mutually hostile!) organizations with a clear divide between the two. Often this divide has been described as a "wall", and the process of moving an application into production as "throwing something over the wall".
In recent years this analogy has started to break down. As organizations have adopted more agile, iterative, approaches to application development, moving code into production has become a more agile, iterative, process. Rather than completing an application and throwing it over the wall, many organizations are now incrementally improving applications and bringing new feature/functions on-line in an ongoing rapid cycle of develop/test/deploy. As a result of this we see developers actively involved in the management of production applications, and operations people involved in the troubleshooting and support of applications. Tools such as Tivoli Composite Application Manager represent something that didn't really exist ten years ago: a solution designed for both developers and operations personnel.
So the walls are becoming more permeable but we still have a way to go. One thing that will help is to start having development and operations tooling share their data, processes and other artifacts. Consider the following:
Will the wall between operations and development ever fully come down? Probably not - they are different disciplines after all and do require different skill sets and processes. But in the future these walls will be less of a barrier and more of a demarcation line, and wherever we chose to draw that line should be dictated by our business needs - and not our tooling's inability to inter-operate.
Pete Marshall 110000HXS0 email@example.com Tags:  pulse pulse2010 service-management hot-topics ibmpulse 947 Visits
One of my more fun tasks at the moment is being the track chair for the Hot Topics in Service Management track at Pulse 2010. My job, along with a team of colleagues representing a wide range of subject areas, is to come up with a set of twelve sessions that provide a comprehensive survey of the challenges facing business leaders in service management today. It's quite a challenge.
Recently Kathleen Holm was asking me about what constituted a Hot Topic and what kind of sessions and topics we are looking forward to in this track. You can read her notes on our conversation here. As you can see, we're really interested in covering a wide range of topics that businesses are wrestling with right now, including cloud computing, sustainability, security, storage and information management and much more. There is a lot to cover and if anything, our problem is going to be getting through everything we would like to cover in just twelve sessions.
Part of the challenge in putting together this track is deciding what's hot, what must we absolutely cover and, in some cases, what's not quite hot enough to make the cut. Of course as I talk to different people I get different opinions about this. Everyone has their favorites: to the cloud people, cloud is the hottest; to the security people, it's security, and so on. Everyone seems to agree roughly on what's hot or not; the contention is over what's hotter than what!
This of course is the stuff of technical debate and something our industry is famous for: endless arguments about the relative merits of X versus Y. Unfortunately, like most of these debates, the result is usually more heat than light. So aside from some tough decisions we may have to make about excluding a certain topic in favor of another one, I'm trying to stay away from these debate as much as possible.
That said, the mere existence of these debates got me thinking: the hottest topic in service management is how to get beyond these debates, how to break through the analysis and start working on getting things done. That should be our focus in this track: real-life examples of making a decision and building something in an environment where a lot of people are still debating what the problem is. That's the hottest topic!
Doug McClure posed an interesting question on Twitter last night: What are the characteristics of a "truly aligned" IT metric with a business metric?
Good question, and one that I am sure I don't have a complete answer for. But I have some thoughts on the subject based on working with customers over the years so here goes... In my experience the best IT metrics are:
Aligned to Business Outcomes
Well, duh! But it's amazing how many IT metrics that look like they align with business interests at first glance but turn out not to be so good. A good example of this is everybody's favorite: up-time: especially when expressed as a percentage. Now clearly 99.99% availability is better than 99.9% availability, or even 99.98%, and better numbers are worth striving for. But as an indicator of business value, these types of numbers are of limited value.
The problem is, in this example, that the business impact of outages is non-linear. A short outage doesn't really hurt, but a long one does. As an example, let's take 99.9% availability - a number that many people would regard as two nines short of a decent service level. You can achieve 99.9% availability even if you have a 90 second outage every day. In most organizations, 90 seconds a day would probably count as irritating but not represent a huge impact. On the other hand a 45 minute outage every month, or a close on 9 hour outage once a year would certainly be noticed, even though they both represent the same overall availability.
Many IT metrics - number of transactions that fail to complete, time to problem resolution, number of incidents per user, number of bugs per line of code etc. - suffer the same problem. The impact is non-linear. Business only notice, and are impacted, by the long tail in these statistical distributions and these are the metrics that count.
Good metrics are the ones that respect this non-linearity. So to get good alignment we need to throw away the data that represents small failures and report only those that really have impact. So, in our up-time example, move away from 99.xyz%, and start expressing the impact in terms of something like number of outages over 3 minutes, 10 minutes, 30 minutes etc. Now I share the concern that this kind of metric is somewhat negative in it's presentation, but I would argue that that makes it (and others like it) a better-aligned measurement.
Presented in a Way that Business Users Think
It is a long-stated goal of the IT profession that, one day, IT services will be just like electrical power or telephone dial-tone: it will simply be there and users will take it for granted. Well I have news: users already do take IT for granted and assume it will be "just there" when it's needed. In fact, better than that, as I discussed above, there's even a tolerance for a certain amount of failure. Now the level of tolerance has it's limits obviously and at some point business does start to suffer, and the exact point where failure becomes a real issue differs from application to application, business to business, and from user to user. But all users have one thing in common: they think of things in terms of failures, not successes. IT, by default works most of the time: what you need to measure is how far away from most of the time you are getting.
There's nothing strange going on here; we all think like this. If I ask you if you have been happy with your cellphone service this month, you will immediately try and recall if you had any problems. You may say you are unhappy based on how much failure occurred and the impact it had on you at the time, or you may say everything was basically okay despite a few dropped calls. Again, circumstances and expectations differ, but the point is its all based on thinking about problems: when asked this question you will not do is think about the 274 calls that went through without a problem.
So to get better aligned, throw out the "success" data and report problems. Nobody cares about success, they take it for granted. So provide only metrics that focus on your failures: not only will these types of metrics garner IT more respect for being honest about it's failures, it will provide numbers that much more realistically reflect how well, or otherwise, IT is supporting the business.
Metrics that Inspire
The third characteristic that a good metric has is that it should inspire people to take action to improve it. Some metrics are better than others at this.
I have long thought that it is only a certain type of quality wonk to gets excited about going from 99.9% to 99.99% to 99.999%. Numbers like this don't typically inspire. They don't inspire IT folks to get better, and they don't signal to the business that things are getting better. The same thing goes for getting from 5-sigma to 6-sigma: the improvement gets lost in the statistics: nobody gets what the difference between a 5.3 last month and a 5.6 this month means. It could be huge in terms of business benefits but it looks tiny. We're human, bigger, simpler numbers are a lot easier to think about.
So, again, a good approach is to drop the statistics and go for the raw numbers, something I talked about in my post on the transaction factory model. So instead of 99.99% successful completion of transactions this week, report it as 1,352 failures. Getting that number to zero is much more inspiring than moving some statistical indicator incrementally upwards. It's also much easier for the business side of things to see improvement (or otherwise) with simple numbers.
Of course these numbers perhaps look worse on the outside than overall statistics. There's something more comforting about reporting 99.99% success than saying you had 1,352 failures last week. But a good metric should prompt action, and action often only follows if you get out of the comfort zone. And, who knows, if you explain to your users exactly who bad things actually are, they may just start understanding why you keep asking for additional funding!
Here are some interesting numbers...
I was listening to James Blacken of IBM's Global Technology Services present at Pulse Comes to You in Long Beach, California, last week. James' session was titled Service Management in an Uncertain Economy and was largely based on the results of a worldwide survey of over 400 CIOs, IT Directors, and CFOs carried out at the end of 2008 looking into 2009. This survey covers a wide range of topics around spending trends and priorities in the service management space, and throws up some interesting insights into how businesses are reacting to the current economic times.
As you would expect, spending on service management is not increasing in most organizations but, on the other hand, few respondents reported a significant decrease. Spending for most was largely flat coming into 2009 but even with that, most IT organizations realize that they need to re-prioritize projects to focus on demonstrable returns and a quick return on investment.
The survey went on to look at business drivers, project priorities, changes in timescales and so forth. All very interesting stuff and it would be very interesting if it were repeated again for 2009-2010.
The standout chart to me in all of this was one on measuring the value of service management projects. Specifically the question asked was: Overall, how long has it taken to realize the intended business value/ROI for service management programs or projects completed in the last 24 months? Of the 46% who reported getting measurable value, there was an interesting spread of results: for any given project - event management, for example - some respondents reported getting payback in less than six months, while others took up to two years.
But what was most interesting about these numbers were the other 54% of respondents who did not produce measurable value. Here's how that 54% broke out:
5% reported that their project(s) did not achieve value. What's interesting here (aside from the honesty!) is the fact that they knew they had failed and clearly had a measurement of that failure.Now these last two, which represent 27% of all organizations surveyed, don't have a project measure in place. (I assume the 22% "too early" do, otherwise they wouldn't say this). The question is: is 27% a good number or a bad number? for our industry as a whole?
In a trivial sense, of course, anything other than 0% is a bad number. We really should be able to measure what we set out to do, even if we fail to do it. That said, I'm inclined to think that this is a pretty good number, especially given the criteria that the question asked about measurable business value. Cast you mind back a few years: were we anywhere near as good as this?
As I said, it would be interesting if this were repeated. This number is good indicator of how our industry is maturing.
It is true to say that, when it comes to operating in a mature, business-oriented fashion, the systems/service management community has made great strides in the past ten years or so. The notion that one needs to manage the services that end-users actually consume, and not just the infrastructure components has, in this time, gone from being a radical idea to a universal truth. In the same time period, implementing a process model such as ITIL has gone from being a novelty to pretty much a basic foundation of what we do.
What we would like to think is that service management has gone from the wild west of seat-of-the-pants, it'll-be-all-right-on-the-night, management to a calm, ordered, by-the-book, model where applying changes doesn't necessarily foreshadow disaster, problems, if they should occur, are dealt with logically and methodically.
So why do we still have IT heroes?
That IT heroes are detrimental to the overall progress of service management (and IT in general) is well understood. This article from itsmbuzz.com puts it very well:
" ...ultimately the hero mentality is a big reason why so many IT departments are stuck in a reactive mode, constantly fighting fires then slapping each other on the back for a job well done."So why does this persist?
One explanation given is that, despite all the fine talk, IT (as a whole industry or as in your local IT department, take your pick) still really likes it that way and hasn't really changed. Given, however, that many IT managers and practitioners suffer the same frustration, this is unlikely and in most cases so we can rule out malfeasance.
What about pragmatism? Are people still running around because they have no choice? Are we so up to our asses in alligators that we've lost sight of our goal of draining the swamp? Perhaps, but I don't think this is the whole story.
It it were, the prescription would be easy: discipline. Project management discipline to keep resources off fire-fighting - at whatever cost - and on implementing process controls and automation. Operational discipline to stick to those controls come what may and to avoid the temptation to put in the quick fix. Discipline to do closed loop analysis and iteratively improve processes to the point where only black swan events have any impact.
So is this simply a case of IT heal thyself? Certainly it is to a point, but there's more to it than this. We're ignoring the end-user and business managers role in the problem...
One of my favorite stories about IT is The Parable of the Two Programmers This story is about development rather than operations but the point is well made: if non-technical people don't understand what is going on they will look at signs of activity - heads-down doing things or amount of output - and ascribe value to that, rather than to the value they cannot see. In the story, the value is in the elegance of Charles' solution, but the visible is in Alan's activity and amount of work product.
The same happens, I believe, in IT operations. Seamless processes are great, but seamlessness implies invisibility, and invisibility, if you don't understand the real value, equates to worthlessness. Now given that end-users and business managers don't even think about service management until there is a problem, the folks that naturally gets the attention are the ones who run around conspicuously fixing things. Even in the best-run shop, the IT folks that get all the kudos are the one's that rush in and save the day, not the one's who quietly labor to stop things going wrong in the first place. IT may be trying to get out of react mode, but businesses still want their IT heroes!
I'm not saying we as IT people shouldn't continue to work towards a hero-free culture, but let us keep in mind this is not necessarily a problem entirely of our own making.
In response to: Lots of good things are smallSo I can't help but think that small scale ITIL implementations are just as applicable to the large enterprise as they are in the SMB space. The end game may be different (say in terms of total number of component processes implemented) but the idea of starting with a small number of fairly generic processes and then refining it to the level necessary needed for a large enterprise seems to me to be a viable, and appealing approach.
In talking to customers about ITIL implementations I get the impression that three things get in the way of successful projects:
First, the breadth and depth of ITIL (esp. V3) is daunting: when you start to put a project plan together it's easy to get into analysis paralysis. Projects get delayed by ongoing scope creep and never get out of the blocks.
Second, for those folks that do make a start, such a large amount gets implemented in phase 1 that it becomes too late, and too costly, to make changes based on the real-life day-to-day execution post the big launch.
Third, big projects, especially around "standards", become a haven for those souls who lose track of the difference between the means and the end. Implementing ITIL (or anything else) shouldn't be an end in itself - we're after the business benefits - but it's easy to stop seeing the wood for the trees.
Seems to me that this kind of simple approach not only makes ITIL usable in the smaller organization but could provide a great on-ramp and starting point for the first cycle of an iterative large-scale implementation.
One of the questions commonly asked in systems and service management is" "what's the overhead?" Whether it is CPU overhead, people overhead, process overhead, or whatever, the costs of managing systems and services are, quite rightly, a concern. Costs should always be optimized. Management software should be efficient from a resource-consumption perspective. More importantly it should be integrated and automated to maximize the effectiveness of management personnel.
As an industry we've done very well over the years in this respect. The past twenty-odd years I've been in the management business represents about 12 Moore's Law doublings, or roughly a 4,000 times increase in capability and capacity per dollar spent on IT. At the same time these past two decades have seen a huge increase in IT spend by most organizations. Put togther, today we have hugely greater number of bytes, cycles, transactions, files, storage devices, network connections etc. etc. to manage than we had in the late 'eighties.
Where we've done well it that the number of people managing all of this capability and capacity hasn't increased anything like as fast - thankfully! As an industry - vendors and users alike - we've done a great job of increasing our management capability while keeping our costs under control.
Yet in many cases we continue to refer to these relatively small costs as "overhead". In one sense that's good: overhead is bad and using the word keeps us all on our toes. On the other hand it's bad: service management is worth investing in, but labeling it as overhead can lead to poor, or at least delayed, investment decisions.
It's a common pattern. An organization builds a new application on a new technology. As investing in managing that technology is an "overhead", the decision to buy and implement a suitable tool is put off until the last minute and, things being what they are, there's a period of days, if not weeks or months, before the service is properly brought under control.
We need to re-frame our thinking and get away from this common assumption that management is some kind of necessary evil that fills in the gaps (as an after-thought) in a solution which, in an ideal world, would not need managing at all. We need to re-frame our thinking and look at the overall economics of building an application and delivering that application as a service. We need to understand that this build-then-manage approach is, in fact, the optimal approach economically.
Adding external, after-the-fact, management to new technologies is a common strategy in human affairs. Take the automobile as an example. The technology exists today to have vehicles communicate intelligently with each other and to work collectively to avoid traffic collisions and congestion. But (today, at least) it is simpler and cheaper to use traffic lights to get the management we need. We may not like traffic lights from time to time, but in general we don't regard them as an overhead.
The same applies to software. It is possible to write systems and applications that don't need managing: the whole real-time systems business is about doing just that. It's a hugely expensive business and only something you want to do if you can't avoid it. So if you need software to go places it can't be managed (into space, for example), or into things that simply cannot fail (airliners, medical equipment...) then you have no choice. But for most other software applications, building something and then separately adding management to it is by far the most economical approach.
That's worth bearing in mind the next time someone says "overhead". It's not overhead, it's the logical, rational, thing to do. Remind whoever says it of this fact and perhaps you can get the right management systems and tools in place without going through the all-too-traditional investment-gap phase.