|System z on Facebook
Lower business risk though mainframe monitoring
Caroline Exum 270004MPQK firstname.lastname@example.org | | Tags:  itcam zos ims cics z tivoli zenterprise omegamon management systems system_z_software | 723 Visits
By Lorin Ullmann, IBM® Master Inventor, Lead Architect, Integrated Service Management for System z®
Tivoli System z Architecture and Technical Strategy, IBM Development Lab, Austin Texas
If you are responsible for your organization’s most critical applications, then you are very likely working with mainframes. You also have likely been called back into work at night, or over the weekend, in order to diagnose and fix a computing problem. Read more of this blog to learn why IT organizations are implementing comprehensive Service Management technology strategies.
The mainframe runs the most critical business applications
The mainframe runs applications that allow people to use their credit cards to purchase goods, withdraw cash from an ATM bank machine, check-in at a hotel or airport, process their house insurance claim after a fire, receive their weekly paycheck into their bank account, receive government services like social security, or enable critical operations by our military. The figure below illustrates common pattern in the mainframe computing application environment. Application software architectures used today by businesses are critical and complex. Software components running on the mainframe provide computer processing for the storage of critical customer data, execution of IT transactions, hosting application business service logic, and access by internal users and external customers.
Figure 1: Example System z Application Architecture Pattern
Performance monitoring lowers business risk
These mainframe applications, often referred to as “workload,” are most often the company’s most critical assets, generating business revenue with a proven track record of being operational, 24 hours a day, 7 days a week, 365 days a year. The software components have been highly tuned, modified by programmers over time, in order to guarantee response times for the end user. The collection of software components must work together seamlessly to run billions of business transactions a year.
In order to lower business risk, mainframe organizations use monitoring software, specialized applications or tools that collect information used to provide indicators that everything is working fine. A healthy application means the company continues to make money, as customer can access their data and perform transactions by using the company’s services. Performance monitoring tools also highlight software parts that are not performing. Performance indicators, sometimes referred to as “metrics or key performance indicators (KPI),” are collected to provide early identification of potential bottlenecks that could eventfully provide a poor user experience. For example, the ATM or web banking application might perform slower than expected, but only when a large number of users log in at the same time.
IT staff depend on problem determination tooling
Monitoring tools alert operators of potential issues and then help the subject matter experts ( SMEs) determine the root cause of an issue. Using the example of an unusually slow response time for an end user, the SME will investigate each area of the computing software and hardware environment. This common problem determination approach permits the SME to eliminate technology areas as the potential root cause.
Problem determination of an application slow-down is difficult and time consuming because there are many potential areas of concern. In addition to the application software environment shown in Figure 1, the SME must also consider the hardware storage, server and network environment and any non-mainframe servers that are part of the application data and logic flow. The non-mainframe servers often referred to as “distributed servers, “in the company’s data center often connect the mainframe and drive the use of computing resources on the mainframe in order to support their customer’s business transactions.
Comprehensive IBM System z Monitoring Solution
The IBM System z Monitoring Solution Architecture design addresses the need to streamline problem determination for complex application architecture run-times by collecting performance metrics in a high performing, 24x7 highly available environment. The figure below illustrates the general solution scope based on OMEGAMON® and ITCAM monitoring technology.
The OMEGAMON and ITCAM technology families provide single data collection, and screen visibility across entire System z and subsystems, supporting personas with both 3270 displays for rapid response times for system programmers and graphical user interfaces for SME operations staff. The solution provides data for event monitoring, reporting and automation tools for both real-time and historical use cases. Best practices are provided for both out of the box for IT staff, and ready to be customized by an organization.
Figure 2: IBM System z Monitoring Solution Architecture
In closing, a few key points are reiterated below:
Lorin Ullmann, the lead development architect for Tivoli's Integrated Service Management for System z and an IBM Master Inventor, focuses primary on System z technical strategies, solution architectures, and software designs across the Tivoli portfolio that help clients address today’s IT challenges. During his 25-year IBM career, Lorin's architectural contribution has spanned a broad range of technologies, including systems management; Java™, OS/2™, thin client and pervasive device operating systems; printer device drivers, graphic and video engines; power management firmware; PS/2 personal computer video circuit board hardware; and IO optical storage peripherals.