By: James Barnes
Graduation and Mother's day are over, Memorial Day has passed, and Black Friday is a ways off, no large sales are currently on the horizon. That means you can rest right? Perform your development iterations, do a bunch of changes, and tests. Yes and No. Now is the perfect time to do a health check, while you still have plenty of time to resolve any problems.
What exactly do we mean by a health check:?
First off you want to get a picture of how your current system is running, things to look at are as follows:
- CPU Utilization
- Page views/minute or hour
- Heap utilization
- Cache misses
- Fix levels
- Connection pool review
- Database Review
If you are not keeping an eye on these at least some of the time, it can lead to some holiday experiences that would be less than joyous. Now let us take a deeper look at each one.
You want to understand what your average utilization is as well as peak, so that when the holiday rush comes, you will have the resources able to meet the demand. This knowledge can also help you to investigate your environment and detect any bottlenecks and deal with them before they become a problem. One case like this was some test code that slipped into production that was blocking threads, we discovered this before holiday time and it was only causing slight hangs during the busiest part of the day.
This volume indicator will allow us to correlate other factors with time of day as well as a base from which to grow from. By this I mean, if you are getting 300 page views/hour, a reasonable estimation is that this may grow by 30% when it comes to black friday. So now you would be serving 390 page views per hour instead, now do you think you have the available resources to handle this increase? You can also use it as a sanity check when looking at your traffic to see if you are the target of an attack.
This is most commonly reviewed by turning on verbose garbage collection output, with this log you can use one of the various tools(like PMAT) to view and see if your system is properly tuned, or you are having a problem with not enough heap, large objects or something along those lines.
Reviewing this will require doing some dynacache tracing, or other instrumentation to see when an item is pulled from the cache versus having to calculate this. Usually it is best to tune your cache by running load against the system and see where the optimum setting is, more details are here Dynacache tuning
You want to review your system and see if you are at the recommended fix levels, as well as ensuring any must install fixes are applied, it is better to do this under a low stress period then when you are losing business. For WebSphere Commerce you can see that list here http://www-01.ibm.com/support/docview.wss?uid=swg21261296
Connection Pool Review:
Is your application waiting for a connection? Do you have instrumentation in place to monitor for this? A good sizing exercise is shown here http://www-01.ibm.com/support/docview.wss?uid=swg21358336
Review the CPU utilization of your database, how many connections can it support? If it is DB2, have you been keeping up with runstats and reorg? for both DB2 and Oracle have you been using dbclean utility to help clean up old and orphaned data? Keeping your database tier healthy will have positive impact on the sites performance.
When you start to notice a problem, attack it then, instead of waiting for 8 months when it is now the holiday time. Doing it this way should allow you and support to focus on the cause instead of grasping as many of the logs will have been rolled over. Bottom Line: do a health check now, and keep the data around as a point comparison.