Misc: Performance monitoring

Title:Misc: Performance monitoring
Author:Douglas O’Leary <dkoleary@olearycomputers.com>
Description:Misc: Performance monitoring
Date created:06/2009
Date updated:10/02/2015
Disclaimer:Standard: Use the information that follows at your own risk. If you screw up a system, don’t blame it on me...

Overview

sar, system accounting & reporting, is a performance monitoring tool that comes standard with most flavors of UNIX. It does a good job of collecting statistics and helping to identify three out of the four possible UNIX based bottlenecks. The first script below, sadc.txt, sets up sar. Two other scripts which will be forthcoming generate reports on the CPUs and disks. These are the two reports that I’ve used most often and also the two that need the most manipulation in order to provide any meaningful results.

A couple of notes:

  • The scripts that follow were generated on HP platforms. Generally, sar is set up the same between the various vendors; however, verify the binary paths in the scripts if you’re setting this up on systems other than HP.
  • Most major vendors supply a -r option to their sar commands to report on memory. HP, for some reason, doesn’t. What they do have that other vendors don’t is the -Mu option which will break out the cpu stats by CPU. That comes in handy to verify that the applications are hitting all the cpus simultaneously.
  • The scripts require a minor bit of tweaking in order to work on linux systems. Like most things that have migrated to linux, sar has gotten much better. I will post the linux versions of the scripts sometime soon.

And now, on to the scripts:

sadc.ksh: # Set up and run the data collection graph.sar-u: # Report on CPU stats averaged out over an hour. graph.sar-d: # Report on disk stats that break defined levels.

The graph.sar-u script seems to be missing. I’ll try to re-find that too.

sadc.ksh

As mentioned above, sadc.ksh sets up sar. It should be kicked off via cron. The script takes two command line arguments and will complain if it doesn’t get them. The first identifies the scan rate - how many seconds between system poles. The second argument identifies the number of poles. The script will initiate the sadc data collector with the scanning rate and number you specify and store the data in /var/adm/sa/sar.YYMMDD. This means that you’ll have to use the -f command line argument to sar to get information from the file. For instance, to get the cpu stats from today’s file, the command line is:

sar -u -f /var/adm/sa/sar.000204

Depending on the scan rate you specify, some of these files can get pretty big. Make sure you have enough room in the /var filesystem. Better yet, create a new filesytsem for the sar stats and either mount it at /var/adm/sa or soft link it there. I usually like to have at least a gig for a good performance evaluation - preferrably two gigs.

Speaking of my normal (performance evaluation) setup: I will run the collector every two minutes for the full day. This catches just about any twitch the system does. It provides the detail necessary and, with the reporting scripts, can be averaged out to provide a much higher level perspective. When I’m not actively running a performance evaluation, I will back the scan rate off to about 15 minutes between poles. At the two minute pole rate, the sar files will get up to 10-12 megs each. As you can tell, it won’t take long to fill up a filesystem with files like that. The cron table entries look like:

0 0 * * * /usr/local/sbin/sadc.ksh 120 720 # Full bore
0 0 * * * /usr/local/sbin/sadc.ksh 900 96  # Reduced rate

Make sure you either clean up the sar files periodically or stop the data collection when the performance evaluation is complete.

graph.sar-u

Notes the graph.sar-u script will go here.

graph.sar-d

Notes on the graph.sar-d script will go here.