Keeping track of File System Utilization in Ops Center 12c

Posted by S Stelting on Oracle Blogs See other posts from Oracle Blogs or by S Stelting
Published on Wed, 7 Nov 2012 17:32:50 +0000 Indexed on 2012/11/07 23:09 UTC
Read the original article Hit count: 807

Filed under:

/Oracle/Tips

Enterprise Manager Ops Center 12c provides significant monitoring capabilities, combined with very flexible incident management. These capabilities even extend to monitoring the file systems associated with Solaris or Linux assets. Depending on your needs you can monitor and manage incidents, or you can fine tune alert monitoring rules to specific file systems.

This article will show you how to use Ops Center 12c to

Track file system utilization
Adjust file system monitoring rules
Disable file system rules
Create custom monitoring rules

If you're interested in this topic, please join us for a WebEx presentation!

Date: Thursday, November 8, 2012
Time: 11:00 am, Eastern Standard Time (New York, GMT-05:00)
Meeting Number: 598 796 842
Meeting Password: oracle123

To join the online meeting
-------------------------------------------------------
1. Go to https://oracleconferencing.webex.com/oracleconferencing/j.php?ED=209833597&UID=1512095432&PW=NOWQ3YjJlMmYy&RT=MiMxMQ%3D%3D
2. If requested, enter your name and email address.
3. If a password is required, enter the meeting password: oracle123
4. Click "Join".

To view in other time zones or languages, please click the link:
https://oracleconferencing.webex.com/oracleconferencing/j.php?ED=209833597&UID=1512095432&PW=NOWQ3YjJlMmYy&ORT=MiMxMQ%3D%3D

Monitoring File Systems for OS Assets

The Libraries tab provides basic, device-level information about the storage associated with an OS instance. This tab shows you the local file system associated with the instance and any shared storage libraries mounted by Ops Center.

More detailed information about file system storage is available under the Analytics tab under the sub-tab named Charts. Here, you can select and display the individual mount points of an OS, and export the utilization data if desired:

In this example, the OS instance has a basic root file partition and several NFS directories. Each file system mount point can be independently chosen for display in the Ops Center chart.

File Systems and Incident Reporting

Every asset managed by Ops Center has a "monitoring policy", which determines what represents a reportable issue with the asset. The policy is made up of a bunch of monitoring rules, where each rule describes

An attribute to monitor
The conditions which represent an issue
The level or levels of severity for the issue

When the conditions are met, Ops Center sends a notification and creates an incident.

By default, OS instances have three monitoring rules associated with file systems:

File System Reachability: Triggers an incident if a file system is not reachable

NAS Library Status: Triggers an incident for a value of "WARNING" or "DEGRADED" for a NAS-based file system
File System Used Space Percentage: Triggers an incident when file system utilization grows beyond defined thresholds

You can view these rules in the Monitoring tab for an OS:

Of course, the default monitoring rules is that they apply to every file system associated with an OS instance. As a result, any issue with NAS accessibility or disk utilization will trigger an incident. This can cause incidents for file systems to be reported multiple times if the same shared storage is used by many assets, as shown in this screen shot:

Depending on the level of control you'd like, there are a number of ways to fine tune incident reporting.

Note that any changes to an asset's monitoring policy will detach it from the default, creating a new monitoring policy for the asset. If you'd like, you can extract a monitoring policy from an asset, which allows you to save it and apply the customized monitoring profile to other OS assets.

Solution #1: Modify the Reporting Thresholds

In some cases, you may want to modify the basic conditions for incident reporting in your file system. The changes you make to a default monitoring rule will apply to all of the file systems associated with your operating system. Selecting the File Systems Used Space Percentage entry and clicking the "Edit Alert Monitoring Rule Parameters" button opens a pop-up dialog which allows you to modify the rule.

The first screen lets you decide when you will check for file system usage, and how long you will wait before opening an incident in Ops Center. By default, Ops Center monitors continuously and reports disk utilization issues which exist for more than 15 minutes.

The second screen lets you define actual threshold values. By default, Ops Center opens a Warning level incident is utilization rises above 80%, and a Critical level incident for utilization above 95%

Solution #2: Disable Incident Reporting for File System

If you'd rather not report file system incidents, you can disable the monitoring rules altogether. In this case, you can select the monitoring rules and click the "Disable Alert Monitoring Rule(s)" button to open the pop-up confirmation dialog.

Like the first solution, this option affects all file system monitoring. It allows you to completely disable incident reporting for NAS library status or file system space consumption.

Solution #3: Create New Monitoring Rules for Specific File Systems

If you'd like to have the greatest flexibility when monitoring file systems, you can create entirely new rules. Clicking the "Add Alert Monitoring Rule" (the icon with the green plus sign) opens a wizard which allows you to define a new rule.

This rule will be based on a threshold, and will be used to monitor operating system assets. We'd like to add a rule to track disk utilization for a specific file system - the /nfs-guest directory. To do this, we specify the following attribute

FileSystemUsages.name=/nfs-guest.usedSpacePercentage

The value of name in the attribute allows us to define a specific NFS shared directory or file system... in the case of this OS, we could have chosen any of the values shown in the File Systems Utilization chart at the beginning of this article.

usedSpacePercentage lets us define a threshold based on the percentage of total disk space used. There are a number of other values that we could use for threshold-based monitoring of FileSystemUsages, including

freeSpace
freeSpacePercentage
totalSpace
usedSpace
usedSpacePercentage

The final sections of the screen allow us to determine when to monitor for disk usage, and how long to wait after utilization reaches a threshold before creating an incident. The next screen lets us define the threshold values and severity levels for the monitoring rule:

If historical data is available, Ops Center will display it in the screen. Clicking the Apply button will create the new monitoring rule and active it in your monitoring policy.

If you combine this with one of the previous solutions, you can precisely define which file systems will generate incidents and notifications. For example, this monitoring policy has the default "File System Used Space Percentage" rule disabled, but the new rule reports ONLY on utilization for the /nfs-guest directory.

Stay Connected:

Twitter | Facebook | YouTube | Linkedin | Newsletter

Developer IT