FMw Diagnostic Framework : Automatic Capture of Diagnostic Data Upon First Failure!
Posted
by Daniel Mortimer
on Oracle Blogs
See other posts from Oracle Blogs
or by Daniel Mortimer
Published on Fri, 2 Nov 2012 12:58:38 +0000
Indexed on
2012/11/02
17:18 UTC
Read the original article
Hit count: 419
/Oracle
Introduction
There is nothing more frustrating than a problem that "cannot be reproduced". Logs, configuration files have been analysed but there just isn't enough information to establish the root cause. The issue maybe closed, but you are left with the feeling that the problem will raise its ugly head again in the future. Trouble is, to resolve such issues you need to capture diagnostic data at the exact time the incident occurs. Step forward Fusion Middleware Diagnostic Framework!
Diagnostic Framework monitors WebLogic Managed Servers and delivers "Automatic capture of diagnostic data upon first failure". To quote from
Oracle Fusion Middleware Administrator's Guide 11g Release 1 (11.1.1)
Chapter 13 Diagnosing Problems
"When a critical error occurs ... the Diagnostic Framework automatically collects diagnostics, such as thread dumps, DMS metric dumps, and WebLogic Diagnostics Framework (WLDF) server image dumps ... The data is stored in a file-based repository and is accessible with command-line utilities."
In other words the data collected upon first failure - especially the thread and image dumps - provides a snapshot of the system as or immediately after the problem occurs. The table below shows the type of WebLogic Server issues which fall into the scope of Diagnostic Framework
How to Configure Diagnostic Framework?
Depending on your Fusion Middleware product choice you may not need to do anything! Diagnostic Framework is automatically installed, configured and initiated for any WebLogic Domain which has the Oracle Java Required Files (JRF) template applied. This template is applied by default whenever you configure WebLogic Managed Servers for products such as
- Portal / Forms / Reports / Discoverer
- Identity Management ( OID , OAM , OIM etc)
- WebCenter
- SOA
Check your WebLogic Domain directory structure. If you have an "adr" sub directory under
DOMAIN_HOME/servers/<servername>/
then JRF template has been applied and Diagnostic Framework will be in play.
Should the "adr" sub directory not exist, review the advice given in My Oracle Support article
How to Apply FMW ( EM ) Control and JRF to a WebLogic Domain and Managed Servers [ID 947043.1]
If you are working with a standalone WebLogic Server solution and applying Oracle JRF is not acceptable, consider using WLDF - WebLogic Diagnostic Framework. (Fusion Middleware Diagnostic Framework makes use of WLDF under the covers.) Couple of useful links about WLDF are listed below
- Configuring and Using the Diagnostics Framework for Oracle WebLogic Server 11g
- WebLogic Diagnostics Framework-A Very Useful Tool [A nice blog which describes a WLDF use case]
How to Get Started With Diagnostic Framework
To be frank, the Fusion Middleware Administrator's Guide is the best place to start your learning
Oracle Fusion Middleware Administrator's Guide 11g Release 1 (11.1.1)
Chapter 13 Diagnosing Problems
A lot of reading here, but if you are in hurry and just want to get the right information to Oracle Support to help resolve your issue, check out the next section below.
How to Upload Diagnostic Framework Incident Data to Oracle Support
Some Background Information
There are three interfaces to the Repository:
- Enterprise Manager Cloud Control (Support Workbench)
- WLST (Command Line)
- ADRCI (Command Line)
The Enterprise Manager Cloud Control does provide a nice GUI interface to search, view and package diagnostic framework incidents. However, this software is not to be confused with Fusion Middleware (EM) Control. Cloud Control (formerly known as Grid Control) is part of the Enterprise Manager media package. EM Cloud Control has it's own install and configuration story. Therefore, for the benefit of those yet to install and play with Cloud Control, I am going to describe how to use the command line tools.
Ideally, you would only need to one command line interface, but currently I suggest using both - mainly due to the fact that ADRCI SHOW INCIDENTS does not reveal the description behind the Diagnostic Framework error code.
Instructions:
Note:
WLST and ADRCI are case sensitive when it comes to handling parameter values. If you make a mistake, expect an unfriendly syntax error message.
1) Find the incident
Note:
The managed server which you are troubleshooting must be up and running. If the managed server is down, ensure the domain's Admin Server is accessible. If you cannot connect to the Admin Server or the Managed Server the example WLST commands will not work.
a) Launch WLST
Note: Use the WLST which resides in the "oracle_common" directory (not WL_HOME/common/bin) otherwise you will get a syntax error like the one below
Traceback (innermost last):
File "<console>", line 1, in ?
NameError: listIncidentsMW_HOME/oracle_common/common/bin/wlst.sh
b) Connect to the managed server or the admin server e.g.
wls:/offline> connect('weblogic','welcome1','t3://localhost:7020')c) Run the command
wls:/MyDomain/serverConfig> listIncidents()This will list the incidents for the server to which you have connected. If you have connected to the Admin Server and want to list the incidents for a managed server within the domain, use the command
wls:/MyDomain/serverConfig> listIncidents(adrHome='diag\ofm\MyDomain\MyManagedServer' ,server='MyManagedServer')Example output
Incident Id Problem Key Incident Time 1 DFW-99998 [java.lang.NullPointerException] [oracle.error.simulator.ErrorSimulator.createNullPointerException][errorWebApp_1-0-0-0] Fri Nov 02 10:38:46 GMT 2012The piece highlighted in bold is the description you do not see when using the ADRCI 'SHOW INCIDENT' command.
Make a note of the incident id. You are ready to move to step 2
2. Package the incident
a) Set up the environment - example commands below are for Unix
cd <DOMAIN_HOME>/bin. ./setDomainEnv.shIf you want ADRCI to run a Remote Diagnostic Agent collection (recommended) at generate package time, point ORACLE_HOME at oracle_common
ORACLE_HOME=$MW_HOME/oracle_common; export ORACLE_HOME
To prevent ADRCI from running RDA at generate package time, point ORACLE_HOME at WL_HOME/server/adr directory.
ORACLE_HOME=$WL_HOME/server/adr; export ORACLE_HOMEb) Launch adrci
$WL_HOME/server/adr/adrcic) Set BASE and HOMEPATH
adrci> SET BASE /oracle/middleware/user_projects/domains/ mydomain/servers/mymanagedserver/adradrci> SET HOMEPATH diag/ofm/mydomain/mymanagedserverd) Optionally run SHOW INCIDENTS e.g.
adrci> SHOW INCIDENTS -MODE DETAILADR Home = /oracle/middleware/user_projects/domains/mydomain/ servers/mymanagedserver/adr/diag/ofm/mydomain/mymanagedserver:
*************************************************************************
**********************************************************
INCIDENT INFO RECORD 1
**********************************************************
INCIDENT_ID 1
STATUS ready
CREATE_TIME 2012-11-02 10:38:46.468000 +00:00
PROBLEM_ID 1
CLOSE_TIME <NULL>
FLOOD_CONTROLLED none
ERROR_FACILITY DFW
ERROR_NUMBER 99998
ERROR_ARG1 <NULL>
ERROR_ARG2 <NULL>
ERROR_ARG3 <NULL>
ERROR_ARG4 <NULL>
ERROR_ARG5 <NULL>
ERROR_ARG6 <NULL>
ERROR_ARG7 <NULL>
ERROR_ARG8 <NULL>
ERROR_ARG9 <NULL>
ERROR_ARG10 <NULL>
ERROR_ARG11 <NULL>
ERROR_ARG12 <NULL>
SIGNALLING_COMPONENT <NULL>
SIGNALLING_SUBCOMPONENT <NULL>
SUSPECT_COMPONENT <NULL>
SUSPECT_SUBCOMPONENT <NULL>
ECID 5162744c6a2eea5e:155ff445:13ac0aae7cb:-8000-000
0000000000325
IMPACTS 0
1 rows fetchede) Create a logical package
IPS CREATE PACKAGE INCIDENT incident_number
e.g.
adrci> IPS CREATE PACKAGE INCIDENT 1
Created package 1 based on incident id 1, correlation level typicalf) Generate the package
IPS GENERATE PACKAGE package_number IN path
e.g.
adrci> IPS GENERATE PACKAGE 1 IN /tmp Generated package 1 in file /tmp/DFW99998j_20121102113633_COM_1.zip, mode complete
Note:
If the generate package command hangs, ADRCI may be experiencing an issue when running RDA. To avoid such trouble, exit ADRCI and point the ORACLE_HOME environment variable at WL_HOME/server/adr
3) Upload the package zip to Oracle Support via your Service Request
a) Log into My Oracle Support and locate your Service Request
b) Click on "Add Attachments
c) And upload the zip file
© Oracle Blogs or respective owner