A lot of customers ask how to verify their SOA clusters and make them production ready. Here is a list that I recommend using for 10G SOA Clusters.
v\:* {behavior:url(#default#VML);}
o\:* {behavior:url(#default#VML);}
w\:* {behavior:url(#default#VML);}
.shape {behavior:url(#default#VML);}
Normal
0
false
false
false
EN-CA
X-NONE
X-NONE
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin-top:0cm;
mso-para-margin-right:0cm;
mso-para-margin-bottom:10.0pt;
mso-para-margin-left:0cm;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
mso-bidi-font-size:12.0pt;
font-family:"Calibri","sans-serif";
mso-fareast-language:EN-US;}
Test cases for each component - Oracle Application Server 10G
General Application Server test cases
This section is going to cover very General test cases to make sure that the
Application Server cluster has been set up correctly and if you can start and
stop all the components in the server via opmnct and AS Console.
Test Case 1
Check if you can see AS instances in the console
Implementation
1. Log on to the AS Console --> check to see if you can see all the nodes
in your AS cluster. You should be able to see all the Oracle AS instances that
are part of the cluster. This means that the OPMN clustering worked and the AS
instances successfully joined the AS cluster.
Result
You should be able to see if all the instances in the AS cluster are listed
in the EM console. If the instances are not listed here are the files to check
to see if OPMN joined the cluster properly:
$ORACLE_HOME\opmn\logs{*}opmn.log*$ORACLE_HOME\opmn\logs{*}opmn.dbg*
If OPMN did not join the cluster properly, please check the opmn.xml file to
make sure the discovery multicast address and port are correct (see this
link for opmn documentation). Restart the whole instance
using opmnctl stopall followed by opmnctl
startall. Log on to AS console to see if instance is listed as
part of the cluster.
Test Case 2
Check to see if you can start/stop each component
Implementation
Check
each OC4J component on each AS instanceStart
each and every component through the AS console to see if they will start
and stop.Do that
for each and every instance.
Result
Each component should start and stop through the AS console. You can also
verify if the component started by checking opmnctl status
by logging onto each box associated with the cluster
Test Case 3
Add/modify a datasource entry through AS console on a remote AS instance (not
on the instance where EM is physically running)
Implementation
Pick an
OC4J instanceCreate a
new data-source through the AS consoleModify
an existing data-source or connection pool (optional)
Result
Open $ORACLE_HOME\j2ee\<oc4j_name>\config\data-sources.xml to see if
the new (and or the modified) connection details and data-source exist. If they
do then the AS console has successfully updated a remote file and MBeans are
communicating correctly.
Test Case 4
Start and stop AS instances using opmnctl @cluster
command
Implementation
1. Go to $ORACLE_HOME\opmn\bin and use the opmnctl @cluster
to start and stop the AS instances
Result
Use opmnctl @cluster status
to check for start and stop statuses.
HTTP server test cases
This section will deal with use cases to test HTTP server failover
scenarios. In these examples the HTTP server will be talking to the BPEL
console (or any other web application that the client wants), so the URL will
be _http://hostname:port\BPELConsole
Test Case 1
Shut down one of the HTTP servers while accessing the BPEL console and see
the requested routed to the second HTTP server in the cluster
Implementation
Access
the BPELConsoleCheck
$ORACLE_HOME\Apache\Apache\logs\access_log --> check for the timestamp
and the URL that was accessed by the user. Timestamp and URL would look
like this
1xx.2x.2xx.xxx [24/Mar/2009:16:04:38 -0500] "GET /BPELConsole=System HTTP/1.1" 200 15
After
you have figured out which HTTP server this is running on, shut down this
HTTP server by using opmnctl
stopproc --> this is a graceful shutdown.Access
the BPELConsole again (please note that you should have a LoadBalancer in
front of the HTTP server and configured the Apache Virtual Host, see EDG
for steps)Check
$ORACLE_HOME\Apache\Apache\logs\access_log --> check for the timestamp
and the URL that was accessed by the user. Timestamp and URL would look
like above
Result
Even though you are shutting down the HTTP server the request is routed to
the surviving HTTP server, which is then able to route the request to the BPEL
Console and you are able to access the console. By checking the access log file
you can confirm that the request is being picked up by the surviving node.
Test Case 2
Repeat the same test as above but instead of calling opmnctl stopproc, pull
the network cord of one of the HTTP servers, so that the LBR routes the request
to the surviving HTTP node --> this is simulating a network failure.
Test Case 3
In test case 1 we have simulated a graceful
shutdown, in this case we will simulate an Apache crash
Implementation
Use opmnctl status -l
to get the PID of the HTTP server that you would like forcefully bring
downOn Linux
use kill -9 <PID> to kill the HTTP serverAccess
the BPEL console
Result
As you shut down the HTTP server, OPMN will restart the HTTP server. The
restart may be so quick that the LBR may still route the request to the same
server. One way to check if the HTTP server restared is to check the new PID
and the timestamp in the access log for the BPEL console.
BPEL test cases
This section is going to cover scenarios dealing with BPEL clustering using
jGroups, BPEL deployment and testing related to BPEL failover.
Test Case 1
Verify that jGroups has initialized correctly. There is no real testing in
this use case just a visual verification by looking at log files that jGroups has
initialized correctly.
Check
the opmn log for the BPEL container for all nodes at
$ORACLE_HOME/opmn/logs/<group name><container name><group
name>~1.log. This logfile will contain jGroups related information
during startup and steady-state operation. Soon after startup you should
find log entries for UDP or TCP.Example
jGroups Log Entries for UDPApr 3, 2008 6:30:37 PM
org.collaxa.thirdparty.jgroups.protocols.UDP createSockets
· INFO: sockets will use interface 144.25.142.172· · Apr 3, 2008 6:30:37 PM org.collaxa.thirdparty.jgroups.protocols.UDP createSockets· · INFO: socket information:· · local_addr=144.25.142.172:1127, mcast_addr=228.8.15.75:45788, bind_addr=/144.25.142.172, ttl=32· sock: bound to 144.25.142.172:1127, receive buffer size=64000, send buffer size=32000· mcast_recv_sock: bound to 144.25.142.172:45788, send buffer size=32000, receive buffer size=64000· mcast_send_sock: bound to 144.25.142.172:1128, send buffer size=32000, receive buffer size=64000· Apr 3, 2008 6:30:37 PM org.collaxa.thirdparty.jgroups.protocols.TP$DiagnosticsHandler bindToInterfaces· · -------------------------------------------------------· · GMS: address is 144.25.142.172:1127· -------------------------------------------------------
Example
jGroups Log Entries for TCPApr 3, 2008 6:23:39 PM
org.collaxa.thirdparty.jgroups.blocks.ConnectionTable start
· INFO: server socket created on 144.25.142.172:7900· · Apr 3, 2008 6:23:39 PM org.collaxa.thirdparty.jgroups.protocols.TP$DiagnosticsHandler bindToInterfaces· · -------------------------------------------------------· GMS: address is 144.25.142.172:7900-------------------------------------------------------
In the
log below the "socket created on" indicates that the TCP socket
is established on the own node at that IP address and port the
"created socket to" shows that the second node has connected to
the first node, matching the logfile above with the IP address and
port.Apr 3, 2008 6:25:40 PM
org.collaxa.thirdparty.jgroups.blocks.ConnectionTable start
· INFO: server socket created on 144.25.142.173:7901· · Apr 3, 2008 6:25:40 PM org.collaxa.thirdparty.jgroups.protocols.TP$DiagnosticsHandler bindToInterfaces· · ------------------------------------------------------· GMS: address is 144.25.142.173:7901· -------------------------------------------------------· Apr 3, 2008 6:25:41 PM org.collaxa.thirdparty.jgroups.blocks.ConnectionTable getConnectionINFO: created socket to 144.25.142.172:7900
Result
By reviewing the log files, you can confirm if BPEL clustering at the
jGroups level is working and that the jGroup channel is communicating.
Test Case 2
Test connectivity between BPEL Nodes
Implementation
Test
connections between different cluster nodes using ping, telnet, and
traceroute. The presence of firewalls and number of hops between cluster
nodes can affect performance as they have a tendency to take down
connections after some time or simply block them.Also
reference Metalink Note 413783.1: "How to Test Whether Multicast is
Enabled on the Network."
Result
Using the above tools you can confirm if Multicast is working and
whether BPEL nodes are commnunicating.
Test Case3
Test deployment of BPEL suitcase to one BPEL node.
Implementation
Deploy a
HelloWorrld BPEL suitcase (or any other client specific BPEL suitcase) to
only one BPEL instance using ant, or JDeveloper or via the BPEL
consoleLog on
to the second BPEL console to check if the BPEL suitcase has been deployed
Result
If jGroups has been configured and communicating correctly, BPEL clustering
will allow you to deploy a suitcase to a single node, and jGroups will notify
the second instance of the deployment. The second BPEL instance will go to the
DB and pick up the new deployment after receiving notification. The result is
that the new deployment will be "deployed" to each node, by only
deploying to a single BPEL instance in the BPEL cluster.
Test Case 4
Test to see if the BPEL server failsover and if all asynch processes
are picked up by the secondary BPEL instance
Implementation
Deploy
a 2 Asynch process: A
ParentAsynch Process which calls a ChildAsynchProcess with a variable
telling it how many times to loop or how many seconds to sleepA
ChildAsynchProcess that loops or sleeps or has an onAlarmMake
sure that the processes are deployed to both serversShut
down one BPEL serverOn the
active BPEL server call ParentAsynch a few times (use the load generation
page)When
you have enough ParentAsynch instances shut down this BPEL instance and
start the other one. Please wait till this BPEL instance shuts down fully
before starting up the second one.Log on
to the BPEL console and see that the instance were picked up by the second
BPEL node and completed
Result
The BPEL instance will failover to the secondary node and complete the flow
ESB test cases
This section covers the use
cases involved with testing an ESB cluster. For this section please
Normal
0
false
false
false
EN-CA
X-NONE
X-NONE
/* Style Definitions */
table.MsoNormalTable
{mso-style-name:"Table Normal";
mso-tstyle-rowband-size:0;
mso-tstyle-colband-size:0;
mso-style-noshow:yes;
mso-style-priority:99;
mso-style-qformat:yes;
mso-style-parent:"";
mso-padding-alt:0cm 5.4pt 0cm 5.4pt;
mso-para-margin-top:0cm;
mso-para-margin-right:0cm;
mso-para-margin-bottom:10.0pt;
mso-para-margin-left:0cm;
line-height:115%;
mso-pagination:widow-orphan;
font-size:11.0pt;
mso-bidi-font-size:12.0pt;
font-family:"Calibri","sans-serif";
mso-fareast-language:EN-US;}
follow
Metalink Note 470267.1 which covers the basic tests to verify your ESB cluster.