How to Achieve OC4J RMI Load Balancing
- by fip
This is an old, Oracle SOA and OC4J 10G topic. In fact this is not
even a SOA topic per se. Questions of RMI load balancing arise when you
developed custom web applications accessing human tasks running off a
remote SOA 10G cluster. Having returned from a customer who faced
challenges with OC4J RMI load
balancing, I felt there is still some confusions in the field how OC4J
RMI load balancing work. Hence I decide to dust off an old tech note
that I wrote a few
years back and share it with the general public.
Here is the tech note:
Overview
A typical use case in Oracle SOA is that you are building web based,
custom human tasks UI that will interact with the task services housed
in a remote BPEL 10G cluster. Or, in a more generic way, you are just
building a web based application in Java that needs to interact with the
EJBs in a remote OC4J cluster. In either case, you are talking to an
OC4J cluster as RMI client. Then immediately you must ask yourself the
following questions:
1. How do I make sure that the web application, as an RMI client, even
distribute its load against all the nodes in the remote OC4J cluster?
2. How do I make sure that the web application, as an RMI client, is
resilient to the node failures in the remote OC4J cluster, so that in
the unlikely case when one of the remote OC4J nodes fail, my web
application will continue to function?
That is the topic of how to achieve load balancing with OC4J RMI client.
Solutions You need to configure and code RMI load balancing in two places:
1. Provider URL can be specified with a comma separated list of
URLs, so that the initial lookup will land to one of the available
URLs.
2. Choose a proper value for the oracle.j2ee.rmi.loadBalance
property, which, along side with the PROVIDER_URL property, is one
of the JNDI properties passed to the JNDI
lookup.(http://docs.oracle.com/cd/B31017_01/web.1013/b28958/rmi.htm#BABDGFBI) More details below: About the PROVIDER_URL
The JNDI property java.name.provider.url's job is, when the client looks
up for a new context at the very first time in the client session, to
provide a list of RMI context
The value of the JNDI property java.name.provider.url goes by the format of a single URL, or a comma separate list of URLs.
A single URL. For example: opmn:ormi://host1:6003:oc4j_instance1/appName1
A comma separated list of multiple URLs. For examples: opmn:ormi://host1:6003:oc4j_instanc1/appName,
opmn:ormi://host2:6003:oc4j_instance1/appName,
opmn:ormi://host3:6003:oc4j_instance1/appName
When the client looks up for a new Context the very first time in the
client session, it sends a query against the OPMN referenced by the provider URL. The OPMN host and port
specifies the destination of such query, and the OC4J instance name and
appName are actually the “where clause” of the query. When the PROVIDER URL reference a single OPMN server
Let's consider the case when the provider url only reference a single
OPMN server of the destination cluster. In this case, that single OPMN
server receives the query and returns a list of the qualified
Contexts from all OC4Js within the cluster, even though there is a
single OPMN server in the provider URL. A context represent a particular starting point at a particular server for subsequent object lookup.
For example, if the URL is opmn:ormi://host1:6003:oc4j_instance1/appName, then, OPMN will return the following contexts:
appName on oc4j_instance1 on host1
appName on oc4j_instance1 on host2,
appName on oc4j_instance1 on host3,
(provided that host1, host2, host3 are all in the same cluster)
Please note that
One OPMN will be sufficient to find the list of all
contexts from the entire cluster that satisfy the JNDI lookup query. You
can do an experiment by shutting down appName on host1, and observe
that OPMN on host1 will still be able to return you appname on host2 and
appName on host3.
When the PROVIDER URL reference a comma separated list of multiple OPMN servers When the JNDI propery java.naming.provider.url references a comma
separated list of multiple URLs, the lookup will return the exact same
things as with the single OPMN server: a list of qualified Contexts from
the cluster.
The purpose of having multiple OPMN servers is to provide high
availability in the initial context creation, such that if OPMN at host1
is unavailable, client will try the lookup via OPMN on host2, and so
on. After the initial lookup returns and cache a list of contexts, the
JNDI URL(s) are no longer used in the same client session. That explains
why removing the 3rd URL from the list of JNDI URLs will not stop the
client from getting the EJB on the 3rd server. About the oracle.j2ee.rmi.loadBalance Property
After the client acquires the list of contexts, it will cache it at the
client side as “list of available RMI contexts”. This list includes all
the servers in the destination cluster. This list will stay in the
cache until the client session (JVM) ends. The RMI load balancing
against the destination cluster is happening at the client side, as the
client is switching between the members of the list.
Whether and how often the client will fresh the Context from the list of
Context is based on the value of the oracle.j2ee.rmi.loadBalance. The
documentation at http://docs.oracle.com/cd/B31017_01/web.1013/b28958/rmi.htm#BABDGFBI list all the available values for the oracle.j2ee.rmi.loadBalance.
Value
Description
client
If specified, the client interacts with the OC4J process that was
initially chosen at the first lookup for the entire conversation.
context
Used for a Web client (servlet or JSP) that will access EJBs in a clustered OC4J environment.
If specified, a new Context object for a randomly-selected OC4J instance will be returned each time InitialContext() is invoked.
lookup
Used for a standalone client that will access EJBs in a clustered OC4J environment.
If specified, a new Context object for a randomly-selected OC4J instance will be created each time the client calls Context.lookup().
Please note the regardless of the setting of oracle.j2ee.rmi.loadBalance
property, the “refresh” only occurs at the client. The client can only
choose from the "list of available context" that was returned and cached
from the very first lookup. That is, the client will merely get a new
Context object from the “list of available RMI contexts” from the cache
at the client side. The client will NOT go to the OPMN server again to
get the list. That also implies that if you are adding a node to the
server cluster AFTER the client’s initial lookup, the client would not
know it because neither the server nor the client will initiate a
refresh of the “list of available servers” to reflect the new node. About High Availability (i.e. Resilience Against Node Failure of Remote OC4J Cluster)
What we have discussed above is about load balancing. Let's also discuss high availability.
This is how the High Availability works in RMI: when the client use the
context but get an exception such as socket is closed, it knows that the
server referenced by that Context is problematic and will try to get
another unused Context from the “list of available contexts”. Again,
this list is the list that was returned and cached at the very first
lookup in the entire client session.