Recent improvements in Console Performance
- by loren.konkus
Recently, the WebLogic Server development and support organizations have worked with a number of customers to quantify and improve the performance of the Administration Console in large, distributed configurations where there is significant latency in the communications between the administration server and managed servers.
These improvements fall into two categories:
Constraining the amount of time that the Console stalls waiting for communication
Reducing and streamlining the amount of data required for an update
A few releases ago, we added support for a configurable domain-wide mbean "Invocation Timeout" value on the Console's configuration: general, advanced section for a domain. The default value for this setting is 0, which means wait indefinitely and was chosen for compatibility with the behavior of previous releases. This configuration setting applies to all mbean communications between the admin server and managed servers, and is the first line of defense against being blocked by a stalled or completely overloaded managed server. Each site should choose an appropriate timeout value for their environment and network latency.
In the next release of WebLogic Server, we've added an additional console preference, "Management Operation Timeout", to the Console's shared preference page. This setting further constrains how long certain console pages will wait for slowly responding servers before returning partial results. While not all Console pages support this yet, key pages such as the Servers Configuration and Control table pages and the Deployments Control pages have been updated to support this. For example, if a user requests a Servers Table page and a Management Operation Timeout occurs, the table is displayed with both local configuration and remote runtime information from the responding managed servers and only local configuration information for servers that did not yet respond. This means that a troublesome managed server does not impede your ability to manage your domain using the Console.
To support these changes, these Console pages have been re-written to use the Work Management feature of WebLogic Server to interact with each server or deployment concurrently, which further improves the responsiveness of these pages. The basic algorithm for these pages is:
For each configuration mbean (ie, Servers)
populate rows with configuration attributes from the fast, local mbean server
Find a WorkManager
For each server,
Create a Work instance to obtain runtime mbean attributes for the server
Schedule Work instance in the WorkManager
Call WorkManager.waitForAll to wait WorkItems to finish, constrained by Management Operation Timeout
For each WorkItem,
if the runtime information obtained was not complete, add a message indicating which server has incomplete data
Display collected data in table
In addition to these changes to constrain how long the console waits for communication, a number of other changes have been made to reduce the amount and scope of managed server interactions for key pages. For example, in previous releases the Deployments Control table looked at the status of a deployment on every managed server, even those servers that the deployment was not currently targeted on. (This was done to handle an edge case where a deployment's target configuration was changed while it remained running on previously targeted servers.) We decided supporting that edge case did not warrant the performance impact for all, and instead only look at the status of a deployment on the servers it is targeted to. Comprehensive status continues to be available if a user clicks on the 'status' field for a deployment.
Finally, changes have been made to the System Status portlet to reduce its impact on Console page display times. Obtaining health information for this display requires several mbean interactions with managed servers. In previous releases, this mbean interaction occurred with every display, and any delay or impediment in these interactions was reflected in the display time for every page. To reduce this impact, we've made several changes in this portlet:
Using Work Management to obtain health concurrently
Applying the operation timeout configuration to constrain how long we will wait
Caching health information to reduce the cost during rapid navigation from page to page and only obtaining new health information if the previous information is over 30 seconds old.
Eliminating heath collection if this portlet is minimized.
Together, these Console changes have resulted in significant performance improvements for the customers with large configurations and high latency that we have worked with during their development, and some lesser performance improvements for those with small configurations and very fast networks. These changes will be included in the 11g Rel 1 patch set 2 (10.3.3.0) release of WebLogic Server.