Testing fault tolerant code

Posted by Robert on Stack Overflow See other posts from Stack Overflow or by Robert
Published on 2010-05-03T09:09:27Z Indexed on 2010/05/03 9:18 UTC
Read the original article Hit count: 455

Filed under:

I’m currently working on a server application were we have agreed to try and maintain a certain level of service. The level of service we want to guaranty is: if a request is accepted by the server and the server sends on an acknowledgement to the client we want to guaranty that the request will happen, even if the server crashes. As requests can be long running and the acknowledgement time needs be short we implement this by persisting the request, then sending an acknowledgement to the client, then carrying out the various actions to fulfill the request. As actions are carried out they too are persisted, so the server knows the state of a request on start up, and there’s also various reconciliation mechanisms with external systems to check the accuracy of our logs.

This all seems to work fairly well, but we have difficult saying this with any conviction as we find it very difficult to test our fault tolerant code. So far we’ve come up with two strategies but neither is entirely satisfactory:

Have an external process watch the server code and then try and kill it off at what the external process thinks is an appropriate point in the test
Add code the application that will cause it to crash a certain know critical points

My problem with the first strategy is the external process cannot know the exact state of the application, so we cannot be sure we’re hitting the most problematic points in the code. My problem with the second strategy, although it gives more control over were the fault takes, is I do not like have code to inject faults within my application, even with optional compilation etc. I fear it would be too easy to over look a fault injection point and have it slip into a production environment.

Developer IT

Testing fault tolerant code - Developer IT

Testing fault tolerant code

fault-tolerance

testing

Related posts about fault-tolerance

Using Openfiler inside a virtualmachine and VmWare Fault Tolerance

Software Fault Tolerance

Elastic Caching Platforms Balance Performance, Scalability, And Fault Tolerance

Elastic Caching Platforms Balance Performance, Scalability, And Fault Tolerance

SAN Replication for Fault tolerance using EVA4400

Related posts about testing

Automated unit testing, integration testing or acceptance testing

Modifying a HTML page to fix several "bugs" add a function to next/previous on a option dropdown

Automation testing tool for Regression testing of desktop application

Oracle Functional Testing Suite Advanced Pack for Oracle EBS Now Available

Do You Know How OUM defines the four, basic types of business system testing performed on a project? Why not test your knowledge?

Categories cloud