I need advice on how to debug a cluster
Posted
by
alcor
on Programmers
See other posts from Programmers
or by alcor
Published on 2012-08-23T14:30:15Z
Indexed on
2012/09/03
15:51 UTC
Read the original article
Hit count: 184
I'm the only developer of a complex critical software system, written in Visual C++ 2005. It's deployed on a classical Microsoft cluster scenario (active/passive), that has Windows Server 2003 R2.
If a server A goes down, the other one (B) starts and take the ownership of its duties.
You have to know that:
- both servers have the same Microsoft patches/fixes, same hardware, same everything.
- both servers use the same memory storage (a RAID-6 through fiber channel).
- this software has a main module who launch the peripheral modules. if a peripheral module crashes, the main module restarts it.
When I switch the application in one of the two servers (let's say the B server) two of the peripheral modules of the main applications just started to crash apparently without reason about 2 seconds after the start of the peripheral module.
What could I do to analyze/inspect/resolve this weird situation?
© Programmers or respective owner