Search Results

Search found 21340 results on 854 pages for 'inter process communicat'.

Page 19/854 | < Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26  | Next Page >

  • Oracle Coherence, Split-Brain and Recovery Protocols In Detail

    - by Ricardo Ferreira
    This article provides a high level conceptual overview of Split-Brain scenarios in distributed systems. It will focus on a specific example of cluster communication failure and recovery in Oracle Coherence. This includes a discussion on the witness protocol (used to remove failed cluster members) and the panic protocol (used to resolve Split-Brain scenarios). Note that the removal of cluster members does not necessarily indicate a Split-Brain condition. Oracle Coherence does not (and cannot) detect a Split-Brain as it occurs, the condition is only detected when cluster members that previously lost contact with each other regain contact. Cluster Topology and Configuration In order to create an good didactic for the article, let's assume a cluster topology and configuration. In this example we have a six member cluster, consisting of one JVM on each physical machine. The member IDs are as follows: Member ID  IP Address  1  10.149.155.76  2  10.149.155.77  3  10.149.155.236  4  10.149.155.75  5  10.149.155.79  6  10.149.155.78 Members 1, 2, and 3 are connected to a switch, and members 4, 5, and 6 are connected to a second switch. There is a link between the two switches, which provides network connectivity between all of the machines. Member 1 is the first member to join this cluster, thus making it the senior member. Member 6 is the last member to join this cluster. Here is a log snippet from Member 6 showing the complete member set: 2010-02-26 15:27:57.390/3.062 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=main, member=6): Started DefaultCacheServer... SafeCluster: Name=cluster:0xDDEB Group{Address=224.3.5.3, Port=35465, TTL=4} MasterMemberSet ( ThisMember=Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) OldestMember=Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) ActualMemberSet=MemberSet(Size=6, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=2, Timestamp=2010-02-26 15:27:17.847, Address=10.149.155.77:8088, MachineId=1101, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:296, Role=CoherenceServer) Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) Member(Id=5, Timestamp=2010-02-26 15:27:49.095, Address=10.149.155.79:8088, MachineId=1103, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:3229, Role=CoherenceServer) Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) ) RecycleMillis=120000 RecycleSet=MemberSet(Size=0, BitSetCount=0 ) ) At approximately 15:30, the connection between the two switches is severed: Thirty seconds later (the default packet timeout in development mode) the logs indicate communication failures across the cluster. In this example, the communication failure was caused by a network failure. In a production setting, this type of communication failure can have many root causes, including (but not limited to) network failures, excessive GC, high CPU utilization, swapping/virtual memory, and exceeding maximum network bandwidth. In addition, this type of failure is not necessarily indicative of a split brain. Any communication failure will be logged in this fashion. Member 2 logs a communication failure with Member 5: 2010-02-26 15:30:32.638/196.928 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=PacketPublisher, member=2): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=5, Timestamp=2010-02-26 15:27:49.095, Address=10.149.155.79:8088, MachineId=1103, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:3229, Role=CoherenceServer) by MemberSet(Size=2, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) ) The Coherence clustering protocol (TCMP) is a reliable transport mechanism built on UDP. In order for the protocol to be reliable, it requires an acknowledgement (ACK) for each packet delivered. If a packet fails to be acknowledged within the configured timeout period, the Coherence cluster member will log a packet timeout (as seen in the log message above). When this occurs, the cluster member will consult with other members to determine who is at fault for the communication failure. If the witness members agree that the suspect member is at fault, the suspect is removed from the cluster. If the witnesses unanimously disagree, the accuser is removed. This process is known as the witness protocol. Since Member 2 cannot communicate with Member 5, it selects two witnesses (Members 1 and 4) to determine if the communication issue is with Member 5 or with itself (Member 2). However, Member 4 is on the switch that is no longer accessible by Members 1, 2 and 3; thus a packet timeout for member 4 is recorded as well: 2010-02-26 15:30:35.648/199.938 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=PacketPublisher, member=2): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) by MemberSet(Size=2, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) ) Member 1 has the ability to confirm the departure of member 4, however Member 6 cannot as it is also inaccessible. At the same time, Member 3 sends a request to remove Member 6, which is followed by a report from Member 3 indicating that Member 6 has departed the cluster: 2010-02-26 15:30:35.706/199.996 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=2): MemberLeft request for Member 6 received from Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) 2010-02-26 15:30:35.709/199.999 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=2): MemberLeft notification for Member 6 received from Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) The log for Member 3 determines how Member 6 departed the cluster: 2010-02-26 15:30:35.161/191.694 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=PacketPublisher, member=3): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) by MemberSet(Size=2, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=2, Timestamp=2010-02-26 15:27:17.847, Address=10.149.155.77:8088, MachineId=1101, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:296, Role=CoherenceServer) ) 2010-02-26 15:30:35.165/191.698 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=Cluster, member=3): Member departure confirmed by MemberSet(Size=2, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) Member(Id=2, Timestamp=2010-02-26 15:27:17.847, Address=10.149.155.77:8088, MachineId=1101, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:296, Role=CoherenceServer) ); removing Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) In this case, Member 3 happened to select two witnesses that it still had connectivity with (Members 1 and 2) thus resulting in a simple decision to remove Member 6. Given the departure of Member 6, Member 2 is left with a single witness to confirm the departure of Member 4: 2010-02-26 15:30:35.713/200.003 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=Cluster, member=2): Member departure confirmed by MemberSet(Size=1, BitSetCount=2 Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) ); removing Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) In the meantime, Member 4 logs a missing heartbeat from the senior member. This message is also logged on Members 5 and 6. 2010-02-26 15:30:07.906/150.453 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=PacketListenerN, member=4): Scheduled senior member heartbeat is overdue; rejoining multicast group. Next, Member 4 logs a TcpRing failure with Member 2, thus resulting in the termination of Member 2: 2010-02-26 15:30:21.421/163.968 Oracle Coherence GE 3.5.3/465p2 <D4> (thread=Cluster, member=4): TcpRing: Number of socket exceptions exceeded maximum; last was "java.net.SocketTimeoutException: connect timed out"; removing the member: 2 For quick process termination detection, Oracle Coherence utilizes a feature called TcpRing which is a sparse collection of TCP/IP-based connections between different members in the cluster. Each member in the cluster is connected to at least one other member, which (if at all possible) is running on a different physical box. This connection is not used for any data transfer, only heartbeat communications are sent once a second per each link. If a certain number of exceptions are thrown while trying to re-establish a connection, the member throwing the exceptions is removed from the cluster. Member 5 logs a packet timeout with Member 3 and cites witnesses Members 4 and 6: 2010-02-26 15:30:29.791/165.037 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=PacketPublisher, member=5): Timeout while delivering a packet; requesting the departure confirmation for Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) by MemberSet(Size=2, BitSetCount=2 Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) ) 2010-02-26 15:30:29.798/165.044 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=Cluster, member=5): Member departure confirmed by MemberSet(Size=2, BitSetCount=2 Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) Member(Id=6, Timestamp=2010-02-26 15:27:58.635, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) ); removing Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer) Eventually we are left with two distinct clusters consisting of Members 1, 2, 3 and Members 4, 5, 6, respectively. In the latter cluster, Member 4 is promoted to senior member. The connection between the two switches is restored at 15:33. Upon the restoration of the connection, the cluster members immediately receive cluster heartbeats from the two senior members. In the case of Members 1, 2, and 3, the following is logged: 2010-02-26 15:33:14.970/369.066 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=Cluster, member=1): The member formerly known as Member(Id=4, Timestamp=2010-02-26 15:30:35.341, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) has been forcefully evicted from the cluster, but continues to emit a cluster heartbeat; henceforth, the member will be shunned and its messages will be ignored. Likewise for Members 4, 5, and 6: 2010-02-26 15:33:14.343/336.890 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=Cluster, member=4): The member formerly known as Member(Id=1, Timestamp=2010-02-26 15:30:31.64, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) has been forcefully evicted from the cluster, but continues to emit a cluster heartbeat; henceforth, the member will be shunned and its messages will be ignored. This message indicates that a senior heartbeat is being received from members that were previously removed from the cluster, in other words, something that should not be possible. For this reason, the recipients of these messages will initially ignore them. After several iterations of these messages, the existence of multiple clusters is acknowledged, thus triggering the panic protocol to reconcile this situation. When the presence of more than one cluster (i.e. Split-Brain) is detected by a Coherence member, the panic protocol is invoked in order to resolve the conflicting clusters and consolidate into a single cluster. The protocol consists of the removal of smaller clusters until there is one cluster remaining. In the case of equal size clusters, the one with the older Senior Member will survive. Member 1, being the oldest member, initiates the protocol: 2010-02-26 15:33:45.970/400.066 Oracle Coherence GE 3.5.3/465p2 <Warning> (thread=Cluster, member=1): An existence of a cluster island with senior Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) containing 3 nodes have been detected. Since this Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) is the senior of an older cluster island, the panic protocol is being activated to stop the other island's senior and all junior nodes that belong to it. Member 3 receives the panic: 2010-02-26 15:33:45.803/382.336 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=3): Received panic from senior Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer) caused by Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer) Member 4, the senior member of the younger cluster, receives the kill message from Member 3: 2010-02-26 15:33:44.921/367.468 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=4): Received a Kill message from a valid Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer); stopping cluster service. In turn, Member 4 requests the departure of its junior members 5 and 6: 2010-02-26 15:33:44.921/367.468 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=4): Received a Kill message from a valid Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer); stopping cluster service. 2010-02-26 15:33:43.343/349.015 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=6): Received a Kill message from a valid Member(Id=4, Timestamp=2010-02-26 15:27:39.574, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer); stopping cluster service. Once Members 4, 5, and 6 restart, they rejoin the original cluster with senior member 1. The log below is from Member 4. Note that it receives a different member id when it rejoins the cluster. 2010-02-26 15:33:44.921/367.468 Oracle Coherence GE 3.5.3/465p2 <Error> (thread=Cluster, member=4): Received a Kill message from a valid Member(Id=3, Timestamp=2010-02-26 15:27:24.892, Address=10.149.155.236:8088, MachineId=1260, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:32459, Role=CoherenceServer); stopping cluster service. 2010-02-26 15:33:46.921/369.468 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Service Cluster left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Invocation:InvocationService, member=4): Service InvocationService left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=OptimisticCache, member=4): Service OptimisticCache left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=ReplicatedCache, member=4): Service ReplicatedCache left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=DistributedCache, member=4): Service DistributedCache left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Invocation:Management, member=4): Service Management left the cluster 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service Management with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service DistributedCache with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service ReplicatedCache with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service OptimisticCache with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member 6 left service InvocationService with senior member 5 2010-02-26 15:33:47.046/369.593 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=4): Member(Id=6, Timestamp=2010-02-26 15:33:47.046, Address=10.149.155.78:8088, MachineId=1102, Location=process:228, Role=CoherenceServer) left Cluster with senior member 4 2010-02-26 15:33:49.218/371.765 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=main, member=n/a): Restarting cluster 2010-02-26 15:33:49.421/371.968 Oracle Coherence GE 3.5.3/465p2 <D5> (thread=Cluster, member=n/a): Service Cluster joined the cluster with senior service member n/a 2010-02-26 15:33:49.625/372.172 Oracle Coherence GE 3.5.3/465p2 <Info> (thread=Cluster, member=n/a): This Member(Id=5, Timestamp=2010-02-26 15:33:50.499, Address=10.149.155.75:8088, MachineId=1099, Location=process:800, Role=CoherenceServer, Edition=Grid Edition, Mode=Development, CpuCount=2, SocketCount=1) joined cluster "cluster:0xDDEB" with senior Member(Id=1, Timestamp=2010-02-26 15:27:06.931, Address=10.149.155.76:8088, MachineId=1100, Location=site:usdhcp.oraclecorp.com,machine:dhcp-burlington6-4fl-east-10-149,process:511, Role=CoherenceServer, Edition=Grid Edition, Mode=Development, CpuCount=2, SocketCount=2) Cool isn't it?

    Read the article

  • AIA Release 3.1 verfügbar

    - by Hans Viehmann
    Nachdem das Foundation Pack 11g inzwischen eine Weile auf dem Markt ist, wurden jetzt auch die darauf aufsetzenden Process Integration Packs (PIPs) freigegeben. In diesem Zuge wurden neben den bestehenden 16 PIPs auch drei neue Integrationen vorgestellt:Oracle Design-to-Release Integration Pack for Agile Product Lifecycle Management for Process and Oracle Process ManufacturingOracle Clinical Trial Payments Integration Pack for Siebel ClinicalOracle Serialization and Tracking Integration Pack for Oracle Pedigree and Serialization Manager and Oracle E-Business SuiteLetztere sind speziell für den Healthcare/Life Sciences Markt gedacht.Zur Freigabe gibt es nicht nur eine entsprechende Pressemeldung (hier), sondern auch einen öffentlichen Launch-Webcast am 23. Februar unter dem Titel "Tackling the Challenges of Application Integration". Leider ist er mehr für amerikanische Zuhörer gedacht und findet um 10:00h PDT statt. Wer aber sein abendliches Fernsehprogramm eintauschen möchte, findet hier die nötigen Details und die Möglichkeit zur Registrierung.

    Read the article

  • Networking Client Server Packet logic (How they communicate)

    - by Trixmix
    I want to know what is the logic behind server client communication through packets for a real time game. for example the server sends x packets then the client receives x packets and processes them.. Basically what is the process to keep the client and server in sync and able to receive and send packets. more in depth example of what I want to know: client step 1 wait for a packet step 2 read x packets step 3 process x packets step 4 send x packets and so on... I need to know the very basic outline of the communication. Big questions are: 1) do I send and read packets all at one time? i.e for loop though the incoming packets array list and read them all or one every server loop or what... 2) what order should I do things i.e first receive then read then process then send etc.. 3) what I asked above a step by step of what the server / client should do.. Thanks!

    Read the article

  • PASS: The Budget Process

    - by Bill Graziano
    Every fiscal year PASS creates a detailed budget.  This helps us set priorities and communicate to our members what we’re going to do in the upcoming year.  You can review the current budget on the PASS Governance page.  That page currently requires you to login but I’m talking with HQ to see if there are any legal issues with opening that up. The Accounting Team The PASS accounting team is two people.  The Executive Vice-President of Finance (“EVP”) and the PASS Accounting Manager.  Sandy Cherry is the accounting manager and works at PASS HQ.  Sandy has been with PASS since we switched management companies in 2007.  Throughout this document when I talk about any actual work related to the budget that’s all Sandy :)  She’s the glue that gets us through this process.  Last year we went through 32 iterations of the budget before the Board approved so it’s a pretty busy time for her us – well, mostly her. Fiscal Year The PASS fiscal year runs from July 1st through June 30th the following year.  Right now we’re in fiscal year 2011.  Our 2010 Summit actually occurred in FY2011.  We switched to this schedule from a calendar year in 2006.  Our goal was to have the Summit occur early in our fiscal year.  That gives us the rest of the year to handle any significant financial impact from the Summit.  If registrations are down we can reduce spending.  If registrations are up we can decide how much to increase our reserves and how much to spend.  Keep in mind that the Summit is budgeted to generate 82% of our revenue this year.  How it performs has a significant impact on our financials.  The other benefit of this fiscal year is that it matches the Microsoft fiscal year.  We sign an annual sponsorship agreement with Microsoft and it’s very helpful that our fiscal years match. This year our budget process will probably start in earnest in March or April.  I’d like to be done in early June so we can publish before July 1st.  I was late publishing it this year and I’m trying not to repeat that. Our Budget Our actual budget is an Excel spreadsheet with 36 sheets.  We remove some of those when we publish it since they include salary information.  The budget is broken up into various portfolios or departments.  We have 20 portfolios.  They include chapters, marketing, virtual chapters, marketing, etc.  Ideally each portfolio is assigned to a Board member.  Each portfolio also typically has a staff person assigned to it.  Portfolios that aren’t assigned to a Board member are monitored by HQ and the ExecVP-Finance (me).  These are typically smaller portfolios such as deferred membership or Summit futures.  (More on those in a later post.)  All portfolios are reviewed by all Board members during the budget approval process, when interim financials are released internally and at year-end. The Process Our first step is to budget revenues.  The Board determines a target attendee number.  We have formulas based on historical performance that convert that to an overall attendee revenue number.  Other revenue projections (such as vendor sponsorships) come from different parts of the organization.  I hope to have another post with more details on how we project revenues. The next step is to budget expenses.  Board members fill out a sample spreadsheet with their budget for the year.  They can add line items and notes describing what the amounts are for.  Each Board portfolio typically has from 10 to 30 line items.  Any new initiatives they want to pursue needs to be budgeted.  The Summit operations budget is managed by HQ.  It includes the cost for food, electrical, internet, etc.  Most of these come from our estimate of attendees and our contract with the convention center.  During this process the Board can ask for more or less to be spent on various line items.  For example, if we weren’t happy with the Internet at the last Summit we can ask them to look into different options and/or increasing the budget.  HQ will also make adjustments to these numbers based on what they see at the events and the feedback we receive on the surveys. After we have all the initial estimates we start reviewing the entire budget.  It is sent out to the Board and we can see what each portfolio requested and what the overall profit and loss number is.  We usually start with too much in expenses and need to cut.  In years past the Board started haggling over these numbers as a group.  This past year they decided I should take a first cut and present them with a reasonable budget and a list of what I changed.  That worked well and I think we’ll continue to do that in the future. We go through a number of iterations on the budget.  If I remember correctly, we went through 32 iterations before we passed the budget.  At each iteration various revenue and expense numbers can change.  Keep in mind that the PASS budget has 200+ line items spread over 20 portfolios.  Many of these depend on other numbers.  For example, if we decide increase the projected attendees that cascades through our budget.  At each iteration we list what changed and the impact.  Ideally these discussions will take place at a face-to-face Board meeting.  Many of them also take place over the phone.  Board members explain any increase they are asking for while performing due diligence on other budget requests.  Eventually a budget emerges and is passed. Publishing After the budget is passed we create a version without the formulas and salaries for posting on the web site.  Sandy also creates some charts to help our members understand the budget.  The EVP writes a nice little letter describing some of the changes from last year’s budget.  You can see my letter and our budget on the PASS Governance page. And then, eight months later, we start all over again.

    Read the article

  • What happens to other users if the .NET worker process crashes?

    - by Jason Slocomb
    My knowledge of how processes are handled by the ASP.Net worker process is woefully inadequate. I'm hoping some of the experts out there can fill me in. If I crash the worker process with a System.OutOfMemoryException, what would the user experience be for other users who were being served by the same process? Would they get a blank screen? 503 error? I'm going to attempt to test this scenario with some other folks in our lab, but I thought I would float this out there. I will update with our results.

    Read the article

  • Case Management Patterns with Oracle Unified Business Process Management Suite

    - by Ajay Khanna
    Contributed by Heidi Buelow, Oracle Product Management Case Management was a hot topic all week at Oracle OpenWorld so I was excited to share our current features and upcoming plans at the session Thursday morning on Case Management Patterns with Oracle Unified Business Process Management Suite.  My colleague, Ravi Rangaswamy, the Case Management Development Manager, and I, Heidi Buelow, the Case Management Product Manager, discussed case management use case patterns with an interested audience.  We also talked about the current BPM Suite offering for Case Managment and showed a demo of our upcoming release where Case Management becomes a first class component in a BPM composite application. Case Management use case patterns cover a wide range of horizontal applications such as Accounts Payable, Dispute Resolution, Call Center, Employee OnBoarding, and many vertical applications in domains and industries such as Public Sector services, Insurance claims, and Healthcare.  Really, it is any use case where the resolution of a request may require a knowledge worker making decisions using experienced judgement in the current situation.  This allows for expidited care and customer satisfaction, both being highly valued for consumer loyalty, regulatory compliance, and efficient resolution. Today, BPM Suite provides the tools for creating Case Management applications using BPMN 2.0, Business Rules, and rich BAM and Case Analytics.  The Process Composer provides the agility to change rules and processes by the business users.  The case manager and case workers have the flexibilty they need.  With integrated content management and the concept of a BPM Process Spaces instance (case) space, the current release enables case management use case applications. In the next release, Case Management becomes a first class component. By this, we mean, Case is a separate component in the composite.  We are adding case attributes such as milestones, case events, case stakeholders, and more, providing a rich toolset for the use cases that require a flexible Case Management approach.  Activites become available according to the conditions that you specify and information can be protected by permissions indicated.  In BPM Studio, you design a Case and associate all of the attributes and activities that are needed, yet, at runtime you have the flexibility to add and change these as needed. We enjoyed sharing Case Management and it was well received by the audience.  The presentation is available online and we have viewlets of the demo that will be available at release time.

    Read the article

  • Oracle Retail Consulting and the Implementation Process, with Maria Porretta

    - by user801960
    Maria Porretta, Engagement Director, discusses Oracle Retail Consulting and its involvement in the implementation process and how it supports customers to maximize their ROI in Oracle Retail solutions. Maria explains the wide range of factors customers need to consider when preparing to implement Oracle Retail, from the solutions being utilized to the current IT infrastructures and available resources of the end user. Oracle Retail Consulting ensures a smooth and efficient process by working with customers from design right through to final implementation, and continues to work with customers to ensure they get value from their software investments and further extend investment in Oracle Retail solutions. Further information is available on our website regarding Oracle Retail Consulting.

    Read the article

  • Application development : method to manage backgound process

    - by Simon Dubois
    I am developing an application with different behavior depending on the arguments : - "-config" starts a Gtk window to change options, start and close the daemon. - "-daemon" starts a background process that does something every X minutes. I already know how to use fork/system/exec etc... But I would like to know the main logic of such application to : - restart or refresh the daemon when configuration change. - keep only one instance of the daemon. I have red that killing the daemon to restart it is not a clean way to do. How other applications do ? (ubuntuone, weather forecast, rss feed working with notification area) Thanks for your help. PS : I don't want to create a system-wide daemon, just a user application with a background process.

    Read the article

  • watchdog/0 process using all my CPU suddenly

    - by jeffery_the_wind
    I have a fresh installation of Ubuntu 12.04, I have been running it for about a week. Suddenly today I noticed my computer freezes every 5 seconds. I restarted the computer and I still get this. I believe it is a process called watchdog/0 that is using all the resources. See the attached pictures. How can I stop this? I can barely use my computer like this. UPDATE Well I just did a cold reboot, (shutdown, unplug, and plug back in, and turn on) and it seems to have fixed it. After looking at the man page for watchdog, it seems that this process may stay on during a restart? so it is more like a soft restart? Why that happens I don't know.

    Read the article

  • Using paypal to process credit cards in Sweden through an API [on hold]

    - by Mastikator
    I'm looking for a Paypal API that lets me process credit cards to make payments without being redirected to a paypal site and without enforcing consumers to use their paypal account. And it needs to work in Sweden. The ones I've looked at (dodirectpayment, expresscheckout, paypalpro gateway) and none of them have let me process credit cards in Sweden via an API that doesn't force the user to visit the paypal login site. I have a form on my webpage that the user types their credit card number, ccv2, expiration, name, address, etc. I need an API that works in Sweden that simply processes the request, and it has to be without the step of being redirected into a paypal website. The ones that I have found only worked in a select few countries, is there an international solution? I've already spent over 12 work hours just looking for an API that meets my requirements.

    Read the article

  • How to find out which process actually locks your dll when SharePoint Solution deployment failed

    - by ybbest
    When your SharePoint Solution package include third party or external dlls , you will often see your solution fail to deploy due to the locking of the dlls. Today I will show you how to find which process locks your dlls using Process Explorer. 1. Here is an example that your solution fails to deploy due to dll being locked. 2. Start the explorer by double click the procexp.exe 3. From the find tab click Find Handle or DLL 4.Type the your dll name and click Search 5. I can see all the processes that use my dlls at the moment, it looks like the iis , visual studio and SharePoint timer services might be the trouble. From my experience , it could be Visual studio. 6. Close visual studio and redeploy my solution, it works like charm. Re-search the dll, you can see Visual studio is not in the results.

    Read the article

  • How to hide process arguments from other users?

    - by poolie
    A while ago, I used to use the grsecurity kernel patches, which had an option to hide process arguments from other non-root users. Basically this just made /proc/*/cmdline be mode 0600, and ps handles that properly by showing that the process exists but not its arguments. This is kind of nice if someone on a multiuser machine is running say vi christmas-presents.txt, to use the canonical example. Is there any supported way to do this in Ubuntu, other than by installing a new kernel? (I'm familiar with the technique that lets individual programs alter their argv, but most programs don't do that and anyhow it is racy. This stackoverflow user seems to be asking the same question, but actually just seems very confused.)

    Read the article

  • How to hide process arguments from other users?

    - by poolie
    A while ago, I used to use the grsecurity kernel patches, which had an option to hide process arguments from other non-root users. Basically this just made /proc/*/cmdline be mode 0600, and ps handles that properly by showing that the process exists but not its arguments. This is kind of nice if someone on a multiuser machine is running say vi christmas-presents.txt, to use the canonical example. Is there any supported way to do this in Ubuntu, other than by installing a new kernel? (I'm familiar with the technique that lets individual programs alter their argv, but most programs don't do that and anyhow it is racy. This stackoverflow user seems to be asking the same question, but actually just seems very confused.)

    Read the article

  • New release of Oracle Process Accelerators (11.1.1.6.2

    - by JuergenKress
    Press release delivers Industry-focused solutions including Incident Reporting (Public Sector) and Loan Origination (Financial Services). lSO updated Travel Request Management (TRM), Document Routing and Approval (DRA), and Internal Service Requests (ISR). All BPM material is available at our SOA Community Workspace (SOA Community membership required): Presentation | Data Sheet | White paper Download Process Accelerators and Collateral - OTN SOA & BPM Partner Community For regular information on Oracle SOA Suite become a member in the SOA & BPM Partner Community for registration please visit  www.oracle.com/goto/emea/soa (OPN account required) If you need support with your account please contact the Oracle Partner Business Center. Blog Twitter LinkedIn Mix Forum Technorati Tags: bpm,process accelerators,SOA Community,Oracle SOA,Oracle BPM,Community,OPN,Jürgen Kress

    Read the article

  • apt-get upgrade E: Sub-process /usr/bin/dpkg returned an error code (1)

    - by user292425
    When I typed apt-get install upgrade, I got error: Reading package lists... Done Building dependency tree Reading state information... Done 0 upgraded, 0 newly installed, 0 to remove and 0 not upgraded. 1 not fully installed or removed. After this operation, 0 B of additional disk space will be used. Do you want to continue [Y/n]? y Setting up linux-netizen (1.0.1-1) ... chrome: no process found dpkg: error processing linux-netizen (--configure): subprocess installed post-installation script returned error exit status 1 Errors were encountered while processing: linux-netizen E: Sub-process /usr/bin/dpkg returned an error code (1) So I tried some method to fix it: sudo apt-get install -f and sudo apt-get install --configure -a But all methods are not working. Please help me....

    Read the article

  • Overview of the agile process that I can apply to a startup

    - by Pete2k
    I need to provide a quote to an external client for some software. I'm looking to use agile just for initial requirements building (which I'm experienced in from a developer perspective) but I need to do everything this is just a one man job. The client are having a hard time working out what there requirements are and the value I can add will be to sit down with them and work out what they want using user stories etc, I basically need to be a BA for a little bit. I am looking for good overview of the procedures to go through in the agile process for building requirements, and the continuing process a bit for further down the line. For example the initial inception through to elaboration of epics and building user stories (or not) just need to read a bit about it before the meeting so I know the best way to proceed if I spend a day with them. Having additional resources to provide to the client so that we are all on the same page would be useful too.

    Read the article

  • My site will not process Credit Cards [closed]

    - by user654389
    Authorized.net was processing my credit card purchases until the end of Feb. As of 3/1/2011 they no longer will process electronic cigarette transactions. Processing network told us we would have a seamless transition over to a processor called EPN. Now we can not process and credit card orders at all. I have been told it's an SSL concern (EPN says no) I have been told it's an issue between Authorized.net and EPN again I am told no. Might site worked and functioned fine until the "seamless" transition took place. Please help me out here. Thanks Dave

    Read the article

  • Heartbeat won't successfully start up resources from a cold boot when a failed node is present

    - by Matthew
    I currently have two ubuntu servers running Heartbeat and DRBD. The servers are directory connected with a 1000Mbps crossover cable on eth1 and have access to an IP camera LAN on eth0. Now, let's say that one node is down and the remaining functional node is booting after having been shut down. The node that is still functioning won't start up heartbeat and provide access to the drbd resource from a cold boot. I have to manually restart heartbeat by sudo service heartbeat restart to get everything up and running. How can I get it to start fine from a cold start, when only one server is present? Here is the ha.cf: debug /var/log/ha-debug logfile /var/log/ha-log logfacility none keepalive 2 deadtime 10 warntime 7 initdead 60 ucast eth1 192.168.2.2 ucast eth0 10.1.10.201 node EMserver1 node EMserver2 respawn hacluster /usr/lib/heartbeat/ipfail ping 10.1.10.22 10.1.10.21 10.1.10.11 auto_failback off Some material from the syslog: harc[4604]: 2012/11/27_13:54:49 info: Running /etc/ha.d//rc.d/status status mach_down[4632]: 2012/11/27_13:54:49 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired mach_down[4632]: 2012/11/27_13:54:49 info: mach_down takeover complete for node emserver2. Nov 27 13:54:49 EMserver1 heartbeat: [4586]: info: Initial resource acquisition complete (T_RESOURCES(us)) Nov 27 13:54:49 EMserver1 heartbeat: [4586]: info: mach_down takeover complete. IPaddr[4679]: 2012/11/27_13:54:49 INFO: Resource is stopped Nov 27 13:54:49 EMserver1 heartbeat: [4605]: info: Local Resource acquisition completed. harc[4713]: 2012/11/27_13:54:49 info: Running /etc/ha.d//rc.d/ip-request-resp ip-request-resp ip-request-resp[4713]: 2012/11/27_13:54:49 received ip-request-resp IPaddr::10.1.10.254 OK yes ResourceManager[4732]: 2012/11/27_13:54:50 info: Acquiring resource group: emserver1 IPaddr::10.1.10.254 drbddisk::r0 Filesystem::/dev/drbd1::/shr::ext4 nfs-kernel-server IPaddr[4759]: 2012/11/27_13:54:50 INFO: Resource is stopped ResourceManager[4732]: 2012/11/27_13:54:50 info: Running /etc/ha.d/resource.d/IPaddr 10.1.10.254 start IPaddr[4816]: 2012/11/27_13:54:50 INFO: Using calculated nic for 10.1.10.254: eth0 IPaddr[4816]: 2012/11/27_13:54:50 INFO: Using calculated netmask for 10.1.10.254: 255.255.255.0 IPaddr[4816]: 2012/11/27_13:54:50 INFO: eval ifconfig eth0:0 10.1.10.254 netmask 255.255.255.0 broadcast 10.1.10.255 IPaddr[4804]: 2012/11/27_13:54:50 INFO: Success ResourceManager[4732]: 2012/11/27_13:54:50 info: Running /etc/ha.d/resource.d/drbddisk r0 start Filesystem[4965]: 2012/11/27_13:54:50 INFO: Resource is stopped ResourceManager[4732]: 2012/11/27_13:54:50 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /shr ext4 start Filesystem[5039]: 2012/11/27_13:54:50 INFO: Running start for /dev/drbd1 on /shr Filesystem[5033]: 2012/11/27_13:54:51 INFO: Success ResourceManager[4732]: 2012/11/27_13:54:51 info: Running /etc/init.d/nfs-kernel-server start Nov 27 13:55:00 EMserver1 heartbeat: [4586]: info: Local Resource acquisition completed. (none) Nov 27 13:55:00 EMserver1 heartbeat: [4586]: info: local resource transition completed. Nov 27 13:57:46 EMserver1 heartbeat: [4586]: info: Heartbeat shutdown in progress. (4586) Nov 27 13:57:46 EMserver1 heartbeat: [5286]: info: Giving up all HA resources. ResourceManager[5301]: 2012/11/27_13:57:46 info: Releasing resource group: emserver1 IPaddr::10.1.10.254 drbddisk::r0 Filesystem::/dev/drbd1::/shr::ext4 nfs-kernel-server ResourceManager[5301]: 2012/11/27_13:57:46 info: Running /etc/init.d/nfs-kernel-server stop ResourceManager[5301]: 2012/11/27_13:57:46 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /shr ext4 stop Filesystem[5372]: 2012/11/27_13:57:46 INFO: Running stop for /dev/drbd1 on /shr Filesystem[5372]: 2012/11/27_13:57:47 INFO: Trying to unmount /shr Filesystem[5372]: 2012/11/27_13:57:47 INFO: unmounted /shr successfully Filesystem[5366]: 2012/11/27_13:57:47 INFO: Success ResourceManager[5301]: 2012/11/27_13:57:47 info: Running /etc/ha.d/resource.d/drbddisk r0 stop ResourceManager[5301]: 2012/11/27_13:57:47 info: Running /etc/ha.d/resource.d/IPaddr 10.1.10.254 stop IPaddr[5509]: 2012/11/27_13:57:47 INFO: ifconfig eth0:0 down IPaddr[5497]: 2012/11/27_13:57:47 INFO: Success Nov 27 13:57:47 EMserver1 heartbeat: [5286]: info: All HA resources relinquished. Nov 27 13:57:48 EMserver1 heartbeat: [4586]: info: killing /usr/lib/heartbeat/ipfail process group 4603 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBFIFO process 4589 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4590 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4591 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4592 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4593 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4594 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4595 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4596 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4597 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBWRITE process 4598 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: killing HBREAD process 4599 with signal 15 Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4589 exited. 11 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4596 exited. 10 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4598 exited. 9 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4590 exited. 8 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4595 exited. 7 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4591 exited. 6 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4592 exited. 5 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4593 exited. 4 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4597 exited. 3 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4594 exited. 2 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: Core process 4599 exited. 1 remaining Nov 27 13:57:49 EMserver1 heartbeat: [4586]: info: emserver1 Heartbeat shutdown complete. Here is some more from the log ResourceManager[2576]: 2012/11/28_16:32:42 info: Acquiring resource group: emserver1 IPaddr::10.1.10.254 drbddisk::r0 Filesystem::/dev/drbd1::/shr::ext4 nfs-kernel-server IPaddr[2602]: 2012/11/28_16:32:42 INFO: Running OK Filesystem[2653]: 2012/11/28_16:32:43 INFO: Running OK Nov 28 16:32:52 EMserver1 heartbeat: [1695]: WARN: node emserver2: is dead Nov 28 16:32:52 EMserver1 heartbeat: [1695]: info: Dead node emserver2 gave up resources. Nov 28 16:32:52 EMserver1 ipfail: [1807]: info: Status update: Node emserver2 now has status dead Nov 28 16:32:52 EMserver1 heartbeat: [1695]: info: Link emserver2:eth1 dead. Nov 28 16:32:53 EMserver1 ipfail: [1807]: info: NS: We are still alive! Nov 28 16:32:53 EMserver1 ipfail: [1807]: info: Link Status update: Link emserver2/eth1 now has status dead Nov 28 16:32:55 EMserver1 ipfail: [1807]: info: Asking other side for ping node count. Nov 28 16:32:55 EMserver1 ipfail: [1807]: info: Checking remote count of ping nodes. Nov 28 16:32:57 EMserver1 heartbeat: [1695]: info: Heartbeat shutdown in progress. (1695) Nov 28 16:32:57 EMserver1 heartbeat: [2734]: info: Giving up all HA resources. ResourceManager[2751]: 2012/11/28_16:32:57 info: Releasing resource group: emserver1 IPaddr::10.1.10.254 drbddisk::r0 Filesystem::/dev/drbd1::/shr::ext4 nfs-kernel-server ResourceManager[2751]: 2012/11/28_16:32:57 info: Running /etc/init.d/nfs-kernel-server stop ResourceManager[2751]: 2012/11/28_16:32:57 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd1 /shr ext4 stop Filesystem[2829]: 2012/11/28_16:32:57 INFO: Running stop for /dev/drbd1 on /shr Filesystem[2829]: 2012/11/28_16:32:57 INFO: Trying to unmount /shr Filesystem[2829]: 2012/11/28_16:32:58 INFO: unmounted /shr successfully Filesystem[2823]: 2012/11/28_16:32:58 INFO: Success ResourceManager[2751]: 2012/11/28_16:32:58 info: Running /etc/ha.d/resource.d/drbddisk r0 stop ResourceManager[2751]: 2012/11/28_16:32:58 info: Running /etc/ha.d/resource.d/IPaddr 10.1.10.254 stop IPaddr[2971]: 2012/11/28_16:32:58 INFO: ifconfig eth0:0 down IPaddr[2958]: 2012/11/28_16:32:58 INFO: Success Nov 28 16:32:58 EMserver1 heartbeat: [2734]: info: All HA resources relinquished. Nov 28 16:32:59 EMserver1 heartbeat: [1695]: info: killing /usr/lib/heartbeat/ipfail process group 1807 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBFIFO process 1777 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1778 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1779 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1780 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1781 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1782 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1783 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1784 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1785 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBWRITE process 1786 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: killing HBREAD process 1787 with signal 15 Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1778 exited. 11 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1779 exited. 10 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1780 exited. 9 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1781 exited. 8 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1782 exited. 7 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1783 exited. 6 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1784 exited. 5 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1785 exited. 4 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1786 exited. 3 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1787 exited. 2 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: Core process 1777 exited. 1 remaining Nov 28 16:33:01 EMserver1 heartbeat: [1695]: info: emserver1 Heartbeat shutdown complete. If I restarted heartbeat at this point... the resources heartbeat controls would start up fine.... please help!

    Read the article

  • Apache error: could not make child process 25105 exit, attempting to continue anyway

    - by Temnovit
    Hello! I have a web server based on Ubuntu Server 9.10 with this software: apache 2 PHP 5.3 MySQL 5 Python 2.5 Few of my websites are PHP based, few use python/django through mod_wsgi. For month or so, every day my apache server stops responding until I manually restart it. Error logs show: [Fri Mar 05 17:06:47 2010] [error] could not make child process 25059 exit, attempting to continue anyway [Fri Mar 05 17:06:47 2010] [error] could not make child process 25061 exit, attempting to continue anyway [Fri Mar 05 17:06:47 2010] [error] could not make child process 24930 exit, attempting to continue anyway [Fri Mar 05 17:06:47 2010] [error] could not make child process 25084 exit, attempting to continue anyway [Fri Mar 05 17:06:47 2010] [error] could not make child process 25105 exit, attempting to continue anyway and so on. I tried to google this problem but it seems, that I can't find a solution there. How can I determine the cause of this error and how do I fix it? Thank you for your help. UPDATE Updating mod-wsgi to version 3.1 didn't solve the problem Updating PHP to 5.3 also didn't solve it Here is a list of all installed modules: core mod_log_config mod_logio prefork http_core mod_so mod_alias mod_auth_basic mod_authn_file mod_authz_default mod_authz_groupfile mod_authz_host mod_authz_user mod_autoindex mod_cgi mod_deflate mod_dir mod_env mod_mime mod_negotiation mod_php5 mod_rewrite mod_setenvif mod_status mod_wsgi Here's how my virtual host with wsgi looks: <VirtualHost *:80> ServerName example.net DocumentRoot /var/www/example.net #wcgi script that serves all the thing WSGIScriptAlias / /var/www/example.net/index.wsgi WSGIDaemonProcess example user=wsgideamonuser group=root processes=1 threads=10 WSGIProcessGroup example Alias /static /var/www/example.net/static #serving admin files Alias /media/ /usr/local/lib/python2.6/dist-packages/django/contrib/admin/media/ <Location "/static"> SetHandler None </Location> <Location "/media"> SetHandler None </Location> ErrorLog /var/www/example.net/error.log </VirtualHost> Error log now contains two types of errors fallowed one by another: [error] child process 9486 still did not exit, sending a SIGKILL [error] could not make child process 9106 exit, attempting to continue anyway

    Read the article

< Previous Page | 15 16 17 18 19 20 21 22 23 24 25 26  | Next Page >