Search Results

Search found 5597 results on 224 pages for 'worker processes'.

Page 25/224 | < Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32 | Next Page >

When a python process is killed on OSX, why doesn't it kill the child processes?

- by Hugh

I found myself getting very confused a while back by some changes that I found when moving Python scripts from Linux over to OSX... On Linux, if a python script has called os.system(), and the calling process is killed, the called process will be killed at the same time. On OSX, however, if the main process is killed, anything that it launched is left behind. Is there something somewhere in OSX/Python where I can change this behaviour? This is causing problems on our render farm, where the processes can be killed from the management GUI, but the top level process is really just a wrapper, so, while the render farm management might think that the process has gone and the machine is freed up for another task, the actual processor-intensive task is still running, which can lead to huge blockages. I know that I could write more logic to catch the kill signal and pass it on to the child processes, but I was hoping that it might be something that could be enabled at a lower level.

Read the article
A button to set all processes to on-hold for Linux?

- by fuenfundachtzig

When Linux starts swapping you're basically doomed. Very soon the system won't react to any input any more, but happily swap on until the end of days... Can you think of a command that holds all processes whatsoever, thus (and while) allowing you to open a clean shell where you can examine the source of the problem and kill the process which ate up all the memory? (I guess this won't be easy, because as the memory is probably completely filled up you'd need to swap out some more memory to gather space for opening a shell, on the other hand all other swapping processes must be stopped.) If you tied such a command to a hot key then maybe you can use this as an emergency button saving you a lot a time. Any ideas if this is possible at all? Has somebody tried something like this before? If one could realize this it would be a cool feature :)

Read the article
How to configure Nginx to serve a variety of back-ends via multiple FCGI processes?

- by Ben Horton

I've seen a lot of tutorials showing one how to set up PHP/Python/Perl/RoR on nginx via various FCGI processes. None of the tutorials that I found show one how to serve multiple FCGI services off one server. How would one configure the stable nginx (nginx-0.7.64) to serve multiple FCGI processes (one for each of the above languages)? Example addresses for each FCGI process are as follows: 127.0.0.1:8080 - PHP 127.0.0.1:8081 - Python 127.0.0.1:8082 - Perl 127.0.0.1:8083 - Ruby on Rails An example configuration file that shows one how to implement multiple FCGI's off one server is really what I need. Perhaps others will benefit as well.

Read the article
Is there a difference in page fault rates between CPU bound and I/O bound processes?

- by user198864

I was thinking, should there be any difference in expectation of the page fault rate on CPU-bound vs I/O bound processes? At first I thought maybe we could, since CPU-bound processes would likely be using more memory accesses per time quantum, so I expect it would move from locality to locality faster. At the same time, the CPU-bound process is probably given a larger working set... but this doesn't affect the fault overhead as it hits a new locality IF this wasn't pre-paged in. Is there actually any real difference in the page fault rates or am I just musing about something nonexistent? And if there is, how would it impact a real-world OS like linux?

Read the article
What sysadmin must do to run OS with damaged /lib/libc.so file ? / rsyslogd daemon logrotation / deny checking list of running processes

- by Virtual_Lotos

What sysadmin must do to run OS with damaged /lib/libc.so file ? In other words, how command interpreter should be configured to be able to run system with corrupted /lib/libc.so file ? Do I have to move it to /var catalog ? Does the command interpreter must be statically compiled or have setuid attribute or perhaps must be a symbolic link to /bin/sh or must be no larger than 2MB ? How to prevent a user from checking list of processes started by another user ? How do I forbid a user to see which processes are running by another user ? What do I have to keep in mind when I want to make rsyslogd daemon logrotation ?

Read the article
How do I see which processes have open TCP/IP ports in Mac OS X?

- by Zubair

How do I see which processes have open TCP/IP ports in Mac OS X?

Read the article
How to keep memory consumption below 500 MB or less than 25 processes at background on netbook?

- by overmann

I bought a netbook yesterday, (I'm loving it) but I will never understand why they need to be a lot of processes running on background. I worry about other users who have no idea about it and continue using their computers with occasional choppiness due to 70 processes on background occupying most of the memory I'd like to keep my memory consumption below 500MB (I have 1 GB) is this possible? What are your ideas for this to work? I always run Microsoft Security Essentials at startup and real time protection, how many features can I disable to reach my goal memory usage?

Read the article
Application running as a service is not able to create the same number of processes as when it runs

- by Pini Reznik

I have a Windows application which creates up to 35 processes and it's working OK when it's running from cmd. But when it is executed as a service on the same machine it is able to create only 20 processes and all other are killed because of some kind of resource exhaustion problem. The problem is persistent on one Windows 2003 server but not reproducible on other servers. Can it be because the system has run out of desktop heap? http://support.microsoft.com/kb/184802 How can I check it?

Read the article
What does it mean for an OS to "execute within user processes"? Do any modern OS's use that approach

- by Chris Cooper

I have recently become interested in operating system, and a friend of mine lent me a book called Operating Systems: Internals and Design Principles (I have the third edition), published in 1998. It's been a very interesting book so far, but I have come to the part dealing with process control, and it's using UNIX System V as one of its examples of an operating system that executes within user processes. This concept has struck me as a little strange. First of all, does this mean that OS instructions and data are stored in each user of the processes? Probably not, because that would be an absurdly redundant scheme. But if not, then what does it mean to "execute within" a user process? Do any modern operating systems use this approach? It seems much more logical to have the operating system execute as its own process, or even independently of all processes, if you're short on memory. All the inter-accessiblilty of process data required for this layout seems to greatly complicate things. (But maybe that's just because I don't quite get the concept ;D) Here is what the book says: "Execution within User Processes: An alternative that is common with operation systems on smaller machines is to execute virtually all operating system software in the context of a user process. ... "

Read the article
Does the number of busy worker threads in the CLR ThreadPool affect performance of I/O threads?

- by andrej351

We have a Windows Service which hosts a number of WCF services and, in an unrelated part of the app, makes extensive use of the TPL Task class to asynchronously do relatively short bits of work. It is my understanding that WCF uses managed I/O threads from the ThreadPool to execute requests. I noticed that after deploying a feature which significantly raised the applications use of Tasks, and as such the use of ThreadPool worker threads as well, performance of a couple of web services has become very slow. We're talking minutes instead of less than a second. The number of Tasks actually trying to run at any one time can range between 20 and 1000, which makes me think that any new (last in) work needing some CPU time could be forced to wait for quite some time. Does the (in my case extremely large) number of busy ThreadPool worker threads affect the ThreadPool's managed I/O threads? Or could these two be connected in any way? Thanks!

Read the article
Oracle Support Master Note for Troubleshooting Advanced Queuing and Oracle Streams Propagation Issues (Doc ID 233099.1)

- by faye.todd(at)oracle.com

Master Note for Troubleshooting Advanced Queuing and Oracle Streams Propagation Issues (Doc ID 233099.1) Copyright (c) 2010, Oracle Corporation. All Rights Reserved. In this Document Purpose Last Review Date Instructions for the Reader Troubleshooting Details 1. Scope and Application 2. Definitions and Classifications 3. How to Use This Guide 4. Basic AQ Propagation Troubleshooting 5. Additional Troubleshooting Steps for AQ Propagation of User-Enqueued and Dequeued Messages 6. Additional Troubleshooting Steps for Propagation in an Oracle Streams Environment 7. Performance Issues References Applies to: Oracle Server - Enterprise Edition - Version: 8.1.7.0 to 11.2.0.2 - Release: 8.1.7 to 11.2Information in this document applies to any platform. Purpose This document presents a step-by-step methodology for troubleshooting and resolving problems with Advanced Queuing Propagation in both Streams and basic Advanced Queuing environments. It also serves as a master reference for other more specific notes on Oracle Streams Propagation and Advanced Queuing Propagation issues. Last Review Date December 20, 2010 Instructions for the Reader A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting. Troubleshooting Details 1. Scope and Application This note is intended for Database Administrators of Oracle databases where issues are being encountered with propagating messages between advanced queues, whether the queues are used for user-created messaging systems or for Oracle Streams. It contains troubleshooting steps and links to notes for further problem resolution.It can also be used a template to document a problem when it is necessary to engage Oracle Support Services. Knowing what is NOT happening can frequently speed up the resolution process by focusing solely on the pertinent problem area. This guide is divided into five parts: Section 2: Definitions and Classifications (discusses the different types and features of propagations possible - helpful for understanding the rest of the guide) Section 3: How to Use this Guide (to be used as a start part for determining the scope of the problem and what sections to consult) Section 4. Basic AQ propagation troubleshooting (applies to both AQ propagation of user enqueued and dequeued messages as well as Oracle Streams propagations) Section 5. Additional troubleshooting steps for AQ propagation of user enqueued and dequeued messages Section 6. Additional troubleshooting steps for Oracle Streams propagation Section 7. Performance issues 2. Definitions and Classifications Given the potential scope of issues that can be encountered with AQ propagation, the first recommended step is to do some basic diagnosis to determine the type of problem that is being encountered. 2.1. What Type of Propagation is Being Used? 2.1.1. Buffered Messaging For an advanced queue, messages can be maintained on disk (persistent messaging) or in memory (buffered messaging). To determine if a queue is buffered or not, reference the GV_$BUFFERED_QUEUES view. If the queue does not appear in this view, it is persistent. 2.1.2. Propagation mode - queue-to-dblink vs queue-to-queue As of 10.2, an AQ propagation can also be defined as queue-to-dblink, or queue-to-queue: queue-to-dblink: The propagation delivers messages or events from the source queue to all subscribing queues at the destination database identified by the dblink. A single propagation schedule is used to propagate messages to all subscribing queues. Hence any changes made to this schedule will affect message delivery to all the subscribing queues. This mode does not support multiple propagations from the same source queue to the same target database. queue-to-queue: Added in 10.2, this propagation mode delivers messages or events from the source queue to a specific destination queue identified on the database link. This allows the user to have fine-grained control on the propagation schedule for message delivery. This new propagation mode also supports transparent failover when propagating to a destination Oracle RAC system. With queue-to-queue propagation, you are no longer required to re-point a database link if the owner instance of the queue fails on Oracle RAC. This mode supports multiple propagations to the same target database if the target queues are different. The default is queue-to-dblink. To verify if queue-to-queue propagation is being used, in non-Streams environments query DBA_QUEUE_SCHEDULES.DESTINATION - if a remote queue is listed along with the remote database link, then queue-to-queue propagation is being used. For Streams environments, the DBA_PROPAGATION.QUEUE_TO_QUEUE column can be checked.See the following note for a method to switch between the two modes:Document 827473.1 How to alter propagation from queue-to-queue to queue-to-dblink 2.1.3. Combined Capture and Apply (CCA) for Streams In 11g Oracle Streams environments, an optimization called Combined Capture and Apply (CCA) is implemented by default when possible. Although a propagation is configured in this case, Streams does not use it; instead it passes information directly from capture to an apply receiver. To see if CCA is in use: COLUMN CAPTURE_NAME HEADING 'Capture Name' FORMAT A30COLUMN OPTIMIZATION HEADING 'CCA Mode?' FORMAT A10SELECT CAPTURE_NAME, DECODE(OPTIMIZATION,0, 'No','Yes') OPTIMIZATIONFROM V$STREAMS_CAPTURE; Also, see the following note:Document 463820.1 Streams Combined Capture and Apply in 11g 2.2. Queue Table Compatibility There are three types of queue table compatibility. In more recent databases, queue tables may be present in all three modes of compatibility: 8.0 - earliest version, deprecated in 10.2 onwards 8.1 - support added for RAC, asynchronous notification, secure queues, queue level access control, rule-based subscribers, separate storage of history information 10.0 - if the database is in 10.1-compatible mode, then the default value for queue table compatibility is 10.0 2.3. Single vs Multiple Consumer Queue Tables If more than one recipient can dequeue a message from a queue, then its queue table is multiple consumer. You can propagate messages from a multiple-consumer queue to a single-consumer queue. Propagation from a single-consumer queue to a multiple-consumer queue is not possible. 3. How to Use This Guide 3.1. Are Messages Being Propagated at All, or is the Propagation Just Slow? Run the following query on the source database for the propagation (assuming that it is running): select TOTAL_NUMBER from DBA_QUEUE_SCHEDULES where QNAME='<source_queue_name>'; If TOTAL_NUMBER is increasing, then propagation is most likely functioning, although it may be slow. For performance issues, see Section 7. 3.2. Propagation Between Persistent User-Created Queues See Sections 4 and 5 (and optionally Section 6 if performance is an issue). 3.3. Propagation Between Buffered User-Created Queues See Sections 4, 5, and 6 (and optionally Section 7 if performance is an issue). 3.4. Propagation between Oracle Streams Queues (without Combined Capture and Apply (CCA) Optimization) See Sections 4 and 6 (and optionally Section 7 if performance is an issue). 3.5. Propagation between Oracle Streams Queues (with Combined Capture and Apply (CCA) Optimization) Although an AQ propagation is not used directly in this case, some characteristics of the message transfer are inferred from the propagation parameters used. Some parts of Sections 4 and 6 still apply. 3.6. Messaging Gateway Propagations This note does not apply to Messaging Gateway propagations. 4. Basic AQ Propagation Troubleshooting 4.1. Double-check Your Code Make sure that you are consistent in your usage of the database link(s) names, queue names, etc. It may be useful to plot a diagram of which queues are connected via which database links to make sure that the logical structure is correct. 4.2. Verify that Job Queue Processes are Running 4.2.1. Versions 10.2 and Lower - DBA_JOBS Package For versions 10.2 and lower, a scheduled propagation is managed by DBMS_JOB package. The propagation is performed by job queue process background processes. Therefore we need to verify that there are sufficient processes available for the propagation process. We should have at least 4 job queue processes running and preferably more depending on the number of other jobs running in the database. It should be noted that for AQ specific work, AQ will only ever use half of the job queue processes available.An issue caused by an inadequate job queue processes parameter setting is described in the following note:Document 298015.1 Kwqjswproc:Excep After Loop: Assigning To Self 4.2.1.1. Job Queue Processes in Initalization Parameter File The parameter JOB_QUEUE_PROCESSES in the init.ora/spfile should be > 0. The value can be changed dynamically via connect / as sysdbaalter system set JOB_QUEUE_PROCESSES=10; 4.2.1.2. Job Queue Processes in Memory The following command will show how many job queue processes are currentlyin use by this instance (this may be different than what is in the init.ora/spfile): connect / as sysdbashow parameter job; 4.2.1.3. OS PIDs Corresponding to Job Queue Processes Identify the operating system process ids (spids) of job queue processes involved in propagation via select p.SPID, p.PROGRAM from V$PROCESS p, DBA_JOBS_RUNNING jr, V$SESSION s, DBA_JOBS j where s.SID=jr.SID and s.PADDR=p.ADDR and jr.JOB=j.JOBand j.WHAT like '%sys.dbms_aqadm.aq$_propaq(job)%'; and these SPIDs can be used to check at the operating system level that they exist.In 8i a job queue process will have a name similar to: ora_snp1_<instance_name>.In 9i onwards you will see a coordinator process: ora_cjq0_ and multiple slave processes: ora_jnnn_<instance_name>, where nnn is an integer between 1 and 999. 4.2.2. Version 11.1 and Above - Oracle Scheduler In version 11.1 and above, Oracle Scheduler is used to perform AQ and Streams propagations. Oracle Scheduler automatically tunes the number of slave processes for these jobs based on the load on the computer system, and the JOB_QUEUE_PROCESSES initialization parameter is only used to specify the maximum number of slave processes. Therefore, the JOB_QUEUE_PROCESSES initialization parameter does not need to be set (it defaults to a very high number), unless you want to limit the number of slaves that can be created. If JOB_QUEUE_PROCESSES = 0, no propagation jobs will run.See the following note for a discussion of Oracle Streams 11g and Oracle Scheduler:Document 1083608.1 11g Streams and Oracle Scheduler 4.2.2.1. Job Queue Processes in Initalization Parameter File The parameter JOB_QUEUE_PROCESSES in the init.ora/spfile should be > 0, and preferably be left at its default value. The value can be changed dynamically via connect / as sysdbaalter system set JOB_QUEUE_PROCESSES=10; To set the JOB_QUEUE_PROCESSES parameter to its default value, run: connect / as sysdbaalter system reset JOB_QUEUE_PROCESSES; and then bounce the instance. 4.2.2.2. Job Queue Processes in Memory The following command will show how many job queue processes are currently in use by this instance (this may be different than what is in the init.ora/spfile): connect / as sysdbashow parameter job; 4.2.2.3. OS PIDs Corresponding to Job Queue Processes Identify the operating system process ids (SPIDs) of job queue processes involved in propagation via col PROGRAM for a30select p.SPID, p.PROGRAM, j.JOB_namefrom v$PROCESS p, DBA_SCHEDULER_RUNNING_JOBS jr, V$SESSION s, DBA_SCHEDULER_JOBS j where s.SID=jr.SESSION_ID and s.PADDR=p.ADDRand jr.JOB_name=j.JOB_NAME and j.JOB_NAME like '%AQ_JOB$_%'; and these SPIDs can be used to check at the operating system level that they exist.You will see a coordinator process: ora_cjq0_ and multiple slave processes: ora_jnnn_<instance_name>, where nnn is an integer between 1 and 999. 4.3. Check the Alert Log and Any Associated Trace Files The first place to check for propagation failures is the alert logs at all sites (local and if relevant all remote sites). When a job queue process attempts to execute a schedule and fails it will always write an error stack to the alert log. This error stack will also be written in a job queue process trace file, which will be written to the BACKGROUND_DUMP_DEST location for 10.2 and below, and in the DIAGNOSTIC_DEST location for 11g. The fact that errors are written to the alert log demonstrates that the schedule is executing. This means that the problem could be with the set up of the schedule. In this example the ORA-02068 demonstrates that the failure was at the remote site. Further investigation revealed that the remote database was not open, hence the ORA-03114 error. Starting the database resolved the problem. Thu Feb 14 10:40:05 2002 Propagation Schedule for (AQADM.MULTIPLEQ, SHANE816.WORLD) encountered following error:ORA-04052: error occurred when looking up Remote object [email protected]: error occurred at recursive SQL level 4ORA-02068: following severe error from SHANE816ORA-03114: not connected to ORACLEORA-06512: at "SYS.DBMS_AQADM_SYS", line 4770ORA-06512: at "SYS.DBMS_AQADM", line 548ORA-06512: at line 1 Other potential errors that may be written to the alert log can be found in the following notes:Document 827184.1 AQ Propagation with CLOB data types Fails with ORA-22990 (11.1)Document 846297.1 AQ Propagation Fails : ORA-00600[kope2upic2954] or Ora-00600[Kghsstream_copyn] (10.2, 11.1)Document 731292.1 ORA-25215 Reported on Local Propagation When Using Transformation with ANYDATA queue tables (10.2, 11.1, 11.2)Document 365093.1 ORA-07445 [kwqppay2aqe()+7360] Reported on Propagation of a Transformed Message (10.1, 10.2)Document 219416.1 Advanced Queuing Propagation Fails with ORA-22922 (9.0)Document 1203544.1 AQ Propagation Aborted with ORA-600 [ociksin: invalid status] on SYS.DBMS_AQADM_SYS.AQ$_PROPAGATION_PROCEDURE After Upgrade (11.1, 11.2)Document 1087324.1 ORA-01405 ORA-01422 reported by Advanced Queuing Propagation schedules after RAC reconfiguration (10.2)Document 1079577.1 Advanced Queuing Propagation Fails With "ORA-22370 incorrect usage of method" (9.2, 10.2, 11.1, 11.2)Document 332792.1 ORA-04061 error relating to SYS.DBMS_PRVTAQIP reported when setting up Statspack (8.1, 9.0, 9.2, 10.1)Document 353325.1 ORA-24056: Internal inconsistency for QUEUE <queue_name> and destination <dblink> (8.1, 9.0, 9.2, 10.1, 10.2, 11.1, 11.2)Document 787367.1 ORA-22275 reported on Propagating Messages with LOB component when propagating between 10.1 and 10.2 (10.1, 10.2)Document 566622.1 ORA-22275 when propagating >4K AQ$_JMS_TEXT_MESSAGEs from 9.2.0.8 to 10.2.0.1 (9.2, 10.1)Document 731539.1 ORA-29268: HTTP client error 401 Unauthorized Error when the AQ Servlet attempts to Propagate a message via HTTP (9.0, 9.2, 10.1, 10.2, 11.1)Document 253131.1 Concurrent Writes May Corrupt LOB Segment When Using Auto Segment Space Management (ORA-1555) (9.2)Document 118884.1 How to unschedule a propagation schedule stuck in pending stateDocument 222992.1 DBMS_AQADM.DISABLE_PROPAGATION_SCHEDULE Returns ORA-24082Document 282987.1 Propagated Messages marked UNDELIVERABLE after Drop and Recreate Of Remote QueueDocument 1204080.1 AQ Propagation Failing With ORA-25329 After Upgraded From 8i or 9i to 10g or 11g.Document 1233675.1 AQ Propagation stops after upgrade to 11.2.0.1 ORA-30757 4.3.1. Errors Related to Incorrect Network Configuration The most common propagation errors result from an incorrect network configuration. The list below contains common errors caused by tnsnames.ora file or database links being configured incorrectly: - ORA-12154: TNS:could not resolve service name- ORA-12505: TNS:listener does not currently know of SID given in connect descriptor- ORA-12514: TNS:listener could not resolve SERVICE_NAME - ORA-12541: TNS-12541 TNS:no listener 4.4. Check the Database Links Exist and are Functioning Correctly For schedules to remote databases confirm the database link exists via. SQL> col DBLINK for a45SQL> select QNAME, NVL(REGEXP_SUBSTR(DESTINATION, '[^@]+', 1, 2), DESTINATION) dblink2 from DBA_QUEUE_SCHEDULES3 where MESSAGE_DELIVERY_MODE = 'PERSISTENT';QNAME DBLINK------------------------------ ---------------------------------------------MY_QUEUE ORCL102B.WORLD Connect as the owner of the link and select across it to verify it works and connects to the database we expect. i.e. select * from ALL_QUEUES@ ORCL102B.WORLD; You need to ensure that the userid that scheduled the propagation (using DBMS_AQADM.SCHEDULE_PROPAGATION or DBMS_PROPAGATION_ADM.CREATE_PROPAGATION if using Streams) has access to the database link for the destination. 4.5. Has Propagation Been Correctly Scheduled? Check that the propagation schedule has been created and that a job queue process has been assigned. Look for the entry in DBA_QUEUE_SCHEDULES and SYS.AQ$_SCHEDULES for your schedule. For 10g and below, check that it has a JOBNO entry in SYS.AQ$_SCHEDULES, and that there is an entry in DBA_JOBS with that JOBNO. For 11g and above, check that the schedule has a JOB_NAME entry in SYS.AQ$_SCHEDULES, and that there is an entry in DBA_SCHEDULER_JOBS with that JOB_NAME. Check the destination is as intended and spelled correctly. SQL> select SCHEMA, QNAME, DESTINATION, SCHEDULE_DISABLED, PROCESS_NAME from DBA_QUEUE_SCHEDULES;SCHEMA QNAME DESTINATION S PROCESS------- ---------- ------------------ - -----------AQADM MULTIPLEQ AQ$_LOCAL N J000 AQ$_LOCAL in the destination column shows that the queue to which we are propagating to is in the same database as the source queue. If the propagation was to a remote (different) database, a database link will be in the DESTINATION column. The entry in the SCHEDULE_DISABLED column, N, means that the schedule is NOT disabled. If Y (yes) appears in this column, propagation is disabled and the schedule will not be executed. If not using Oracle Streams, propagation should resume once you have enabled the schedule by invoking DBMS_AQADM.ENABLE_PROPAGATION_SCHEDULE (for 10.2 Oracle Streams and above, the DBMS_PROPAGATION_ADM.START_PROPAGATION procedure should be used). The PROCESS_NAME is the name of the job queue process currently allocated to execute the schedule. This process is allocated dynamically at execution time. If the PROCESS_NAME column is null (empty) the schedule is not currently executing. You may need to execute this statement a number of times to verify if a process is being allocated. If a process is at some time allocated to the schedule, it is attempting to execute. SQL> select SCHEMA, QNAME, LAST_RUN_DATE, NEXT_RUN_DATE from DBA_QUEUE_SCHEDULES;SCHEMA QNAME LAST_RUN_DATE NEXT_RUN_DATE------ ----- ----------------------- ----------------------- AQADM MULTIPLEQ 13-FEB-2002 13:18:57 13-FEB-2002 13:20:30 In 11g, these dates are expressed in TIMESTAMP WITH TIME ZONE datatypes. If the NEXT_RUN_DATE and NEXT_RUN_TIME columns are null when this statement is executed, the scheduled propagation is currently in progress. If they never change it would suggest that the schedule itself is never executing. If the next scheduled execution is too far away, change the NEXT_TIME parameter of the schedule so that schedules are executed more frequently (assuming that the window is not set to be infinite). Parameters of a schedule can be changed using the DBMS_AQADM.ALTER_PROPAGATION_SCHEDULE call. In 10g and below, scheduling propagation posts a job in the DBA_JOBS view. The columns are more or less the same as DBA_QUEUE_SCHEDULES so you just need to recognize the job and verify that it exists. SQL> select JOB, WHAT from DBA_JOBS where WHAT like '%sys.dbms_aqadm.aq$_propaq(job)%';JOB WHAT---- ----------------- 720 next_date := sys.dbms_aqadm.aq$_propaq(job); For 11g, scheduling propagation posts a job in DBA_SCHEDULER_JOBS instead: SQL> select JOB_NAME from DBA_SCHEDULER_JOBS where JOB_NAME like 'AQ_JOB$_%';JOB_NAME------------------------------AQ_JOB$_41 If no job exists, check DBA_QUEUE_SCHEDULES to make sure that the schedule has not been disabled. For 10g and below, the job number is dynamic for AQ propagation schedules. The procedure that is executed to expedite a propagation schedule runs, removes itself from DBA_JOBS, and then reposts a new job for the next scheduled propagation. The job number should therefore always increment unless the schedule has been set up to run indefinitely. 4.6. Is the Schedule Executing but Failing to Complete? Run the following query: SQL> select FAILURES, LAST_ERROR_MSG from DBA_QUEUE_SCHEDULES;FAILURES LAST_ERROR_MSG------------ -----------------------1 ORA-25207: enqueue failed, queue AQADM.INQ is disabled from enqueueingORA-02063: preceding line from SHANE816 The failures column shows how many times we have attempted to execute the schedule and failed. Oracle will attempt to execute the schedule 16 times after which it will be removed from the DBA_JOBS or DBA_SCHEDULER_JOBS view and the schedule will become disabled. The column DBA_QUEUE_SCHEDULES.SCHEDULE_DISABLED will show 'Y'. For 11g and above, the DBA_SCHEDULER_JOBS.STATE column will show 'BROKEN' for the job corresponding to DBA_QUEUE_SCHEDULES.JOB_NAME. Prior to 10g the back off algorithm for failures was exponential, whereas from 10g onwards it is linear. The propagation will become disabled on the 17th attempt. Only the last execution failure will be reflected in the LAST_ERROR_MSG column. That is, if the schedule fails 5 times for 5 different reasons, only the last set of errors will be recorded in DBA_QUEUE_SCHEDULES. Any errors need to be resolved to allow propagation to continue. If propagation has also become disabled due to 17 failures, first resolve the reason for the error and then re-enable the schedule using the DBMS_AQADM.ENABLE_PROPAGATION_SCHEDULE procedure, or DBMS_PROPAGATION_ADM.START_PROPAGATION if using 10.2 or above Oracle Streams. As soon as the schedule executes successfully the error message entries will be deleted. Oracle does not keep a history of past failures. However, when using Oracle Streams, the errors will be retained in the DBA_PROPAGATION view even after the schedule resumes successfully. See the following note for instructions on how to clear out the errors from the DBA_PROPAGATION view:Document 808136.1 How to clear the old errors from DBA_PROPAGATION view?If a schedule is active and no errors are being reported then the source queue may not have any messages to be propagated. 4.7. Do the Propagation Notification Queue Table and Queue Exist? Check to see that the propagation notification queue table and queue exist and are enabled for enqueue and dequeue. Propagation makes use of the propagation notification queue for handling propagation run-time events, and the messages in this queue are stored in a SYS-owned queue table. This queue should never be stopped or dropped and the corresponding queue table never be dropped. 10g and belowThe propagation notification queue table is of the format SYS.AQ$_PROP_TABLE_n, where 'n' is the RAC instance number, i.e. '1' for a non-RAC environment. This queue and queue table are created implicitly when propagation is first scheduled. If propagation has been scheduled and these objects do not exist, try unscheduling and rescheduling propagation. If they still do not exist contact Oracle Support. SQL> select QUEUE_TABLE from DBA_QUEUE_TABLES2 where QUEUE_TABLE like '%PROP_TABLE%' and OWNER = 'SYS';QUEUE_TABLE------------------------------AQ$_PROP_TABLE_1SQL> select NAME, ENQUEUE_ENABLED, DEQUEUE_ENABLED2 from DBA_QUEUES where owner='SYS'3 and QUEUE_TABLE like '%PROP_TABLE%';NAME ENQUEUE DEQUEUE------------------------------ ------- -------AQ$_PROP_NOTIFY_1 YES YESAQ$_AQ$_PROP_TABLE_1_E NO NO If the AQ$_PROP_NOTIFY_1 queue is not enabled for enqueue or dequeue, it should be so enabled using DBMS_AQADM.START_QUEUE. However, the exception queue AQ$_AQ$_PROP_TABLE_1_E should not be enabled for enqueue or dequeue.11g and aboveThe propagation notification queue table is of the format SYS.AQ_PROP_TABLE, and is created when the database is created. If they do not exist, contact Oracle Support. SQL> select QUEUE_TABLE from DBA_QUEUE_TABLES2 where QUEUE_TABLE like '%PROP_TABLE%' and OWNER = 'SYS';QUEUE_TABLE------------------------------AQ_PROP_TABLESQL> select NAME, ENQUEUE_ENABLED, DEQUEUE_ENABLED2 from DBA_QUEUES where owner='SYS'3 and QUEUE_TABLE like '%PROP_TABLE%';NAME ENQUEUE DEQUEUE------------------------------ ------- -------AQ_PROP_NOTIFY YES YESAQ$_AQ_PROP_TABLE_E NO NO If the AQ_PROP_NOTIFY queue is not enabled for enqueue or dequeue, it should be so enabled using DBMS_AQADM.START_QUEUE. However, the exception queue AQ$_AQ$_PROP_TABLE_E should not be enabled for enqueue or dequeue. 4.8. Does the Remote Queue Exist and is it Enabled for Enqueueing? Check that the remote queue the propagation is transferring messages to exists and is enabled for enqueue: SQL> select DESTINATION from USER_QUEUE_SCHEDULES where QNAME = 'OUTQ';DESTINATION-----------------------------------------------------------------------------"AQADM"."INQ"@M2V102.ESSQL> select OWNER, NAME, ENQUEUE_ENABLED, DEQUEUE_ENABLED from [email protected];OWNER NAME ENQUEUE DEQUEUE-------- ------ ----------- -----------AQADM INQ YES YES 4.9. Do the Target and Source Database Charactersets Differ? If a message fails to propagate, check the database charactersets of the source and target databases. Investigate whether the same message can propagate between the databases with the same characterset or it is only a particular combination of charactersets which causes a problem. 4.10. Check the Queue Table Type Agreement Propagation is not possible between queue tables which have types that differ in some respect. One way to determine if this is the case is to run the DBMS_AQADM.VERIFY_QUEUE_TYPES procedure for the two queues that the propagation operates on. If the types do not agree, DBMS_AQADM.VERIFY_QUEUE_TYPES will return '0'.For AQ propagation between databases which have different NLS_LENGTH_SEMANTICS settings, propagation will not work, unless the queues are Oracle Streams ANYDATA queues.See the following notes for issues caused by lack of type agreement:Document 1079577.1 Advanced Queuing Propagation Fails With "ORA-22370: incorrect usage of method"Document 282987.1 Propagated Messages marked UNDELIVERABLE after Drop and Recreate Of Remote QueueDocument 353754.1 Streams Messaging Propagation Fails between Single and Multi-byte Charactersets when using Chararacter Length Semantics in the ADT 4.11. Enable Propagation Tracing 4.11.1. System Level This is set it in the init.ora/spfile as follows: event="24040 trace name context forever, level 10" and restart the instanceThis event cannot be set dynamically with an alter system command until version 10.2: SQL> alter system set events '24040 trace name context forever, level 10'; To unset the event: SQL> alter system set events '24040 trace name context off'; Debugging information will be logged to job queue trace file(s) (jnnn) as propagation takes place. You can check the trace file for errors, and for statements indicating that messages have been sent. For the most part the trace information is understandable. This trace should also be uploaded to Oracle Support if a service request is created. 4.11.2. Attaching to a Specific Process We can also attach to an existing job queue processes that is running a propagation schedule and trace it individually using the oradebug utility, as follows:10.2 and below connect / as sysdbaselect p.SPID, p.PROGRAM from v$PROCESS p, DBA_JOBS_RUNNING jr, V$SESSION s, DBA_JOBS j where s.SID=jr.SID and s.PADDR=p.ADDR and jr.JOB=j.JOB and j.WHAT like '%sys.dbms_aqadm.aq$_propaq(job)%';-- For the process id (SPID) attach to it via oradebug and generate the following traceoradebug setospid <SPID>oradebug unlimitoradebug Event 10046 trace name context forever, level 12oradebug Event 24040 trace name context forever, level 10-- Trace the process for 5 minutesoradebug Event 10046 trace name context offoradebug Event 24040 trace name context off-- The following command returns the pathname/filename to the file being written tooradebug tracefile_name 11g connect / as sysdbacol PROGRAM for a30select p.SPID, p.PROGRAM, j.JOB_NAMEfrom v$PROCESS p, DBA_SCHEDULER_RUNNING_JOBS jr, V$SESSION s, DBA_SCHEDULER_JOBS j where s.SID=jr.SESSION_ID and s.PADDR=p.ADDR and jr.JOB_NAME=j.JOB_NAME and j.JOB_NAME like '%AQ_JOB$_%';-- For the process id (SPID) attach to it via oradebug and generate the following traceoradebug setospid <SPID>oradebug unlimitoradebug Event 10046 trace name context forever, level 12oradebug Event 24040 trace name context forever, level 10-- Trace the process for 5 minutesoradebug Event 10046 trace name context offoradebug Event 24040 trace name context off-- The following command returns the pathname/filename to the file being written tooradebug tracefile_name 4.11.3. Further Tracing The previous tracing steps only trace the job queue process executing the propagation on the source. At times it is useful to trace the propagation receiver process (the session which is enqueueing the messages into the target queue) on the target database which is associated with the job queue process on the source database.These following queries provide ways of identifying the processes involved in propagation so that you can attach to them via oradebug to generate trace information.In order to identify the propagation receiver process you need to execute the query as a user with privileges to access the v$ views in both the local and remote databases so the database link must connect as a user with those privileges in the remote database. The <DBLINK> in the queries should be replaced by the appropriate database link.The queries have two forms due to the differences between operating systems. The value returned by 'Rem Process' is the operating system identifier of the propagation receiver on the remote database. Once identified, this process can be attached to and traced on the remote database using the commands given in Section 4.11.2.10.2 and below - Windows select pl.SPID "JobQ Process", pl.PROGRAM, sr.PROCESS "Rem Process" from v$PROCESS pl, DBA_JOBS_RUNNING jr, V$SESSION s, DBA_JOBS j, V$SESSION@<DBLINK> sr where s.SID=jr.SID and s.PADDR=pl.ADDR and jr.JOB=j.JOB and j.WHAT like '%sys.dbms_aqadm.aq$_propaq(job)%' and pl.SPID=substr(sr.PROCESS, instr(sr.PROCESS,':')+1); 10.2 and below - Unix select pl.SPID "JobQ Process", pl.PROGRAM, sr.PROCESS "Rem Process" from V$PROCESS pl, DBA_JOBS_RUNNING jr, V$SESSION s, DBA_JOBS j, V$SESSION@<DBLINK> sr where s.SID=jr.SID and s.PADDR=pl.ADDR and jr.JOB=j.JOB and j.WHAT like '%sys.dbms_aqadm.aq$_propaq(job)%' and pl.SPID=sr.PROCESS; 11g - Windows select pl.SPID "JobQ Process", pl.PROGRAM, sr.PROCESS "Rem Process" from V$PROCESS pl, DBA_SCHEDULER_RUNNING_JOBS jr, V$SESSION s, DBA_SCHEDULER_JOBS j, V$SESSION@<DBLINK> sr where s.SID=jr.SESSION_ID and s.PADDR=pl.ADDR and jr.JOB_NAME=j.JOB_NAME and j.JOB_NAME like '%AQ_JOB$_%%' and pl.SPID=substr(sr.PROCESS, instr(sr.PROCESS,':')+1); 11g - Unix select pl.SPID "JobQ Process", pl.PROGRAM, sr.PROCESS "Rem Process" from V$PROCESS pl, DBA_SCHEDULER_RUNNING_JOBS jr, V$SESSION s, DBA_SCHEDULER_JOBS j, V$SESSION@<DBLINK> sr where s.SID=jr.SESSION_ID and s.PADDR=pl.ADDR and jr.JOB_NAME=j.JOB_NAME and j.JOB_NAME like '%AQ_JOB$_%%' and pl.SPID=sr.PROCESS; 5. Additional Troubleshooting Steps for AQ Propagation of User-Enqueued and Dequeued Messages 5.1. Check the Privileges of All Users Involved Ensure that the owner of the database link has the necessary privileges on the aq packages. SQL> select TABLE_NAME, PRIVILEGE from USER_TAB_PRIVS;TABLE_NAME PRIVILEGE------------------------------ ----------------------------------------DBMS_LOCK EXECUTEDBMS_AQ EXECUTEDBMS_AQADM EXECUTEDBMS_AQ_BQVIEW EXECUTEQT52814_BUFFER SELECT Note that when queue table is created, a view called QT<nnn>_BUFFER is created in the SYS schema, and the queue table owner is given SELECT privileges on it. The <nnn> corresponds to the object_id of the associated queue table. SQL> select * from USER_ROLE_PRIVS;USERNAME GRANTED_ROLE ADM DEF OS_------------------------------ ------------------------------ ---- ---- ---AQ_USER1 AQ_ADMINISTRATOR_ROLE NO YES NOAQ_USER1 CONNECT NO YES NOAQ_USER1 RESOURCE NO YES NO It is good practice to configure central AQ administrative user. All admin and processing jobs are created, executed and administered as this user. This configuration is not mandatory however, and the database link can be owned by any existing queue user. If this latter configuration is used, ensure that the connecting user has the necessary privileges on the AQ packages and objects involved. Privileges for an AQ Administrative user Execute on DBMS_AQADM Execute on DBMS_AQ Granted the AQ_ADMINISTRATOR_ROLE Privileges for an AQ user Execute on DBMS_AQ Execute on the message payload Enqueue privileges on the remote queue Dequeue privileges on the originating queue Privileges need to be confirmed on both sites when propagation is scheduled to remote destinations. Verify that the user ID used to login to the destination through the database link has been granted privileges to use AQ. 5.2. Verify Queue Payload Types AQ will not propagate messages from one queue to another if the payload types of the two queues are not verified to be equivalent. An AQ administrator can verify if the source and destination's payload types match by executing the DBMS_AQADM.VERIFY_QUEUE_TYPES procedure. The results of the type checking will be stored in the SYS.AQ$_MESSAGE_TYPES table. This table can be accessed using the object identifier OID of the source queue and the address database link of the destination queue, i.e. [schema.]queue_name[@destination]. Prior to Oracle 9i the payload (message type) had to be the same for all the queue tables involved in propagation. From Oracle9i onwards a transformation can be used so that payloads can be converted from one type to another. The following procedural call made on the source database can verify whether we can propagate between the source and the destination queue tables. connect aq_user1/[email protected] serverout onDECLARErc_value number;BEGINDBMS_AQADM.VERIFY_QUEUE_TYPES(src_queue_name => 'AQ_USER1.Q_1', dest_queue_name => 'AQ_USER2.Q_2',destination => 'dbl_aq_user2.es',rc => rc_value);dbms_output.put_line('rc_value code is '||rc_value);END;/ If propagation is possible then the return code value will be 1. If it is 0 then propagation is not possible and further investigation of the types and transformations used by and in conjunction with the queue tables is required. With regard to comparison of the types the following sql can be used to extract the DDL for a specific type with' %' changed appropriately on the source and target. This can then be compared for the source and target. SET LONG 20000 set pagesize 50 EXECUTE DBMS_METADATA.SET_TRANSFORM_PARAM(DBMS_METADATA.SESSION_TRANSFORM, 'STORAGE',false); SELECT DBMS_METADATA.GET_DDL('TYPE',t.type_name) from user_types t WHERE t.type_name like '%'; EXECUTE DBMS_METADATA.SET_TRANSFORM_PARAM(DBMS_METADATA.SESSION_TRANSFORM, 'DEFAULT'); 5.3. Check Message State and Destination The first step in this process is to identify the queue table associated with the problem source queue. Although you schedule propagation for a specific queue, most of the meta-data associated with that queue is stored in the underlying queue table. The following statement finds the queue table for a given queue (note that this is a multiple-consumer queue table). SQL> select QUEUE_TABLE from DBA_QUEUES where NAME = 'MULTIPLEQ';QUEUE_TABLE --------------------MULTIPLEQTABLE For a small amount of messages in a multiple-consumer queue table, the following query can be run: SQL> select MSG_STATE, CONSUMER_NAME, ADDRESS from AQ$MULTIPLEQTABLE where QUEUE = 'MULTIPLEQ';MSG_STATE CONSUMER_NAME ADDRESS-------------- ----------------------- -------------READY AQUSER2 [email protected] AQUSER1READY AQUSER3 AQADM.INQ In this example we see 2 messages ready to be propagated to remote queues and 1 that is not. If the address column is blank, the message is not scheduled for propagation and can only be dequeued from the queue upon which it was enqueued. The MSG_STATE column values are discussed in Document 102330.1 Advanced Queueing MSG_STATE Values and their Interpretation. If the address column has a value, the message has been enqueued for propagation to another queue. The first row in the example includes a database link (@M2V102.ES). This demonstrates that the message should be propagated to a queue at a remote database. The third row does not include a database link so will be propagated to a queue that resides on the same database as the source queue. The consumer name is the intended recipient at the target queue. Note that we are not querying the base queue table directly; rather, we are querying a view that is available on top of every queue table, AQ$<queue_table_name>.A more realistic query in an environment where the queue table contains thousands of messages is8.0.3-compatible multiple-consumer queue table and all compatibility single-consumer queue tables select count(*), MSG_STATE, QUEUE from AQ$<queue_table_name> group by MSG_STATE, QUEUE; 8.1.3 and 10.0-compatible queue tables select count(*), MSG_STATE, QUEUE, CONSUMER_NAME from AQ$<queue_table_name>group by MSG_STATE, QUEUE, CONSUMER_NAME; For multiple-consumer queue tables, if you did not see the expected CONSUMER_NAME , check the syntax of the enqueue code and verify the recipients are declared correctly. If a recipients list is not used on enqueue, check the subscriber list in the AQ$_<queue_table_name>_S view (note that a single-consumer queue table does not have a subscriber view. This view records all members of the default subscription list which were added using the DBMS_AQADM.ADD_SUBSCRIBER procedure and also those enqueued using a recipient list. SQL> select QUEUE, NAME, ADDRESS from AQ$MULTIPLEQTABLE_S;QUEUE NAME ADDRESS---------- ----------- -------------MULTIPLEQ AQUSER2 [email protected] AQUSER1 In this example we have 2 subscribers registered with the queue. We have a local subscriber AQUSER1, and a remote subscriber AQUSER2, on the queue INQ, owned by AQADM, at M2V102.ES. Unless overridden with a recipient list during enqueue every message enqueued to this queue will be propagated to INQ at M2V102.ES.For 8.1 style and above multiple consumer queue tables, you can also check the following information at the target: select CONSUMER_NAME, DEQ_TXN_ID, DEQ_TIME, DEQ_USER_ID, PROPAGATED_MSGID from AQ$<queue_table_name> where QUEUE = '<QUEUE_NAME>'; For 8.0 style queues, if the queue table supports multiple consumers you can obtain the same information from the history column of the queue table: select h.CONSUMER, h.TRANSACTION_ID, h.DEQ_TIME, h.DEQ_USER, h.PROPAGATED_MSGIDfrom AQ$<queue_table_name> t, table(t.history) h where t.Q_NAME = '<QUEUE_NAME>'; A non-NULL TRANSACTION_ID indicates that the message was successfully propagated. Further, the DEQ_TIME indicates the time of propagation, the DEQ_USER indicates the userid used for propagation, and the PROPAGATED_MSGID indicates the message ID of the message that was enqueued at the destination. 6. Additional Troubleshooting Steps for Propagation in an Oracle Streams Environment 6.1. Is the Propagation Enabled? For a propagation job to propagate messages, the propagation must be enabled. For Streams, a special view called DBA_PROPAGATION exists to convey information about Streams propagations. If messages are not being propagated by a propagation as expected, then the propagation might not be enabled. To query for this: SELECT p.PROPAGATION_NAME, DECODE(s.SCHEDULE_DISABLED, 'Y', 'Disabled','N', 'Enabled') SCHEDULE_DISABLED, s.PROCESS_NAME, s.FAILURES, s.LAST_ERROR_MSGFROM DBA_QUEUE_SCHEDULES s, DBA_PROPAGATION pWHERE p.DESTINATION_DBLINK = NVL(REGEXP_SUBSTR(s.DESTINATION, '[^@]+', 1, 2), s.DESTINATION) AND s.SCHEMA = p.SOURCE_QUEUE_OWNER AND s.QNAME = p.SOURCE_QUEUE_NAME AND MESSAGE_DELIVERY_MODE = 'PERSISTENT' order by PROPAGATION_NAME; At times, the propagation job may become "broken" or fail to start after an error has been encountered or after a database restart. If an error is indicated by the above query, an attempt to disable the propagation and then re-enable it can be made. In the examples below, for the propagation named STRMADMIN_PROPAGATE where the queue name is STREAMS_QUEUE owned by STRMADMIN and the destination database link is ORCL2.WORLD, the commands would be:10.2 and above exec dbms_propagation_adm.stop_propagation('STRMADMIN_PROPAGATE'); exec dbms_propagation_adm.start_propagation('STRMADMIN_PROPAGATE'); If the above does not fix the problem, stop the propagation specifying the force parameter (2nd parameter on stop_propagation) as TRUE: exec dbms_propagation_adm.stop_propagation('STRMADMIN_PROPAGATE',true); exec dbms_propagation_adm.start_propagation('STRMADMIN_PROPAGATE'); The statistics for the propagation as well as any old error messages are cleared when the force parameter is set to TRUE. Therefore if the propagation schedule is stopped with FORCE set to TRUE, and upon restart there is still an error message in DBA_PROPAGATION, then the error message is current.9.2 or 10.1 exec dbms_aqadm.disable_propagation_schedule('STRMADMIN.STREAMS_QUEUE','ORCL2.WORLD'); exec dbms.aqadm.enable_propagation_schedule('STRMADMIN.STREAMS_QUEUE','ORCL2.WORLD'); If the above does not fix the problem, perform an unschedule of propagation and then schedule_propagation: exec dbms_aqadm.unschedule_propagation('STRMADMIN.STREAMS_QUEUE','ORCL2.WORLD'); exec dbms_aqadm.schedule_propagation('STRMADMIN.STREAMS_QUEUE','ORCL2.WORLD'); Typically if the error from the first query in Section 6.1 recurs after restarting the propagation as shown above, further troubleshooting of the error is needed. 6.2. Check Propagation Rule Sets and Transformations Inspect the configuration of the rules in the rule set that is associated with the propagation process to make sure that they evaluate to TRUE as expected. If not, then the object or schema will not be propagated. Remember that when a negative rule evaluates to TRUE, the specified object or schema will not be propagated. Finally inspect any rule-based transformations that are implemented with propagation to make sure they are changing the data in the intended way.The following query shows what rule sets are assigned to a propagation: select PROPAGATION_NAME, RULE_SET_OWNER||'.'||RULE_SET_NAME "Positive Rule Set",NEGATIVE_RULE_SET_OWNER||'.'||NEGATIVE_RULE_SET_NAME "Negative Rule Set"from DBA_PROPAGATION; The next two queries list the propagation rules and their conditions. The first is for the positive rule set, the second is for the negative rule set: set long 4000select rsr.RULE_SET_OWNER||'.'||rsr.RULE_SET_NAME RULE_SET ,rsr.RULE_OWNER||'.'||rsr.RULE_NAME RULE_NAME,r.RULE_CONDITION CONDITION fromDBA_RULE_SET_RULES rsr, DBA_RULES rwhere rsr.RULE_NAME = r.RULE_NAME and rsr.RULE_OWNER = r.RULE_OWNER and RULE_SET_NAME in(select RULE_SET_NAME from DBA_PROPAGATION) order by rsr.RULE_SET_OWNER, rsr.RULE_SET_NAME; set long 4000select c.PROPAGATION_NAME, rsr.RULE_SET_OWNER||'.'||rsr.RULE_SET_NAME RULE_SET ,rsr.RULE_OWNER||'.'||rsr.RULE_NAME RULE_NAME,r.RULE_CONDITION CONDITION fromDBA_RULE_SET_RULES rsr, DBA_RULES r ,DBA_PROPAGATION cwhere rsr.RULE_NAME = r.RULE_NAME and rsr.RULE_OWNER = r.RULE_OWNER andrsr.RULE_SET_OWNER=c.NEGATIVE_RULE_SET_OWNER and rsr.RULE_SET_NAME=c.NEGATIVE_RULE_SET_NAMEand rsr.RULE_SET_NAME in(select NEGATIVE_RULE_SET_NAME from DBA_PROPAGATION) order by rsr.RULE_SET_OWNER, rsr.RULE_SET_NAME; 6.3. Determining the Total Number of Messages and Bytes Propagated As in Section 3.1, determining if messages are flowing can be instructive to see whether the propagation is entirely hung or just slow. If the propagation is not in flow control (see Section 6.5.2), but the statistics are incrementing slowly, there may be a performance issue. For Streams implementations two views are available that can assist with this that can show the number of messages sent by a propagation, as well as the number of acknowledgements being returned from the target site: the V$PROPAGATION_SENDER view at the Source site and the V$PROPAGATION_RECEIVER view at the destination site. It is helpful to query both to determine if messages are being delivered to the target. Look for the statistics to increase.Source: select QUEUE_SCHEMA, QUEUE_NAME, DBLINK,HIGH_WATER_MARK, ACKNOWLEDGEMENT, TOTAL_MSGS, TOTAL_BYTESfrom V$PROPAGATION_SENDER; Target: select SRC_QUEUE_SCHEMA, SRC_QUEUE_NAME, SRC_DBNAME, DST_QUEUE_SCHEMA, DST_QUEUE_NAME, HIGH_WATER_MARK, ACKNOWLEDGEMENT, TOTAL_MSGS from V$PROPAGATION_RECEIVER; 6.4. Check Buffered Subscribers The V$BUFFERED_SUBSCRIBERS view displays information about subscribers for all buffered queues in the instance. This view can be queried to make sure that the site that the propagation is propagating to is listed as a subscriber address for the site being propagated from: select QUEUE_SCHEMA, QUEUE_NAME, SUBSCRIBER_ADDRESS from V$BUFFERED_SUBSCRIBERS; The SUBSCRIBER_ADDRESS column will not be populated when the propagation is local (between queues on the same database). 6.5. Common Streams Propagation Errors 6.5.1. ORA-02082: A loopback database link must have a connection qualifier. This error can occur if you use the Streams Setup Wizard in Oracle Enterprise Manager without first configuring the GLOBAL_NAME for your database. 6.5.2. ORA-25307: Enqueue rate too high. Enable flow control DBA_QUEUE_SCHEDULES will display this informational message for propagation when the automatic flow control (10g feature of Streams) has been invoked.Similar to Streams capture processes, a Streams propagation process can also go into a state of 'flow control. This is an informative message that indicates flow control has been automatically enabled to reduce the rate at which messages are being enqueued into at target queue.This typically occurs when the target site is unable to keep up with the rate of messages flowing from the source site. Other than checking that the apply process is running normally on the target site, usually no action is required by the DBA. Propagation and the capture process will be resumed automatically when the target site is able to accept more messages.The following document contains more information:Document 302109.1 Streams Propagation Error: ORA-25307 Enqueue rate too high. Enable flow controlSee the following document for one potential cause of this situation:Document 1097115.1 Oracle Streams Apply Reader is in 'Paused' State 6.5.3. ORA-25315 unsupported configuration for propagation of buffered messages This error typically occurs when the target database is RAC and usually indicates that an attempt was made to propagate buffered messages with the database link pointing to an instance in the destination database which is not the owner instance of the destination queue. To resolve the problem, use queue-to-queue propagation for buffered messages. 6.5.4. ORA-600 [KWQBMCRCPTS101] after dropping / recreating propagation For cause/fixes refer to:Document 421237.1 ORA-600 [KWQBMCRCPTS101] reported by a Qmon slave process after dropping a Streams Propagation 6.5.5. Stopping or Dropping a Streams Propagation Hangs See the following note:Document 1159787.1 Troubleshooting Streams Propagation When It is Not Functioning and Attempts to Stop It Hang 6.6. Streams Propagation-Related Notes for Common Issues Document 437838.1 Streams Specific PatchesDocument 749181.1 How to Recover Streams After Dropping PropagationDocument 368912.1 Queue to Queue Propagation Schedule encountered ORA-12514 in a RAC environmentDocument 564649.1 ORA-02068/ORA-03114/ORA-03113 Errors From Streams Propagation Process - Remote Database is Available and Unschedule/Reschedule Does Not ResolveDocument 553017.1 Stream Propagation Process Errors Ora-4052 Ora-6554 From 11g To 10201Document 944846.1 Streams Propagation Fails Ora-7445 [kohrsmc]Document 745601.1 ORA-23603 'STREAMS enqueue aborted due to low SGA' Error from Streams Propagation, and V$STREAMS_CAPTURE.STATE Hanging on 'Enqueuing Message'Document 333068.1 ORA-23603: Streams Enqueue Aborted Eue To Low SGADocument 363496.1 Ora-25315 Propagating on RAC StreamsDocument 368237.1 Unable to Unschedule Propagation. Streams Queue is InvalidDocument 436332.1 dbms_propagation_adm.stop_propagation hangsDocument 727389.1 Propagation Fails With ORA-12528Document 730911.1 ORA-4063 Is Reported After Dropping Negative Prop.RulesetDocument 460471.1 Propagation Blocked by Qmon Process - Streams_queue_table / 'library cache lock' waitsDocument 1165583.1 ORA-600 [kwqpuspse0-ack] In Streams EnvironmentDocument 1059029.1 Combined Capture and Apply (CCA) : Capture aborts : ORA-1422 after schedule_propagationDocument 556309.1 Changing Propagation/ queue_to_queue : false -> true does does not work; no LCRs propagatedDocument 839568.1 Propagation failing with error: ORA-01536: space quota exceeded for tablespace ''Document 311021.1 Streams Propagation Process : Ora 12154 After Reboot with Transparent Application Failover TAF configuredDocument 359971.1 STREAMS propagation to Primary of physical Standby configuation errors with Ora-01033, Ora-02068Document 1101616.1 DBMS_PROPAGATION_ADM.DROP_PROPAGATION FAILS WITH ORA-1747 7. Performance Issues A propagation may seem to be slow if the queries from Sections 3.1 and 6.3 show that the message statistics are not changing quickly. In Oracle Streams, this more usually is due to a slow apply process at the target rather than a slow propagation. Propagation could be inferred to be slow if the message statistics are changing, and the state of a capture process according to V$STREAMS_CAPTURE.STATE is PAUSED FOR FLOW CONTROL, but an ORA-25307 'Enqueue rate too high. Enable flow control' warning is NOT observed in DBA_QUEUE_SCHEDULES per Section 6.5.2. If this is the case, see the following notes / white papers for suggestions to increase performance:Document 335516.1 Master Note for Streams Performance RecommendationsDocument 730036.1 Overview for Troubleshooting Streams Performance IssuesDocument 780733.1 Streams Propagation Tuning with Network ParametersWhite Paper: http://www.oracle.com/technetwork/database/features/availability/maa-wp-10gr2-streams-performance-130059.pdfWhite Paper: Oracle Streams Configuration Best Practices: Oracle Database 10g Release 10.2, http://www.oracle.com/technetwork/database/features/availability/maa-10gr2-streams-configuration-132039.pdf, See APPENDIX A: USING STREAMS CONFIGURATIONS OVER A NETWORKFor basic AQ propagation, the network tuning in the aforementioned Appendix A of the white paper 'Oracle Streams Configuration Best Practices: Oracle Database 10g Release 10.2' is applicable. References NOTE:102330.1 - Advanced Queueing MSG_STATE Values and their InterpretationNOTE:102771.1 - Advanced Queueing Propagation using PL/SQLNOTE:1059029.1 - Combined Capture and Apply (CCA) : Capture aborts : ORA-1422 after schedule_propagationNOTE:1079577.1 - Advanced Queuing Propagation Fails With "ORA-22370: incorrect usage of method"NOTE:1083608.1 - 11g Streams and Oracle SchedulerNOTE:1087324.1 - ORA-01405 ORA-01422 reported by Adavanced Queueing Propagation schedules after RAC reconfigurationNOTE:1097115.1 - Oracle Streams Apply Reader is in 'Paused' StateNOTE:1101616.1 - DBMS_PROPAGATION_ADM.DROP_PROPAGATION FAILS WITH ORA-1747NOTE:1159787.1 - Troubleshooting Streams Propagation When It is Not Functioning and Attempts to Stop It HangNOTE:1165583.1 - ORA-600 [kwqpuspse0-ack] In Streams EnvironmentNOTE:118884.1 - How to unschedule a propagation schedule stuck in pending stateNOTE:1203544.1 - AQ PROPAGATION ABORTED WITH ORA-600[OCIKSIN: INVALID STATUS] ON SYS.DBMS_AQADM_SYS.AQ$_PROPAGATION_PROCEDURE AFTER UPGRADENOTE:1204080.1 - AQ Propagation Failing With ORA-25329 After Upgraded From 8i or 9i to 10g or 11g.NOTE:219416.1 - Advanced Queuing Propagation fails with ORA-22922NOTE:222992.1 - DBMS_AQADM.DISABLE_PROPAGATION_SCHEDULE Returns ORA-24082NOTE:253131.1 - Concurrent Writes May Corrupt LOB Segment When Using Auto Segment Space Management (ORA-1555)NOTE:282987.1 - Propagated Messages marked UNDELIVERABLE after Drop and Recreate Of Remote QueueNOTE:298015.1 - Kwqjswproc:Excep After Loop: Assigning To SelfNOTE:302109.1 - Streams Propagation Error: ORA-25307 Enqueue rate too high. Enable flow controlNOTE:311021.1 - Streams Propagation Process : Ora 12154 After Reboot with Transparent Application Failover TAF configuredNOTE:332792.1 - ORA-04061 error relating to SYS.DBMS_PRVTAQIP reported when setting up StatspackNOTE:333068.1 - ORA-23603: Streams Enqueue Aborted Eue To Low SGANOTE:335516.1 - Master Note for Streams Performance RecommendationsNOTE:353325.1 - ORA-24056: Internal inconsistency for QUEUE and destination NOTE:353754.1 - Streams Messaging Propagation Fails between Single and Multi-byte Charactersets when using Chararacter Length Semantics in the ADT.NOTE:359971.1 - STREAMS propagation to Primary of physical Standby configuation errors with Ora-01033, Ora-02068NOTE:363496.1 - Ora-25315 Propagating on RAC StreamsNOTE:365093.1 - ORA-07445 [kwqppay2aqe()+7360] reported on Propagation of a Transformed MessageNOTE:368237.1 - Unable to Unschedule Propagation. Streams Queue is InvalidNOTE:368912.1 - Queue to Queue Propagation Schedule encountered ORA-12514 in a RAC environmentNOTE:421237.1 - ORA-600 [KWQBMCRCPTS101] reported by a Qmon slave process after dropping a Streams PropagationNOTE:436332.1 - dbms_propagation_adm.stop_propagation hangsNOTE:437838.1 - Streams Specific PatchesNOTE:460471.1 - Propagation Blocked by Qmon Process - Streams_queue_table / 'library cache lock' waitsNOTE:463820.1 - Streams Combined Capture and Apply in 11gNOTE:553017.1 - Stream Propagation Process Errors Ora-4052 Ora-6554 From 11g To 10201NOTE:556309.1 - Changing Propagation/ queue_to_queue : false -> true does does not work; no LCRs propagatedNOTE:564649.1 - ORA-02068/ORA-03114/ORA-03113 Errors From Streams Propagation Process - Remote Database is Available and Unschedule/Reschedule Does Not ResolveNOTE:566622.1 - ORA-22275 when propagating >4K AQ$_JMS_TEXT_MESSAGEs from 9.2.0.8 to 10.2.0.1NOTE:727389.1 - Propagation Fails With ORA-12528NOTE:730036.1 - Overview for Troubleshooting Streams Performance IssuesNOTE:730911.1 - ORA-4063 Is Reported After Dropping Negative Prop.RulesetNOTE:731292.1 - ORA-25215 Reported On Local Propagation When Using Transformation with ANYDATA queue tablesNOTE:731539.1 - ORA-29268: HTTP client error 401 Unauthorized Error when the AQ Servlet attempts to Propagate a message via HTTPNOTE:745601.1 - ORA-23603 'STREAMS enqueue aborted due to low SGA' Error from Streams Propagation, and V$STREAMS_CAPTURE.STATE Hanging on 'Enqueuing Message'NOTE:749181.1 - How to Recover Streams After Dropping PropagationNOTE:780733.1 - Streams Propagation Tuning with Network ParametersNOTE:787367.1 - ORA-22275 reported on Propagating Messages with LOB component when propagating between 10.1 and 10.2NOTE:808136.1 - How to clear the old errors from DBA_PROPAGATION view ?NOTE:827184.1 - AQ Propagation with CLOB data types Fails with ORA-22990NOTE:827473.1 - How to alter propagation from queue_to_queue to queue_to_dblinkNOTE:839568.1 - Propagation failing with error: ORA-01536: space quota exceeded for tablespace ''NOTE:846297.1 - AQ Propagation Fails : ORA-00600[kope2upic2954] or Ora-00600[Kghsstream_copyn]NOTE:944846.1 - Streams Propagation Fails Ora-7445 [kohrsmc]

Read the article
Is there any danger in disabling windows firewall on a azure worker role?

- by NullReference

I'm trying to troubleshoot a bug on our Azure worker role where we occasionally get the error "Unable to read data from the transport connection: An established connection was aborted by the software in your host machine". This error occurs when we are connecting to outside resources like google auth servers. A few people have recommended disabling the firewall\antivirus on the server. I'm just wondering what kind of security risk we would take by doing this. The server doesn't have iis installed but would it be vulnerable to hacking without the firewall? Thanks

Read the article
Is it possible to install and use a printer in Azure Worker Roles?

- by lurkerbelow

I'd like to know if it's possible to create and use a printer in Azure Worker Roles. I do know that I can install printers with basic batch commands. So I could define a batch script that would run as a startup task. something like: rundll32 printui.dll,PrintUIEntry /if /b "printer" /f %windir%\inf\ntprint.inf /r "file:" /m "printername") Question: Can I use a printer to print to file, maybe to local storage? I do need the function to print to file or atleast to have a printer installed, because I have to get the PCL output from different installed printers. Sadly I can not test it by myself. I do not have a CC to join the 90 days trial.

Read the article
What's the right way to kill child processes in perl before exiting?

- by rarbox

I'm running an IRC Bot (Bot::BasicBot) which has two child processes running File::Tail but when exiting, they don't terminate. So I'm killling them using Proc::ProcessTable like this before exiting: my $parent=$$; my $proc_table=Proc::ProcessTable->new(); for my $proc (@{$proc_table->table()}) { kill(15, $proc->pid) if ($proc->ppid == $parent); } It works but I get this warning: 14045: !!! Child process PID:14047 reaped: 14045: !!! Child process PID:14048 reaped: 14045: !!! Your program may not be using sig_child() to reap processes. 14045: !!! In extreme cases, your program can force a system reboot 14045: !!! if this resource leakage is not corrected. What else can I do to kill child processes? The forked process is created using the forkit method in Bot::BasicBot.

Read the article
Log out of an SSH session into Erlang VM without stopping the VM or leaving stale processes

- by Garret Smith

I have an Erlang application running as a daemon, configured as an SSH server. I can connect to it with an SSH client and I get the standard Erlang REPL. If I 'q().' I shut down the Erlang VM, not the connection. If I close the connection ('~.' for OpenSSH, close the window in PuTTY) some processes remain under the sshd_sup/ssh_system_xx_sup tree. These appear to be stale shell processes. I do not see any exported function in the shell module that would let me shut down the shell (and therefore the SSH connection) without affecting the entire VM. How should I be logging out of the SSH session to not leave stale processes in the VM?

Read the article
How to hold a queue of messages and have a group of working threads without polling?

- by Mark

I have a workflow that I want to looks something like this: / Worker 1 \ =Request Channel= - [Holding Queue|||] - Worker 2 - =Response Channel= \ Worker 3 / That is: Requests come in and they enter a FIFO queue Identical workers then pick up tasks from the queue At any given time any worker may work only one task When a worker is free and the holding queue is non-empty the worker should immediately pick up another task When tasks are complete, a worker places the result on the Response Channel I know there are QueueChannels in Spring Integration, but these channels require polling (which seems suboptimal). In particular, if a worker can be busy, I'd like the worker to be busy. Also, I've considered avoiding the queue altogether and simply letting tasks round-robin to all workers, but it's preferable to have a single waiting line as some tasks may be accomplished faster than others. Furthermore, I'd like insight into how many jobs are remaining (which I can get from the queue) and the ability to cancel all or particular jobs. How can I implement this message queuing/work distribution pattern while avoiding a polling? Edit: It appears I'm looking for the Message Dispatcher pattern -- how can I implement this using Spring/Spring Integration?

Read the article
Unexpected start of already-primary server processes when heartbeat on secondary is stopped.

- by vorik

Hi, I've got an active-passive Heartbeat cluster with Apache, MySQL, ActiveMQ and DRBD. Today, I wanted to perform hardware-maintenance on the secondary node (node04), so I stopped the heartbeat service before shutting it down. Then, the primary node (node03) received a shutdown notice from the secondary node (node04). This logging comes from the primary node: node03 heartbeat[4458]: 2010/03/08_08:52:56 info: Received shutdown notice from 'node04.companydomain.nl'. heartbeat[4458]: 2010/03/08_08:52:56 info: Resources being acquired from node04.companydomain.nl. harc[27522]: 2010/03/08_08:52:56 info: Running /etc/ha.d/rc.d/status status heartbeat[27523]: 2010/03/08_08:52:56 info: Local Resource acquisition completed. mach_down[27567]: 2010/03/08_08:52:56 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired mach_down[27567]: 2010/03/08_08:52:56 info: mach_down takeover complete for node node04.companydomain.nl. heartbeat[4458]: 2010/03/08_08:52:56 info: mach_down takeover complete. harc[27620]: 2010/03/08_08:52:56 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp ip-request-resp[27620]: 2010/03/08_08:52:56 received ip-request-resp drbddisk OK yes ResourceManager[27645]: 2010/03/08_08:52:56 info: Acquiring resource group: node03.companydomain.nl drbddisk Filesystem::/dev/drbd0::/data::ext3 mysql apache::/etc/httpd/conf/httpd.conf LVSSyncDaemonSwap::master monitor activemq tivoli-cluster MailTo::[email protected]::DRBDFailureDrisAcc MailTo::[email protected]::DRBDFailureDrisAcc 1.2.3.212 ResourceManager[27645]: 2010/03/08_08:52:56 info: Running /etc/ha.d/resource.d/drbddisk start Filesystem[27700]: 2010/03/08_08:52:57 INFO: Running OK ResourceManager[27645]: 2010/03/08_08:52:57 info: Running /etc/ha.d/resource.d/mysql start mysql[27783]: 2010/03/08_08:52:57 Starting MySQL[ OK ] apache[27853]: 2010/03/08_08:52:57 INFO: Running OK ResourceManager[27645]: 2010/03/08_08:52:57 info: Running /etc/ha.d/resource.d/monitor start monitor[28160]: 2010/03/08_08:52:58 ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/activemq start activemq[28210]: 2010/03/08_08:52:58 Starting ActiveMQ Broker... ActiveMQ Broker is already running. ResourceManager[27645]: 2010/03/08_08:52:58 ERROR: Return code 1 from /etc/ha.d/resource.d/activemq ResourceManager[27645]: 2010/03/08_08:52:58 CRIT: Giving up resources due to failure of activemq ResourceManager[27645]: 2010/03/08_08:52:58 info: Releasing resource group: node03.companydomain.nl drbddisk Filesystem::/dev/drbd0::/data::ext3 mysql apache::/etc/httpd/conf/httpd.conf LVSSyncDaemonSwap::master monitor activemq tivoli-cluster MailTo::[email protected]::DRBDFailureDrisAcc MailTo::[email protected]::DRBDFailureDrisAcc 1.2.3.212 ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/IPaddr 1.2.3.212 stop IPaddr[28329]: 2010/03/08_08:52:58 INFO: ifconfig eth0:0 down IPaddr[28312]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/MailTo [email protected] DRBDFailureDrisAcc stop MailTo[28378]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/MailTo [email protected] DRBDFailureDrisAcc stop MailTo[28433]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/tivoli-cluster stop ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/activemq stop activemq[28503]: 2010/03/08_08:53:01 Stopping ActiveMQ Broker... Stopped ActiveMQ Broker. ResourceManager[27645]: 2010/03/08_08:53:01 info: Running /etc/ha.d/resource.d/monitor stop monitor[28681]: 2010/03/08_08:53:01 ResourceManager[27645]: 2010/03/08_08:53:01 info: Running /etc/ha.d/resource.d/LVSSyncDaemonSwap master stop LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncmaster down LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncbackup up LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncmaster released ResourceManager[27645]: 2010/03/08_08:53:02 info: Running /etc/ha.d/resource.d/apache /etc/httpd/conf/httpd.conf stop apache[28782]: 2010/03/08_08:53:03 INFO: Killing apache PID 18390 apache[28782]: 2010/03/08_08:53:03 INFO: apache stopped. apache[28771]: 2010/03/08_08:53:03 INFO: Success ResourceManager[27645]: 2010/03/08_08:53:03 info: Running /etc/ha.d/resource.d/mysql stop mysql[28851]: 2010/03/08_08:53:24 Shutting down MySQL.....................[ OK ] ResourceManager[27645]: 2010/03/08_08:53:24 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop Filesystem[29010]: 2010/03/08_08:53:25 INFO: Running stop for /dev/drbd0 on /data Filesystem[29010]: 2010/03/08_08:53:25 INFO: Trying to unmount /data Filesystem[29010]: 2010/03/08_08:53:25 ERROR: Couldn't unmount /data; trying cleanup with SIGTERM Filesystem[29010]: 2010/03/08_08:53:25 INFO: Some processes on /data were signalled Filesystem[29010]: 2010/03/08_08:53:27 INFO: unmounted /data successfully Filesystem[28999]: 2010/03/08_08:53:27 INFO: Success ResourceManager[27645]: 2010/03/08_08:53:27 info: Running /etc/ha.d/resource.d/drbddisk stop heartbeat[4458]: 2010/03/08_08:53:29 WARN: node node04.companydomain.nl: is dead heartbeat[4458]: 2010/03/08_08:53:29 info: Dead node node04.companydomain.nl gave up resources. heartbeat[4458]: 2010/03/08_08:53:29 info: Link node04.companydomain.nl:eth0 dead. heartbeat[4458]: 2010/03/08_08:53:29 info: Link node04.companydomain.nl:eth1 dead. hb_standby[29193]: 2010/03/08_08:53:57 Going standby [foreign]. heartbeat[4458]: 2010/03/08_08:53:57 info: node03.companydomain.nl wants to go standby [foreign] Soo... What just happened here??? Heartbeat on node04 stopped and told node03, which was the active node at the time. Somehow, node03 decided to start the cluster processes that were already running. (For the processes that are not critical, I always return a 0 from the startupscript so it does not stops the entire cluster when a non-essential part fails.) When starting ActiveMQ, it returns status 1 because it is already running. This fails the node and shuts everything down. As heartbeat is not running on the secondary node, it cannot failover to there. When I tried to run ha_takeover to restart the resources, absolutely nothing happened. Only after I restarted heartbeat on the primary node the resources could be started (after a delay of 2 minutes). These are my questions: Why does heartbeat on the primary node try to start the cluster processes again? Why did ha_takeover not work? What can I do to prevent this from happening? Server configuration: DRBD: version: 8.3.7 (api:88/proto:86-91) GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by [email protected], 2010-01-20 09:14:48 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate B r---- ns:0 nr:6459432 dw:6459432 dr:0 al:0 bm:301 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 uname -a Linux node04 2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04 EST 2010 x86_64 x86_64 x86_64 GNU/Linux haresources node03.companydomain.nl \ drbddisk \ Filesystem::/dev/drbd0::/data::ext3 \ mysql \ apache::/etc/httpd/conf/httpd.conf \ LVSSyncDaemonSwap::master \ monitor \ activemq \ tivoli-cluster \ MailTo::[email protected]::DRBDFailureDrisAcc \ MailTo::[email protected]::DRBDFailureDrisAcc \ 1.2.3.212 ha.cf debugfile /var/log/ha-debug logfile /var/log/ha-log keepalive 500ms deadtime 30 warntime 10 initdead 120 udpport 694 mcast eth0 225.0.0.3 694 1 0 mcast eth1 225.0.0.4 694 1 0 auto_failback off node node03.companydomain.nl node node04.companydomain.nl respawn hacluster /usr/lib64/heartbeat/dopd apiauth dopd gid=haclient uid=hacluster Thank you very much in advance, Ger Apeldoorn

Read the article
Using a mounted NTFS share with nginx

- by Hoff

I have set up a local testing VM with Ubuntu Server 12.04 LTS and the LEMP stack. It's kind of an unconventional setup because instead of having all my PHP scripts on the local machine, I've mounted an NTFS share as the document root because I do my development on Windows. I had everything working perfectly up until this morning, now I keep getting a dreaded 'File not found.' error. I am almost certain this must be somehow permission related, because if I copy my site over to /var/www, nginx and php-fpm have no problems serving my PHP scripts. What I can't figure out is why all of a sudden (after a reboot of the server), no PHP files will be served but instead just the 'File not found.' error. Static files work fine, so I think it's PHP that is causing the headache. Both nginx and php-fpm are configured to run as the user www-data: root@ubuntu-server:~# ps aux | grep 'nginx\|php-fpm' root 1095 0.0 0.0 5816 792 ? Ss 11:11 0:00 nginx: master process /opt/nginx/sbin/nginx -c /etc/nginx/nginx.conf www-data 1096 0.0 0.1 6016 1172 ? S 11:11 0:00 nginx: worker process www-data 1098 0.0 0.1 6016 1172 ? S 11:11 0:00 nginx: worker process root 1130 0.0 0.4 175560 4212 ? Ss 11:11 0:00 php-fpm: master process (/etc/php5/php-fpm.conf) www-data 1131 0.0 0.3 175560 3216 ? S 11:11 0:00 php-fpm: pool www www-data 1132 0.0 0.3 175560 3216 ? S 11:11 0:00 php-fpm: pool www www-data 1133 0.0 0.3 175560 3216 ? S 11:11 0:00 php-fpm: pool www root 1686 0.0 0.0 4368 816 pts/1 S+ 11:11 0:00 grep --color=auto nginx\|php-fpm I have mounted the NTFS share at /mnt/webfiles by editing /etc/fstab and adding the following line: //192.168.0.199/c$/Websites/ /mnt/webfiles cifs username=Jordan,password=mypasswordhere,gid=33,uid=33 0 0 Where gid 33 is the www-data group and uid 33 is the user www-data. If I list the contents of one of my sites you can in fact see that they belong to the user www-data: root@ubuntu-server:~# ls -l /mnt/webfiles/nTv5-2.0 total 8 drwxr-xr-x 0 www-data www-data 0 Jun 6 19:12 app drwxr-xr-x 0 www-data www-data 0 Aug 22 19:00 assets -rwxr-xr-x 0 www-data www-data 1150 Jan 4 2012 favicon.ico -rwxr-xr-x 0 www-data www-data 1412 Dec 28 2011 index.php drwxr-xr-x 0 www-data www-data 0 Jun 3 16:44 lib drwxr-xr-x 0 www-data www-data 0 Jan 3 2012 plugins drwxr-xr-x 0 www-data www-data 0 Jun 3 16:45 vendors If I switch to the www-data user, I have no problem creating a new file on the share: root@ubuntu-server:~# su www-data $ > /mnt/webfiles/test.txt $ ls -l /mnt/webfiles | grep test\.txt -rwxr-xr-x 0 www-data www-data 0 Sep 8 11:19 test.txt There should be no problem reading or writing to the share with php-fpm running as the user www-data. When I examine the error log of nginx, it's filled with a bunch of lines that look like the following: 2012/09/08 11:22:36 [error] 1096#0: *1 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: 192.168.0.199, server: , request: "GET / HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.0.123" 2012/09/08 11:22:39 [error] 1096#0: *1 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: 192.168.0.199, server: , request: "GET /apc.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.0.123" It's bizarre that this was working previously and now all of sudden PHP is complaining that it can't "find" the scripts on the share. Does anybody know why this is happening? EDIT I tried editing php-fpm.conf and changing chdir to the following: chdir = /mnt/webfiles When I try and restart the php-fpm service, I get the error: Starting php-fpm [08-Sep-2012 14:20:55] ERROR: [pool www] the chdir path '/mnt/webfiles' does not exist or is not a directory This is a total load of bullshit because this directory DOES exist and is mounted! Any ls commands to list that directory work perfectly. Why the hell can't PHP-FPM see this directory?! Here are my configuration files for reference: nginx.conf user www-data; worker_processes 2; error_log /var/log/nginx/nginx.log info; pid /var/run/nginx.pid; events { worker_connections 1024; multi_accept on; } http { include fastcgi.conf; include mime.types; default_type application/octet-stream; set_real_ip_from 127.0.0.1; real_ip_header X-Forwarded-For; ## Proxy proxy_redirect off; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; client_max_body_size 32m; client_body_buffer_size 128k; proxy_connect_timeout 90; proxy_send_timeout 90; proxy_read_timeout 90; proxy_buffers 32 4k; ## Compression gzip on; gzip_types text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript; gzip_disable "MSIE [1-6]\.(?!.*SV1)"; ### TCP options tcp_nodelay on; tcp_nopush on; keepalive_timeout 65; sendfile on; include /etc/nginx/sites-enabled/*; } my site config server { listen 80; access_log /var/log/nginx/$host.access.log; error_log /var/log/nginx/error.log; root /mnt/webfiles/nTv5-2.0/app/webroot; index index.php; ## Block bad bots if ($http_user_agent ~* (HTTrack|HTMLParser|libcurl|discobot|Exabot|Casper|kmccrew|plaNETWORK|RPT-HTTPClient)) { return 444; } ## Block certain Referers (case insensitive) if ($http_referer ~* (sex|vigra|viagra) ) { return 444; } ## Deny dot files: location ~ /\. { deny all; } ## Favicon Not Found location = /favicon.ico { access_log off; log_not_found off; } ## Robots.txt Not Found location = /robots.txt { access_log off; log_not_found off; } if (-f $document_root/maintenance.html) { rewrite ^(.*)$ /maintenance.html last; } location ~* \.(?:ico|css|js|gif|jpe?g|png)$ { # Some basic cache-control for static files to be sent to the browser expires max; add_header Pragma public; add_header Cache-Control "max-age=2678400, public, must-revalidate"; } location / { try_files $uri $uri/ index.php; if (-f $request_filename) { break; } rewrite ^(.+)$ /index.php?url=$1 last; } location ~ \.php$ { include /etc/nginx/fastcgi.conf; fastcgi_pass unix:/var/run/php5-fpm.sock; } } php-fpm.conf ;;;;;;;;;;;;;;;;;;;;; ; FPM Configuration ; ;;;;;;;;;;;;;;;;;;;;; ; All relative paths in this configuration file are relative to PHP's install ; prefix (/opt/php5). This prefix can be dynamicaly changed by using the ; '-p' argument from the command line. ; Include one or more files. If glob(3) exists, it is used to include a bunch of ; files from a glob(3) pattern. This directive can be used everywhere in the ; file. ; Relative path can also be used. They will be prefixed by: ; - the global prefix if it's been set (-p arguement) ; - /opt/php5 otherwise ;include=etc/fpm.d/*.conf ;;;;;;;;;;;;;;;;;; ; Global Options ; ;;;;;;;;;;;;;;;;;; [global] ; Pid file ; Note: the default prefix is /opt/php5/var ; Default Value: none pid = /var/run/php-fpm.pid ; Error log file ; Note: the default prefix is /opt/php5/var ; Default Value: log/php-fpm.log error_log = /var/log/php5-fpm/php-fpm.log ; Log level ; Possible Values: alert, error, warning, notice, debug ; Default Value: notice ;log_level = notice ; If this number of child processes exit with SIGSEGV or SIGBUS within the time ; interval set by emergency_restart_interval then FPM will restart. A value ; of '0' means 'Off'. ; Default Value: 0 ;emergency_restart_threshold = 0 ; Interval of time used by emergency_restart_interval to determine when ; a graceful restart will be initiated. This can be useful to work around ; accidental corruptions in an accelerator's shared memory. ; Available Units: s(econds), m(inutes), h(ours), or d(ays) ; Default Unit: seconds ; Default Value: 0 ;emergency_restart_interval = 0 ; Time limit for child processes to wait for a reaction on signals from master. ; Available units: s(econds), m(inutes), h(ours), or d(ays) ; Default Unit: seconds ; Default Value: 0 ;process_control_timeout = 0 ; Send FPM to background. Set to 'no' to keep FPM in foreground for debugging. ; Default Value: yes ;daemonize = yes ;;;;;;;;;;;;;;;;;;;; ; Pool Definitions ; ;;;;;;;;;;;;;;;;;;;; ; Multiple pools of child processes may be started with different listening ; ports and different management options. The name of the pool will be ; used in logs and stats. There is no limitation on the number of pools which ; FPM can handle. Your system will tell you anyway :) ; Start a new pool named 'www'. ; the variable $pool can we used in any directive and will be replaced by the ; pool name ('www' here) [www] ; Per pool prefix ; It only applies on the following directives: ; - 'slowlog' ; - 'listen' (unixsocket) ; - 'chroot' ; - 'chdir' ; - 'php_values' ; - 'php_admin_values' ; When not set, the global prefix (or /opt/php5) applies instead. ; Note: This directive can also be relative to the global prefix. ; Default Value: none ;prefix = /path/to/pools/$pool ; The address on which to accept FastCGI requests. ; Valid syntaxes are: ; 'ip.add.re.ss:port' - to listen on a TCP socket to a specific address on ; a specific port; ; 'port' - to listen on a TCP socket to all addresses on a ; specific port; ; '/path/to/unix/socket' - to listen on a unix socket. ; Note: This value is mandatory. ;listen = 127.0.0.1:9000 listen = /var/run/php5-fpm.sock ; Set listen(2) backlog. A value of '-1' means unlimited. ; Default Value: 128 (-1 on FreeBSD and OpenBSD) ;listen.backlog = -1 ; List of ipv4 addresses of FastCGI clients which are allowed to connect. ; Equivalent to the FCGI_WEB_SERVER_ADDRS environment variable in the original ; PHP FCGI (5.2.2+). Makes sense only with a tcp listening socket. Each address ; must be separated by a comma. If this value is left blank, connections will be ; accepted from any ip address. ; Default Value: any ;listen.allowed_clients = 127.0.0.1 ; Set permissions for unix socket, if one is used. In Linux, read/write ; permissions must be set in order to allow connections from a web server. Many ; BSD-derived systems allow connections regardless of permissions. ; Default Values: user and group are set as the running user ; mode is set to 0666 ;listen.owner = www-data ;listen.group = www-data ;listen.mode = 0666 ; Unix user/group of processes ; Note: The user is mandatory. If the group is not set, the default user's group ; will be used. user = www-data group = www-data ; Choose how the process manager will control the number of child processes. ; Possible Values: ; static - a fixed number (pm.max_children) of child processes; ; dynamic - the number of child processes are set dynamically based on the ; following directives: ; pm.max_children - the maximum number of children that can ; be alive at the same time. ; pm.start_servers - the number of children created on startup. ; pm.min_spare_servers - the minimum number of children in 'idle' ; state (waiting to process). If the number ; of 'idle' processes is less than this ; number then some children will be created. ; pm.max_spare_servers - the maximum number of children in 'idle' ; state (waiting to process). If the number ; of 'idle' processes is greater than this ; number then some children will be killed. ; Note: This value is mandatory. pm = dynamic ; The number of child processes to be created when pm is set to 'static' and the ; maximum number of child processes to be created when pm is set to 'dynamic'. ; This value sets the limit on the number of simultaneous requests that will be ; served. Equivalent to the ApacheMaxClients directive with mpm_prefork. ; Equivalent to the PHP_FCGI_CHILDREN environment variable in the original PHP ; CGI. ; Note: Used when pm is set to either 'static' or 'dynamic' ; Note: This value is mandatory. pm.max_children = 50 ; The number of child processes created on startup. ; Note: Used only when pm is set to 'dynamic' ; Default Value: min_spare_servers + (max_spare_servers - min_spare_servers) / 2 pm.start_servers = 20 ; The desired minimum number of idle server processes. ; Note: Used only when pm is set to 'dynamic' ; Note: Mandatory when pm is set to 'dynamic' pm.min_spare_servers = 5 ; The desired maximum number of idle server processes. ; Note: Used only when pm is set to 'dynamic' ; Note: Mandatory when pm is set to 'dynamic' pm.max_spare_servers = 35 ; The number of requests each child process should execute before respawning. ; This can be useful to work around memory leaks in 3rd party libraries. For ; endless request processing specify '0'. Equivalent to PHP_FCGI_MAX_REQUESTS. ; Default Value: 0 pm.max_requests = 500 ; The URI to view the FPM status page. If this value is not set, no URI will be ; recognized as a status page. By default, the status page shows the following ; information: ; accepted conn - the number of request accepted by the pool; ; pool - the name of the pool; ; process manager - static or dynamic; ; idle processes - the number of idle processes; ; active processes - the number of active processes; ; total processes - the number of idle + active processes. ; max children reached - number of times, the process limit has been reached, ; when pm tries to start more children (works only for ; pm 'dynamic') ; The values of 'idle processes', 'active processes' and 'total processes' are ; updated each second. The value of 'accepted conn' is updated in real time. ; Example output: ; accepted conn: 12073 ; pool: www ; process manager: static ; idle processes: 35 ; active processes: 65 ; total processes: 100 ; max children reached: 1 ; By default the status page output is formatted as text/plain. Passing either ; 'html' or 'json' as a query string will return the corresponding output ; syntax. Example: ; http://www.foo.bar/status ; http://www.foo.bar/status?json ; http://www.foo.bar/status?html ; Note: The value must start with a leading slash (/). The value can be ; anything, but it may not be a good idea to use the .php extension or it ; may conflict with a real PHP file. ; Default Value: not set pm.status_path = /status ; The ping URI to call the monitoring page of FPM. If this value is not set, no ; URI will be recognized as a ping page. This could be used to test from outside ; that FPM is alive and responding, or to ; - create a graph of FPM availability (rrd or such); ; - remove a server from a group if it is not responding (load balancing); ; - trigger alerts for the operating team (24/7). ; Note: The value must start with a leading slash (/). The value can be ; anything, but it may not be a good idea to use the .php extension or it ; may conflict with a real PHP file. ; Default Value: not set ping.path = /ping ; This directive may be used to customize the response of a ping request. The ; response is formatted as text/plain with a 200 response code. ; Default Value: pong ping.response = pong ; The timeout for serving a single request after which the worker process will ; be killed. This option should be used when the 'max_execution_time' ini option ; does not stop script execution for some reason. A value of '0' means 'off'. ; Available units: s(econds)(default), m(inutes), h(ours), or d(ays) ; Default Value: 0 ;request_terminate_timeout = 0 ; The timeout for serving a single request after which a PHP backtrace will be ; dumped to the 'slowlog' file. A value of '0s' means 'off'. ; Available units: s(econds)(default), m(inutes), h(ours), or d(ays) ; Default Value: 0 ;request_slowlog_timeout = 0 ; The log file for slow requests ; Default Value: not set ; Note: slowlog is mandatory if request_slowlog_timeout is set ;slowlog = log/$pool.log.slow ; Set open file descriptor rlimit. ; Default Value: system defined value ;rlimit_files = 1024 ; Set max core size rlimit. ; Possible Values: 'unlimited' or an integer greater or equal to 0 ; Default Value: system defined value ;rlimit_core = 0 ; Chroot to this directory at the start. This value must be defined as an ; absolute path. When this value is not set, chroot is not used. ; Note: you can prefix with '$prefix' to chroot to the pool prefix or one ; of its subdirectories. If the pool prefix is not set, the global prefix ; will be used instead. ; Note: chrooting is a great security feature and should be used whenever ; possible. However, all PHP paths will be relative to the chroot ; (error_log, sessions.save_path, ...). ; Default Value: not set ;chroot = ; Chdir to this directory at the start. ; Note: relative path can be used. ; Default Value: current directory or / when chroot ;chdir = /var/www ; Redirect worker stdout and stderr into main error log. If not set, stdout and ; stderr will be redirected to /dev/null according to FastCGI specs. ; Note: on highloaded environement, this can cause some delay in the page ; process time (several ms). ; Default Value: no ;catch_workers_output = yes ; Pass environment variables like LD_LIBRARY_PATH. All $VARIABLEs are taken from ; the current environment. ; Default Value: clean env ;env[HOSTNAME] = $HOSTNAME ;env[PATH] = /usr/local/bin:/usr/bin:/bin ;env[TMP] = /tmp ;env[TMPDIR] = /tmp ;env[TEMP] = /tmp ; Additional php.ini defines, specific to this pool of workers. These settings ; overwrite the values previously defined in the php.ini. The directives are the ; same as the PHP SAPI: ; php_value/php_flag - you can set classic ini defines which can ; be overwritten from PHP call 'ini_set'. ; php_admin_value/php_admin_flag - these directives won't be overwritten by ; PHP call 'ini_set' ; For php_*flag, valid values are on, off, 1, 0, true, false, yes or no. ; Defining 'extension' will load the corresponding shared extension from ; extension_dir. Defining 'disable_functions' or 'disable_classes' will not ; overwrite previously defined php.ini values, but will append the new value ; instead. ; Note: path INI options can be relative and will be expanded with the prefix ; (pool, global or /opt/php5) ; Default Value: nothing is defined by default except the values in php.ini and ; specified at startup with the -d argument ;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f [email protected] ;php_flag[display_errors] = off ;php_admin_value[error_log] = /var/log/fpm-php.www.log ;php_admin_flag[log_errors] = on ;php_admin_value[memory_limit] = 32M php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i

Read the article
How should I clean up hung grandchild processes when an alarm trips in Perl?

- by brian d foy

I have a parallelized automation script which needs to call many other scripts, some of which hang because they (incorrectly) wait for standard input. That's not a big deal because I catch those with alarm. The trick is to shut down those hung grandchild processes when the child shuts down. I thought various incantations of SIGCHLD, waiting, and process groups could do the trick, but they all block and the grandchildren aren't reaped. My solution, which works, just doesn't seem like it is the right solution. I'm not especially interested in the Windows solution just yet, but I'll eventually need that too. Mine only works for Unix, which is fine for now. I wrote a small script that takes the number of simultaneous parallel children to run and the total number of forks: $ fork_bomb <parallel jobs> <number of forks> $ fork_bomb 8 500 This will probably hit the per-user process limit within a couple of minutes. Many solutions I've found just tell you to increase the per-user process limit, but I need this to run about 300,000 times, so that isn't going to work. Similarly, suggestions to re-exec and so on to clear the process table aren't what I need. I'd like to actually fix the problem instead of slapping duct tape over it. I crawl the process table looking for the child processes and shut down the hung processes individually in the SIGALRM handler, which needs to die because the rest of real code has no hope of success after that. The kludgey crawl through the process table doesn't bother me from a performance perspective, but I wouldn't mind not doing it: use Parallel::ForkManager; use Proc::ProcessTable; my $pm = Parallel::ForkManager->new( $ARGV[0] ); my $alarm_sub = sub { kill 9, map { $_->{pid} } grep { $_->{ppid} == $$ } @{ Proc::ProcessTable->new->table }; die "Alarm rang for $$!\n"; }; foreach ( 0 .. $ARGV[1] ) { print "."; print "\n" unless $count++ % 50; my $pid = $pm->start and next; local $SIG{ALRM} = $alarm_sub; eval { alarm( 2 ); system "$^X -le '<STDIN>'"; # this will hang alarm( 0 ); }; $pm->finish; } If you want to run out of processes, take out the kill. I thought that setting a process group would work so I could kill everything together, but that blocks: my $alarm_sub = sub { kill 9, -$$; # blocks here die "Alarm rang for $$!\n"; }; foreach ( 0 .. $ARGV[1] ) { print "."; print "\n" unless $count++ % 50; my $pid = $pm->start and next; setpgrp(0, 0); local $SIG{ALRM} = $alarm_sub; eval { alarm( 2 ); system "$^X -le '<STDIN>'"; # this will hang alarm( 0 ); }; $pm->finish; } The same thing with POSIX's setsid didn't work either, and I think that actually broke things in a different way since I'm not really daemonizing this. Curiously, Parallel::ForkManager's run_on_finish happens too late for the same clean-up code: the grandchildren are apparently already disassociated from the child processes at that point.

Read the article
Update: GTAS and EBS

- by jeffrey.waterman

Provided below are updated target date timeframes for provided patches for upcoming legislative enhancements. Dates have been pushed out from previous dates provided due to changes in Treasury mandatory dates. Mandatory dates for GTAS and IPAC have changes since previous target dates for patches were provided. These are target dates, not commitments to deliver functionality. Deliverable Target Timeframes for Customer Patches Comments R12 GTAS Configuration Apr 2012 Patch is available GTAS Key Processes Oct/Nov 2012 Includes GTAS processes necessary to create the GTAS interface file, migration of FACTS balances to GTAS, GTAS Trial Balance, and GTAS Transaction Register. GTAS Reports Nov/Dec 2012 GTAS Trial Balance GTAS Transaction Register Capture of Trading Partner TAS/BETC Apr/May 2013 Includes modification necessary to capture BETC, Trading Partner TAS/BETC on relevant transactions. GTAS Other Processes May/Jun 2013 Includes GTAS Customer and Vendor update processes. IPAC Aug/Sep Includes modification required to IPAC to accommodate Componentized TAS and BETC. 11i GTAS Configuration May 2012 Patch is available GTAS Key Processes Nov/Dec 2012 Includes GTAS processes necessary to create the GTAS interface file, migration of FACTS balances to GTAS, GTAS Trial Balance, and GTAS Transaction Register. GTAS Reports Dec/Jan 2012 GTAS Trial Balance GTAS Transaction Register Capture of Trading Partner TAS/BETC May/Jun 2013 Includes modification necessary to capture BETC, Trading Partner TAS/BETC on relevant transactions. GTAS Other Processes Jun/Jul 2013 Includes GTAS Customer and Vendor update processes. IPAC Sep/Oct 2013 Includes modification required to IPAC to accommodate Componentized TAS and BETC.

Read the article
How to figure out who owns a worker thread that is still running when my app exits?

- by Dave

Not long after upgrading to VS2010, my application won't shut down cleanly. If I close the app and then hit pause in the IDE, I see this: The problem is, there's no context. The call stack just says [External code], which isn't too helpful. Here's what I've done so far to try to narrow down the problem: deleted all extraneous plugins to minimize the number of worker threads launched set breakpoints in my code anywhere I create worker threads (and delegates + BeginInvoke, since I think they are labeled "Worker Thread" in the debugger anyway). None were hit. set IsBackground = true for all threads While I could do the next brute force step, which is to roll my code back to a point where this didn't happen and then look over all of the change logs, this isn't terribly efficient. Can anyone recommend a better way to figure this out, given the notable lack of information presented by the debugger? The only other things I can think of include: read up on WinDbg and try to use it to stop anytime a thread is started. At least, I thought that was possible... :) comment out huge blocks of code until the app closes properly, then start uncommenting until it doesn't. UPDATE Perhaps this information will be of use. I decided to use WinDbg and attach to my application. I then closed it, and switched to thread 0 and dumped the stack contents. Here's what I have: ThreadCount: 6 UnstartedThread: 0 BackgroundThread: 1 PendingThread: 0 DeadThread: 4 Hosted Runtime: no PreEmptive GC Alloc Lock ID OSID ThreadOBJ State GC Context Domain Count APT Exception 0 1 1c70 005a65c8 6020 Enabled 02dac6e0:02dad7f8 005a03c0 0 STA 2 2 1b20 005b1980 b220 Enabled 00000000:00000000 005a03c0 0 MTA (Finalizer) XXXX 3 08504048 19820 Enabled 00000000:00000000 005a03c0 0 Ukn XXXX 4 08504540 19820 Enabled 00000000:00000000 005a03c0 0 Ukn XXXX 5 08516a90 19820 Enabled 00000000:00000000 005a03c0 0 Ukn XXXX 6 08517260 19820 Enabled 00000000:00000000 005a03c0 0 Ukn 0:008> ~0s eax=c0674960 ebx=00000000 ecx=00000000 edx=00000000 esi=0040f320 edi=005a65c8 eip=76c37e47 esp=0040f23c ebp=0040f258 iopl=0 nv up ei pl nz na po nc cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202 USER32!NtUserGetMessage+0x15: 76c37e47 83c404 add esp,4 0:000> !clrstack OS Thread Id: 0x1c70 (0) Child SP IP Call Site 0040f274 76c37e47 [InlinedCallFrame: 0040f274] 0040f270 6baa8976 DomainBoundILStubClass.IL_STUB_PInvoke(System.Windows.Interop.MSG ByRef, System.Runtime.InteropServices.HandleRef, Int32, Int32)*** WARNING: Unable to verify checksum for C:\Windows\assembly\NativeImages_v4.0.30319_32\WindowsBase\d17606e813f01376bd0def23726ecc62\WindowsBase.ni.dll 0040f274 6ba924c5 [InlinedCallFrame: 0040f274] MS.Win32.UnsafeNativeMethods.IntGetMessageW(System.Windows.Interop.MSG ByRef, System.Runtime.InteropServices.HandleRef, Int32, Int32) 0040f2c4 6ba924c5 MS.Win32.UnsafeNativeMethods.GetMessageW(System.Windows.Interop.MSG ByRef, System.Runtime.InteropServices.HandleRef, Int32, Int32) 0040f2dc 6ba8e5f8 System.Windows.Threading.Dispatcher.GetMessage(System.Windows.Interop.MSG ByRef, IntPtr, Int32, Int32) 0040f318 6ba8d579 System.Windows.Threading.Dispatcher.PushFrameImpl(System.Windows.Threading.DispatcherFrame) 0040f368 6ba8d2a1 System.Windows.Threading.Dispatcher.PushFrame(System.Windows.Threading.DispatcherFrame) 0040f374 6ba7fba0 System.Windows.Threading.Dispatcher.Run() 0040f380 62e6ccbb System.Windows.Application.RunDispatcher(System.Object)*** WARNING: Unable to verify checksum for C:\Windows\assembly\NativeImages_v4.0.30319_32\PresentationFramewo#\7f91eecda3ff7ce478146b6458580c98\PresentationFramework.ni.dll 0040f38c 62e6c8ff System.Windows.Application.RunInternal(System.Windows.Window) 0040f3b0 62e6c682 System.Windows.Application.Run(System.Windows.Window) 0040f3c0 62e6c30b System.Windows.Application.Run() 0040f3cc 001f00bc MyApplication.App.Main() [C:\code\trunk\MyApplication\obj\Debug\GeneratedInternalTypeHelper.g.cs @ 24] 0040f608 66c421db [GCFrame: 0040f608] EDIT -- not sure if this helps, but the main thread's call stack looks like this: [Managed to Native Transition] > WindowsBase.dll!MS.Win32.UnsafeNativeMethods.GetMessageW(ref System.Windows.Interop.MSG msg, System.Runtime.InteropServices.HandleRef hWnd, int uMsgFilterMin, int uMsgFilterMax) + 0x15 bytes WindowsBase.dll!System.Windows.Threading.Dispatcher.GetMessage(ref System.Windows.Interop.MSG msg, System.IntPtr hwnd, int minMessage, int maxMessage) + 0x48 bytes WindowsBase.dll!System.Windows.Threading.Dispatcher.PushFrameImpl(System.Windows.Threading.DispatcherFrame frame = {System.Windows.Threading.DispatcherFrame}) + 0x85 bytes WindowsBase.dll!System.Windows.Threading.Dispatcher.PushFrame(System.Windows.Threading.DispatcherFrame frame) + 0x49 bytes WindowsBase.dll!System.Windows.Threading.Dispatcher.Run() + 0x4c bytes PresentationFramework.dll!System.Windows.Application.RunDispatcher(object ignore) + 0x17 bytes PresentationFramework.dll!System.Windows.Application.RunInternal(System.Windows.Window window) + 0x6f bytes PresentationFramework.dll!System.Windows.Application.Run(System.Windows.Window window) + 0x26 bytes PresentationFramework.dll!System.Windows.Application.Run() + 0x1b bytes I did a search on it and found some posts related to WPF GUIs hanging, and maybe that'll give me some more clues.

Read the article
Events raised by BackgroundWorker not executed on expected thread

- by Topdown

A winforms dialog is using BackgroundWorker to perform some asynchronous operations with significant success. On occasion, the async process being run by the background worker will need to raise events to the winforms app for user response (a message that asks the user if they wish to cancel), the response of which captured in an CancelEventArgs type of the event. Being an implementation of threading, I would have expected the RaiseEvent of the worker to fire, and then the worker would continue, hence requiring me to pause the worker until the response is received. Instead however, the worker is held to wait for the code executed by the raise event to complete. It seems like method I am calling via the event call is actually on the worker thread used by the background worker, and I am surprised, since I expected to see it on the Main Thread which is where the mainform is running. Also surprisingly, there are no cross thread exceptions thrown. Can somebody please explain why this is not as I expect?

Read the article
Windows Azure Evolution – Welcome to VS2012

- by Shaun

When the Microsoft released the first preview version of Windows 8 and Visual Studio, many people in the community were asking if the windows azure tool is available to it. The answer was “NO”. Microsoft promised that the windows azure tool will only support the Visual Studio 2010 but when the 2012 was final released, windows azure tool should be work. But now alone with the new windows azure platform was published we got the latest Windows Azure SDK 1.7, which is compatible to the Visual Studio 2012 RC. You can retrieve the latest version of the Windows Azure SDK through Web Platform Installer, which I think it’s the easiest and simplest way to download and install, since besides the SDK itself it also needs some other components. To download the latest windows azure SDK from Web Platform Installer, just go to the windows azure website and clicked the Develop, .NET and click the blue “install” button. Then you need to select which version of Visual Studio you want to use, Visual Studio 2010 or Visual Studio 2012 RC. After selected the current version you will download an EXE file. This file will lead you to install the Web Platform Installer 4.0 (if you haven’t installed) and the latest windows azure SDK. You can see the version name is June 2012, 1.7. Finally the WebPI will detect the dependent components you need to download and begin to install. But if you want to challenge yourself you can download the components and install them manually. The standalone installations are listed in this page with the instruction on how to install them with necessary pre-requirements. Once you finished the installation you can open the Visual Studio 2012 RC and as usual, it need to be run as administrator. If you clicked the New Project link from the start page, navigated to Cloud category you will find that there no project template available. Is there anything wrong? So, if you changed the target framework from the default .NET 4.5 to .NET 4 you will see the azure project template. This is because, currently the windows azure instance does not support .NET 4.5. After clicked OK you will see the role creation window, which is similar as what you have seen before. But there are some new role templates in this SDK. Firstly you will have ASP.NET MVC 4 web role available, which means you can create ASP.NET MVC 4 applications for internet, intranet, mobile and WebAPI on the cloud. Then there are two new worker role templates, “Cache Worker Role” and “Worker Role with Service Bus Queue”. “Worker Role with Service Bus Queue” is a worker role which had added necessary references to access the Windows Azure Service Bus Queue. It also have some basic sample code in the worker role class which could read messages from the queue when started. The “Cache Worker Role” is a worker role which has the in-memory distributed cache feature enabled by default. This feature is different than the Windows Azure Caching. It allows the role instance to use its memory as a in-memory distributed cache clusters. By using this feature you can have one or more worker roles as some dedicate cache clusters. Alternatively, you can make part of your web role and worker role’s memory as the cache clusters as well. Let’s just create an ASP.NET MVC 4 Web Role, and click F5 to run it under the local emulator. If you have been working with azure for a while you should know that I need to setup the local storage emulator before running locally if it’s a fresh azure SDK installation. But in this version when we started our azure project the Visual Studio will check if the storage emulator had been initialized. If not, it will run the initializer automatically. And as you can see, in this version the storage emulator relies on the SQL Server 2012 Local DB feature. It will create the emulator database and tables in the default local database. You can set the storage emulator to use a standard SQL Server default instance by using the command “dsinit /instance:.”. The “dsinit” tool now is located at %PROGRAM FILES%\Microsoft SDKs\Windows Azure\Emulator\devstore After the Visual Studio complied and deployed the package our website should be shown in the browser. This is the MVC 4 Web Role home page on my Windows 8 machine in IE10. Another thing you might notice is that, in this version the compute emulator utilizes IIS Express to host the web roles instead of the full IIS. You can add breakpoint in the code and debug, and you can use the local storage emulator to test your code for accessing the storage service. All of them are same as what your are doing now on SDK 1.6. You can switch to use IIS to run your web role in local emulator. Just open the windows azure porject property windows, in the Web page select “Use IIS Web Server”. For more information about this please have a look on Nuno’s blog post. In the role property page in Visual Studio there’s no massive changes. You can configure your role settings such as the endpoints, certificates and local storage, etc.. One thing was added is the Caching tab. Here you can specify enable the caching feature or not, and how much memory you want to use as the cache cluster. I will introduce more details about it in the future posts. The publish and package feature are also no change. You can publish your project to azure directly through Visual Studio 2012, while you can create the package and upload manually. Below is the SDK version of my deployment which is 1.7.30602.1703 in the developer portal. Summary In this post I introduced about the new Windows Azure SDK 1.7 especially on how it works on the latest Visual Studio 2012 RC. There’s no significant changes in the visual studio tool in this version but some small enhancement such as ASP.NET MVC 4, Cache Worker Role, using SQL 2012 Local DB and IIS Express, etc.. Hope this helps, Shaun All documents and related graphics, codes are provided "AS IS" without warranty of any kind. Copyright © Shaun Ziyan Xu. This work is licensed under the Creative Commons License.

Read the article
Windows Server task manager displays much higher memory use than sum of all processes' working set s

- by Sleepless

I have a 16 GB Windows Server 2008 x64 machine mostly running SQL Server 2008. The free memory as seen in Task Manager is very low (128 MB at the moment), i.e. about 15.7 GB are used. So far, so good. Now when I try to narrow down the process(es) using the most memory I get confused: None of the processes have more than 200MB Working Set Size as displayed in the 'Processes' tab of Task Manager. Well, maybe the Working Set Size isn't the relevant counter? To figure that out I used a PowerShell command [1] to sum up each individual property of the process object in sort of a brute force approach - surely one of them must add up to the 15.7 GB, right? Turns out none of them does, with the closest being VirtualMemorySize (around 12.7 GB) and PeakVirtualMemorySize (around 14.7 GB). WTF? To put it another way: Which of the numerous memory related process information is the "correct" one, i.e. counts towards the server's physical memory as displayed in the Task Manager's 'Performance' tab? Thank you all! [1] $erroractionpreference="silentlycontinue"; get-process | gm | where-object {$.membertype -eq "Property"} | foreach-object {$.name; (get-process | measure-object -sum $_.name ).sum / 1MB}

Read the article
As an indie game dev, what processes are the best for soliciting feedback on my design/spec/idea? [closed]

- by Jess Telford

Background I have worked in a professional environment where the process usually goes like the following: Brain storm idea Solidify the game mechanics / design Iterate on design/idea to create a more solid experience Spec out the details of the design/idea Build it Step 3. is generally done with the stakeholders of the game (developers, designers, investors, publishers, etc) to reach an 'agreement' which meets the goals of all involved. Due to this process involving a series of often opposing and unique view points, creative solutions can surface through discussion / iteration. This is backed up by a process for collating the changes / new ideas, as well as structured time for discussion. As a (now) indie developer, I have to play the role of all the stakeholders (developers, designers, investors, publishers, etc), and often find myself too close to the idea / design to do more than minor changes, which I feel to be local maxima when it comes to the best result (I'm looking for the global maxima, of course). I have read that ideas / game designs / unique mechanics are merely multipliers of execution, and that keeping them secret is just silly. In sharing the idea with others outside the realm of my own thinking, I hope to replicate the influence other stakeholders have. I am struggling with the collation of changes / new ideas, and any kind of structured method of receiving feedback. My question: As an indie game developer, how and where can I share my ideas/designs to receive meaningful / constructive feedback? How can I successfully collate the feedback into a new iteration of the design? Are there any specialized websites, etc?

Read the article

Search Results

Search found 5597 results on 224 pages for 'worker processes'.

Page 25/224 | < Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32 | Next Page >

- by Hugh

- by fuenfundachtzig

- by Ben Horton

- by user198864

- by Virtual_Lotos

- by Zubair

- by overmann

- by Pini Reznik

- by Chris Cooper

- by andrej351

- by faye.todd(at)oracle.com

- by NullReference

- by lurkerbelow

- by rarbox

- by Garret Smith

- by Mark

- by vorik

- by Hoff

- by brian d foy

- by jeffrey.waterman

- by Dave

- by Topdown

- by Shaun

- by Sleepless

- by Jess Telford

< Previous Page | 21 22 23 24 25 26 27 28 29 30 31 32 | Next Page >