Search Results

Search found 4272 results on 171 pages for 'processes'.

Page 18/171 | < Previous Page | 14 15 16 17 18 19 20 21 22 23 24 25 | Next Page >

PostgreSQL 8.4 won't start after blackout

- by RiZe

I have problem with starting PostgreSQL 8.4 on Ubuntu 9.10 Server after blackout. When I try to connect to the database it says: psql: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. When I try to start it by using command sudo -u postgres /etc/init.d/postgresql-8.4 start * Starting PostgreSQL 8.4 database server [ OK ] Netstat output netstat -tulp (No info could be read for "-p": geteuid()=1000 but you should be root.) Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 localhost:postgresql *:* LISTEN - tcp 0 0 192.168.1.35:svn *:* LISTEN - tcp 0 0 192.168.1.35:http-alt *:* LISTEN - tcp 0 0 *:ssh *:* LISTEN - tcp6 0 0 localhost:postgresql [::]:* LISTEN - tcp6 0 0 [::]:ssh [::]:* LISTEN - udp 0 0 *:bootpc *:* - But still don't work so lets restart it sudo -u postgres /etc/init.d/postgresql-8.4 restart * Restarting PostgreSQL 8.4 database server * The PostgreSQL server failed to start. Please check the log output: 2009-11-30 13:39:37 CET LOG: database system was shut down at 2009-11-30 13:39:33 CET 2009-11-30 13:39:37 CET LOG: autovacuum launcher started 2009-11-30 13:39:37 CET LOG: database system is ready to accept connections 2009-11-30 13:39:37 CET LOG: incomplete startup packet 2009-11-30 13:39:38 CET LOG: server process (PID 2240) was terminated by signal 11: Segmentation fault 2009-11-30 13:39:38 CET LOG: terminating any other active server processes 2009-11-30 13:39:38 CET LOG: all server processes terminated; reinitializing 2009-11-30 13:39:38 CET LOG: database system was interrupted; last known up at 2009-11-30 13:39:37 CET 2009-11-30 13:39:38 CET LOG: database system was not properly shut down; automatic recovery in progress 2009-11-30 13:39:38 CET LOG: record with zero length at 0/11D464C 2009-11-30 13:39:38 CET LOG: redo is not required 2009-11-30 13:39:38 CET LOG: autovacuum launcher started 2009-11-30 13:39:38 CET LOG: database system is ready to accept connections 2009-11-30 13:39:38 CET LOG: server process (PID 2248) was terminated by signal 11: Segmentation fault 2009-11-30 13:39:38 CET LOG: terminating any other active server processes 2009-11-30 13:39:38 CET LOG: all server processes terminated; reinitializing 2009-11-30 13:39:38 CET LOG: database system was interrupted; last known up at 2009-11-30 13:39:38 CET 2009-11-30 13:39:38 CET LOG: database system was not properly shut down; automatic recovery in progress 2009-11-30 13:39:38 CET LOG: record with zero length at 0/11D4690 2009-11-30 13:39:38 CET LOG: redo is not required 2009-11-30 13:39:39 CET LOG: autovacuum launcher started 2009-11-30 13:39:39 CET LOG: database system is ready to accept connections 2009-11-30 13:39:39 CET LOG: server process (PID 2256) was terminated by signal 11: Segmentation fault 2009-11-30 13:39:39 CET LOG: terminating any other active server processes 2009-11-30 13:39:39 CET LOG: all server processes terminated; reinitializing 2009-11-30 13:39:39 CET LOG: database system was interrupted; last known up at 2009-11-30 13:39:38 CET 2009-11-30 13:39:39 CET LOG: database system was not properly shut down; automatic recovery in progress 2009-11-30 13:39:39 CET LOG: record with zero length at 0/11D46D4 2009-11-30 13:39:39 CET LOG: redo is not required 2009-11-30 13:39:39 CET LOG: autovacuum launcher started 2009-11-30 13:39:39 CET LOG: database system is ready to accept connections 2009-11-30 13:39:39 CET LOG: server process (PID 2264) was terminated by signal 11: Segmentation fault 2009-11-30 13:39:39 CET LOG: terminating any other active server processes 2009-11-30 13:39:39 CET LOG: all server processes terminated; reinitializing 2009-11-30 13:39:39 CET LOG: database system was interrupted; last known up at 2009-11-30 13:39:39 CET 2009-11-30 13:39:39 CET LOG: database system was not properly shut down; automatic recovery in progress 2009-11-30 13:39:40 CET LOG: record with zero length at 0/11D4718 2009-11-30 13:39:40 CET LOG: redo is not required 2009-11-30 13:39:40 CET LOG: autovacuum launcher started 2009-11-30 13:39:40 CET LOG: database system is ready to accept connections 2009-11-30 13:39:40 CET LOG: server process (PID 2272) was terminated by signal 11: Segmentation fault 2009-11-30 13:39:40 CET LOG: terminating any other active server processes 2009-11-30 13:39:40 CET LOG: all server processes terminated; reinitializing 2009-11-30 13:39:40 CET LOG: database system was interrupted; last known up at 2009-11-30 13:39:40 CET 2009-11-30 13:39:40 CET LOG: database system was not properly shut down; automatic recovery in progress 2009-11-30 13:39:40 CET LOG: record with zero length at 0/11D475C 2009-11-30 13:39:40 CET LOG: redo is not required 2009-11-30 13:39:40 CET LOG: autovacuum launcher started 2009-11-30 13:39:40 CET LOG: database system is ready to accept connections 2009-11-30 13:39:41 CET LOG: server process (PID 2280) was terminated by signal 11: Segmentation fault 2009-11-30 13:39:41 CET LOG: terminating any other active server processes 2009-11-30 13:39:41 CET LOG: all server processes terminated; reinitializing 2009-11-30 13:39:41 CET LOG: database system was interrupted; last known up at 2009-11-30 13:39:40 CET 2009-11-30 13:39:41 CET LOG: database system was not properly shut down; automatic recovery in progress 2009-11-30 13:39:41 CET LOG: record with zero length at 0/11D47A0 2009-11-30 13:39:41 CET LOG: redo is not required 2009-11-30 13:39:41 CET LOG: autovacuum launcher started 2009-11-30 13:39:41 CET LOG: database system is ready to accept connections 2009-11-30 13:39:41 CET LOG: server process (PID 2288) was terminated by signal 11: Segmentation fault 2009-11-30 13:39:41 CET LOG: terminating any other active server processes 2009-11-30 13:39:41 CET LOG: all server processes terminated; reinitializing 2009-11-30 13:39:41 CET LOG: database system was interrupted; last known up at 2009-11-30 13:39:41 CET 2009-11-30 13:39:41 CET LOG: database system was not properly shut down; automatic recovery in progress 2009-11-30 13:39:41 CET LOG: record with zero length at 0/11D47E4 2009-11-30 13:39:41 CET LOG: redo is not required 2009-11-30 13:39:41 CET LOG: autovacuum launcher started 2009-11-30 13:39:41 CET LOG: database system is ready to accept connections 2009-11-30 13:39:42 CET LOG: server process (PID 2296) was terminated by signal 11: Segmentation fault 2009-11-30 13:39:42 CET LOG: terminating any other active server processes 2009-11-30 13:39:42 CET LOG: all server processes terminated; reinitializing 2009-11-30 13:39:42 CET LOG: database system was interrupted; last known up at 2009-11-30 13:39:41 CET 2009-11-30 13:39:42 CET LOG: database system was not properly shut down; automatic recovery in progress 2009-11-30 13:39:42 CET LOG: record with zero length at 0/11D4828 2009-11-30 13:39:42 CET LOG: redo is not required 2009-11-30 13:39:42 CET LOG: autovacuum launcher started 2009-11-30 13:39:42 CET LOG: database system is ready to accept connections 2009-11-30 13:39:42 CET LOG: server process (PID 2304) was terminated by signal 11: Segmentation fault 2009-11-30 13:39:42 CET LOG: terminating any other active server processes 2009-11-30 13:39:42 CET LOG: all server processes terminated; reinitializing 2009-11-30 13:39:42 CET LOG: database system was interrupted; last known up at 2009-11-30 13:39:42 CET 2009-11-30 13:39:42 CET LOG: database system was not properly shut down; automatic recovery in progress 2009-11-30 13:39:42 CET LOG: record with zero length at 0/11D486C 2009-11-30 13:39:42 CET LOG: redo is not required 2009-11-30 13:39:43 CET LOG: autovacuum launcher started 2009-11-30 13:39:43 CET LOG: database system is ready to accept connections 2009-11-30 13:39:43 CET LOG: server process (PID 2312) was terminated by signal 11: Segmentation fault 2009-11-30 13:39:43 CET LOG: terminating any other active server processes 2009-11-30 13:39:43 CET LOG: all server processes terminated; reinitializing 2009-11-30 13:39:43 CET LOG: database system was interrupted; last known up at 2009-11-30 13:39:42 CET 2009-11-30 13:39:43 CET LOG: database system was not properly shut down; automatic recovery in progress 2009-11-30 13:39:43 CET LOG: record with zero length at 0/11D48B0 2009-11-30 13:39:43 CET LOG: redo is not required 2009-11-30 13:39:43 CET LOG: autovacuum launcher started 2009-11-30 13:39:43 CET LOG: database system is ready to accept connections [fail] So what happened and what can I do to solve this? Thanks for replies

Read the article
How to disable an "always there" program if it isn't in the processes list?

- by rumtscho

I have Crashplan and it is constantly running in the background and making backups every 15 minutes. It caused some problems with the backup target folders, so I want it to be inactive while I am making changes to these folders. I started the application itself, but could not find some kind of "Pause" button. So I decided to just stop its process. I first tried the lazy way - the system monitor in the Gnome panel has a "Processes" tab - but didn't find it listed there. Then I did a sudo ps -A and read through the whole list. I don't recognize everything on the list (many process names are self-explaining, like evolution-alarm, but I don't recognize others like phy0) but there was nothing which sounded even remotely like crashplan. But I know that there must have been a process belonging to Crashplan running at this time, because the main Crashplan window was open when I ran the command. Do you have any advice how to stop this thing from running? The best solution would involve temporary preventing it from loading on boot too, since I may need to reboot while doing the maintenance there.

Read the article
PHP OCI8 and Oracle 11g DRCP Connection Pooling in Pictures

- by christopher.jones

Here is a screen shot from a PHP OCI8 connection pooling demo that I like to run. It graphically shows how little database host memory is needed when using DRCP connection pooling with Oracle Database 11g. Migrating to DRCP can be as simple as starting the pool and changing the connection string in your PHP application. The script that generated the data for this graph was a simple "Parts" query application being run under various simulated user loads. I was running the database on a small Oracle Linux server with just 2G of memory. I used PHP OCI8 1.4. Apache is in pre-fork mode, as needed for PHP. Each graph has time on the horizontal access in arbitrary 'tick' time units. Click the image to see it full sized. Pooled connections Beginning with the top left graph, At tick time 65 I used Apache's 'ab' tool to start 100 concurrent 'users' running the application. These users connected to the database using DRCP: $c = oci_pconnect('phpdemo', 'welcome', 'myhost/orcl:pooled'); A second hundred DRCP users were added to the system at tick 80 and a final hundred users added at tick 100. At about tick 110 I stopped the test and restarted Apache. This closed all the connections. The bottom left graph shows the number of statements being executed by the database per second, with some spikes for background database activity and some variability for this small test. Each extra batch of users adds another 'step' of load to the system. Looking at the top right Server Process graph shows the database server processes doing the query work for each web user. As user load is added, the DRCP server pool increases (in green). The pool is initially at its default size 4 and quickly ramps up to about (I'm guessing) 35. At tick time 100 the pool increases to my configured maximum of 40 processes. Those 40 processes are doing the query work for all 300 web users. When I stopped the test at tick 110, the pooled processes remained open waiting for more users to connect. If I had left the test quiet for the DRCP 'inactivity_timeout' period (300 seconds by default), the pool would have shrunk back to 4 processes. Looking at the bottom right, you can see the amount of memory being consumed by the database. During the initial quiet period about 500M of memory was in use. The absolute number is just an indication of my particular DB configuration. As the number of pooled processes increases, each process needs more memory. You can see the shape of the memory graph echoes the Server Process graph above it. Each of the 300 web users will also need a few kilobytes but this is almost too small to see on the graph. Non-pooled connections Compare the DRCP case with using 'dedicated server' processes. At tick 140 I started 100 web users who did not use pooled connections: $c = oci_pconnect('phpdemo', 'welcome', 'myhost/orcl'); This connection string change is the only difference between the two tests. At ticks 155 and 165 I started two more batches of 100 simulated users each. At about tick 195 I stopped the user load but left Apache running. Apache then gradually returned to its quiescent state, killing idle httpd processes and producing the downward slope at the right of the graphs as the persistent database connection in each Apache process was closed. The Executions per Second graph on the bottom left shows the same step increases as for the earlier DRCP case. The database is handling this load. But look at the number of Server processes on the top right graph. There is now a one-to-one correspondence between Apache/PHP processes and DB server processes. Each PHP processes has one DB server processes dedicated to it. Hence the term 'dedicated server'. The memory required on the database is proportional to all those database server processes started. Almost all my system's memory was consumed. I doubt it would have coped with any more user load. Summary Oracle Database 11g DRCP connection pooling significantly reduces database host memory requirements allow more system memory to be allocated for the SGA and allowing the system to scale to handled thousands of concurrent PHP users. Even for small systems, using DRCP allows more web users to be active. More information about PHP and DRCP can be found in the PHP Scalability and High Availability chapter of The Underground PHP and Oracle Manual.

Read the article
How can I find out which processes on my computer are accessing the microphone?

- by Vinayak

After reading this interesting Lifehacker post and reading the comments on the page, one person was wondering if it would be possible to use the Physical Device Object Names of other hardware such as the microphone to find out the names of processes using that device. I tried the same approach, but so far it only seems to work for the webcam. Is there any other way I could get this to work in Process Explorer? UPDATE: The Lifehacker post was about finding out which Windows process is currently using your webcam. This is how they went about doing it: Start Device Manager (WIN+R → "devmgmt.msc" → OK) Find your webcam among the list of devices (check under Imaging Devices) Open the properties window of the device and switch to the Details tab (Right click → Properties → Details) In the dropdown menu, select Physical Device Object Name and copy the string(Right click → Copy) Download Process Explorer Make sure you have opened Process Explorer in Administrator Mode(File → Show Details for All Processes) Hit CTRL+F and enter the string you copied earlier(it should be something like \Device\000000XX) Hit the Search button and you should see a list of processes using the webcam(if there are any)

Read the article
Does IIS Sometimes Allocate More Worker Processes Than Configured?

- by Paul Williams

We have an IIS 7.5 web service on Windows Server 2008 that handles WCF requests from C# clients. This service is configured to have Maximum Worker Processes = 1, so it is not a web garden. IIS is setup to recycle itself at the same time every day (3 AM). I am trying to debug gnarly connection issues, so I wanted to be sure the application pool was not recycling itself. I configured the pool to log an event when it recycles itself. To my surprise, I see the following entries in the System event log: Level: Information Date/Time: 3/23/2012 3:00:00 AM - Source: WAS - Event ID: 5076 A worker process with process id of '6636' serving application pool 'MyAppPool' has requested a recycle because it reached its scheduled recycle time. Level: Information Date/Time: 3/23/2012 2:59:39 AM - Source: WAS - Event ID: 5076 A worker process with process id of '9364' serving application pool 'MyAppPool' has requested a recycle because it reached its scheduled recycle time. IIS is correctly recycling the application pool at 3 AM. However, I do not understand why I would be getting two recycle events in the log within a few seconds of each other. The maximum number of processes is 1. Does IIS sometimes allocate multiple processes for an application pool that is specified as having 1 process? -- edit -- I connected at about 4 PM today and only saw 1 w3wp.exe process. There are no other event log entries that would indicate a crash.

Read the article
How can see what processes makes my server slow?

- by Steven

All my websites on my server are extremely slow or not loading at all. Even server admin (Plesk) will not load some times. There's been no changes to the sites for the last coupple of months. How can I see what processes is making my server slow? My environment looks like this: Server: VPS running Linux 2.8.x OS: Centos 5 Manage interface: Plesk 9.x Memmory: 1024MB CPU: 2.2GHz My websites run on PHP and MySQL. I finally managed to telnet (Putty + SSH) in to my server. Running top did not show any processes using more than max 2% CPU and none were using exesive memmory. I also got a friend to install a program that checks the core files, and all seemed fine. So I'm leaning towards network issues or some other server malfunction. But I'm not able to find out what can be wrong. Here are some answers to Sean Kimball: I don't run mail services on my server yet There are noe specific bandwidth peaks. Prefork looks like this <IfModule prefork.c> StartServers 8 MinSpareServers 5 MaxSpareServers 20 ServerLimit 256 MaxClients 256 MaxRequestsPerChild 4000 </IfModule> Not sure what you mean with DNS question. But I think it's up and running. There are no processes running wild Where can I find avarage load? Telnet is disabled and I have to log in using SSH :)

Read the article
After forced quit, “killall Finder” says “No matching processes…” but PID still exists?

- by Old McStopher

Here's one for ya. Upon a forced quit of the Finder with unsuccessful relaunch, "killall Finder" in terminal returns: "No matching processes belonging to you were found" Oddly enough, the PID for finder does actually show up after a "ps -A" to reveal all processes. But the time is perpetually listed as 0:00:00, upon repeated PID listings. I tried the following to manually launch it: open /System/Library/CoreServices/Finder.app But it puked: LSOpenFromURLSpec() failed with error -600 for the file /System/Library/CoreServices/Finder.app. Any other ideas on a Finder relaunch that don't involve rebooting? (I usually have 6 spaces open at once, each with a handful of apps and it's a pain reloading them all.)

Read the article
One of my apache processes is huge - how can I find out why?

- by Malcolm Box

I'm running Apache 2.2.12 with mod_wsgi, hosting a Django site. Most of the apache child processes weigh in at about 125MB RSS, but occasionally I see one child balloon to 1GB RSS. At this point there's usually 1 huge process (1GB), a couple of large ones (500MB) and the rest are still ~125MB. These are the mod_wsgi daemon processes. I've tried using memory leak tracing in Python to see if it's the Django code, and I see no leaks. Looking in the logs doesn't show any particularly strange requests. I'm stumped on how to figure out what's causing this - any ideas? Also, any workaround ways to kill the large apache process when it gets too big, without bringing apache down? Some more details: Not using mod_php Using pre-fork

Read the article
Linux server apache httpd processes take i/o wait to close to 100% and lock down server

- by user3682065

For about 5 days now, and seemingly out of the blue, my linux server has started locking up from time to time. The pattern is always the same as far as I can tell from top and iotop commands around the time it starts happening: One or more httpd processes (usually one) hang and start using up 100% of CPU power, the %wa goes close to 100% and in the iotop I see several httpd processes with 99.99% in the IO column. I'm also running an SVN server on this machine through apache and the one way that I've been consistently able to reproduce this is to do an SVN commit of new files or an SVN update from the repository on this server (I am the only one using this SVN repository). This will always reproduce this scenario successfully, but until very recently I had no problems at all checking in/out of SVN. But sometimes it just happens for no detectable reason at all it seems. So it seems like there is some issue with my Apache that leads it to have processes use up a lot of read/write upon certain triggers. I was wondering if anyone could help me uncover that issue. EDIT: OK now it's happening again: This is top: [root@server ~]# top top - 10:56:54 up 2:59, 5 users, load average: 171.46, 70.35, 27.01 Tasks: 328 total, 2 running, 326 sleeping, 0 stopped, 0 zombie Cpu(s): 1.9%us, 2.0%sy, 0.0%ni, 0.0%id, 96.1%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 2021144k total, 1968192k used, 52952k free, 2500k buffers Swap: 4194288k total, 2938584k used, 1255704k free, 39008k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 10390 apache 20 0 2774m 936m 6200 D 2.0 47.4 1:52.27 httpd 2149 root 20 0 927m 13m 1040 S 0.7 0.7 1:50.46 namecoind 11 root 20 0 0 0 0 R 0.3 0.0 0:30.10 events/0 23 root 20 0 0 0 0 S 0.3 0.0 0:17.88 kblockd/1 2049 root 20 0 382m 4932 2880 D 0.3 0.2 0:03.67 httpd 2144 root 20 0 1702m 69m 1164 S 0.3 3.5 5:19.68 bitcoind 6325 root 20 0 15164 1100 656 R 0.3 0.1 0:11.09 top 10311 apache 20 0 387m 9496 7320 D 0.3 0.5 0:01.89 httpd 10313 apache 20 0 391m 10m 7364 D 0.3 0.5 0:02.40 httpd 10466 apache 20 0 399m 12m 7392 D 0.3 0.7 0:02.41 httpd 10599 apache 20 0 391m 9324 7340 D 0.3 0.5 0:00.15 httpd 10628 apache 20 0 384m 7620 4052 D 0.3 0.4 0:00.01 httpd 10633 apache 20 0 384m 7048 3504 D 0.3 0.3 0:00.01 httpd 10634 apache 20 0 384m 8012 4048 D 0.3 0.4 0:00.02 httpd 10638 apache 20 0 400m 22m 9.8m D 0.3 1.1 0:01.93 httpd 10640 apache 20 0 385m 8288 4028 D 0.3 0.4 0:00.03 httpd 10641 apache 20 0 401m 21m 6376 D 0.3 1.1 0:01.45 httpd 10759 apache 20 0 385m 8816 3480 D 0.3 0.4 0:01.45 httpd 10773 apache 20 0 384m 8044 3464 D 0.3 0.4 0:00.02 httpd This is an iotop snapshot: Total DISK READ: 5.93 M/s | Total DISK WRITE: 0.00 B/s TID PRIO USER DISK READ DISK WRITE SWAPIN IO> COMMAND 10732 be/4 apache 3.76 K/s 0.00 B/s 0.00 % 58.48 % httpd 876 be/3 root 0.00 B/s 52.68 K/s 0.00 % 52.98 % [jbd2/dm-1-8] 10906 be/4 root 124.17 K/s 0.00 B/s 0.00 % 23.03 % sh -c [ -x /usr/local/psa/admin/sbin/backupmng ] && /usr/local/psa/admin/sbin/backupmng >/dev/null 2>&1 2156 be/4 root 206.94 K/s 0.00 B/s 0.00 % 21.15 % bitcoind 10904 be/4 mysql 0.00 B/s 0.00 B/s 0.00 % 18.94 % mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock 10773 be/4 apache 7.53 K/s 0.00 B/s 0.00 % 14.77 % httpd 10641 be/4 apache 15.05 K/s 0.00 B/s 0.00 % 11.57 % httpd 10399 be/4 apache 1057.29 K/s 0.00 B/s 43.16 % 10.56 % httpd 10682 be/4 sw-cp-se 158.03 K/s 0.00 B/s 0.00 % 7.45 % sw-engine-cgi -c /usr/local/psa/admin/conf/php.ini -d auto_prepend_file=auth.php3 -u psaadm 10774 be/4 apache 3.76 K/s 0.00 B/s 0.00 % 6.53 % httpd 10624 be/4 apache 0.00 B/s 0.00 B/s 0.00 % 5.53 % httpd 10356 be/4 apache 899.26 K/s 0.00 B/s 35.52 % 4.01 % httpd 10795 be/4 apache 0.00 B/s 0.00 B/s 0.00 % 3.93 % httpd 10804 be/4 apache 7.53 K/s 0.00 B/s 0.00 % 3.08 % httpd 4379 be/4 root 2.89 M/s 0.00 B/s 99.99 % 0.00 % namecoind 10619 be/4 apache 462.80 K/s 0.00 B/s 7.80 % 0.00 % httpd 10636 be/4 apache 3.76 K/s 0.00 B/s 0.00 % 0.00 % httpd 10716 be/4 mysql 105.35 K/s 0.00 B/s 5.92 % 0.00 % mysqld --basedir=/usr --datadir=/var/lib/mysql --user=mysql --log-error=/var/log/mysqld.log --pid-file=/var/run/mysqld/mysqld.pid --socket=/var/lib/mysql/mysql.sock 1988 be/4 root 18.81 K/s 0.00 B/s 0.00 % 0.00 % spamd_full.sock I also ran lsof -p for pid 10390 which was way up top under the top command and this is the bottom line where I can sort of see what request this was and it says CLOSE_WAIT: httpd 10390 apache 34u IPv6 315879 0t0 TCP default-domain.com:https->crawl-66-249-65-91.googlebot.com:42907 (CLOSE_WAIT) I'm still not sure what exactly is causing this all to happen though? I killed that service but %wa and load average remain high, I also stopped mysqld and other services. It really only goes down once I stop httpd altogether, and even then I can't start it without finding remaining hanging httpd processes via "netstat -tulpn", killing those or doing "killall -9 httpd" and after waiting a while for it to cycle through all those then doing /etc/init.d/httpd start

Read the article
When a python process is killed on OSX, why doesn't it kill the child processes?

- by Hugh

I found myself getting very confused a while back by some changes that I found when moving Python scripts from Linux over to OSX... On Linux, if a python script has called os.system(), and the calling process is killed, the called process will be killed at the same time. On OSX, however, if the main process is killed, anything that it launched is left behind. Is there something somewhere in OSX/Python where I can change this behaviour? This is causing problems on our render farm, where the processes can be killed from the management GUI, but the top level process is really just a wrapper, so, while the render farm management might think that the process has gone and the machine is freed up for another task, the actual processor-intensive task is still running, which can lead to huge blockages. I know that I could write more logic to catch the kill signal and pass it on to the child processes, but I was hoping that it might be something that could be enabled at a lower level.

Read the article
A button to set all processes to on-hold for Linux?

- by fuenfundachtzig

When Linux starts swapping you're basically doomed. Very soon the system won't react to any input any more, but happily swap on until the end of days... Can you think of a command that holds all processes whatsoever, thus (and while) allowing you to open a clean shell where you can examine the source of the problem and kill the process which ate up all the memory? (I guess this won't be easy, because as the memory is probably completely filled up you'd need to swap out some more memory to gather space for opening a shell, on the other hand all other swapping processes must be stopped.) If you tied such a command to a hot key then maybe you can use this as an emergency button saving you a lot a time. Any ideas if this is possible at all? Has somebody tried something like this before? If one could realize this it would be a cool feature :)

Read the article
Worker processes not starting in IIS 7.5. What should I check?

- by locster

I have a Windows 7 machine (Windows version 6.1.7601 SP1 Build 7601) with IIS installed. At some point the installation appears to have become 'corrupted' in some way, as any requests are now met with the message: Service Unavailable HTTP Error 503. The service is unavailable. In IIS manager IIS is started and the app pool I am using reports itself as 'Started', yet there is no w3wp.exe process listed in the process list in task manager (I am a local admin and have clicked the 'Show processes from all users' button. I have enabled logging for the web site (at default location of %SystemDrive%\inetpub\logs\LogFiles), but this folder is empty. I am assuming that this log output is written by w3wp.exe as it handles requests (no w3wp.exe, no log file?). Presumably there is another layer of request handling that is responsible for starting the worker processes, does thsi layer have log files I can check, and/or can I uninstall/re-install that layer? Thanks.

Read the article
How to configure Nginx to serve a variety of back-ends via multiple FCGI processes?

- by Ben Horton

I've seen a lot of tutorials showing one how to set up PHP/Python/Perl/RoR on nginx via various FCGI processes. None of the tutorials that I found show one how to serve multiple FCGI services off one server. How would one configure the stable nginx (nginx-0.7.64) to serve multiple FCGI processes (one for each of the above languages)? Example addresses for each FCGI process are as follows: 127.0.0.1:8080 - PHP 127.0.0.1:8081 - Python 127.0.0.1:8082 - Perl 127.0.0.1:8083 - Ruby on Rails An example configuration file that shows one how to implement multiple FCGI's off one server is really what I need. Perhaps others will benefit as well.

Read the article
Is there a difference in page fault rates between CPU bound and I/O bound processes?

- by user198864

I was thinking, should there be any difference in expectation of the page fault rate on CPU-bound vs I/O bound processes? At first I thought maybe we could, since CPU-bound processes would likely be using more memory accesses per time quantum, so I expect it would move from locality to locality faster. At the same time, the CPU-bound process is probably given a larger working set... but this doesn't affect the fault overhead as it hits a new locality IF this wasn't pre-paged in. Is there actually any real difference in the page fault rates or am I just musing about something nonexistent? And if there is, how would it impact a real-world OS like linux?

Read the article
What sysadmin must do to run OS with damaged /lib/libc.so file ? / rsyslogd daemon logrotation / deny checking list of running processes

- by Virtual_Lotos

What sysadmin must do to run OS with damaged /lib/libc.so file ? In other words, how command interpreter should be configured to be able to run system with corrupted /lib/libc.so file ? Do I have to move it to /var catalog ? Does the command interpreter must be statically compiled or have setuid attribute or perhaps must be a symbolic link to /bin/sh or must be no larger than 2MB ? How to prevent a user from checking list of processes started by another user ? How do I forbid a user to see which processes are running by another user ? What do I have to keep in mind when I want to make rsyslogd daemon logrotation ?

Read the article
How do I see which processes have open TCP/IP ports in Mac OS X?

- by Zubair

How do I see which processes have open TCP/IP ports in Mac OS X?

Read the article
How to keep memory consumption below 500 MB or less than 25 processes at background on netbook?

- by overmann

I bought a netbook yesterday, (I'm loving it) but I will never understand why they need to be a lot of processes running on background. I worry about other users who have no idea about it and continue using their computers with occasional choppiness due to 70 processes on background occupying most of the memory I'd like to keep my memory consumption below 500MB (I have 1 GB) is this possible? What are your ideas for this to work? I always run Microsoft Security Essentials at startup and real time protection, how many features can I disable to reach my goal memory usage?

Read the article
Application running as a service is not able to create the same number of processes as when it runs

- by Pini Reznik

I have a Windows application which creates up to 35 processes and it's working OK when it's running from cmd. But when it is executed as a service on the same machine it is able to create only 20 processes and all other are killed because of some kind of resource exhaustion problem. The problem is persistent on one Windows 2003 server but not reproducible on other servers. Can it be because the system has run out of desktop heap? http://support.microsoft.com/kb/184802 How can I check it?

Read the article
What does it mean for an OS to "execute within user processes"? Do any modern OS's use that approach

- by Chris Cooper

I have recently become interested in operating system, and a friend of mine lent me a book called Operating Systems: Internals and Design Principles (I have the third edition), published in 1998. It's been a very interesting book so far, but I have come to the part dealing with process control, and it's using UNIX System V as one of its examples of an operating system that executes within user processes. This concept has struck me as a little strange. First of all, does this mean that OS instructions and data are stored in each user of the processes? Probably not, because that would be an absurdly redundant scheme. But if not, then what does it mean to "execute within" a user process? Do any modern operating systems use this approach? It seems much more logical to have the operating system execute as its own process, or even independently of all processes, if you're short on memory. All the inter-accessiblilty of process data required for this layout seems to greatly complicate things. (But maybe that's just because I don't quite get the concept ;D) Here is what the book says: "Execution within User Processes: An alternative that is common with operation systems on smaller machines is to execute virtually all operating system software in the context of a user process. ... "

Read the article
Oracle Support Master Note for Troubleshooting Advanced Queuing and Oracle Streams Propagation Issues (Doc ID 233099.1)

- by faye.todd(at)oracle.com

Master Note for Troubleshooting Advanced Queuing and Oracle Streams Propagation Issues (Doc ID 233099.1) Copyright (c) 2010, Oracle Corporation. All Rights Reserved. In this Document Purpose Last Review Date Instructions for the Reader Troubleshooting Details 1. Scope and Application 2. Definitions and Classifications 3. How to Use This Guide 4. Basic AQ Propagation Troubleshooting 5. Additional Troubleshooting Steps for AQ Propagation of User-Enqueued and Dequeued Messages 6. Additional Troubleshooting Steps for Propagation in an Oracle Streams Environment 7. Performance Issues References Applies to: Oracle Server - Enterprise Edition - Version: 8.1.7.0 to 11.2.0.2 - Release: 8.1.7 to 11.2Information in this document applies to any platform. Purpose This document presents a step-by-step methodology for troubleshooting and resolving problems with Advanced Queuing Propagation in both Streams and basic Advanced Queuing environments. It also serves as a master reference for other more specific notes on Oracle Streams Propagation and Advanced Queuing Propagation issues. Last Review Date December 20, 2010 Instructions for the Reader A Troubleshooting Guide is provided to assist in debugging a specific issue. When possible, diagnostic tools are included in the document to assist in troubleshooting. Troubleshooting Details 1. Scope and Application This note is intended for Database Administrators of Oracle databases where issues are being encountered with propagating messages between advanced queues, whether the queues are used for user-created messaging systems or for Oracle Streams. It contains troubleshooting steps and links to notes for further problem resolution.It can also be used a template to document a problem when it is necessary to engage Oracle Support Services. Knowing what is NOT happening can frequently speed up the resolution process by focusing solely on the pertinent problem area. This guide is divided into five parts: Section 2: Definitions and Classifications (discusses the different types and features of propagations possible - helpful for understanding the rest of the guide) Section 3: How to Use this Guide (to be used as a start part for determining the scope of the problem and what sections to consult) Section 4. Basic AQ propagation troubleshooting (applies to both AQ propagation of user enqueued and dequeued messages as well as Oracle Streams propagations) Section 5. Additional troubleshooting steps for AQ propagation of user enqueued and dequeued messages Section 6. Additional troubleshooting steps for Oracle Streams propagation Section 7. Performance issues 2. Definitions and Classifications Given the potential scope of issues that can be encountered with AQ propagation, the first recommended step is to do some basic diagnosis to determine the type of problem that is being encountered. 2.1. What Type of Propagation is Being Used? 2.1.1. Buffered Messaging For an advanced queue, messages can be maintained on disk (persistent messaging) or in memory (buffered messaging). To determine if a queue is buffered or not, reference the GV_$BUFFERED_QUEUES view. If the queue does not appear in this view, it is persistent. 2.1.2. Propagation mode - queue-to-dblink vs queue-to-queue As of 10.2, an AQ propagation can also be defined as queue-to-dblink, or queue-to-queue: queue-to-dblink: The propagation delivers messages or events from the source queue to all subscribing queues at the destination database identified by the dblink. A single propagation schedule is used to propagate messages to all subscribing queues. Hence any changes made to this schedule will affect message delivery to all the subscribing queues. This mode does not support multiple propagations from the same source queue to the same target database. queue-to-queue: Added in 10.2, this propagation mode delivers messages or events from the source queue to a specific destination queue identified on the database link. This allows the user to have fine-grained control on the propagation schedule for message delivery. This new propagation mode also supports transparent failover when propagating to a destination Oracle RAC system. With queue-to-queue propagation, you are no longer required to re-point a database link if the owner instance of the queue fails on Oracle RAC. This mode supports multiple propagations to the same target database if the target queues are different. The default is queue-to-dblink. To verify if queue-to-queue propagation is being used, in non-Streams environments query DBA_QUEUE_SCHEDULES.DESTINATION - if a remote queue is listed along with the remote database link, then queue-to-queue propagation is being used. For Streams environments, the DBA_PROPAGATION.QUEUE_TO_QUEUE column can be checked.See the following note for a method to switch between the two modes:Document 827473.1 How to alter propagation from queue-to-queue to queue-to-dblink 2.1.3. Combined Capture and Apply (CCA) for Streams In 11g Oracle Streams environments, an optimization called Combined Capture and Apply (CCA) is implemented by default when possible. Although a propagation is configured in this case, Streams does not use it; instead it passes information directly from capture to an apply receiver. To see if CCA is in use: COLUMN CAPTURE_NAME HEADING 'Capture Name' FORMAT A30COLUMN OPTIMIZATION HEADING 'CCA Mode?' FORMAT A10SELECT CAPTURE_NAME, DECODE(OPTIMIZATION,0, 'No','Yes') OPTIMIZATIONFROM V$STREAMS_CAPTURE; Also, see the following note:Document 463820.1 Streams Combined Capture and Apply in 11g 2.2. Queue Table Compatibility There are three types of queue table compatibility. In more recent databases, queue tables may be present in all three modes of compatibility: 8.0 - earliest version, deprecated in 10.2 onwards 8.1 - support added for RAC, asynchronous notification, secure queues, queue level access control, rule-based subscribers, separate storage of history information 10.0 - if the database is in 10.1-compatible mode, then the default value for queue table compatibility is 10.0 2.3. Single vs Multiple Consumer Queue Tables If more than one recipient can dequeue a message from a queue, then its queue table is multiple consumer. You can propagate messages from a multiple-consumer queue to a single-consumer queue. Propagation from a single-consumer queue to a multiple-consumer queue is not possible. 3. How to Use This Guide 3.1. Are Messages Being Propagated at All, or is the Propagation Just Slow? Run the following query on the source database for the propagation (assuming that it is running): select TOTAL_NUMBER from DBA_QUEUE_SCHEDULES where QNAME='<source_queue_name>'; If TOTAL_NUMBER is increasing, then propagation is most likely functioning, although it may be slow. For performance issues, see Section 7. 3.2. Propagation Between Persistent User-Created Queues See Sections 4 and 5 (and optionally Section 6 if performance is an issue). 3.3. Propagation Between Buffered User-Created Queues See Sections 4, 5, and 6 (and optionally Section 7 if performance is an issue). 3.4. Propagation between Oracle Streams Queues (without Combined Capture and Apply (CCA) Optimization) See Sections 4 and 6 (and optionally Section 7 if performance is an issue). 3.5. Propagation between Oracle Streams Queues (with Combined Capture and Apply (CCA) Optimization) Although an AQ propagation is not used directly in this case, some characteristics of the message transfer are inferred from the propagation parameters used. Some parts of Sections 4 and 6 still apply. 3.6. Messaging Gateway Propagations This note does not apply to Messaging Gateway propagations. 4. Basic AQ Propagation Troubleshooting 4.1. Double-check Your Code Make sure that you are consistent in your usage of the database link(s) names, queue names, etc. It may be useful to plot a diagram of which queues are connected via which database links to make sure that the logical structure is correct. 4.2. Verify that Job Queue Processes are Running 4.2.1. Versions 10.2 and Lower - DBA_JOBS Package For versions 10.2 and lower, a scheduled propagation is managed by DBMS_JOB package. The propagation is performed by job queue process background processes. Therefore we need to verify that there are sufficient processes available for the propagation process. We should have at least 4 job queue processes running and preferably more depending on the number of other jobs running in the database. It should be noted that for AQ specific work, AQ will only ever use half of the job queue processes available.An issue caused by an inadequate job queue processes parameter setting is described in the following note:Document 298015.1 Kwqjswproc:Excep After Loop: Assigning To Self 4.2.1.1. Job Queue Processes in Initalization Parameter File The parameter JOB_QUEUE_PROCESSES in the init.ora/spfile should be > 0. The value can be changed dynamically via connect / as sysdbaalter system set JOB_QUEUE_PROCESSES=10; 4.2.1.2. Job Queue Processes in Memory The following command will show how many job queue processes are currentlyin use by this instance (this may be different than what is in the init.ora/spfile): connect / as sysdbashow parameter job; 4.2.1.3. OS PIDs Corresponding to Job Queue Processes Identify the operating system process ids (spids) of job queue processes involved in propagation via select p.SPID, p.PROGRAM from V$PROCESS p, DBA_JOBS_RUNNING jr, V$SESSION s, DBA_JOBS j where s.SID=jr.SID and s.PADDR=p.ADDR and jr.JOB=j.JOBand j.WHAT like '%sys.dbms_aqadm.aq$_propaq(job)%'; and these SPIDs can be used to check at the operating system level that they exist.In 8i a job queue process will have a name similar to: ora_snp1_<instance_name>.In 9i onwards you will see a coordinator process: ora_cjq0_ and multiple slave processes: ora_jnnn_<instance_name>, where nnn is an integer between 1 and 999. 4.2.2. Version 11.1 and Above - Oracle Scheduler In version 11.1 and above, Oracle Scheduler is used to perform AQ and Streams propagations. Oracle Scheduler automatically tunes the number of slave processes for these jobs based on the load on the computer system, and the JOB_QUEUE_PROCESSES initialization parameter is only used to specify the maximum number of slave processes. Therefore, the JOB_QUEUE_PROCESSES initialization parameter does not need to be set (it defaults to a very high number), unless you want to limit the number of slaves that can be created. If JOB_QUEUE_PROCESSES = 0, no propagation jobs will run.See the following note for a discussion of Oracle Streams 11g and Oracle Scheduler:Document 1083608.1 11g Streams and Oracle Scheduler 4.2.2.1. Job Queue Processes in Initalization Parameter File The parameter JOB_QUEUE_PROCESSES in the init.ora/spfile should be > 0, and preferably be left at its default value. The value can be changed dynamically via connect / as sysdbaalter system set JOB_QUEUE_PROCESSES=10; To set the JOB_QUEUE_PROCESSES parameter to its default value, run: connect / as sysdbaalter system reset JOB_QUEUE_PROCESSES; and then bounce the instance. 4.2.2.2. Job Queue Processes in Memory The following command will show how many job queue processes are currently in use by this instance (this may be different than what is in the init.ora/spfile): connect / as sysdbashow parameter job; 4.2.2.3. OS PIDs Corresponding to Job Queue Processes Identify the operating system process ids (SPIDs) of job queue processes involved in propagation via col PROGRAM for a30select p.SPID, p.PROGRAM, j.JOB_namefrom v$PROCESS p, DBA_SCHEDULER_RUNNING_JOBS jr, V$SESSION s, DBA_SCHEDULER_JOBS j where s.SID=jr.SESSION_ID and s.PADDR=p.ADDRand jr.JOB_name=j.JOB_NAME and j.JOB_NAME like '%AQ_JOB$_%'; and these SPIDs can be used to check at the operating system level that they exist.You will see a coordinator process: ora_cjq0_ and multiple slave processes: ora_jnnn_<instance_name>, where nnn is an integer between 1 and 999. 4.3. Check the Alert Log and Any Associated Trace Files The first place to check for propagation failures is the alert logs at all sites (local and if relevant all remote sites). When a job queue process attempts to execute a schedule and fails it will always write an error stack to the alert log. This error stack will also be written in a job queue process trace file, which will be written to the BACKGROUND_DUMP_DEST location for 10.2 and below, and in the DIAGNOSTIC_DEST location for 11g. The fact that errors are written to the alert log demonstrates that the schedule is executing. This means that the problem could be with the set up of the schedule. In this example the ORA-02068 demonstrates that the failure was at the remote site. Further investigation revealed that the remote database was not open, hence the ORA-03114 error. Starting the database resolved the problem. Thu Feb 14 10:40:05 2002 Propagation Schedule for (AQADM.MULTIPLEQ, SHANE816.WORLD) encountered following error:ORA-04052: error occurred when looking up Remote object [email protected]: error occurred at recursive SQL level 4ORA-02068: following severe error from SHANE816ORA-03114: not connected to ORACLEORA-06512: at "SYS.DBMS_AQADM_SYS", line 4770ORA-06512: at "SYS.DBMS_AQADM", line 548ORA-06512: at line 1 Other potential errors that may be written to the alert log can be found in the following notes:Document 827184.1 AQ Propagation with CLOB data types Fails with ORA-22990 (11.1)Document 846297.1 AQ Propagation Fails : ORA-00600[kope2upic2954] or Ora-00600[Kghsstream_copyn] (10.2, 11.1)Document 731292.1 ORA-25215 Reported on Local Propagation When Using Transformation with ANYDATA queue tables (10.2, 11.1, 11.2)Document 365093.1 ORA-07445 [kwqppay2aqe()+7360] Reported on Propagation of a Transformed Message (10.1, 10.2)Document 219416.1 Advanced Queuing Propagation Fails with ORA-22922 (9.0)Document 1203544.1 AQ Propagation Aborted with ORA-600 [ociksin: invalid status] on SYS.DBMS_AQADM_SYS.AQ$_PROPAGATION_PROCEDURE After Upgrade (11.1, 11.2)Document 1087324.1 ORA-01405 ORA-01422 reported by Advanced Queuing Propagation schedules after RAC reconfiguration (10.2)Document 1079577.1 Advanced Queuing Propagation Fails With "ORA-22370 incorrect usage of method" (9.2, 10.2, 11.1, 11.2)Document 332792.1 ORA-04061 error relating to SYS.DBMS_PRVTAQIP reported when setting up Statspack (8.1, 9.0, 9.2, 10.1)Document 353325.1 ORA-24056: Internal inconsistency for QUEUE <queue_name> and destination <dblink> (8.1, 9.0, 9.2, 10.1, 10.2, 11.1, 11.2)Document 787367.1 ORA-22275 reported on Propagating Messages with LOB component when propagating between 10.1 and 10.2 (10.1, 10.2)Document 566622.1 ORA-22275 when propagating >4K AQ$_JMS_TEXT_MESSAGEs from 9.2.0.8 to 10.2.0.1 (9.2, 10.1)Document 731539.1 ORA-29268: HTTP client error 401 Unauthorized Error when the AQ Servlet attempts to Propagate a message via HTTP (9.0, 9.2, 10.1, 10.2, 11.1)Document 253131.1 Concurrent Writes May Corrupt LOB Segment When Using Auto Segment Space Management (ORA-1555) (9.2)Document 118884.1 How to unschedule a propagation schedule stuck in pending stateDocument 222992.1 DBMS_AQADM.DISABLE_PROPAGATION_SCHEDULE Returns ORA-24082Document 282987.1 Propagated Messages marked UNDELIVERABLE after Drop and Recreate Of Remote QueueDocument 1204080.1 AQ Propagation Failing With ORA-25329 After Upgraded From 8i or 9i to 10g or 11g.Document 1233675.1 AQ Propagation stops after upgrade to 11.2.0.1 ORA-30757 4.3.1. Errors Related to Incorrect Network Configuration The most common propagation errors result from an incorrect network configuration. The list below contains common errors caused by tnsnames.ora file or database links being configured incorrectly: - ORA-12154: TNS:could not resolve service name- ORA-12505: TNS:listener does not currently know of SID given in connect descriptor- ORA-12514: TNS:listener could not resolve SERVICE_NAME - ORA-12541: TNS-12541 TNS:no listener 4.4. Check the Database Links Exist and are Functioning Correctly For schedules to remote databases confirm the database link exists via. SQL> col DBLINK for a45SQL> select QNAME, NVL(REGEXP_SUBSTR(DESTINATION, '[^@]+', 1, 2), DESTINATION) dblink2 from DBA_QUEUE_SCHEDULES3 where MESSAGE_DELIVERY_MODE = 'PERSISTENT';QNAME DBLINK------------------------------ ---------------------------------------------MY_QUEUE ORCL102B.WORLD Connect as the owner of the link and select across it to verify it works and connects to the database we expect. i.e. select * from ALL_QUEUES@ ORCL102B.WORLD; You need to ensure that the userid that scheduled the propagation (using DBMS_AQADM.SCHEDULE_PROPAGATION or DBMS_PROPAGATION_ADM.CREATE_PROPAGATION if using Streams) has access to the database link for the destination. 4.5. Has Propagation Been Correctly Scheduled? Check that the propagation schedule has been created and that a job queue process has been assigned. Look for the entry in DBA_QUEUE_SCHEDULES and SYS.AQ$_SCHEDULES for your schedule. For 10g and below, check that it has a JOBNO entry in SYS.AQ$_SCHEDULES, and that there is an entry in DBA_JOBS with that JOBNO. For 11g and above, check that the schedule has a JOB_NAME entry in SYS.AQ$_SCHEDULES, and that there is an entry in DBA_SCHEDULER_JOBS with that JOB_NAME. Check the destination is as intended and spelled correctly. SQL> select SCHEMA, QNAME, DESTINATION, SCHEDULE_DISABLED, PROCESS_NAME from DBA_QUEUE_SCHEDULES;SCHEMA QNAME DESTINATION S PROCESS------- ---------- ------------------ - -----------AQADM MULTIPLEQ AQ$_LOCAL N J000 AQ$_LOCAL in the destination column shows that the queue to which we are propagating to is in the same database as the source queue. If the propagation was to a remote (different) database, a database link will be in the DESTINATION column. The entry in the SCHEDULE_DISABLED column, N, means that the schedule is NOT disabled. If Y (yes) appears in this column, propagation is disabled and the schedule will not be executed. If not using Oracle Streams, propagation should resume once you have enabled the schedule by invoking DBMS_AQADM.ENABLE_PROPAGATION_SCHEDULE (for 10.2 Oracle Streams and above, the DBMS_PROPAGATION_ADM.START_PROPAGATION procedure should be used). The PROCESS_NAME is the name of the job queue process currently allocated to execute the schedule. This process is allocated dynamically at execution time. If the PROCESS_NAME column is null (empty) the schedule is not currently executing. You may need to execute this statement a number of times to verify if a process is being allocated. If a process is at some time allocated to the schedule, it is attempting to execute. SQL> select SCHEMA, QNAME, LAST_RUN_DATE, NEXT_RUN_DATE from DBA_QUEUE_SCHEDULES;SCHEMA QNAME LAST_RUN_DATE NEXT_RUN_DATE------ ----- ----------------------- ----------------------- AQADM MULTIPLEQ 13-FEB-2002 13:18:57 13-FEB-2002 13:20:30 In 11g, these dates are expressed in TIMESTAMP WITH TIME ZONE datatypes. If the NEXT_RUN_DATE and NEXT_RUN_TIME columns are null when this statement is executed, the scheduled propagation is currently in progress. If they never change it would suggest that the schedule itself is never executing. If the next scheduled execution is too far away, change the NEXT_TIME parameter of the schedule so that schedules are executed more frequently (assuming that the window is not set to be infinite). Parameters of a schedule can be changed using the DBMS_AQADM.ALTER_PROPAGATION_SCHEDULE call. In 10g and below, scheduling propagation posts a job in the DBA_JOBS view. The columns are more or less the same as DBA_QUEUE_SCHEDULES so you just need to recognize the job and verify that it exists. SQL> select JOB, WHAT from DBA_JOBS where WHAT like '%sys.dbms_aqadm.aq$_propaq(job)%';JOB WHAT---- ----------------- 720 next_date := sys.dbms_aqadm.aq$_propaq(job); For 11g, scheduling propagation posts a job in DBA_SCHEDULER_JOBS instead: SQL> select JOB_NAME from DBA_SCHEDULER_JOBS where JOB_NAME like 'AQ_JOB$_%';JOB_NAME------------------------------AQ_JOB$_41 If no job exists, check DBA_QUEUE_SCHEDULES to make sure that the schedule has not been disabled. For 10g and below, the job number is dynamic for AQ propagation schedules. The procedure that is executed to expedite a propagation schedule runs, removes itself from DBA_JOBS, and then reposts a new job for the next scheduled propagation. The job number should therefore always increment unless the schedule has been set up to run indefinitely. 4.6. Is the Schedule Executing but Failing to Complete? Run the following query: SQL> select FAILURES, LAST_ERROR_MSG from DBA_QUEUE_SCHEDULES;FAILURES LAST_ERROR_MSG------------ -----------------------1 ORA-25207: enqueue failed, queue AQADM.INQ is disabled from enqueueingORA-02063: preceding line from SHANE816 The failures column shows how many times we have attempted to execute the schedule and failed. Oracle will attempt to execute the schedule 16 times after which it will be removed from the DBA_JOBS or DBA_SCHEDULER_JOBS view and the schedule will become disabled. The column DBA_QUEUE_SCHEDULES.SCHEDULE_DISABLED will show 'Y'. For 11g and above, the DBA_SCHEDULER_JOBS.STATE column will show 'BROKEN' for the job corresponding to DBA_QUEUE_SCHEDULES.JOB_NAME. Prior to 10g the back off algorithm for failures was exponential, whereas from 10g onwards it is linear. The propagation will become disabled on the 17th attempt. Only the last execution failure will be reflected in the LAST_ERROR_MSG column. That is, if the schedule fails 5 times for 5 different reasons, only the last set of errors will be recorded in DBA_QUEUE_SCHEDULES. Any errors need to be resolved to allow propagation to continue. If propagation has also become disabled due to 17 failures, first resolve the reason for the error and then re-enable the schedule using the DBMS_AQADM.ENABLE_PROPAGATION_SCHEDULE procedure, or DBMS_PROPAGATION_ADM.START_PROPAGATION if using 10.2 or above Oracle Streams. As soon as the schedule executes successfully the error message entries will be deleted. Oracle does not keep a history of past failures. However, when using Oracle Streams, the errors will be retained in the DBA_PROPAGATION view even after the schedule resumes successfully. See the following note for instructions on how to clear out the errors from the DBA_PROPAGATION view:Document 808136.1 How to clear the old errors from DBA_PROPAGATION view?If a schedule is active and no errors are being reported then the source queue may not have any messages to be propagated. 4.7. Do the Propagation Notification Queue Table and Queue Exist? Check to see that the propagation notification queue table and queue exist and are enabled for enqueue and dequeue. Propagation makes use of the propagation notification queue for handling propagation run-time events, and the messages in this queue are stored in a SYS-owned queue table. This queue should never be stopped or dropped and the corresponding queue table never be dropped. 10g and belowThe propagation notification queue table is of the format SYS.AQ$_PROP_TABLE_n, where 'n' is the RAC instance number, i.e. '1' for a non-RAC environment. This queue and queue table are created implicitly when propagation is first scheduled. If propagation has been scheduled and these objects do not exist, try unscheduling and rescheduling propagation. If they still do not exist contact Oracle Support. SQL> select QUEUE_TABLE from DBA_QUEUE_TABLES2 where QUEUE_TABLE like '%PROP_TABLE%' and OWNER = 'SYS';QUEUE_TABLE------------------------------AQ$_PROP_TABLE_1SQL> select NAME, ENQUEUE_ENABLED, DEQUEUE_ENABLED2 from DBA_QUEUES where owner='SYS'3 and QUEUE_TABLE like '%PROP_TABLE%';NAME ENQUEUE DEQUEUE------------------------------ ------- -------AQ$_PROP_NOTIFY_1 YES YESAQ$_AQ$_PROP_TABLE_1_E NO NO If the AQ$_PROP_NOTIFY_1 queue is not enabled for enqueue or dequeue, it should be so enabled using DBMS_AQADM.START_QUEUE. However, the exception queue AQ$_AQ$_PROP_TABLE_1_E should not be enabled for enqueue or dequeue.11g and aboveThe propagation notification queue table is of the format SYS.AQ_PROP_TABLE, and is created when the database is created. If they do not exist, contact Oracle Support. SQL> select QUEUE_TABLE from DBA_QUEUE_TABLES2 where QUEUE_TABLE like '%PROP_TABLE%' and OWNER = 'SYS';QUEUE_TABLE------------------------------AQ_PROP_TABLESQL> select NAME, ENQUEUE_ENABLED, DEQUEUE_ENABLED2 from DBA_QUEUES where owner='SYS'3 and QUEUE_TABLE like '%PROP_TABLE%';NAME ENQUEUE DEQUEUE------------------------------ ------- -------AQ_PROP_NOTIFY YES YESAQ$_AQ_PROP_TABLE_E NO NO If the AQ_PROP_NOTIFY queue is not enabled for enqueue or dequeue, it should be so enabled using DBMS_AQADM.START_QUEUE. However, the exception queue AQ$_AQ$_PROP_TABLE_E should not be enabled for enqueue or dequeue. 4.8. Does the Remote Queue Exist and is it Enabled for Enqueueing? Check that the remote queue the propagation is transferring messages to exists and is enabled for enqueue: SQL> select DESTINATION from USER_QUEUE_SCHEDULES where QNAME = 'OUTQ';DESTINATION-----------------------------------------------------------------------------"AQADM"."INQ"@M2V102.ESSQL> select OWNER, NAME, ENQUEUE_ENABLED, DEQUEUE_ENABLED from [email protected];OWNER NAME ENQUEUE DEQUEUE-------- ------ ----------- -----------AQADM INQ YES YES 4.9. Do the Target and Source Database Charactersets Differ? If a message fails to propagate, check the database charactersets of the source and target databases. Investigate whether the same message can propagate between the databases with the same characterset or it is only a particular combination of charactersets which causes a problem. 4.10. Check the Queue Table Type Agreement Propagation is not possible between queue tables which have types that differ in some respect. One way to determine if this is the case is to run the DBMS_AQADM.VERIFY_QUEUE_TYPES procedure for the two queues that the propagation operates on. If the types do not agree, DBMS_AQADM.VERIFY_QUEUE_TYPES will return '0'.For AQ propagation between databases which have different NLS_LENGTH_SEMANTICS settings, propagation will not work, unless the queues are Oracle Streams ANYDATA queues.See the following notes for issues caused by lack of type agreement:Document 1079577.1 Advanced Queuing Propagation Fails With "ORA-22370: incorrect usage of method"Document 282987.1 Propagated Messages marked UNDELIVERABLE after Drop and Recreate Of Remote QueueDocument 353754.1 Streams Messaging Propagation Fails between Single and Multi-byte Charactersets when using Chararacter Length Semantics in the ADT 4.11. Enable Propagation Tracing 4.11.1. System Level This is set it in the init.ora/spfile as follows: event="24040 trace name context forever, level 10" and restart the instanceThis event cannot be set dynamically with an alter system command until version 10.2: SQL> alter system set events '24040 trace name context forever, level 10'; To unset the event: SQL> alter system set events '24040 trace name context off'; Debugging information will be logged to job queue trace file(s) (jnnn) as propagation takes place. You can check the trace file for errors, and for statements indicating that messages have been sent. For the most part the trace information is understandable. This trace should also be uploaded to Oracle Support if a service request is created. 4.11.2. Attaching to a Specific Process We can also attach to an existing job queue processes that is running a propagation schedule and trace it individually using the oradebug utility, as follows:10.2 and below connect / as sysdbaselect p.SPID, p.PROGRAM from v$PROCESS p, DBA_JOBS_RUNNING jr, V$SESSION s, DBA_JOBS j where s.SID=jr.SID and s.PADDR=p.ADDR and jr.JOB=j.JOB and j.WHAT like '%sys.dbms_aqadm.aq$_propaq(job)%';-- For the process id (SPID) attach to it via oradebug and generate the following traceoradebug setospid <SPID>oradebug unlimitoradebug Event 10046 trace name context forever, level 12oradebug Event 24040 trace name context forever, level 10-- Trace the process for 5 minutesoradebug Event 10046 trace name context offoradebug Event 24040 trace name context off-- The following command returns the pathname/filename to the file being written tooradebug tracefile_name 11g connect / as sysdbacol PROGRAM for a30select p.SPID, p.PROGRAM, j.JOB_NAMEfrom v$PROCESS p, DBA_SCHEDULER_RUNNING_JOBS jr, V$SESSION s, DBA_SCHEDULER_JOBS j where s.SID=jr.SESSION_ID and s.PADDR=p.ADDR and jr.JOB_NAME=j.JOB_NAME and j.JOB_NAME like '%AQ_JOB$_%';-- For the process id (SPID) attach to it via oradebug and generate the following traceoradebug setospid <SPID>oradebug unlimitoradebug Event 10046 trace name context forever, level 12oradebug Event 24040 trace name context forever, level 10-- Trace the process for 5 minutesoradebug Event 10046 trace name context offoradebug Event 24040 trace name context off-- The following command returns the pathname/filename to the file being written tooradebug tracefile_name 4.11.3. Further Tracing The previous tracing steps only trace the job queue process executing the propagation on the source. At times it is useful to trace the propagation receiver process (the session which is enqueueing the messages into the target queue) on the target database which is associated with the job queue process on the source database.These following queries provide ways of identifying the processes involved in propagation so that you can attach to them via oradebug to generate trace information.In order to identify the propagation receiver process you need to execute the query as a user with privileges to access the v$ views in both the local and remote databases so the database link must connect as a user with those privileges in the remote database. The <DBLINK> in the queries should be replaced by the appropriate database link.The queries have two forms due to the differences between operating systems. The value returned by 'Rem Process' is the operating system identifier of the propagation receiver on the remote database. Once identified, this process can be attached to and traced on the remote database using the commands given in Section 4.11.2.10.2 and below - Windows select pl.SPID "JobQ Process", pl.PROGRAM, sr.PROCESS "Rem Process" from v$PROCESS pl, DBA_JOBS_RUNNING jr, V$SESSION s, DBA_JOBS j, V$SESSION@<DBLINK> sr where s.SID=jr.SID and s.PADDR=pl.ADDR and jr.JOB=j.JOB and j.WHAT like '%sys.dbms_aqadm.aq$_propaq(job)%' and pl.SPID=substr(sr.PROCESS, instr(sr.PROCESS,':')+1); 10.2 and below - Unix select pl.SPID "JobQ Process", pl.PROGRAM, sr.PROCESS "Rem Process" from V$PROCESS pl, DBA_JOBS_RUNNING jr, V$SESSION s, DBA_JOBS j, V$SESSION@<DBLINK> sr where s.SID=jr.SID and s.PADDR=pl.ADDR and jr.JOB=j.JOB and j.WHAT like '%sys.dbms_aqadm.aq$_propaq(job)%' and pl.SPID=sr.PROCESS; 11g - Windows select pl.SPID "JobQ Process", pl.PROGRAM, sr.PROCESS "Rem Process" from V$PROCESS pl, DBA_SCHEDULER_RUNNING_JOBS jr, V$SESSION s, DBA_SCHEDULER_JOBS j, V$SESSION@<DBLINK> sr where s.SID=jr.SESSION_ID and s.PADDR=pl.ADDR and jr.JOB_NAME=j.JOB_NAME and j.JOB_NAME like '%AQ_JOB$_%%' and pl.SPID=substr(sr.PROCESS, instr(sr.PROCESS,':')+1); 11g - Unix select pl.SPID "JobQ Process", pl.PROGRAM, sr.PROCESS "Rem Process" from V$PROCESS pl, DBA_SCHEDULER_RUNNING_JOBS jr, V$SESSION s, DBA_SCHEDULER_JOBS j, V$SESSION@<DBLINK> sr where s.SID=jr.SESSION_ID and s.PADDR=pl.ADDR and jr.JOB_NAME=j.JOB_NAME and j.JOB_NAME like '%AQ_JOB$_%%' and pl.SPID=sr.PROCESS; 5. Additional Troubleshooting Steps for AQ Propagation of User-Enqueued and Dequeued Messages 5.1. Check the Privileges of All Users Involved Ensure that the owner of the database link has the necessary privileges on the aq packages. SQL> select TABLE_NAME, PRIVILEGE from USER_TAB_PRIVS;TABLE_NAME PRIVILEGE------------------------------ ----------------------------------------DBMS_LOCK EXECUTEDBMS_AQ EXECUTEDBMS_AQADM EXECUTEDBMS_AQ_BQVIEW EXECUTEQT52814_BUFFER SELECT Note that when queue table is created, a view called QT<nnn>_BUFFER is created in the SYS schema, and the queue table owner is given SELECT privileges on it. The <nnn> corresponds to the object_id of the associated queue table. SQL> select * from USER_ROLE_PRIVS;USERNAME GRANTED_ROLE ADM DEF OS_------------------------------ ------------------------------ ---- ---- ---AQ_USER1 AQ_ADMINISTRATOR_ROLE NO YES NOAQ_USER1 CONNECT NO YES NOAQ_USER1 RESOURCE NO YES NO It is good practice to configure central AQ administrative user. All admin and processing jobs are created, executed and administered as this user. This configuration is not mandatory however, and the database link can be owned by any existing queue user. If this latter configuration is used, ensure that the connecting user has the necessary privileges on the AQ packages and objects involved. Privileges for an AQ Administrative user Execute on DBMS_AQADM Execute on DBMS_AQ Granted the AQ_ADMINISTRATOR_ROLE Privileges for an AQ user Execute on DBMS_AQ Execute on the message payload Enqueue privileges on the remote queue Dequeue privileges on the originating queue Privileges need to be confirmed on both sites when propagation is scheduled to remote destinations. Verify that the user ID used to login to the destination through the database link has been granted privileges to use AQ. 5.2. Verify Queue Payload Types AQ will not propagate messages from one queue to another if the payload types of the two queues are not verified to be equivalent. An AQ administrator can verify if the source and destination's payload types match by executing the DBMS_AQADM.VERIFY_QUEUE_TYPES procedure. The results of the type checking will be stored in the SYS.AQ$_MESSAGE_TYPES table. This table can be accessed using the object identifier OID of the source queue and the address database link of the destination queue, i.e. [schema.]queue_name[@destination]. Prior to Oracle 9i the payload (message type) had to be the same for all the queue tables involved in propagation. From Oracle9i onwards a transformation can be used so that payloads can be converted from one type to another. The following procedural call made on the source database can verify whether we can propagate between the source and the destination queue tables. connect aq_user1/[email protected] serverout onDECLARErc_value number;BEGINDBMS_AQADM.VERIFY_QUEUE_TYPES(src_queue_name => 'AQ_USER1.Q_1', dest_queue_name => 'AQ_USER2.Q_2',destination => 'dbl_aq_user2.es',rc => rc_value);dbms_output.put_line('rc_value code is '||rc_value);END;/ If propagation is possible then the return code value will be 1. If it is 0 then propagation is not possible and further investigation of the types and transformations used by and in conjunction with the queue tables is required. With regard to comparison of the types the following sql can be used to extract the DDL for a specific type with' %' changed appropriately on the source and target. This can then be compared for the source and target. SET LONG 20000 set pagesize 50 EXECUTE DBMS_METADATA.SET_TRANSFORM_PARAM(DBMS_METADATA.SESSION_TRANSFORM, 'STORAGE',false); SELECT DBMS_METADATA.GET_DDL('TYPE',t.type_name) from user_types t WHERE t.type_name like '%'; EXECUTE DBMS_METADATA.SET_TRANSFORM_PARAM(DBMS_METADATA.SESSION_TRANSFORM, 'DEFAULT'); 5.3. Check Message State and Destination The first step in this process is to identify the queue table associated with the problem source queue. Although you schedule propagation for a specific queue, most of the meta-data associated with that queue is stored in the underlying queue table. The following statement finds the queue table for a given queue (note that this is a multiple-consumer queue table). SQL> select QUEUE_TABLE from DBA_QUEUES where NAME = 'MULTIPLEQ';QUEUE_TABLE --------------------MULTIPLEQTABLE For a small amount of messages in a multiple-consumer queue table, the following query can be run: SQL> select MSG_STATE, CONSUMER_NAME, ADDRESS from AQ$MULTIPLEQTABLE where QUEUE = 'MULTIPLEQ';MSG_STATE CONSUMER_NAME ADDRESS-------------- ----------------------- -------------READY AQUSER2 [email protected] AQUSER1READY AQUSER3 AQADM.INQ In this example we see 2 messages ready to be propagated to remote queues and 1 that is not. If the address column is blank, the message is not scheduled for propagation and can only be dequeued from the queue upon which it was enqueued. The MSG_STATE column values are discussed in Document 102330.1 Advanced Queueing MSG_STATE Values and their Interpretation. If the address column has a value, the message has been enqueued for propagation to another queue. The first row in the example includes a database link (@M2V102.ES). This demonstrates that the message should be propagated to a queue at a remote database. The third row does not include a database link so will be propagated to a queue that resides on the same database as the source queue. The consumer name is the intended recipient at the target queue. Note that we are not querying the base queue table directly; rather, we are querying a view that is available on top of every queue table, AQ$<queue_table_name>.A more realistic query in an environment where the queue table contains thousands of messages is8.0.3-compatible multiple-consumer queue table and all compatibility single-consumer queue tables select count(*), MSG_STATE, QUEUE from AQ$<queue_table_name> group by MSG_STATE, QUEUE; 8.1.3 and 10.0-compatible queue tables select count(*), MSG_STATE, QUEUE, CONSUMER_NAME from AQ$<queue_table_name>group by MSG_STATE, QUEUE, CONSUMER_NAME; For multiple-consumer queue tables, if you did not see the expected CONSUMER_NAME , check the syntax of the enqueue code and verify the recipients are declared correctly. If a recipients list is not used on enqueue, check the subscriber list in the AQ$_<queue_table_name>_S view (note that a single-consumer queue table does not have a subscriber view. This view records all members of the default subscription list which were added using the DBMS_AQADM.ADD_SUBSCRIBER procedure and also those enqueued using a recipient list. SQL> select QUEUE, NAME, ADDRESS from AQ$MULTIPLEQTABLE_S;QUEUE NAME ADDRESS---------- ----------- -------------MULTIPLEQ AQUSER2 [email protected] AQUSER1 In this example we have 2 subscribers registered with the queue. We have a local subscriber AQUSER1, and a remote subscriber AQUSER2, on the queue INQ, owned by AQADM, at M2V102.ES. Unless overridden with a recipient list during enqueue every message enqueued to this queue will be propagated to INQ at M2V102.ES.For 8.1 style and above multiple consumer queue tables, you can also check the following information at the target: select CONSUMER_NAME, DEQ_TXN_ID, DEQ_TIME, DEQ_USER_ID, PROPAGATED_MSGID from AQ$<queue_table_name> where QUEUE = '<QUEUE_NAME>'; For 8.0 style queues, if the queue table supports multiple consumers you can obtain the same information from the history column of the queue table: select h.CONSUMER, h.TRANSACTION_ID, h.DEQ_TIME, h.DEQ_USER, h.PROPAGATED_MSGIDfrom AQ$<queue_table_name> t, table(t.history) h where t.Q_NAME = '<QUEUE_NAME>'; A non-NULL TRANSACTION_ID indicates that the message was successfully propagated. Further, the DEQ_TIME indicates the time of propagation, the DEQ_USER indicates the userid used for propagation, and the PROPAGATED_MSGID indicates the message ID of the message that was enqueued at the destination. 6. Additional Troubleshooting Steps for Propagation in an Oracle Streams Environment 6.1. Is the Propagation Enabled? For a propagation job to propagate messages, the propagation must be enabled. For Streams, a special view called DBA_PROPAGATION exists to convey information about Streams propagations. If messages are not being propagated by a propagation as expected, then the propagation might not be enabled. To query for this: SELECT p.PROPAGATION_NAME, DECODE(s.SCHEDULE_DISABLED, 'Y', 'Disabled','N', 'Enabled') SCHEDULE_DISABLED, s.PROCESS_NAME, s.FAILURES, s.LAST_ERROR_MSGFROM DBA_QUEUE_SCHEDULES s, DBA_PROPAGATION pWHERE p.DESTINATION_DBLINK = NVL(REGEXP_SUBSTR(s.DESTINATION, '[^@]+', 1, 2), s.DESTINATION) AND s.SCHEMA = p.SOURCE_QUEUE_OWNER AND s.QNAME = p.SOURCE_QUEUE_NAME AND MESSAGE_DELIVERY_MODE = 'PERSISTENT' order by PROPAGATION_NAME; At times, the propagation job may become "broken" or fail to start after an error has been encountered or after a database restart. If an error is indicated by the above query, an attempt to disable the propagation and then re-enable it can be made. In the examples below, for the propagation named STRMADMIN_PROPAGATE where the queue name is STREAMS_QUEUE owned by STRMADMIN and the destination database link is ORCL2.WORLD, the commands would be:10.2 and above exec dbms_propagation_adm.stop_propagation('STRMADMIN_PROPAGATE'); exec dbms_propagation_adm.start_propagation('STRMADMIN_PROPAGATE'); If the above does not fix the problem, stop the propagation specifying the force parameter (2nd parameter on stop_propagation) as TRUE: exec dbms_propagation_adm.stop_propagation('STRMADMIN_PROPAGATE',true); exec dbms_propagation_adm.start_propagation('STRMADMIN_PROPAGATE'); The statistics for the propagation as well as any old error messages are cleared when the force parameter is set to TRUE. Therefore if the propagation schedule is stopped with FORCE set to TRUE, and upon restart there is still an error message in DBA_PROPAGATION, then the error message is current.9.2 or 10.1 exec dbms_aqadm.disable_propagation_schedule('STRMADMIN.STREAMS_QUEUE','ORCL2.WORLD'); exec dbms.aqadm.enable_propagation_schedule('STRMADMIN.STREAMS_QUEUE','ORCL2.WORLD'); If the above does not fix the problem, perform an unschedule of propagation and then schedule_propagation: exec dbms_aqadm.unschedule_propagation('STRMADMIN.STREAMS_QUEUE','ORCL2.WORLD'); exec dbms_aqadm.schedule_propagation('STRMADMIN.STREAMS_QUEUE','ORCL2.WORLD'); Typically if the error from the first query in Section 6.1 recurs after restarting the propagation as shown above, further troubleshooting of the error is needed. 6.2. Check Propagation Rule Sets and Transformations Inspect the configuration of the rules in the rule set that is associated with the propagation process to make sure that they evaluate to TRUE as expected. If not, then the object or schema will not be propagated. Remember that when a negative rule evaluates to TRUE, the specified object or schema will not be propagated. Finally inspect any rule-based transformations that are implemented with propagation to make sure they are changing the data in the intended way.The following query shows what rule sets are assigned to a propagation: select PROPAGATION_NAME, RULE_SET_OWNER||'.'||RULE_SET_NAME "Positive Rule Set",NEGATIVE_RULE_SET_OWNER||'.'||NEGATIVE_RULE_SET_NAME "Negative Rule Set"from DBA_PROPAGATION; The next two queries list the propagation rules and their conditions. The first is for the positive rule set, the second is for the negative rule set: set long 4000select rsr.RULE_SET_OWNER||'.'||rsr.RULE_SET_NAME RULE_SET ,rsr.RULE_OWNER||'.'||rsr.RULE_NAME RULE_NAME,r.RULE_CONDITION CONDITION fromDBA_RULE_SET_RULES rsr, DBA_RULES rwhere rsr.RULE_NAME = r.RULE_NAME and rsr.RULE_OWNER = r.RULE_OWNER and RULE_SET_NAME in(select RULE_SET_NAME from DBA_PROPAGATION) order by rsr.RULE_SET_OWNER, rsr.RULE_SET_NAME; set long 4000select c.PROPAGATION_NAME, rsr.RULE_SET_OWNER||'.'||rsr.RULE_SET_NAME RULE_SET ,rsr.RULE_OWNER||'.'||rsr.RULE_NAME RULE_NAME,r.RULE_CONDITION CONDITION fromDBA_RULE_SET_RULES rsr, DBA_RULES r ,DBA_PROPAGATION cwhere rsr.RULE_NAME = r.RULE_NAME and rsr.RULE_OWNER = r.RULE_OWNER andrsr.RULE_SET_OWNER=c.NEGATIVE_RULE_SET_OWNER and rsr.RULE_SET_NAME=c.NEGATIVE_RULE_SET_NAMEand rsr.RULE_SET_NAME in(select NEGATIVE_RULE_SET_NAME from DBA_PROPAGATION) order by rsr.RULE_SET_OWNER, rsr.RULE_SET_NAME; 6.3. Determining the Total Number of Messages and Bytes Propagated As in Section 3.1, determining if messages are flowing can be instructive to see whether the propagation is entirely hung or just slow. If the propagation is not in flow control (see Section 6.5.2), but the statistics are incrementing slowly, there may be a performance issue. For Streams implementations two views are available that can assist with this that can show the number of messages sent by a propagation, as well as the number of acknowledgements being returned from the target site: the V$PROPAGATION_SENDER view at the Source site and the V$PROPAGATION_RECEIVER view at the destination site. It is helpful to query both to determine if messages are being delivered to the target. Look for the statistics to increase.Source: select QUEUE_SCHEMA, QUEUE_NAME, DBLINK,HIGH_WATER_MARK, ACKNOWLEDGEMENT, TOTAL_MSGS, TOTAL_BYTESfrom V$PROPAGATION_SENDER; Target: select SRC_QUEUE_SCHEMA, SRC_QUEUE_NAME, SRC_DBNAME, DST_QUEUE_SCHEMA, DST_QUEUE_NAME, HIGH_WATER_MARK, ACKNOWLEDGEMENT, TOTAL_MSGS from V$PROPAGATION_RECEIVER; 6.4. Check Buffered Subscribers The V$BUFFERED_SUBSCRIBERS view displays information about subscribers for all buffered queues in the instance. This view can be queried to make sure that the site that the propagation is propagating to is listed as a subscriber address for the site being propagated from: select QUEUE_SCHEMA, QUEUE_NAME, SUBSCRIBER_ADDRESS from V$BUFFERED_SUBSCRIBERS; The SUBSCRIBER_ADDRESS column will not be populated when the propagation is local (between queues on the same database). 6.5. Common Streams Propagation Errors 6.5.1. ORA-02082: A loopback database link must have a connection qualifier. This error can occur if you use the Streams Setup Wizard in Oracle Enterprise Manager without first configuring the GLOBAL_NAME for your database. 6.5.2. ORA-25307: Enqueue rate too high. Enable flow control DBA_QUEUE_SCHEDULES will display this informational message for propagation when the automatic flow control (10g feature of Streams) has been invoked.Similar to Streams capture processes, a Streams propagation process can also go into a state of 'flow control. This is an informative message that indicates flow control has been automatically enabled to reduce the rate at which messages are being enqueued into at target queue.This typically occurs when the target site is unable to keep up with the rate of messages flowing from the source site. Other than checking that the apply process is running normally on the target site, usually no action is required by the DBA. Propagation and the capture process will be resumed automatically when the target site is able to accept more messages.The following document contains more information:Document 302109.1 Streams Propagation Error: ORA-25307 Enqueue rate too high. Enable flow controlSee the following document for one potential cause of this situation:Document 1097115.1 Oracle Streams Apply Reader is in 'Paused' State 6.5.3. ORA-25315 unsupported configuration for propagation of buffered messages This error typically occurs when the target database is RAC and usually indicates that an attempt was made to propagate buffered messages with the database link pointing to an instance in the destination database which is not the owner instance of the destination queue. To resolve the problem, use queue-to-queue propagation for buffered messages. 6.5.4. ORA-600 [KWQBMCRCPTS101] after dropping / recreating propagation For cause/fixes refer to:Document 421237.1 ORA-600 [KWQBMCRCPTS101] reported by a Qmon slave process after dropping a Streams Propagation 6.5.5. Stopping or Dropping a Streams Propagation Hangs See the following note:Document 1159787.1 Troubleshooting Streams Propagation When It is Not Functioning and Attempts to Stop It Hang 6.6. Streams Propagation-Related Notes for Common Issues Document 437838.1 Streams Specific PatchesDocument 749181.1 How to Recover Streams After Dropping PropagationDocument 368912.1 Queue to Queue Propagation Schedule encountered ORA-12514 in a RAC environmentDocument 564649.1 ORA-02068/ORA-03114/ORA-03113 Errors From Streams Propagation Process - Remote Database is Available and Unschedule/Reschedule Does Not ResolveDocument 553017.1 Stream Propagation Process Errors Ora-4052 Ora-6554 From 11g To 10201Document 944846.1 Streams Propagation Fails Ora-7445 [kohrsmc]Document 745601.1 ORA-23603 'STREAMS enqueue aborted due to low SGA' Error from Streams Propagation, and V$STREAMS_CAPTURE.STATE Hanging on 'Enqueuing Message'Document 333068.1 ORA-23603: Streams Enqueue Aborted Eue To Low SGADocument 363496.1 Ora-25315 Propagating on RAC StreamsDocument 368237.1 Unable to Unschedule Propagation. Streams Queue is InvalidDocument 436332.1 dbms_propagation_adm.stop_propagation hangsDocument 727389.1 Propagation Fails With ORA-12528Document 730911.1 ORA-4063 Is Reported After Dropping Negative Prop.RulesetDocument 460471.1 Propagation Blocked by Qmon Process - Streams_queue_table / 'library cache lock' waitsDocument 1165583.1 ORA-600 [kwqpuspse0-ack] In Streams EnvironmentDocument 1059029.1 Combined Capture and Apply (CCA) : Capture aborts : ORA-1422 after schedule_propagationDocument 556309.1 Changing Propagation/ queue_to_queue : false -> true does does not work; no LCRs propagatedDocument 839568.1 Propagation failing with error: ORA-01536: space quota exceeded for tablespace ''Document 311021.1 Streams Propagation Process : Ora 12154 After Reboot with Transparent Application Failover TAF configuredDocument 359971.1 STREAMS propagation to Primary of physical Standby configuation errors with Ora-01033, Ora-02068Document 1101616.1 DBMS_PROPAGATION_ADM.DROP_PROPAGATION FAILS WITH ORA-1747 7. Performance Issues A propagation may seem to be slow if the queries from Sections 3.1 and 6.3 show that the message statistics are not changing quickly. In Oracle Streams, this more usually is due to a slow apply process at the target rather than a slow propagation. Propagation could be inferred to be slow if the message statistics are changing, and the state of a capture process according to V$STREAMS_CAPTURE.STATE is PAUSED FOR FLOW CONTROL, but an ORA-25307 'Enqueue rate too high. Enable flow control' warning is NOT observed in DBA_QUEUE_SCHEDULES per Section 6.5.2. If this is the case, see the following notes / white papers for suggestions to increase performance:Document 335516.1 Master Note for Streams Performance RecommendationsDocument 730036.1 Overview for Troubleshooting Streams Performance IssuesDocument 780733.1 Streams Propagation Tuning with Network ParametersWhite Paper: http://www.oracle.com/technetwork/database/features/availability/maa-wp-10gr2-streams-performance-130059.pdfWhite Paper: Oracle Streams Configuration Best Practices: Oracle Database 10g Release 10.2, http://www.oracle.com/technetwork/database/features/availability/maa-10gr2-streams-configuration-132039.pdf, See APPENDIX A: USING STREAMS CONFIGURATIONS OVER A NETWORKFor basic AQ propagation, the network tuning in the aforementioned Appendix A of the white paper 'Oracle Streams Configuration Best Practices: Oracle Database 10g Release 10.2' is applicable. References NOTE:102330.1 - Advanced Queueing MSG_STATE Values and their InterpretationNOTE:102771.1 - Advanced Queueing Propagation using PL/SQLNOTE:1059029.1 - Combined Capture and Apply (CCA) : Capture aborts : ORA-1422 after schedule_propagationNOTE:1079577.1 - Advanced Queuing Propagation Fails With "ORA-22370: incorrect usage of method"NOTE:1083608.1 - 11g Streams and Oracle SchedulerNOTE:1087324.1 - ORA-01405 ORA-01422 reported by Adavanced Queueing Propagation schedules after RAC reconfigurationNOTE:1097115.1 - Oracle Streams Apply Reader is in 'Paused' StateNOTE:1101616.1 - DBMS_PROPAGATION_ADM.DROP_PROPAGATION FAILS WITH ORA-1747NOTE:1159787.1 - Troubleshooting Streams Propagation When It is Not Functioning and Attempts to Stop It HangNOTE:1165583.1 - ORA-600 [kwqpuspse0-ack] In Streams EnvironmentNOTE:118884.1 - How to unschedule a propagation schedule stuck in pending stateNOTE:1203544.1 - AQ PROPAGATION ABORTED WITH ORA-600[OCIKSIN: INVALID STATUS] ON SYS.DBMS_AQADM_SYS.AQ$_PROPAGATION_PROCEDURE AFTER UPGRADENOTE:1204080.1 - AQ Propagation Failing With ORA-25329 After Upgraded From 8i or 9i to 10g or 11g.NOTE:219416.1 - Advanced Queuing Propagation fails with ORA-22922NOTE:222992.1 - DBMS_AQADM.DISABLE_PROPAGATION_SCHEDULE Returns ORA-24082NOTE:253131.1 - Concurrent Writes May Corrupt LOB Segment When Using Auto Segment Space Management (ORA-1555)NOTE:282987.1 - Propagated Messages marked UNDELIVERABLE after Drop and Recreate Of Remote QueueNOTE:298015.1 - Kwqjswproc:Excep After Loop: Assigning To SelfNOTE:302109.1 - Streams Propagation Error: ORA-25307 Enqueue rate too high. Enable flow controlNOTE:311021.1 - Streams Propagation Process : Ora 12154 After Reboot with Transparent Application Failover TAF configuredNOTE:332792.1 - ORA-04061 error relating to SYS.DBMS_PRVTAQIP reported when setting up StatspackNOTE:333068.1 - ORA-23603: Streams Enqueue Aborted Eue To Low SGANOTE:335516.1 - Master Note for Streams Performance RecommendationsNOTE:353325.1 - ORA-24056: Internal inconsistency for QUEUE and destination NOTE:353754.1 - Streams Messaging Propagation Fails between Single and Multi-byte Charactersets when using Chararacter Length Semantics in the ADT.NOTE:359971.1 - STREAMS propagation to Primary of physical Standby configuation errors with Ora-01033, Ora-02068NOTE:363496.1 - Ora-25315 Propagating on RAC StreamsNOTE:365093.1 - ORA-07445 [kwqppay2aqe()+7360] reported on Propagation of a Transformed MessageNOTE:368237.1 - Unable to Unschedule Propagation. Streams Queue is InvalidNOTE:368912.1 - Queue to Queue Propagation Schedule encountered ORA-12514 in a RAC environmentNOTE:421237.1 - ORA-600 [KWQBMCRCPTS101] reported by a Qmon slave process after dropping a Streams PropagationNOTE:436332.1 - dbms_propagation_adm.stop_propagation hangsNOTE:437838.1 - Streams Specific PatchesNOTE:460471.1 - Propagation Blocked by Qmon Process - Streams_queue_table / 'library cache lock' waitsNOTE:463820.1 - Streams Combined Capture and Apply in 11gNOTE:553017.1 - Stream Propagation Process Errors Ora-4052 Ora-6554 From 11g To 10201NOTE:556309.1 - Changing Propagation/ queue_to_queue : false -> true does does not work; no LCRs propagatedNOTE:564649.1 - ORA-02068/ORA-03114/ORA-03113 Errors From Streams Propagation Process - Remote Database is Available and Unschedule/Reschedule Does Not ResolveNOTE:566622.1 - ORA-22275 when propagating >4K AQ$_JMS_TEXT_MESSAGEs from 9.2.0.8 to 10.2.0.1NOTE:727389.1 - Propagation Fails With ORA-12528NOTE:730036.1 - Overview for Troubleshooting Streams Performance IssuesNOTE:730911.1 - ORA-4063 Is Reported After Dropping Negative Prop.RulesetNOTE:731292.1 - ORA-25215 Reported On Local Propagation When Using Transformation with ANYDATA queue tablesNOTE:731539.1 - ORA-29268: HTTP client error 401 Unauthorized Error when the AQ Servlet attempts to Propagate a message via HTTPNOTE:745601.1 - ORA-23603 'STREAMS enqueue aborted due to low SGA' Error from Streams Propagation, and V$STREAMS_CAPTURE.STATE Hanging on 'Enqueuing Message'NOTE:749181.1 - How to Recover Streams After Dropping PropagationNOTE:780733.1 - Streams Propagation Tuning with Network ParametersNOTE:787367.1 - ORA-22275 reported on Propagating Messages with LOB component when propagating between 10.1 and 10.2NOTE:808136.1 - How to clear the old errors from DBA_PROPAGATION view ?NOTE:827184.1 - AQ Propagation with CLOB data types Fails with ORA-22990NOTE:827473.1 - How to alter propagation from queue_to_queue to queue_to_dblinkNOTE:839568.1 - Propagation failing with error: ORA-01536: space quota exceeded for tablespace ''NOTE:846297.1 - AQ Propagation Fails : ORA-00600[kope2upic2954] or Ora-00600[Kghsstream_copyn]NOTE:944846.1 - Streams Propagation Fails Ora-7445 [kohrsmc]

Read the article
What's the right way to kill child processes in perl before exiting?

- by rarbox

I'm running an IRC Bot (Bot::BasicBot) which has two child processes running File::Tail but when exiting, they don't terminate. So I'm killling them using Proc::ProcessTable like this before exiting: my $parent=$$; my $proc_table=Proc::ProcessTable->new(); for my $proc (@{$proc_table->table()}) { kill(15, $proc->pid) if ($proc->ppid == $parent); } It works but I get this warning: 14045: !!! Child process PID:14047 reaped: 14045: !!! Child process PID:14048 reaped: 14045: !!! Your program may not be using sig_child() to reap processes. 14045: !!! In extreme cases, your program can force a system reboot 14045: !!! if this resource leakage is not corrected. What else can I do to kill child processes? The forked process is created using the forkit method in Bot::BasicBot.

Read the article
Log out of an SSH session into Erlang VM without stopping the VM or leaving stale processes

- by Garret Smith

I have an Erlang application running as a daemon, configured as an SSH server. I can connect to it with an SSH client and I get the standard Erlang REPL. If I 'q().' I shut down the Erlang VM, not the connection. If I close the connection ('~.' for OpenSSH, close the window in PuTTY) some processes remain under the sshd_sup/ssh_system_xx_sup tree. These appear to be stale shell processes. I do not see any exported function in the shell module that would let me shut down the shell (and therefore the SSH connection) without affecting the entire VM. How should I be logging out of the SSH session to not leave stale processes in the VM?

Read the article
Unexpected start of already-primary server processes when heartbeat on secondary is stopped.

- by vorik

Hi, I've got an active-passive Heartbeat cluster with Apache, MySQL, ActiveMQ and DRBD. Today, I wanted to perform hardware-maintenance on the secondary node (node04), so I stopped the heartbeat service before shutting it down. Then, the primary node (node03) received a shutdown notice from the secondary node (node04). This logging comes from the primary node: node03 heartbeat[4458]: 2010/03/08_08:52:56 info: Received shutdown notice from 'node04.companydomain.nl'. heartbeat[4458]: 2010/03/08_08:52:56 info: Resources being acquired from node04.companydomain.nl. harc[27522]: 2010/03/08_08:52:56 info: Running /etc/ha.d/rc.d/status status heartbeat[27523]: 2010/03/08_08:52:56 info: Local Resource acquisition completed. mach_down[27567]: 2010/03/08_08:52:56 info: /usr/share/heartbeat/mach_down: nice_failback: foreign resources acquired mach_down[27567]: 2010/03/08_08:52:56 info: mach_down takeover complete for node node04.companydomain.nl. heartbeat[4458]: 2010/03/08_08:52:56 info: mach_down takeover complete. harc[27620]: 2010/03/08_08:52:56 info: Running /etc/ha.d/rc.d/ip-request-resp ip-request-resp ip-request-resp[27620]: 2010/03/08_08:52:56 received ip-request-resp drbddisk OK yes ResourceManager[27645]: 2010/03/08_08:52:56 info: Acquiring resource group: node03.companydomain.nl drbddisk Filesystem::/dev/drbd0::/data::ext3 mysql apache::/etc/httpd/conf/httpd.conf LVSSyncDaemonSwap::master monitor activemq tivoli-cluster MailTo::[email protected]::DRBDFailureDrisAcc MailTo::[email protected]::DRBDFailureDrisAcc 1.2.3.212 ResourceManager[27645]: 2010/03/08_08:52:56 info: Running /etc/ha.d/resource.d/drbddisk start Filesystem[27700]: 2010/03/08_08:52:57 INFO: Running OK ResourceManager[27645]: 2010/03/08_08:52:57 info: Running /etc/ha.d/resource.d/mysql start mysql[27783]: 2010/03/08_08:52:57 Starting MySQL[ OK ] apache[27853]: 2010/03/08_08:52:57 INFO: Running OK ResourceManager[27645]: 2010/03/08_08:52:57 info: Running /etc/ha.d/resource.d/monitor start monitor[28160]: 2010/03/08_08:52:58 ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/activemq start activemq[28210]: 2010/03/08_08:52:58 Starting ActiveMQ Broker... ActiveMQ Broker is already running. ResourceManager[27645]: 2010/03/08_08:52:58 ERROR: Return code 1 from /etc/ha.d/resource.d/activemq ResourceManager[27645]: 2010/03/08_08:52:58 CRIT: Giving up resources due to failure of activemq ResourceManager[27645]: 2010/03/08_08:52:58 info: Releasing resource group: node03.companydomain.nl drbddisk Filesystem::/dev/drbd0::/data::ext3 mysql apache::/etc/httpd/conf/httpd.conf LVSSyncDaemonSwap::master monitor activemq tivoli-cluster MailTo::[email protected]::DRBDFailureDrisAcc MailTo::[email protected]::DRBDFailureDrisAcc 1.2.3.212 ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/IPaddr 1.2.3.212 stop IPaddr[28329]: 2010/03/08_08:52:58 INFO: ifconfig eth0:0 down IPaddr[28312]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/MailTo [email protected] DRBDFailureDrisAcc stop MailTo[28378]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/MailTo [email protected] DRBDFailureDrisAcc stop MailTo[28433]: 2010/03/08_08:52:58 INFO: Success ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/tivoli-cluster stop ResourceManager[27645]: 2010/03/08_08:52:58 info: Running /etc/ha.d/resource.d/activemq stop activemq[28503]: 2010/03/08_08:53:01 Stopping ActiveMQ Broker... Stopped ActiveMQ Broker. ResourceManager[27645]: 2010/03/08_08:53:01 info: Running /etc/ha.d/resource.d/monitor stop monitor[28681]: 2010/03/08_08:53:01 ResourceManager[27645]: 2010/03/08_08:53:01 info: Running /etc/ha.d/resource.d/LVSSyncDaemonSwap master stop LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncmaster down LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncbackup up LVSSyncDaemonSwap[28714]: 2010/03/08_08:53:02 info: ipvs_syncmaster released ResourceManager[27645]: 2010/03/08_08:53:02 info: Running /etc/ha.d/resource.d/apache /etc/httpd/conf/httpd.conf stop apache[28782]: 2010/03/08_08:53:03 INFO: Killing apache PID 18390 apache[28782]: 2010/03/08_08:53:03 INFO: apache stopped. apache[28771]: 2010/03/08_08:53:03 INFO: Success ResourceManager[27645]: 2010/03/08_08:53:03 info: Running /etc/ha.d/resource.d/mysql stop mysql[28851]: 2010/03/08_08:53:24 Shutting down MySQL.....................[ OK ] ResourceManager[27645]: 2010/03/08_08:53:24 info: Running /etc/ha.d/resource.d/Filesystem /dev/drbd0 /data ext3 stop Filesystem[29010]: 2010/03/08_08:53:25 INFO: Running stop for /dev/drbd0 on /data Filesystem[29010]: 2010/03/08_08:53:25 INFO: Trying to unmount /data Filesystem[29010]: 2010/03/08_08:53:25 ERROR: Couldn't unmount /data; trying cleanup with SIGTERM Filesystem[29010]: 2010/03/08_08:53:25 INFO: Some processes on /data were signalled Filesystem[29010]: 2010/03/08_08:53:27 INFO: unmounted /data successfully Filesystem[28999]: 2010/03/08_08:53:27 INFO: Success ResourceManager[27645]: 2010/03/08_08:53:27 info: Running /etc/ha.d/resource.d/drbddisk stop heartbeat[4458]: 2010/03/08_08:53:29 WARN: node node04.companydomain.nl: is dead heartbeat[4458]: 2010/03/08_08:53:29 info: Dead node node04.companydomain.nl gave up resources. heartbeat[4458]: 2010/03/08_08:53:29 info: Link node04.companydomain.nl:eth0 dead. heartbeat[4458]: 2010/03/08_08:53:29 info: Link node04.companydomain.nl:eth1 dead. hb_standby[29193]: 2010/03/08_08:53:57 Going standby [foreign]. heartbeat[4458]: 2010/03/08_08:53:57 info: node03.companydomain.nl wants to go standby [foreign] Soo... What just happened here??? Heartbeat on node04 stopped and told node03, which was the active node at the time. Somehow, node03 decided to start the cluster processes that were already running. (For the processes that are not critical, I always return a 0 from the startupscript so it does not stops the entire cluster when a non-essential part fails.) When starting ActiveMQ, it returns status 1 because it is already running. This fails the node and shuts everything down. As heartbeat is not running on the secondary node, it cannot failover to there. When I tried to run ha_takeover to restart the resources, absolutely nothing happened. Only after I restarted heartbeat on the primary node the resources could be started (after a delay of 2 minutes). These are my questions: Why does heartbeat on the primary node try to start the cluster processes again? Why did ha_takeover not work? What can I do to prevent this from happening? Server configuration: DRBD: version: 8.3.7 (api:88/proto:86-91) GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by [email protected], 2010-01-20 09:14:48 0: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate B r---- ns:0 nr:6459432 dw:6459432 dr:0 al:0 bm:301 lo:0 pe:0 ua:0 ap:0 ep:1 wo:d oos:0 uname -a Linux node04 2.6.18-164.11.1.el5 #1 SMP Wed Jan 6 13:26:04 EST 2010 x86_64 x86_64 x86_64 GNU/Linux haresources node03.companydomain.nl \ drbddisk \ Filesystem::/dev/drbd0::/data::ext3 \ mysql \ apache::/etc/httpd/conf/httpd.conf \ LVSSyncDaemonSwap::master \ monitor \ activemq \ tivoli-cluster \ MailTo::[email protected]::DRBDFailureDrisAcc \ MailTo::[email protected]::DRBDFailureDrisAcc \ 1.2.3.212 ha.cf debugfile /var/log/ha-debug logfile /var/log/ha-log keepalive 500ms deadtime 30 warntime 10 initdead 120 udpport 694 mcast eth0 225.0.0.3 694 1 0 mcast eth1 225.0.0.4 694 1 0 auto_failback off node node03.companydomain.nl node node04.companydomain.nl respawn hacluster /usr/lib64/heartbeat/dopd apiauth dopd gid=haclient uid=hacluster Thank you very much in advance, Ger Apeldoorn

Read the article
Using a mounted NTFS share with nginx

- by Hoff

I have set up a local testing VM with Ubuntu Server 12.04 LTS and the LEMP stack. It's kind of an unconventional setup because instead of having all my PHP scripts on the local machine, I've mounted an NTFS share as the document root because I do my development on Windows. I had everything working perfectly up until this morning, now I keep getting a dreaded 'File not found.' error. I am almost certain this must be somehow permission related, because if I copy my site over to /var/www, nginx and php-fpm have no problems serving my PHP scripts. What I can't figure out is why all of a sudden (after a reboot of the server), no PHP files will be served but instead just the 'File not found.' error. Static files work fine, so I think it's PHP that is causing the headache. Both nginx and php-fpm are configured to run as the user www-data: root@ubuntu-server:~# ps aux | grep 'nginx\|php-fpm' root 1095 0.0 0.0 5816 792 ? Ss 11:11 0:00 nginx: master process /opt/nginx/sbin/nginx -c /etc/nginx/nginx.conf www-data 1096 0.0 0.1 6016 1172 ? S 11:11 0:00 nginx: worker process www-data 1098 0.0 0.1 6016 1172 ? S 11:11 0:00 nginx: worker process root 1130 0.0 0.4 175560 4212 ? Ss 11:11 0:00 php-fpm: master process (/etc/php5/php-fpm.conf) www-data 1131 0.0 0.3 175560 3216 ? S 11:11 0:00 php-fpm: pool www www-data 1132 0.0 0.3 175560 3216 ? S 11:11 0:00 php-fpm: pool www www-data 1133 0.0 0.3 175560 3216 ? S 11:11 0:00 php-fpm: pool www root 1686 0.0 0.0 4368 816 pts/1 S+ 11:11 0:00 grep --color=auto nginx\|php-fpm I have mounted the NTFS share at /mnt/webfiles by editing /etc/fstab and adding the following line: //192.168.0.199/c$/Websites/ /mnt/webfiles cifs username=Jordan,password=mypasswordhere,gid=33,uid=33 0 0 Where gid 33 is the www-data group and uid 33 is the user www-data. If I list the contents of one of my sites you can in fact see that they belong to the user www-data: root@ubuntu-server:~# ls -l /mnt/webfiles/nTv5-2.0 total 8 drwxr-xr-x 0 www-data www-data 0 Jun 6 19:12 app drwxr-xr-x 0 www-data www-data 0 Aug 22 19:00 assets -rwxr-xr-x 0 www-data www-data 1150 Jan 4 2012 favicon.ico -rwxr-xr-x 0 www-data www-data 1412 Dec 28 2011 index.php drwxr-xr-x 0 www-data www-data 0 Jun 3 16:44 lib drwxr-xr-x 0 www-data www-data 0 Jan 3 2012 plugins drwxr-xr-x 0 www-data www-data 0 Jun 3 16:45 vendors If I switch to the www-data user, I have no problem creating a new file on the share: root@ubuntu-server:~# su www-data $ > /mnt/webfiles/test.txt $ ls -l /mnt/webfiles | grep test\.txt -rwxr-xr-x 0 www-data www-data 0 Sep 8 11:19 test.txt There should be no problem reading or writing to the share with php-fpm running as the user www-data. When I examine the error log of nginx, it's filled with a bunch of lines that look like the following: 2012/09/08 11:22:36 [error] 1096#0: *1 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: 192.168.0.199, server: , request: "GET / HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.0.123" 2012/09/08 11:22:39 [error] 1096#0: *1 FastCGI sent in stderr: "Primary script unknown" while reading response header from upstream, client: 192.168.0.199, server: , request: "GET /apc.php HTTP/1.1", upstream: "fastcgi://unix:/var/run/php5-fpm.sock:", host: "192.168.0.123" It's bizarre that this was working previously and now all of sudden PHP is complaining that it can't "find" the scripts on the share. Does anybody know why this is happening? EDIT I tried editing php-fpm.conf and changing chdir to the following: chdir = /mnt/webfiles When I try and restart the php-fpm service, I get the error: Starting php-fpm [08-Sep-2012 14:20:55] ERROR: [pool www] the chdir path '/mnt/webfiles' does not exist or is not a directory This is a total load of bullshit because this directory DOES exist and is mounted! Any ls commands to list that directory work perfectly. Why the hell can't PHP-FPM see this directory?! Here are my configuration files for reference: nginx.conf user www-data; worker_processes 2; error_log /var/log/nginx/nginx.log info; pid /var/run/nginx.pid; events { worker_connections 1024; multi_accept on; } http { include fastcgi.conf; include mime.types; default_type application/octet-stream; set_real_ip_from 127.0.0.1; real_ip_header X-Forwarded-For; ## Proxy proxy_redirect off; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; client_max_body_size 32m; client_body_buffer_size 128k; proxy_connect_timeout 90; proxy_send_timeout 90; proxy_read_timeout 90; proxy_buffers 32 4k; ## Compression gzip on; gzip_types text/plain text/css application/x-javascript text/xml application/xml application/xml+rss text/javascript; gzip_disable "MSIE [1-6]\.(?!.*SV1)"; ### TCP options tcp_nodelay on; tcp_nopush on; keepalive_timeout 65; sendfile on; include /etc/nginx/sites-enabled/*; } my site config server { listen 80; access_log /var/log/nginx/$host.access.log; error_log /var/log/nginx/error.log; root /mnt/webfiles/nTv5-2.0/app/webroot; index index.php; ## Block bad bots if ($http_user_agent ~* (HTTrack|HTMLParser|libcurl|discobot|Exabot|Casper|kmccrew|plaNETWORK|RPT-HTTPClient)) { return 444; } ## Block certain Referers (case insensitive) if ($http_referer ~* (sex|vigra|viagra) ) { return 444; } ## Deny dot files: location ~ /\. { deny all; } ## Favicon Not Found location = /favicon.ico { access_log off; log_not_found off; } ## Robots.txt Not Found location = /robots.txt { access_log off; log_not_found off; } if (-f $document_root/maintenance.html) { rewrite ^(.*)$ /maintenance.html last; } location ~* \.(?:ico|css|js|gif|jpe?g|png)$ { # Some basic cache-control for static files to be sent to the browser expires max; add_header Pragma public; add_header Cache-Control "max-age=2678400, public, must-revalidate"; } location / { try_files $uri $uri/ index.php; if (-f $request_filename) { break; } rewrite ^(.+)$ /index.php?url=$1 last; } location ~ \.php$ { include /etc/nginx/fastcgi.conf; fastcgi_pass unix:/var/run/php5-fpm.sock; } } php-fpm.conf ;;;;;;;;;;;;;;;;;;;;; ; FPM Configuration ; ;;;;;;;;;;;;;;;;;;;;; ; All relative paths in this configuration file are relative to PHP's install ; prefix (/opt/php5). This prefix can be dynamicaly changed by using the ; '-p' argument from the command line. ; Include one or more files. If glob(3) exists, it is used to include a bunch of ; files from a glob(3) pattern. This directive can be used everywhere in the ; file. ; Relative path can also be used. They will be prefixed by: ; - the global prefix if it's been set (-p arguement) ; - /opt/php5 otherwise ;include=etc/fpm.d/*.conf ;;;;;;;;;;;;;;;;;; ; Global Options ; ;;;;;;;;;;;;;;;;;; [global] ; Pid file ; Note: the default prefix is /opt/php5/var ; Default Value: none pid = /var/run/php-fpm.pid ; Error log file ; Note: the default prefix is /opt/php5/var ; Default Value: log/php-fpm.log error_log = /var/log/php5-fpm/php-fpm.log ; Log level ; Possible Values: alert, error, warning, notice, debug ; Default Value: notice ;log_level = notice ; If this number of child processes exit with SIGSEGV or SIGBUS within the time ; interval set by emergency_restart_interval then FPM will restart. A value ; of '0' means 'Off'. ; Default Value: 0 ;emergency_restart_threshold = 0 ; Interval of time used by emergency_restart_interval to determine when ; a graceful restart will be initiated. This can be useful to work around ; accidental corruptions in an accelerator's shared memory. ; Available Units: s(econds), m(inutes), h(ours), or d(ays) ; Default Unit: seconds ; Default Value: 0 ;emergency_restart_interval = 0 ; Time limit for child processes to wait for a reaction on signals from master. ; Available units: s(econds), m(inutes), h(ours), or d(ays) ; Default Unit: seconds ; Default Value: 0 ;process_control_timeout = 0 ; Send FPM to background. Set to 'no' to keep FPM in foreground for debugging. ; Default Value: yes ;daemonize = yes ;;;;;;;;;;;;;;;;;;;; ; Pool Definitions ; ;;;;;;;;;;;;;;;;;;;; ; Multiple pools of child processes may be started with different listening ; ports and different management options. The name of the pool will be ; used in logs and stats. There is no limitation on the number of pools which ; FPM can handle. Your system will tell you anyway :) ; Start a new pool named 'www'. ; the variable $pool can we used in any directive and will be replaced by the ; pool name ('www' here) [www] ; Per pool prefix ; It only applies on the following directives: ; - 'slowlog' ; - 'listen' (unixsocket) ; - 'chroot' ; - 'chdir' ; - 'php_values' ; - 'php_admin_values' ; When not set, the global prefix (or /opt/php5) applies instead. ; Note: This directive can also be relative to the global prefix. ; Default Value: none ;prefix = /path/to/pools/$pool ; The address on which to accept FastCGI requests. ; Valid syntaxes are: ; 'ip.add.re.ss:port' - to listen on a TCP socket to a specific address on ; a specific port; ; 'port' - to listen on a TCP socket to all addresses on a ; specific port; ; '/path/to/unix/socket' - to listen on a unix socket. ; Note: This value is mandatory. ;listen = 127.0.0.1:9000 listen = /var/run/php5-fpm.sock ; Set listen(2) backlog. A value of '-1' means unlimited. ; Default Value: 128 (-1 on FreeBSD and OpenBSD) ;listen.backlog = -1 ; List of ipv4 addresses of FastCGI clients which are allowed to connect. ; Equivalent to the FCGI_WEB_SERVER_ADDRS environment variable in the original ; PHP FCGI (5.2.2+). Makes sense only with a tcp listening socket. Each address ; must be separated by a comma. If this value is left blank, connections will be ; accepted from any ip address. ; Default Value: any ;listen.allowed_clients = 127.0.0.1 ; Set permissions for unix socket, if one is used. In Linux, read/write ; permissions must be set in order to allow connections from a web server. Many ; BSD-derived systems allow connections regardless of permissions. ; Default Values: user and group are set as the running user ; mode is set to 0666 ;listen.owner = www-data ;listen.group = www-data ;listen.mode = 0666 ; Unix user/group of processes ; Note: The user is mandatory. If the group is not set, the default user's group ; will be used. user = www-data group = www-data ; Choose how the process manager will control the number of child processes. ; Possible Values: ; static - a fixed number (pm.max_children) of child processes; ; dynamic - the number of child processes are set dynamically based on the ; following directives: ; pm.max_children - the maximum number of children that can ; be alive at the same time. ; pm.start_servers - the number of children created on startup. ; pm.min_spare_servers - the minimum number of children in 'idle' ; state (waiting to process). If the number ; of 'idle' processes is less than this ; number then some children will be created. ; pm.max_spare_servers - the maximum number of children in 'idle' ; state (waiting to process). If the number ; of 'idle' processes is greater than this ; number then some children will be killed. ; Note: This value is mandatory. pm = dynamic ; The number of child processes to be created when pm is set to 'static' and the ; maximum number of child processes to be created when pm is set to 'dynamic'. ; This value sets the limit on the number of simultaneous requests that will be ; served. Equivalent to the ApacheMaxClients directive with mpm_prefork. ; Equivalent to the PHP_FCGI_CHILDREN environment variable in the original PHP ; CGI. ; Note: Used when pm is set to either 'static' or 'dynamic' ; Note: This value is mandatory. pm.max_children = 50 ; The number of child processes created on startup. ; Note: Used only when pm is set to 'dynamic' ; Default Value: min_spare_servers + (max_spare_servers - min_spare_servers) / 2 pm.start_servers = 20 ; The desired minimum number of idle server processes. ; Note: Used only when pm is set to 'dynamic' ; Note: Mandatory when pm is set to 'dynamic' pm.min_spare_servers = 5 ; The desired maximum number of idle server processes. ; Note: Used only when pm is set to 'dynamic' ; Note: Mandatory when pm is set to 'dynamic' pm.max_spare_servers = 35 ; The number of requests each child process should execute before respawning. ; This can be useful to work around memory leaks in 3rd party libraries. For ; endless request processing specify '0'. Equivalent to PHP_FCGI_MAX_REQUESTS. ; Default Value: 0 pm.max_requests = 500 ; The URI to view the FPM status page. If this value is not set, no URI will be ; recognized as a status page. By default, the status page shows the following ; information: ; accepted conn - the number of request accepted by the pool; ; pool - the name of the pool; ; process manager - static or dynamic; ; idle processes - the number of idle processes; ; active processes - the number of active processes; ; total processes - the number of idle + active processes. ; max children reached - number of times, the process limit has been reached, ; when pm tries to start more children (works only for ; pm 'dynamic') ; The values of 'idle processes', 'active processes' and 'total processes' are ; updated each second. The value of 'accepted conn' is updated in real time. ; Example output: ; accepted conn: 12073 ; pool: www ; process manager: static ; idle processes: 35 ; active processes: 65 ; total processes: 100 ; max children reached: 1 ; By default the status page output is formatted as text/plain. Passing either ; 'html' or 'json' as a query string will return the corresponding output ; syntax. Example: ; http://www.foo.bar/status ; http://www.foo.bar/status?json ; http://www.foo.bar/status?html ; Note: The value must start with a leading slash (/). The value can be ; anything, but it may not be a good idea to use the .php extension or it ; may conflict with a real PHP file. ; Default Value: not set pm.status_path = /status ; The ping URI to call the monitoring page of FPM. If this value is not set, no ; URI will be recognized as a ping page. This could be used to test from outside ; that FPM is alive and responding, or to ; - create a graph of FPM availability (rrd or such); ; - remove a server from a group if it is not responding (load balancing); ; - trigger alerts for the operating team (24/7). ; Note: The value must start with a leading slash (/). The value can be ; anything, but it may not be a good idea to use the .php extension or it ; may conflict with a real PHP file. ; Default Value: not set ping.path = /ping ; This directive may be used to customize the response of a ping request. The ; response is formatted as text/plain with a 200 response code. ; Default Value: pong ping.response = pong ; The timeout for serving a single request after which the worker process will ; be killed. This option should be used when the 'max_execution_time' ini option ; does not stop script execution for some reason. A value of '0' means 'off'. ; Available units: s(econds)(default), m(inutes), h(ours), or d(ays) ; Default Value: 0 ;request_terminate_timeout = 0 ; The timeout for serving a single request after which a PHP backtrace will be ; dumped to the 'slowlog' file. A value of '0s' means 'off'. ; Available units: s(econds)(default), m(inutes), h(ours), or d(ays) ; Default Value: 0 ;request_slowlog_timeout = 0 ; The log file for slow requests ; Default Value: not set ; Note: slowlog is mandatory if request_slowlog_timeout is set ;slowlog = log/$pool.log.slow ; Set open file descriptor rlimit. ; Default Value: system defined value ;rlimit_files = 1024 ; Set max core size rlimit. ; Possible Values: 'unlimited' or an integer greater or equal to 0 ; Default Value: system defined value ;rlimit_core = 0 ; Chroot to this directory at the start. This value must be defined as an ; absolute path. When this value is not set, chroot is not used. ; Note: you can prefix with '$prefix' to chroot to the pool prefix or one ; of its subdirectories. If the pool prefix is not set, the global prefix ; will be used instead. ; Note: chrooting is a great security feature and should be used whenever ; possible. However, all PHP paths will be relative to the chroot ; (error_log, sessions.save_path, ...). ; Default Value: not set ;chroot = ; Chdir to this directory at the start. ; Note: relative path can be used. ; Default Value: current directory or / when chroot ;chdir = /var/www ; Redirect worker stdout and stderr into main error log. If not set, stdout and ; stderr will be redirected to /dev/null according to FastCGI specs. ; Note: on highloaded environement, this can cause some delay in the page ; process time (several ms). ; Default Value: no ;catch_workers_output = yes ; Pass environment variables like LD_LIBRARY_PATH. All $VARIABLEs are taken from ; the current environment. ; Default Value: clean env ;env[HOSTNAME] = $HOSTNAME ;env[PATH] = /usr/local/bin:/usr/bin:/bin ;env[TMP] = /tmp ;env[TMPDIR] = /tmp ;env[TEMP] = /tmp ; Additional php.ini defines, specific to this pool of workers. These settings ; overwrite the values previously defined in the php.ini. The directives are the ; same as the PHP SAPI: ; php_value/php_flag - you can set classic ini defines which can ; be overwritten from PHP call 'ini_set'. ; php_admin_value/php_admin_flag - these directives won't be overwritten by ; PHP call 'ini_set' ; For php_*flag, valid values are on, off, 1, 0, true, false, yes or no. ; Defining 'extension' will load the corresponding shared extension from ; extension_dir. Defining 'disable_functions' or 'disable_classes' will not ; overwrite previously defined php.ini values, but will append the new value ; instead. ; Note: path INI options can be relative and will be expanded with the prefix ; (pool, global or /opt/php5) ; Default Value: nothing is defined by default except the values in php.ini and ; specified at startup with the -d argument ;php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i -f [email protected] ;php_flag[display_errors] = off ;php_admin_value[error_log] = /var/log/fpm-php.www.log ;php_admin_flag[log_errors] = on ;php_admin_value[memory_limit] = 32M php_admin_value[sendmail_path] = /usr/sbin/sendmail -t -i

Read the article
How should I clean up hung grandchild processes when an alarm trips in Perl?

- by brian d foy

I have a parallelized automation script which needs to call many other scripts, some of which hang because they (incorrectly) wait for standard input. That's not a big deal because I catch those with alarm. The trick is to shut down those hung grandchild processes when the child shuts down. I thought various incantations of SIGCHLD, waiting, and process groups could do the trick, but they all block and the grandchildren aren't reaped. My solution, which works, just doesn't seem like it is the right solution. I'm not especially interested in the Windows solution just yet, but I'll eventually need that too. Mine only works for Unix, which is fine for now. I wrote a small script that takes the number of simultaneous parallel children to run and the total number of forks: $ fork_bomb <parallel jobs> <number of forks> $ fork_bomb 8 500 This will probably hit the per-user process limit within a couple of minutes. Many solutions I've found just tell you to increase the per-user process limit, but I need this to run about 300,000 times, so that isn't going to work. Similarly, suggestions to re-exec and so on to clear the process table aren't what I need. I'd like to actually fix the problem instead of slapping duct tape over it. I crawl the process table looking for the child processes and shut down the hung processes individually in the SIGALRM handler, which needs to die because the rest of real code has no hope of success after that. The kludgey crawl through the process table doesn't bother me from a performance perspective, but I wouldn't mind not doing it: use Parallel::ForkManager; use Proc::ProcessTable; my $pm = Parallel::ForkManager->new( $ARGV[0] ); my $alarm_sub = sub { kill 9, map { $_->{pid} } grep { $_->{ppid} == $$ } @{ Proc::ProcessTable->new->table }; die "Alarm rang for $$!\n"; }; foreach ( 0 .. $ARGV[1] ) { print "."; print "\n" unless $count++ % 50; my $pid = $pm->start and next; local $SIG{ALRM} = $alarm_sub; eval { alarm( 2 ); system "$^X -le '<STDIN>'"; # this will hang alarm( 0 ); }; $pm->finish; } If you want to run out of processes, take out the kill. I thought that setting a process group would work so I could kill everything together, but that blocks: my $alarm_sub = sub { kill 9, -$$; # blocks here die "Alarm rang for $$!\n"; }; foreach ( 0 .. $ARGV[1] ) { print "."; print "\n" unless $count++ % 50; my $pid = $pm->start and next; setpgrp(0, 0); local $SIG{ALRM} = $alarm_sub; eval { alarm( 2 ); system "$^X -le '<STDIN>'"; # this will hang alarm( 0 ); }; $pm->finish; } The same thing with POSIX's setsid didn't work either, and I think that actually broke things in a different way since I'm not really daemonizing this. Curiously, Parallel::ForkManager's run_on_finish happens too late for the same clean-up code: the grandchildren are apparently already disassociated from the child processes at that point.

Read the article

Search Results

Search found 4272 results on 171 pages for 'processes'.

Page 18/171 | < Previous Page | 14 15 16 17 18 19 20 21 22 23 24 25 | Next Page >

- by RiZe

- by rumtscho

- by christopher.jones

- by Vinayak

- by Paul Williams

- by Steven

- by Old McStopher

- by Malcolm Box

- by user3682065

- by Hugh

- by fuenfundachtzig

- by locster

- by Ben Horton

- by user198864

- by Virtual_Lotos

- by Zubair

- by overmann

- by Pini Reznik

- by Chris Cooper

- by faye.todd(at)oracle.com

- by rarbox

- by Garret Smith

- by vorik

- by Hoff

- by brian d foy

< Previous Page | 14 15 16 17 18 19 20 21 22 23 24 25 | Next Page >