Search Results

Search found 5679 results on 228 pages for 'kill processes'.

Page 207/228 | < Previous Page | 203 204 205 206 207 208 209 210 211 212 213 214 | Next Page >

How to move a ruby on rails application to a new server

- by ManiacZX

I have a rails app on an old Ubuntu server I need to move onto a new machine. I haven't worked with ruby on rails so I don't really know anything about the structure of the app. I want to load this onto an Ubuntu 8.04 AMI on Amazon EC2 and am looking for any information regarding the migration process such as: Do I copy over the entire folder defined as the application root in the mongrel config (for ex: /u/apps/myapp/current) or just certain folders? Am I looking for trouble if I go with the latest versions of ruby and the various gems? Any general gotchas to look out for in the process. Current server information: root@webnode001:/# cat /proc/version Linux version 2.6.15-27-server (buildd@terranova) (gcc version 4.0.3 (Ubuntu 4.0.3-1ubuntu5)) #1 SMP Fri Dec 8 18:43:54 UTC 2006 root@webnode001:/# rails -v Rails 1.2.3 root@webnode001:/# mongrel_rails cluster::configure --version Version 1.0.1 root@webnode001:/# gem -v 0.9.0 root@webnode001:/# gem list -l *** LOCAL GEMS *** actionmailer (1.3.3, 1.2.5) Service layer for easy email delivery and testing. actionpack (1.13.3, 1.12.5) Web-flow and rendering framework putting the VC in MVC. actionwebservice (1.2.3, 1.1.6) Web service support for Action Pack. activerecord (1.15.3, 1.15.2, 1.14.4) Implements the ActiveRecord pattern for ORM. activesupport (1.4.2, 1.4.1, 1.3.1) Support and utility classes used by the Rails framework. cgi_multipart_eof_fix (2.1) Fix an exploitable bug in CGI multipart parsing which affects Ruby <= 1.8.5 when multipart boundary attribute contains a non-halting regular expression string. daemons (1.0.7, 1.0.5, 1.0.4, 1.0.2) A toolkit to create and control daemons in different ways eventmachine (0.7.2, 0.7.0) Ruby/EventMachine socket engine library fastercsv (1.2.0, 1.1.0) FasterCSV is CSV, but faster, smaller, and cleaner. fastthread (1.0) Optimized replacement for thread.rb primitives ferret (0.11.4) Ruby indexing library. gem_plugin (0.2.2, 0.2.1) A plugin system based only on rubygems that uses dependencies only mongrel (1.0.1, 0.3.13.4) A small fast HTTP library and server that runs Rails, Camping, Nitro and Iowa apps. mongrel_cluster (0.2.1) Mongrel plugin that provides commands and Capistrano tasks for managing multiple Mongrel processes. mysql (2.7) MySQL/Ruby provides the same functions for Ruby programs that the MySQL C API provides for C programs. piston (1.3.3) Piston is a utility that enables merge tracking of remote repositories. rails (1.2.3, 1.1.6) Web-application framework with template engine, control-flow layer, and ORM. rake (0.7.3, 0.7.1) Ruby based make-like utility. sources (0.0.1) This package provides download sources for remote gem installation swiftiply (0.5.1) A fast clustering proxy for web applications.

Read the article
Cloudify: bootstrap-localcloud: operation failed?

- by quanta

OS: Gentoo, CentOS Version: 2.1.0 Follow the quick start guide, I got the below error when running bootstrap-localcloud: cloudify@default> bootstrap-localcloud STARTING CLOUDIFY MANAGEMENT 2012-05-30 14:55:50,396 WARNING [org.cloudifysource.shell.commands.AbstractGSCommand] - ; \ Caused by: org.cloudifysource.shell.commands.CLIException: \ Error while starting agent. \ Please make sure that another agent is not already running. Operation failed. What port Cloudify is using to check that agent is running? PS: it's working fine when running on Windows. UPDATE: Wed May 30 22:37:30 ICT 2012 Reply to @tamirkorem and @Itai Frenkel: I'm pretty sure because this is the first time I run that command on 2 servers. More clearly, here're the output: cloudify@default> teardown-localcloud Teardown will uninstall all of the deployed services. Do you want to continue [y/n]? 2012-05-30 22:43:33,145 WARNING [org.cloudifysource.shell.commands.AbstractGSCommand] - Teardown failed. Failed to fetch the currently deployed applications list. For force teardown use the -force flag. Operation failed. cloudify@default> teardown-localcloud -force Teardown will uninstall all of the deployed services. Do you want to continue [y/n]? Failed to fetch the currently deployed applications list. Continuing teardown-localcloud. .2012-05-30 22:46:39,040 WARNING [org.cloudifysource.shell.commands.AbstractGSCommand] - Teardown aborted, an agent was not found on the local machine. Operation failed. and this one is the detailed result: cloudify@default> bootstrap-localcloud --verbose NIC Address=127.0.0.1 Lookup Locators=127.0.0.1:4172 Lookup Groups=localcloud Starting agent and management processes: gs-agent.sh gsa.global.lus 0 gsa.lus 0 gsa.gsc 0 gsa.global.gsm 0 gsa.gsm_lus 1 gsa.global.esm 0 gsa.esm 1 >/dev/null 2>&1 STARTING CLOUDIFY MANAGEMENT 2012-05-30 22:36:12,870 WARNING [org.cloudifysource.shell.commands.AbstractGSCommand] - ; Caused by: org.cloudifysource.shell.commands.CLIException: Error while starting agent. Please make sure that another agent is not already running. Command executed: /usr/local/src/gigaspaces-cloudify-2.1.0-ga/bin/gs-agent.sh gsa.global.lus 0 gsa.lus 0 gsa.gsc 0 gsa.global.gsm 0 gsa.gsm_lus 1 gsa.global.esm 0 gsa.esm 1 >/dev/null 2>&1 Reply to @Eliran Malka: there is no such process listening on port 4172: # netstat --protocol=inet -nlp Active Internet connections (only servers) Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name tcp 0 0 127.0.0.1:9050 0.0.0.0:* LISTEN 2363/tor tcp 0 0 127.0.0.1:3306 0.0.0.0:* LISTEN 2331/mysqld tcp 0 0 127.0.0.1:631 0.0.0.0:* LISTEN 2293/cupsd

Read the article
umount bind of stale NFS

- by Paul Eisner

i've got a problem removing mounts created with mount -o bind from a locally mounted NFS folder. Assume the following mount structure: NFS mounted directory: $ mount -o rw,soft,tcp,intr,timeo=10,retrans=2,retry=1 \ 10.20.0.1:/srv/source /srv/nfs-source Bound directory: $ mount -o bind /srv/nfs-source/sub1 /srv/bind-target/sub1 Which results in this mount map $ mount /dev/sda1 on / type ext3 (rw,errors=remount-ro) # ... 10.20.0.1:/srv/source on /srv/nfs-source type nfs (rw,soft,tcp,intr,timeo=10,retrans=2,retry=1,addr=10.20.0.100) /srv/nfs-source/sub1 on /srv/bind-target/sub1 type none (rw,bind) If the server (10.20.0.1) goes down (eg ifdown eth0), the handles become stale, which is expected. I can now un-mount the NFS mount with force $ umount -f /srv/nfs-source This takes some seconds, but works without any problems. However, i cannot un-mount the bound directory in /srv/bind-target/sub1. The forced umount results in: $ umount -f /srv/bind-target/sub1 umount2: Stale NFS file handle umount: /srv/bind-target/sub1: Stale NFS file handle umount2: Stale NFS file handle Here is a trace http://pastebin.com/ipvvrVmB I've tried umounting the sub-directories beforehand, find any processes accessing anything within the NFS or bind mounts (there are none). lsof also complains: $ lsof -n lsof: WARNING: can't stat() nfs file system /srv/nfs-source Output information may be incomplete. lsof: WARNING: can't stat() nfs file system /srv/bind-target/sub1 (deleted) Output information may be incomplete. lsof: WARNING: can't stat() nfs file system /srv/bind-target/ Output information may be incomplete. I've tried with recent stable Linux kernels 3.2.17, 3.2.19 and 3.3.8 (cannot use 3.4.x, cause need the grsecurity patch, which is not, yet, supported - grsecurity is not patched in in the tests above!). My nfs-utils are version 1.2.2 (debian stable). Does anybody have an idea how i can either: force the un-mount some other way? (any dirty trick is welcome, data loss or damage neglible at this point) use something else instead of mount -o bind? (cannot use soft links, cause mounted directories will be used in chroot; bindfs via FUSE is far to slow to be an option) Thanks, Paul Update 1 With 2.6.32.59 the umount of the (stale) sub-mounts work just fine. It seems to be a kernel regression bug. The above tests where with NFSv3. Additional tests with NFSv4 showed no change. Update 2 We have tested now multiple 2.6 and 3.x kernels and are now sure, that this was introduced in 3.0.x. We will fille a bug report, hopefully they figure it out.

Read the article
Trouble setting up PATH for Java on Debian

- by milkmansrevenge

I am trying to get Oracle Java 7 update 3 working correctly on Debian 6. I have downloaded and set up the files in /usr/java/jre1.7.0_03. I have also set the following two lines at the end of /etc/bash.bashrc: export JAVA_HOME=/usr/java/jre1.7.0_03 export PATH=$PATH:$JAVA_HOME/bin Logging in as other users and root is fine, Java can be found: chris@mc:~$ java -version java version "1.7.0_03" Java(TM) SE Runtime Environment (build 1.7.0_03-b04) Java HotSpot(TM) 64-Bit Server VM (build 22.1-b02, mixed mode) However there are two cases where Java cannot be found as detailed below. Note that both of these worked fine when I have previously installed OpenJDK Java 6 via aptitude, but I need Oracle Java 7 for various reasons. Most importantly, I cannot run commands as another user via su, despite the PATH showing that Java should be present. The user was created with adduser chris root@mc:~# su chris -c "echo $PATH" /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/java/jre1.7.0_03/bin:/bin root@mc:~# su chris -c "java -version" bash: java: command not found root@mc:~# su chris -c "/usr/java/jre1.7.0_03/bin/java -version" java version "1.7.0_03" ... How can it be in the PATH but not be found? Update 05/04/2012: explained by Daniel, to do with it being a non-interactive shell so files such as /etc/profile and /etc/bash.bashrc are not executed. Doing a full swap to that user and running Java works: root@mc:~# su chris chris@mc:/root$ java -version java version "1.7.0_03" ... I run a script on start up which exhibits similar but slightly different problems. The script is located in /etc/init.d/start-mystuff.sh and calls a jar: #!/bin/bash # /etc/init.d/start-mystuff.sh java -jar /opt/Mars.jar I can confirm that the script runs on start up and the exit code is 127, which indicates command not found. Inserting a line to print/save the PATH shows that it is: /sbin:/usr/sbin:/bin:/usr/bin This second problem isn't as important because I can just point directly to the Java executable in the script, but I am still curious! I have tried setting the full PATH and JAVA_HOME explicitly in /etc/environment which didn't help. I have also tried setting them in /etc/profile which doesn't seem to help either. I have tried logging in and out again after setting PATH in the various locations (duh!). Anyway, long post for what will probably have a simple one line solution :( Any help with this would be greatly appreciated, I have spent far too long trying to fix it by myself. Motivation The first problem may seem obscure but in my system I have users that are not allowed SSH access yet I still want to run processes as them. I have a ton of scripts operating in this way and don't want to have to change them all.

Read the article
Unusually high memory usage on a CentOS VPS with 512 guaranteed RAM

- by Andrei Bârsan

I'm working on a medium-sized web application written in PHP that's running on a VPS with 512mb ram. The webapp hasn't been officially launched yet, so there isn't too much traffic going on, just me and a few other people working on it. There is another slightly smaller webapp also hosted on this machine, among 4-5 other small static sites. We are running Centos 5 32-bit & cPanel/WHM. This is the result of running ps aux and, as you can see, it's not using 100% of the RAM. However, on the hypanel overview, it's always shown as using aroun 500MB ram, just for running apache, mysql, and the lowest-memory-footprint versions of the mail server, ftp server etc. -bash-3.2# ps aux USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND root 1 0.0 0.0 2156 664 ? Ss 12:08 0:00 init [3] root 1123 0.0 0.0 2260 548 ? S<s 12:08 0:00 /sbin/udevd -d root 1462 0.0 0.0 1812 568 ? Ss 12:08 0:00 syslogd -m 0 named 1496 0.0 0.0 3808 820 ? Ss 12:08 0:00 nsd named 1497 0.0 0.0 10672 756 ? S 12:08 0:00 nsd named 1499 0.0 0.0 3880 584 ? S 12:08 0:00 nsd root 1514 0.0 0.1 7240 1064 ? Ss 12:08 0:00 /usr/sbin/sshd root 1522 0.0 0.0 2832 832 ? Ss 12:08 0:00 xinetd -stayalive -pidfile /var/run/xinetd.pid root 1534 0.0 0.1 3712 1328 ? S 12:08 0:00 /bin/sh /usr/bin/mysqld_safe --datadir=/var/lib/mysql - mysql 1667 0.0 2.9 225680 30884 ? Sl 12:08 0:00 /usr/sbin/mysqld --basedir=/ --datadir=/var/lib/mysql - mailnull 1766 0.0 0.1 9352 1100 ? Ss 12:08 0:00 /usr/sbin/exim -bd -q60m root 1797 0.0 0.0 2156 708 ? Ss 12:08 0:00 /usr/sbin/dovecot root 1798 0.0 0.0 2632 1012 ? S 12:08 0:00 dovecot-auth root 1816 0.0 3.0 38580 32456 ? Ss 12:08 0:01 /usr/local/bin/spamd -d --allowed-ips=127.0.0.1 --pidfi root 1839 0.0 1.6 63200 17496 ? Ss 12:08 0:00 /usr/local/apache/bin/httpd -k start -DSSL root 1846 0.0 0.1 5416 1468 ? Ss 12:08 0:00 pure-ftpd (SERVER) root 1848 0.0 0.1 6212 1244 ? S 12:08 0:00 /usr/sbin/pure-authd -s /var/run/ftpd.sock -r /usr/sbin root 1856 0.0 0.1 4492 1112 ? Ss 12:08 0:00 crond root 1864 0.0 0.0 2356 428 ? Ss 12:08 0:00 /usr/sbin/atd dovecot 1927 0.0 0.1 5196 1952 ? S 12:08 0:00 pop3-login dovecot 1928 0.0 0.1 5196 1948 ? S 12:08 0:00 pop3-login dovecot 1929 0.0 0.1 5316 2012 ? S 12:08 0:00 imap-login dovecot 1930 0.0 0.2 5416 2228 ? S 12:08 0:00 imap-login root 1939 0.0 0.1 3936 1964 ? S 12:08 0:00 cPhulkd - processor root 1963 0.0 0.8 15876 8564 ? S 12:08 0:00 cpsrvd (SSL) - waiting for connections root 1966 0.0 0.7 15172 7748 ? S 12:08 0:00 cpdavd - accepting connections on 2077 and 2078 root 1990 0.0 0.2 5008 3136 ? S 12:08 0:00 queueprocd - wait to process a task root 2017 0.0 2.9 38580 31020 ? S 12:08 0:00 spamd child root 2018 0.0 0.5 8904 5636 ? S 12:08 0:00 /usr/bin/perl /usr/local/cpanel/bin/leechprotect nobody 2021 0.0 3.2 66512 33724 ? S 12:08 0:00 /usr/local/apache/bin/httpd -k start -DSSL nobody 2022 0.0 3.1 67812 33024 ? S 12:08 0:00 /usr/local/apache/bin/httpd -k start -DSSL nobody 2024 0.0 1.9 64364 20680 ? S 12:08 0:00 /usr/local/apache/bin/httpd -k start -DSSL root 2027 0.0 0.4 9000 4540 ? S 12:08 0:00 tailwatchd root 2032 0.0 0.1 4176 1836 ? SN 12:08 0:00 cpanellogd - sleeping for logs nobody 3096 0.0 1.9 64572 20264 ? S 12:09 0:00 /usr/local/apache/bin/httpd -k start -DSSL nobody 3097 0.0 2.8 66008 30136 ? S 12:09 0:00 /usr/local/apache/bin/httpd -k start -DSSL nobody 3098 0.0 2.8 65704 29752 ? S 12:09 0:00 /usr/local/apache/bin/httpd -k start -DSSL nobody 3099 0.0 3.1 67260 32816 ? S 12:09 0:00 /usr/local/apache/bin/httpd -k start -DSSL andrei 3448 0.0 0.1 3204 1632 ? S 12:50 0:00 imap nobody 3537 0.0 1.9 64308 20108 ? S 13:01 0:00 /usr/local/apache/bin/httpd -k start -DSSL nobody 3614 0.0 1.9 64576 20628 ? S 13:10 0:00 /usr/local/apache/bin/httpd -k start -DSSL nobody 3615 0.0 1.3 63200 14672 ? S 13:10 0:00 /usr/local/apache/bin/httpd -k start -DSSL root 3626 0.0 0.2 10232 2964 ? Rs 13:14 0:00 sshd: root@pts/0 root 3648 0.0 0.1 3844 1600 pts/0 Ss 13:14 0:00 -bash root 3826 0.0 0.0 2532 908 pts/0 R+ 13:21 0:00 ps aux Lately, without any significant changes to the configuration, the memory usage started peaking and going over 512, causing the virtual server to kill apache, basically murdering our site in the process. Do you have any idea if this is normal and more resources should be acquired? I don't think... since there isn't too much data or traffic online yet.

Read the article
Debugging IO limitation

- by Martin F

I have a Fedora box with some severe IO limitations which I have no idea how to debug. The server has a Areca Technology Corp. ARC-1130 12-Port PCI-X to SATA RAID Controller with 12 7200 RPM 1.5 TB disks and a Marvell Technology Group Ltd. 88E8050 PCI-E ASF Gigabit Ethernet Controller. uname -a output: 2.6.32.11-99.fc12.x86_64 #1 SMP Mon Apr 5 19:59:38 UTC 2010 x86_64 x86_64 x86_64 GNU/Linux The server is a file server running Nginx with the stub status module enabled, so I can see the current amount of connections. The problem present itself when I have a high number of simultaneous connections in a writing state. Usually around 350, at this very moment it's at 590 and the server is almost unusable and stuck at 230mbit/s. If I run stop and hit 1 to see CPU core usages I have all 4 cores with around 99% io wait, if I run iotop the nginx workers are the only processes producing any read load, currently at around 25MB/s. I have each of the workers bound to their own core. Initially I figured it was just the disks being bugged. But I've run fscheck and smartmontools checks and found no errors. I also ran an iozone test which you can see the result of here: http://www.pastie.org/951667.txt?key=fimcvljulnuqy2dcdxa Additionally, when the amount of connections are low I have no problem getting a good speed. If I wget over the local network it easily hits 60MB/sec. Right now I just tried putting a file in /dev/shm, then I symlinked a file from the public dir to it and used wget over the local network and only got 50KB/s. Also, if I try to cp /dev/shm/test /root/test it quickly copies around 740MB and then slows down HEAVILY. Again with iotop reporting 99% iowait. I'm not really sure how to go about figuring out what the problems are. It could be a natural disk limitation but then the file from /dev/shm ought to transfer so it seems there's a network limit, but that's fine when there's not many connections. Perhaps it's a TCP stack problem but I really have no idea how to check that. Any suggestions on how to proceed with debugging would be very welcome. If additional information is required then let me know and I'll try to get it. Thanks.

Read the article
Ubuntu unattended-upgrades stops apache

- by Robbie

This morning i was alerted to the fact that both apache instances serving my app were not responding to requests from my load balancer. I attempted apachectl restart and it said apache was not running. So, i started apache on both instances and got the service up again. I then followed the logs and worked out that both had performed upgrades via the unattended-upgrades package moments before they stopped responding. /var/log/unattended-upgrades/unattended-upgrades.log 2013-07-02 06:30:51,875 INFO Starting unattended upgrades script 2013-07-02 06:30:51,875 INFO Allowed origins are: ['o=Ubuntu,a=precise-security'] 2013-07-02 06:33:57,771 INFO Packages that are upgraded: accountsservice apache2 apache2-mpm-prefork apache2-utils apache2.2-bin apache2.2-common apparmor apport apt apt-transport-https apt-utils bind9-host binutils dbus dnsutils gnupg gpgv isc-dhcp-client isc-dhcp-common krb5-locales libaccountsservice0 libapt-inst1.4 libapt-pkg4.12 libbind9-80 libc-bin libc-dev-bin libc6 libc6-dev libcurl3-gnutls libdbus-1-3 libdbus-glib-1-2 libdns81 libdrm-intel1 libdrm-nouveau1a libdrm-radeon1 libdrm2 libexpat1 libfreetype6 libgc1c2 libgnutls-dev libgnutls-openssl27 libgnutls26 libgnutlsxx27 libisc83 libisccc80 libisccfg82 liblwres80 libruby1.8 libx11-6 libx11-data libxcb1 libxext6 libxml2 linux-firmware linux-image-virtual linux-libc-dev linux-virtual multiarch-support openssl perl perl-base perl-modules python-apport python-crypto python-keyring python-problem-report python-software-properties ri1.8 ruby1.8 ruby1.8-dev sudo tzdata update-manager-core 2013-07-02 06:33:57,772 INFO Writing dpkg log to '/var/log/unattended-upgrades/unattended-upgrades-dpkg_2013-07-02_06:33:57.772399.log' 2013-07-02 06:36:10,584 INFO All upgrades installed I'm running Ubuntu 12.04 on Amazon EC2 servers. I have unattended-upgrades installed and configured as follows: /etc/apt/apt.conf.d/50unattended-upgrades // Automatically upgrade packages from these (origin:archive) pairs Unattended-Upgrade::Allowed-Origins { "${distro_id}:${distro_codename}-security"; // "${distro_id}:${distro_codename}-updates"; // "${distro_id}:${distro_codename}-proposed"; // "${distro_id}:${distro_codename}-backports"; }; // List of packages to not update Unattended-Upgrade::Package-Blacklist { }; /etc/apt/apt.conf.d/20auto-upgrades APT::Periodic::Update-Package-Lists "1"; APT::Periodic::Unattended-Upgrade "1"; I've struggled to find documentation about what happens to running processes during an upgrade. - Is this expected behaviour? Or should unattended-upgrades restart apache after upgrading it? - What can I do to ensure apache is restarted correctly? Should I just blacklist the apache package?

Read the article
T60 Screen/LCD gets black after some minutes with a highpitched sound rising and fading

- by Edward De Leau

Just now my T60 screen got "black" (so no display). On my second monitor: no problems so the vga output works. symptom: Screen blanks / no display but works on second monitor steps to reproduce: - boot - wait (it does not matter what you do you do not have to login or anything) - (now the monitor of the laptop slowly begins to make a ssssssssHHHHHHHHHHHHHHHHHWOEOEssssssss noice of about 10 seconds) - right after the sounds ends the monitor gets black times seem to be the same each time. software: installed no new software before/after, running zone alarm and antivirus. other: it does not feel hot in any place, there dont seem to be running processes with strange behaviour. warranty: out of warranty what was i doing: typing text on a website and doing some php coding in a text editor. Anyone any idea what I can do here other than buy a new laptop? / does it sound familiar to known cases? update: * exactly the same problem: http://forums.lenovo.com/t5/T61-and-prior-T-series-ThinkPad/T60-Screen-Blackout/m-p/288772 and the second poster (garyj) here: http://forums.lenovo.com/t5/T61-and-prior-T-series-ThinkPad/Black-Screen-on-T60/m-p/235053#M48627 and here: "i have that same problem. i replaced the CCRL on mine and it works fine when the screen is not screwed in. once the frame of the LCD screen (metal portion) touches the metal on the laptop which holds the screen the screen goes black. If the metal is touching the screen when you boot up it boots up with it being very dimmly lit. " from http://forums.lenovo.com/t5/T61-and-prior-T-series-ThinkPad/T60-screen-problems/m-p/205047#M44995 (it seems replacing the lcd is no use, he tried it 3 times). same problem: http://forums.lenovo.com/t5/T61-and-prior-T-series-ThinkPad/T60-black-screen/m-p/80604#M25914 Hmmm... not handy 3 or 4 months ago I ordered and installed a new fan. Now the LCD. Which does not seem the core isse but some electric issue so it seems replacing the LCD is not the thing todo here. I have the following question: If it is not the LCD that needs to be replaced (see other threads) which parts can I order to fix this? Has anyone got any information which could lead me to identify the issue? I have read replacing the "inverter" AND the "backlightning" would that make sense?

Read the article
nginx connection time issue on some IPs

- by sheldon

I have recently shifted my server to nginx and php-fpm getting rid of apache. This has helped improves speeds of my website. Everything seems to work fine until i came across this issue, i noticed that nginx keeps throwing connection time out errors for only certain IPs. One of the IPs is my office IP, we have a backend that is accessed from our office through out the day. I use supervisord to launch 3 php-fpm processes with workers this is my typical php-fpm config pm.max_children = 50 pm.start_servers = 20 pm.min_spare_servers = 5 pm.max_spare_servers = 35 pm.max_requests = 300 Since i have a server with 4 cores and 2 GB ram this is my nginx setup worker_processes 4; worker_rlimit_nofile 8192; events { worker_connections 1024; use epoll; multi_accept off; } sendfile on; tcp_nopush on; tcp_nodelay on; keepalive_timeout 55; recursive_error_pages on; server_name_in_redirect off; server_tokens off; client_header_timeout 3m; client_body_timeout 3m; send_timeout 3m; connection_pool_size 256; client_header_buffer_size 8k; large_client_header_buffers 4 32k; request_pool_size 4k; output_buffers 4 32k; postpone_output 1460; proxy_buffer_size 32k; proxy_buffers 4 32k; proxy_busy_buffers_size 64k; proxy_temp_file_write_size 64k; fastcgi_connect_timeout 120; fastcgi_send_timeout 120; fastcgi_read_timeout 180; fastcgi_buffer_size 128k; fastcgi_buffers 4 256k; fastcgi_busy_buffers_size 256k; fastcgi_temp_file_write_size 256k; fastcgi_intercept_errors on; fastcgi_ignore_client_abort off; Where am i going wrong with the config, I have tried various settings but the issue still persists. These are the errors i keep getting 2011/11/13 18:20:33 [error] 21583#0: *311683 upstream timed out (110: Connection timed out) while reading response header from upstream, client: IP, server: tastykhana.in, request: "GET url HTTP/1.1", upstream: "fastcgi://unix:/var/run/php-fpm.socket:", host: "tastykhana.in", referrer: "url"

Read the article
Core i7 c1e and speedstepping - BSOD on shutdown

- by DeaconDesperado

I'm having an interesting problem with my recent Core i7 Digital Audio workstation build that I am curious to see if others have encountered. First, here are the specs on the machine. ASUS P6TD Deluxe Intel X58 Socket LGA1366 MB Intel Core i7-950 3.06Ghz 8M LGA1366 CPU CORSAIR DOMINATOR 6GB (3 x 2GB) 240-Pin DDR3 SDRAM DDR3 1600 Western Digital Caviar Black WD5001AALS 500GB Plus a couple ASUS optical drives and a 750W Corsair PSU. Running Windows 7 x64. All this is connected to the nefarious Digi 002 firewire audio interface for use with Pro Tools. I following mostly the specs posted by many other I7 users in the digidesign community who pooled their collective knowledge in this thread. Now after completing my build, I fell victim to the "UD5 squeal" described at that forum thread. So taking the advice posted, I disabled c1e advanced halt state and Intel speed stepping (I would likely have done this anyway to maintain a stable clock, power consumption isn't really a relevant concern on this machine.) I enabled XMP to set the ram timings properly as well. What I am experiencing is a BSOD upon shutdown, but only immediately after windows fully exits and ends all processes. The error is a MACHINE_CHECK_EXCEPTION 0x000000. The funny thing is that it is extremely intermitent and only occurs if the shutdown immediately followed a period of relative idleness. It does not a generate a minidump, I suspect because windows monitoring has terminated by the time this error occurs. No damage is evident and one can simply turn off manually and the system will act as though a proper shutdown had occurred. If anything it is a annoyance, I just want to be certain it is not affecting my long term stability. I have read that the i7 950 does not like DRAM voltages past 1.65, but that they are acceptable if they are within .5 of the BLCK setting. I have tried disabling XMP and setting all timings to auto and the problem still manifests in an identical way. It is suspect that the cpu idleness preceding shutdown is the determining factor, as both c1e and speedstepping are both settings intended to modify handling of this state. Any suggestions or prior experiences would be greatly appreciated. EDIT: The behavior very closely resembles what's described in this thread: http://www.tomshardware.com/forum/12003-63-shut-problem-windows The benign nature of it of is identical. I can't seem to download the hotfix cited there however.

Read the article
500 Internal Server Error when setting up Apache on Ubuntu+Django

- by ApacheQ

I tried with Apache on ubuntu 9.04 and get the same error: Internal Server Error The server encountered an internal error or misconfiguration and was unable to complete your request. Please contact the server administrator, webmaster@localhost and inform them of the time the error occurred, and anything you might have done that may have caused the error. More information about this error may be available in the server error log. And my apache/error.log is: [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] ServerName: 'sapint2' [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] DocumentRoot: '/etc/apache2/htdocs' [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] URI: '/' [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] Location: '/' [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] Directory: None [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] Filename: '/etc/apache2/htdocs' [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] PathInfo: '/' [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] Traceback (most recent call last): [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] File "/usr/lib/python2.7/dist-packages/mod_python/importer.py", line 1537, in HandlerDispatch\n default=default_handler, arg=req, silent=hlist.silent) [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] File "/usr/lib/python2.7/dist-packages/mod_python/importer.py", line 1229, in _process_target\n result = _execute_target(config, req, object, arg) [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] File "/usr/lib/python2.7/dist-packages/mod_python/importer.py", line 1128, in _execute_target\n result = object(arg) [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/modpython.py", line 180, in handler\n return ModPythonHandler()(req) [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/modpython.py", line 142, in call\n self.load_middleware() [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] File "/usr/local/lib/python2.7/dist-packages/django/core/handlers/base.py", line 45, in load_middleware\n mod = import_module(mw_module) [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] File "/usr/local/lib/python2.7/dist-packages/django/utils/importlib.py", line 35, in import_module\n import(name) [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] File "/usr/local/lib/python2.7/dist-packages/django/contrib/sessions/middleware.py", line 4, in \n from django.utils.cache import patch_vary_headers [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] File "/usr/local/lib/python2.7/dist-packages/django/utils/cache.py", line 25, in \n from django.core.cache import get_cache [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] File "/usr/local/lib/python2.7/dist-packages/django/core/cache/init.py", line 187, in \n cache = get_cache(DEFAULT_CACHE_ALIAS) [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] File "/usr/local/lib/python2.7/dist-packages/django/core/cache/init.py", line 179, in get_cache\n cache = backend_cls(location, params) [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] File "/usr/local/lib/python2.7/dist-packages/django/core/cache/backends/memcached.py", line 139, in init\n "Memcached cache backend requires either the 'memcache' or 'cmemcache' library" [Sat Oct 06 09:32:04 2012] [error] [client 10.0.64.10] InvalidCacheBackendError: Memcached cache backend requires either the 'memcache' or 'cmemcache' library [Sat Oct 06 09:51:30 2012] [notice] caught SIGTERM, shutting down [Sat Oct 06 09:51:31 2012] [notice] mod_python: Creating 8 session mutexes based on 150 max processes and 0 max threads. [Sat Oct 06 09:51:31 2012] [notice] mod_python: using mutex_directory /tmp [Sat Oct 06 09:51:31 2012] [notice] Apache/2.2.17 (Ubuntu) PHP/5.3.5-1ubuntu7.11 with Suhosin-Patch mod_python/3.3.1 Python/2.7.1+ mod_wsgi/3.3 configured -- resuming normal operations I need some help Thanks

Read the article
Sendmail Tuning For Batch Mail Jobs

- by Kyle Brandt

I have a webservers that send out emails to a sendmail relay server as a batch job. The emails need to be accepted by the relay sendmail server as fast as possible, however, they do not need to go out (be relayed) very quickly. I am seeing a couple timeouts once and a while from the webserver trying to connect to the relay server. The load currently is about 30 emails a second for a couple minutes. There are quite a few tuning options for sendmail in the sendmail tuning guide. What I am focusing on now is the Delivery Mode: Delivery Mode There are a number of delivery modes that sendmail can operate in, set by the DeliveryMode ( d) configuration option. These modes specify how quickly mail will be delivered. Legal modes are: i deliver interactively (synchronously) b deliver in background (asynchronously) q queue only (don't deliver) d defer delivery attempts (don't deliver) There are tradeoffs. Mode i gives the sender the quickest feedback, but may slow down some mailers and is hardly ever necessary. Mode b delivers promptly but can cause large numbers of processes if you have a mailer that takes a long time to deliver a message. Mode q minimizes the load on your machine, but means that delivery may be delayed for up to the queue interval. Mode d is identical to mode q except that it also prevents lookups in maps including the -D flag from working during the initial queue phase; it is intended for ``dial on demand'' sites where DNS lookups might cost real money. Some simple error messages (e.g., host unknown during the SMTP protocol) will be delayed using this mode. Mode b is the usual default. If you run in mode q (queue only), d (defer), or b (deliver in background) sendmail will not expand aliases and follow .forward files upon initial receipt of the mail. This speeds up the response to RCPT commands. Mode i should not be used by the SMTP server. I currently have the CentOS default modes: Sendmail.cf: DeliveryMode=background Submit.cf: DeliveryMode=i Is sendmail.cf/mc for outgoing email from relay (to the intertubes) and sumbit.cf/mc for incoming eamil (from my webservers). Would it make sense to change the outgoing delivery mode to queue? If I did, what would the outbound email flow behave like? If this is the right thing to do, can anyone show me example mc configurations for this change? If it isn't, what recommendations are there for these constraints?

Read the article
sshd running but no PID file

- by dunxd

I'm recently started using monit to monitor the status of sshd on my CentOS 5.4 server. This works fine, but every so often monit reports that sshd is no longer running. This isn't true - I am still able to login to the server via ssh, however I note the following: There is no longer any PID file at /var/run/sshd.pid - after a reboot this file exists. Once it is gone, restarting sshd via service sshd restart does not create the PID file. sudo service sshd status reports openssh-daemon is stopped - again, restarting sshd does not change this, but a reboot does. sudo service sshd stop reports failed, presumably because of the missing PID file. Any idea what is going on? Update sudo netstat -lptun gives the following output relating to port 22 tcp 0 0 :::22 :::* LISTEN 20735/sshd Killing the process with this PID as suggested by @Henry and then starting sshd via service results in service sshd status recognising the process by PID again. Would still like to understand this better. RPM verify suggested by a couple of answerers shows this: sudo rpm -vV openssh openssh-server openssh-clients | grep 'S\.5' S.5....T c /etc/pam.d/sshd S.5....T c /etc/ssh/sshd_config /etc/pam.d/sshd has the following contents: #%PAM-1.0 auth include system-auth account required pam_nologin.so account include system-auth password include system-auth session optional pam_keyinit.so force revoke session include system-auth #session required pam_loginuid.so Should that last line be commented out? Update Here's the output of @YannickGirouard 's script: $ sudo ./sshd_test Searching for the process listening on port 22... Found the following PID: 21330 Command line for PID 21330: /usr/sbin/sshd Listing process(es) relating to PID 21330: UID PID PPID C STIME TTY TIME CMD root 21330 1 0 14:04 ? 00:00:00 /usr/sbin/sshd Listing RPM information about openssh packages: Name : openssh Relocations: (not relocatable) Version : 4.3p2 Vendor: CentOS Release : 72.el5_7.5 Build Date: Tue 30 Aug 2011 12:34:14 AM BST Install Date: Sun 06 Nov 2011 12:50:57 AM GMT Build Host: builder10.centos.org Group : Applications/Internet Source RPM: openssh-4.3p2-72.el5_7.5.src.rpm Size : 745390 License: BSD Signature : DSA/SHA1, Fri 02 Sep 2011 01:13:01 AM BST, Key ID a8a447dce8562897 URL : http://www.openssh.com/portable.html Summary : The OpenSSH implementation of SSH protocol versions 1 and 2 ------------------------------------------------------ Name : openssh-clients Relocations: (not relocatable) Version : 4.3p2 Vendor: CentOS Release : 72.el5_7.5 Build Date: Tue 30 Aug 2011 12:34:14 AM BST Install Date: Sun 06 Nov 2011 12:51:04 AM GMT Build Host: builder10.centos.org Group : Applications/Internet Source RPM: openssh-4.3p2-72.el5_7.5.src.rpm Size : 871132 License: BSD Signature : DSA/SHA1, Fri 02 Sep 2011 01:13:01 AM BST, Key ID a8a447dce8562897 URL : http://www.openssh.com/portable.html Summary : The OpenSSH client applications ------------------------------------------------------ Name : openssh-server Relocations: (not relocatable) Version : 4.3p2 Vendor: CentOS Release : 72.el5_7.5 Build Date: Tue 30 Aug 2011 12:34:14 AM BST Install Date: Sun 06 Nov 2011 12:51:04 AM GMT Build Host: builder10.centos.org Group : System Environment/Daemons Source RPM: openssh-4.3p2-72.el5_7.5.src.rpm Size : 492478 License: BSD Signature : DSA/SHA1, Fri 02 Sep 2011 01:13:01 AM BST, Key ID a8a447dce8562897 URL : http://www.openssh.com/portable.html Summary : The OpenSSH server daemon ------------------------------------------------------ However, I've since got things working by killing the process and starting afresh, as suggested by @Henry below, so perhaps I am no longer seeing the same thing. Will try again if I am seeing the issue again after next reboot. Update - 14 March Monit alerted me that sshd had disappeared, and again I am able to ssh onto the server. So now I can run the script $ sudo ./sshd_test Searching for the process listening on port 22... Found the following PID: 2208 Command line for PID 2208: /usr/sbin/sshd Listing process(es) relating to PID 2208: UID PID PPID C STIME TTY TIME CMD root 2208 1 0 Mar13 ? 00:00:00 /usr/sbin/sshd root 1885 2208 0 21:50 ? 00:00:00 sshd: dunx [priv] Listing RPM information about openssh packages: Name : openssh Relocations: (not relocatable) Version : 4.3p2 Vendor: CentOS Release : 72.el5_7.5 Build Date: Tue 30 Aug 2011 12:34:14 AM BST Install Date: Sun 06 Nov 2011 12:50:57 AM GMT Build Host: builder10.centos.org Group : Applications/Internet Source RPM: openssh-4.3p2-72.el5_7.5.src.rpm Size : 745390 License: BSD Signature : DSA/SHA1, Fri 02 Sep 2011 01:13:01 AM BST, Key ID a8a447dce8562897 URL : http://www.openssh.com/portable.html Summary : The OpenSSH implementation of SSH protocol versions 1 and 2 ------------------------------------------------------ Name : openssh-clients Relocations: (not relocatable) Version : 4.3p2 Vendor: CentOS Release : 72.el5_7.5 Build Date: Tue 30 Aug 2011 12:34:14 AM BST Install Date: Sun 06 Nov 2011 12:51:04 AM GMT Build Host: builder10.centos.org Group : Applications/Internet Source RPM: openssh-4.3p2-72.el5_7.5.src.rpm Size : 871132 License: BSD Signature : DSA/SHA1, Fri 02 Sep 2011 01:13:01 AM BST, Key ID a8a447dce8562897 URL : http://www.openssh.com/portable.html Summary : The OpenSSH client applications ------------------------------------------------------ Name : openssh-server Relocations: (not relocatable) Version : 4.3p2 Vendor: CentOS Release : 72.el5_7.5 Build Date: Tue 30 Aug 2011 12:34:14 AM BST Install Date: Sun 06 Nov 2011 12:51:04 AM GMT Build Host: builder10.centos.org Group : System Environment/Daemons Source RPM: openssh-4.3p2-72.el5_7.5.src.rpm Size : 492478 License: BSD Signature : DSA/SHA1, Fri 02 Sep 2011 01:13:01 AM BST, Key ID a8a447dce8562897 URL : http://www.openssh.com/portable.html Summary : The OpenSSH server daemon ------------------------------------------------------ Again, when I look for /var/run/sshd.pid I don't find it. $ cat /var/run/sshd.pid cat: /var/run/sshd.pid: No such file or directory $ sudo netstat -anp | grep sshd tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 2208/sshd $ sudo kill 2208 $ sudo service sshd start Starting sshd: [ OK ] $ cat /var/run/sshd.pid 3794 $ sudo service sshd status openssh-daemon (pid 3794) is running... Is it possible that sshd is restarting and not creating a pidfile for some reason?

Read the article
How to get more information from the system crash

- by viraptor

I'd like to debug an issue I'm having with a linux (debian stable) server, but I'm running out of ideas of how to confirm any diagnosis. Some background: The servers are running DL160 class with hardware raid between two disks. They're running a lot of services, mostly utilising network interface and CPU. There are 8 cpus and 7 "main" most cpu-hungry processes are bound to one core each via cpu affinity. Other random background scripts are not forced anywhere. The filesystem is writing ~1.5k blocks/s the whole time (goes up above 2k/s in peak times). Normal CPU usage for those servers is ~60% on 7 cores and some minimal usage on the last (whatever's running on shells usually). What actually happens is that the "main" services start using 100% CPU at some point, mainly stuck in kernel time. After a couple of seconds, LA goes over 400 and we lose any way to connect to the box (KVM is on it's way, but not there yet). Sometimes we see a kernel reporting hung task (but not always): [118951.272884] INFO: task zsh:15911 blocked for more than 120 seconds. [118951.272955] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. [118951.273037] zsh D 0000000000000000 0 15911 1 [118951.273093] ffff8101898c3c48 0000000000000046 0000000000000000 ffffffffa0155e0a [118951.273183] ffff8101a753a080 ffff81021f1c5570 ffff8101a753a308 000000051f0fd740 [118951.273274] 0000000000000246 0000000000000000 00000000ffffffbd 0000000000000001 [118951.273335] Call Trace: [118951.273424] [<ffffffffa0155e0a>] :ext3:__ext3_journal_dirty_metadata+0x1e/0x46 [118951.273510] [<ffffffff804294f6>] schedule_timeout+0x1e/0xad [118951.273563] [<ffffffff8027577c>] __pagevec_free+0x21/0x2e [118951.273613] [<ffffffff80428b0b>] wait_for_common+0xcf/0x13a [118951.273692] [<ffffffff8022c168>] default_wake_function+0x0/0xe .... This would point at raid / disk failure, however sometimes the tasks are hung on kernel's gettsc which would indicate some general weird hardware behaviour. It's also running mysql (almost read-only, 99% cache hit), which seems to spawn a lot more threads during the system problems. During the day it does ~200kq/s (selects) and ~10q/s (writes). The host is never running out of memory or swapping, no oom reports are spotted. We've got many boxes with similar/same hardware and they all seem to behave that way, but I'm not sure which part fails, so it's probably not a good idea to just grab something more powerful and hope the problem goes away. Applications themselves don't really report anything wrong when they're running. I can run anything safely on the same hardware in an isolated environment. What can I do to narrow down the problem? Where else should I look for explanation?

Read the article
Software: Launching League of Legends spectator mode from Command Line (Mac)

- by Alex Popov

Background: tl;dr at the end League of Legends has a spectator mode, in which you can watch someone else's game (essentially a replay) with a 3 minute delay. Popular LoL website OP.GG has figured out a clever way of hosting these spectator games on their own servers, thereby making them replayable, as opposed to only being available while the game is on (as Riot does it). If you request a replay from OP.GG, it sends a batch file which looks for where the League is situated and then the magic happens: @start "" "League of Legends.exe" "8394" "LoLLauncher.exe" "" "spectator fspectate.op.gg:4081 tjJbtRLQ/HMV7HuAxWV0XsXoRB4OmFBr 1391881421 NA1" This works fine on Windows. I'm trying to get it to work on Mac (which has an official client). First I tried running the same command by hand, (split for convenience) /Applications/ ... /LeagueOfLegends.app/ ... /LeagueofLegends 8393 LoLLauncher \ /Applications/ ... /LolClient spectator fspectate.op.gg:4081 tjJbtRLQ/HMV7HuAxWV0XsXoRB4OmFBr 1391881421 NA1 Running this, however, just starts the LoLLauncher, which closes all the active League processes. The exactly same thing happens if I just call /Applications/ ... /LeagueOfLegends.app/ ... /LeagueofLegends Next I tried seeing what actually happens when Spectator mode is initiated so I ran $ ps -axf | grep -i lol which showed UID PID PPID C STIME TTY TIME CMD 503 3085 1 0 Wed02pm ?? 0:00.00 (LolClient) 503 24607 1 0 9:19am ?? 0:00.98 /Applications/League of Legends.app/Contents/LOL/RADS/system/UserKernel.app/Contents/MacOS/UserKernel updateandrun lol_launcher LoLLauncher.app 503 24610 24607 0 9:19am ?? 1:08.76 /Applications/League of Legends.app/Contents/LoL/RADS/projects/lol_launcher/releases/0.0.0.122/deploy/LoLLauncher.app/Contents/MacOS/LoLLauncher 503 24611 24610 0 9:19am ?? 1:23.02 /Applications/League of Legends.app/Contents/LoL/RADS/projects/lol_air_client/releases/0.0.0.127/deploy/bin/LolClient -runtime .\ -nodebug META-INF\AIR\application.xml .\ -- 8393 503 24927 24610 0 9:44am ?? 0:03.37 /Applications/League of Legends.app/Contents/LoL/RADS/solutions/lol_game_client_sln/releases/0.0.0.117/deploy/LeagueOfLegends.app/Contents/MacOS/LeagueofLegends 8394 LoLLauncher /Applications/League of Legends.app/Contents/LoL/RADS/projects/lol_air_client/releases/0.0.0.127/deploy/bin/LolClient spectator 216.133.234.17:8088 Yn1oMX/n3LpXNebibzUa1i3Z+s2HV0ul 1400781241 NA1 Of Interest: there is (LolClient) which I cannot kill by it's PID. UserKernel updateandrun lol_launcher LoLLauncher.app is launched first. LoLLauncher is launched by the UserKernel (as we can see from the PPID) The very long command (PID: 24927) is how Spectator mode is launched, and is also launched by UserKernel. Spectator mode is launched in exactly the same way that the OP.GG .bat wanted to, with the only difference that Spectator mode connects to Riot instead of OP.GG's spectate server. I tried attaching GDB to the LolClient, but I couldn't get anything meaningful from it since it's an Adobe AIR application (and I've never used GDB with code other than mine own). Next I ran dtruss -a -b 100m -f -p $PID on everything I could think of: the LolClient, the LolLauncher and the UserKernel and skimmed the half a million lines produced. I found stuff like the GET request used to get the information of the game to spectate, but I could not see any launch of the equivalent of League of Legends.exe with spectator options. Finally, I ran lsof | grep -i lol to see if anything else was opened in the process, but didn't find anything that seemed appropriate. Open were UserKernel, LolLauncher, LolClient, Adobe AIR, LeagueofLegends and then Bugsplat, all of which are expected. None of this seemed especially relevant to figuring out how LeagueofLegends was opened into spectator mode. It obviously can be done, since Spectator mode is accessible from within the client. It seems likely that it can be done from the CLI, since Windows can do it and the clients are supposed to equals. Unless I'm missing something in the difference between how UNIX and Windows handle CLI application launches. My question is if there are any other things I can try to figure out how to launch Spectator mode myself. tl;dr: Trying to get into spectator mode from the CLI. It's possible on Windows (see first code block) but it just restarts League on Mac. What else can I try to find what call is made, and how to reproduce it? PS: Please let me know how I can improve this question or its formatting, I'd love to use StackOverflow/SuperUser, but as the guys said on the podcast this week (Ep. 59) it's very intimidating. Sorry for posting this on StackOverflow the first time :(

Read the article
Postfix flow/hook reference, or high-level overview?

- by threecheeseopera

The Postfix MTA consists of several components/services that work together to perform the different stages of delivery and receipt of mail; these include the smtp daemon, the pickup and cleanup processes, the queue manager, the smtp service, pipe/spawn/virtual/rewrite ... and others (including the possibility of custom components). Postfix also provides several types of hooks that allow it to integrate with external software, such as policy servers, filters, bounce handlers, loggers, and authentication mechanisms; these hooks can be connected to different components/stages of the delivery process, and can communicate via (at least) IPC, network, database, several types of flat files, or a predefined protocol (e.g. milter). An old and very limited example of this is shown at this page. My question: Does anyone have access to a resource that describes these hooks, the components/delivery stages that the hook can interact with, and the supported communication methods? Or, more likely, documentation of the various Postfix components and the hooks/methods that they support? For example: Given the requirement "if the recipient primary MX server matches 'shadysmtpd', check the recipient address against a list; if there is a match, terminate the SMTP connection without notice". My software would need to 1) integrate into the proper part of the SMTP process, 2) use some method to perform the address check (TCP map server? regular expressions? mysql?), and 3) implement the required action (connection termination). Additionally, there will probably be several methods to accomplish this, and another requirement would be to find that which best fits (ex: a network server might be faster than a flat-file lookup; or, if a large volume of mail might be affected by this check, it should be performed as early in the mail process as possible). Real-world example: The apolicy policy server (performs checks on addresses according to user-defined rules) is designed as a standalone TCP server that hooks into Postfix inside the smtpd component via the directive 'check_policy_service inet:127.0.0.1:10001' in the 'smtpd_client_restrictions' configuration option. This means that, when Postfix first receives an item of mail to be delivered, it will create a TCP connection to the policy server address:port for the purpose of determining if the client is allowed to send mail from this server (in addition to whatever other restrictions / restriction lookup methods are defined in that option); the proper action will be taken based on the server's response. Notes: 1)The Postfix architecture page describes some of this information in ascii art; what I am hoping for is distilled, condensed, reference material. 2) Please correct me if I am wrong on any level; there is a mountain of material, and I am just one man ;) Thanks!

Read the article
Generating/managing config files for hosted application

- by mfinni

I asked a question about config management, and haven't seen a reply. It's possible my question was too vague, so let's get down to brass tacks. Here's the process we follow when onboarding a new customer instance into our hosted application : how would you manage this? I'm leaning towards a Perl script to populate templates to generate shell scripts, config files, XML config files, etc. Looking briefly at CFengine and Chef, it seems like they're not going to reduce the amount of work, because I'd still have to manually specify all of the changes/edits within the tool. Doesn't seem to be much of a gain over touching the config files directly. We add a stanza to the main config file for the core (3rd-party) application. This stanza has values that defines the instance (customer) name the TCP listener port for this instance (not one currently used) the DB2 database name (serial numeric identifier, already exists, they get prestaged for us by the DBAs) three sub-config files, by name - they need to be created from 3 templates and be named after the instance The sub-config files define: The filepath for the DB2 volumes The filepath for the storage of objects The filepath for just one of the DB2 volumes (yes, redundant to the first item. We run some application commands, start the instance We do some LDAP thingies (make an OU for the instance, etc.) We add a stanza to the config file for our security listener that acts as a passthrough to LDAP instance name LDAP OU TCP port for instance DB2 database name We restart the security listener (off-hours), change the main config file from item 1, stop and restart the instance. It is now authenticating via LDAP. We add the stop and start commands for this instance to the HA failover scripts. We import an XML config file into the instance that defines things for the actual application for the customer - user names, groups, permissions, and business rules. The XML is supplied by the implementation team. Now, we configure the dataloading application We add a stanza to the existing top-level config file that points to a new customer-level config file. The new customer-level config file includes: the instance (customer) name the DB2 database name arbitrary number of sub-config files, by name Each of the sub-config files defines: filepaths to the directories for ingestion, feedback, backup, and failure those filepaths have a common path to a customer-specific folder, and then one folder for each sub-config file Each of those filepaths needs to be created We need to add this customer instance to our monitoring scripts that confirm the proper processes are running and can be logged into. Of course, those monitoring config files include the instance name, the TCP port, the DB2 database name, etc. There's also a reporting application that needs to be configured for the new instance. You get the idea. There's also XML that is loaded into WAS by the middleware team. We give them the values for them to plug into the XML - they could very easily hand us the template and we could give them back completed XML.

Read the article
Is it safe to enable forced ASLR via EMET on Windows?

- by D.W.

I'd like to enable forced ASLR for all DLLs on Windows. Is this safe? Background: ASLR is an important security mechanism that helps defend against code injection attacks. DLLs can opt into ASLR, and most do, but some DLLs have not opted into ASLR. If a program loads even a single non-ASLRized DLL, then the program doesn't get the benefit/protection of ASLR. This is a problem, because there are a non-trivial number of DLLs that haven't opted into ASLR. For instance, it was recently revealed that Dropbox injects a DLL into a bunch of processes, and the Dropbox DLL doesn't have ASLR turned on, which negates any ASLR protection they otherwise would have had. Unfortunately, there are many other widely used DLLs that haven't opted into ASLR. This is bad for system security. Microsoft provides several ways to turn on ASLR for all DLLs, even ones that haven't opted into ASLR: On Windows 7 and Windows Server 2008, you can enable "Force ASLR" in the registry. On all Windows versions, you can use Microsoft's EMET tool and enable EMET's "Mandatory ASLR" option. These methods are possible because all DLLs are compiled as position-independent code and they can be relocated to a random location even if they haven't opted into ASLR. These options will ensure that ASLR is turned on, even if the developers of the DLL forgot to opt into ASLR. Thus, forcing on ASLR systemwide may help system security. In principle, turning on forced ASLR could potentially break a poorly-written DLL, so there is some risk of breakage. I'm interested in finding out just significant this risk is. I have the suspicion that this kind of breakage might be extremely rare. Here's what I've been able to find: Microsoft has done compatibility testing with several dozen widely used applications. The only one they found where Mandatory ASLR causes problems is Windows Media Player. All the other applications continue working fine. (See pp.39-41 of this document.) I've seen some anecdotal reports that enabling "Mandatory ASLR"/"Force ASLR" is fine and unlikely to cause problems. CERT reports that AMD and ATI video drivers used to crash if you enabled forced ASLR, but their latest drivers have now fixed this problem. They don't show any other drivers with this problem. A forum post from Microsoft shows no other applications with compatibility problems if ASLR is forced on, as of 2011. A user reports that borderlands.exe, a video game by Gearbox Software, crashes if you turn on mandatory ASLR. What else should I know? Is it relatively safe to turn on Force ASLR / Mandatory ASLR systemwide to harden the secuity of my system, or will I be in for a world of pain and broken applications? How significant is the risk of compatibility problems and broken applications?

Read the article
Windows startup Powershell script not closing after Start-Process

- by Matthew Phipps

I've got a Powershell V2.0 startup script for my work computer (XP Professional 64-bit), as follows: start "C:\Program Files (x86)\Microsoft Office\Office12\OUTLOOK.EXE" -ArgumentList "/recycle" sleep -S 2 start "C:\Program Files (x86)\Mozilla Firefox\firefox.exe" -ArgumentList "https://mail.google.com" sleep -S 2 start "C:\Program Files (x86)\Mozilla Firefox\firefox.exe" -ArgumentList "-new-window https://www.google.com/calendar" sleep -S 2 start "C:\Program Files (x86)\Skype\Phone\Skype.exe" The sleeps are to ensure that the windows appear on the taskbar in the correct order. I run this from a shortcut on my Quick Launch with the following Target: C:\WINDOWS\system32\WindowsPowerShell\v1.0\powershell.exe C:\scripts\initialize.ps1 (Yes, this is 2.0: powershell -Version 2.0 works, as does -Version 1.0, but not -Version 3.0) Problem is, the command window stays open until the Firefox windows are closed, which is not what I want. Looking at Process Explorer when I run the script, here's what happens: powershell.exe appears under explorer.exe and the Powershell window appears (with a black background, oddly. But it's not cmd.exe, since when I was debugging the script error messages would appear in red). outlook.exe appears under powershell.exe and the Outlook window appears. firefox.exe appears under powershell.exe and a Firefox window appears. A second firefox.exe appears under powershell.exe and another Firefox window appears. The second Firefox process then exits, as expected, since Firefox only uses one process. skype.exe appears under powershell.exe and the Skype window appears. The powershell.exe process inexplicably sticks around, as does the Powershell window. If I close both Firefox windows, the powershell.exe process exits and the Powershell window closes, and the outlook.exe and skype.exe processes appear under explorer.exe as expected. I suspect this has something to do with Firefox's standard input, output and error: I wouldn't expect Outlook or Skype to ever output anything to the console, but Firefox has command-line options that allow it to do so. I've looked over my about:config's user set values and didn't find anything suspicious. Finally, if I have a firefox.exe instance already running (started from the desktop shortcut) the problem doesn't occur (the powershell.exe process exits as it ought to). So what's going on here? I'm going to try adding -WindowStyle hidden to the shortcut next (gotta close this Firefox to test it), but I want to get to the bottom of this, if only to improve my understanding of how Windows consoles work.

Read the article
How do large blobs affect SQL delete performance, and how can I mitigate the impact?

- by Max Pollack

I'm currently experiencing a strange issue that my understanding of SQL Server doesn't quite mesh with. We use SQL as our file storage for our internal storage service, and our database has about half a million rows in it. Most of the files (86%) are 1mb or under, but even on fresh copies of our database where we simply populate the table with data for the purposes of a test, it appears that rows with large amounts of data stored in a BLOB frequently cause timeouts when our SQL Server is under load. My understanding of how SQL Server deletes rows is that it's a garbage collection process, i.e. the row is marked as a ghost and the row is later deleted by the ghost cleanup process after the changes are copied to the transaction log. This suggests to me that regardless of the size of the data in the blob, row deletion should be close to instantaneous. However when deleting these rows we are definitely experiencing large numbers of timeouts and astoundingly low performance. In our test data set, its files over 30mb that cause this issue. This is an edge case, we don't frequently encounter these, and even though we're looking into SQL filestream as a solution to some of our problems, we're trying to narrow down where these issues are originating from. We ARE performing our deletes inside of a transaction. We're also performing updates to metadata such as file size stats, but these exist in a separate table away from the file data itself. Hierarchy data is stored in the table that contains the file information. Really, in the end it's not so much what we're doing around the deletes that matters, we just can't find any references to low delete performance on rows that contain a large amount of data in a BLOB. We are trying to determine if this is even an avenue worth exploring, or if it has to be one of our processes around the delete that's causing the issue. Are there any situations in which this could occur? Is it common for a database server to come to the point of complete timeouts when many of these deletes are occurring simultaneously? Is there a way to combat this issue if it exists? (cross-posted from StackOverflow )

Read the article
setting up Ubuntu 10.10 as paravirtualized guest in Xen on RHEL5 host - what kernel?

- by kostmo

I've discovered the tool ubuntu-vm-builder, which I've installed and then invoked on an Ubuntu workstation as: sudo vmbuilder xen ubuntu --suite maverick --flavour virtual --arch amd64 --mem=512 --rootsize 8192 This workstation is not the intended target host of the virtual machine, however; I would like to host the guest on a Red Hat Enterprise Linux 5 machine that is running Xen 3.0.3. The output of this command appears to be a folder named ubuntu-xen containing three files: tmpXXXXXX, a very large file which I assume is the root partition image tmpYYYYYY, a somewhat large file which I assume is the swap partition image xen.conf, a text file I have copied the xen.conf file to the RHEL server's /etc/xen directory under the new name newvm, adjusting the paths of tempXXXXXX and tempYYYYYYin the file after also copying them from my local workstation to the RHEL server. When I launch the Virtual Machine Manager virt-manager, I can see the newvm virtual machine listed underneath the Dom0 machine. When I try to start newvm, I get the error: Error starting domain: virDomainCreate() failed POST operation failed: (xend.err 'Error creating domain: Kernel image does not exist: None') Indeed, there exists an entry kernel = 'None' in the xen.conf file. How do I find out what the path of the kernel should be? Is this path supposed to be to a kernel stored on the local filesystem of the RHEL5 host, or is it supposed to be a path inside the guest image? I see that the vmbuilder command provides for a --xen-kernel option, along with a --xen-ramdisk option, but I'm not sure what to use for either. I think I should be able to get this to work, since Ubuntu is said to be supported as a Xen guest, even though the Xen 4.0.1 docs state support for only a limited set of distributions, Ubuntu excluded. Update 1 When running vmbuilder on my local workstation, I did observe an output line saying: Calling hook: install_kernel and later, output lines saying: update-initramfs: Generating /boot/initrd.img-2.6.35-23-virtual [...] run-parts: executing /etc/kernel/postinst.d/initramfs-tools 2.6.35-23-virtual /boot/vmlinuz-2.6.35-23-virtual So in the xen.conf file, I tried setting the lines: kernel = '/boot/vmlinuz-2.6.35-23-virtual' ramdisk = '/boot/initrd.img-2.6.35-23-virtual' When trying to start the VM, I got an error similar to last time: Error starting domain: virDomainCreate() failed POST operation failed: (xend.err 'Error creating domain: Kernel image does not exist: /boot/vmlinuz-2.6.35-23-virtual') This makes me think that the RHEL5 machine is looking for local files, rather than a file within the binary guest disk image. After running sudo updatedb on my workstation, neither of those files were found. If the vmbuilder tool had tried to install them, it must have failed. Update 2 I was able to extract the kernel and initrd images from the guest disk binary by mounting it: mkdir mnt_tmp sudo mount ubuntu-xen/tmpXXXXXX mnt_tmp/ -o loop cp mnt_tmp/boot/vmlinuz-2.6.35-23-virtual virtual_kernel_ubuntu cp mnt_tmp/boot/initrd.img-2.6.35-23-virtual virtual_initrd_ubuntu These two files I copied to the RHEL5 server, and edited the xen.conf file to point to them as kernel and ramdisk. With this done, I could "run" the newvm virtual machine from within virt-manager, but was met with the message Console Not Configured For Guest when I double clicked the entry to open the Virtual Machine Console. As suggested by a forum, I then added the line vfb = [ 'type=vnc' ] to the configuration file, recreated the virtual machine (a ~10 min process), and this time got the message: Connecting to console for guest This remained indefinitely; after selecting View - Serial Console, I found a kernel panic: [5442621.272173] Kernel panic - not syncing: Attempted to kill the idle task! [5442621.272179] Pid: 0, comm: swapper Tainted: G D 2.6.35-23-virtual #41-Ubuntu [5442621.272184] Call Trace: [5442621.272191] [<ffffffff815a1b81>] panic+0x90/0x111 [5442621.272199] [<ffffffff810652ee>] do_exit+0x3be/0x3f0 [5442621.272204] [<ffffffff815a5e20>] oops_end+0xb0/0xf0 [5442621.272211] [<ffffffff8100ddeb>] die+0x5b/0x90 [5442621.272216] [<ffffffff815a56c4>] do_trap+0xc4/0x170 [5442621.272221] [<ffffffff8100ba35>] do_invalid_op+0x95/0xb0 [5442621.272227] [<ffffffff8130851c>] ? intel_idle+0xac/0x180 [5442621.272232] [<ffffffff810072bf>] ? xen_restore_fl_direct_end+0x0/0x1 [5442621.272239] [<ffffffff815a48fe>] ? _raw_spin_unlock_irqrestore+0x1e/0x30 [5442621.272247] [<ffffffff8108dfb7>] ? tick_broadcast_oneshot_control+0xc7/0x120 [5442621.272253] [<ffffffff8100ad5b>] invalid_op+0x1b/0x20 [5442621.272259] [<ffffffff8130851c>] ? intel_idle+0xac/0x180 [5442621.272264] [<ffffffff813084e0>] ? intel_idle+0x70/0x180 [5442621.272269] [<ffffffff810072bf>] ? xen_restore_fl_direct_end+0x0/0x1 [5442621.272275] [<ffffffff8148a147>] cpuidle_idle_call+0xa7/0x140 [5442621.272281] [<ffffffff81008d93>] cpu_idle+0xb3/0x110 [5442621.272286] [<ffffffff815873aa>] rest_init+0x8a/0x90 [5442621.272291] [<ffffffff81b04c9d>] start_kernel+0x387/0x390 [5442621.272297] [<ffffffff81b04341>] x86_64_start_reservations+0x12c/0x130 [5442621.272303] [<ffffffff81b08002>] xen_start_kernel+0x55d/0x561 Update 3 I tried an i386 architecture instead of amd64, but got the same kernel panic. Also, it seems the Virtual Machine Manager pays attention to the format of the filename of the kernel; for the same kernel binary, I tried simply naming it vmlinuz-virtual, which threw out an error box about an invalid kernel. When I named it vmlinuz-2.6.35-23-virtual, it did not throw the error, but it did still result in the kernel panic shortly thereafter.

Read the article
Deployment/provisioning tool for commercial applications (not developed in-house)

- by mfinni

I help manage a few hosted commercial applications, and we have a lot of manual processes involved when doing new customer-instance deployments into the shared (multitenant) environment. Allow me to describe the most relevant features, and then we can talk about the tools. We have an application on AIX, that requires dozens of changes to config files (some plain text, some XML) as well as a good number of commands to be run on multiple servers - some to start the new instance, some to restart our shared authentication and reporting engines, etc. The config changes follow templates, of course. The servers in question will also depend on the initial conditions specified by the implementer/deployer - we may choose to deploy a given customer to our servers in Europe, or one set of servers may be active-active whereas a different set of servers is active-passive - in short, there's a lot of complications. We have another application that run on IIS 6 and SQL. The DBAs don't want any automation of the SQL components and that's fine with me, but automating the IIS bit would be great. For a new customer instance, we make a filesystem copy of a template Virtual Directory target named after the new customer, make a new AppPool to match, edit a VirDir template .xml file to replace the filepaths and AppPool names with the new ones, and then make a new VirDir from the modified template XML to point to the new filesystem folder and app pool. For the first case, something like ControlTier or Chef might be good. For the second, the new(ish) Web Deploy from MS would probably do a good job. Has anyone used these tools or others to do something similar for applications? More of a nice-to-have, not a fixed requirement - Has anyone used anything that works on both platforms? I'm looking for something free, because the official word is that within a year, we will have whatever HP has renamed the OpsWare suite, which should be able to do stuff like this. Edit - based on someone's suggestion, looking at CFengine for the AIX application, it doesn't seem to address my pain. The problem isn't keeping a given config synced across dozens of servers, we have rsync for that. The problem is that onboarding a new customer instance touches dozens of files, putting pieces of the same or similar information into them - some are new stanzas in existing files, some are new files, and some are new directories. This is a several-hours-long process that is also error-prone because it's mostly done by hand. I guess I'm looking for config-file generation and management. I have built a small Perl script to do something similar for a much smaller case - it binds a CSV file into variables, and then does a copy-and-search-and-replace from a set of template config files. I could probably do the same here.

Read the article
What Are All the Variables Necessary to Create Blackbox Logs for Nginx?

- by Alan Gutierrez

There's an article out there, Profiling LAMP Applications with Apache's Blackbox Logs, that describes how to create a log that records a lot of detailed information missing in the common and combined log formats. This information is supposed to help you resolve performance issues. As the author notes "While the common log-file format (and the combined format) are great for hit tracking, they aren't suitable for getting hardcore performance data." The article describes a "blackbox" log format, like a blackbox flight recorder on an aircraft, that gathers information used to profile server performance, missing from the hit tracking log formats: Keep alive status, remote port, child processes, bytes sent, etc. LogFormat "%a/%S %X %t \"%r\" %s/%>s %{pid}P/%{tid}P %T/%D %I/%O/%B" blackbox I'm trying to recreate as much of the format for Nginx, and would like help filling in the blanks. Here's what Nginx blackbox format would look like, the unmapped Apache directives have question marks after their names. access_log blackbox '$remote_addr/$remote_port X? [$time_local] "$request"' 's?/$status $pid/0 T?/D? I?/O?/B?' Here's a table of the variables I've been able to map from the Nginx documentation. %a = $remote_addr - The IP address of the remote client. %S = $remote_port - The port of the remote client. %X = ? - Keep alive status. %t = $time_local - The start time of the request. %r = $request - The first line of request containing method verb, path and protocol. %s = ? - Status before any redirections. %>s = $status - Status after any redirections. %{pid}P = $pid - The process id. %{tid}P = N/A - The thread id, which is non-applicable to Nignx. %T = ? - The time in seconds to handle the request. %D = ? - The time in milliseconds to handle the request. %I = ? - The count of bytes received including headers. %O = ? - The count of bytes sent including headers. %B = ? - The count of bytes sent excluding headers, but with a 0 for none instead of '-'. Looking for help filling in the missing variables, or confirmation that the missing variables are in fact, unavailable in Nginx.

Read the article
How Can We Create Blackbox Logs for Nginx?

- by Alan Gutierrez

There's an article out there, Profiling LAMP Applications with Apache's Blackbox Logs, that describes how to create a log that records a lot of detailed information missing in the common and combined log formats. This information is supposed to help you resolve performance issues. As the author notes "While the common log-file format (and the combined format) are great for hit tracking, they aren't suitable for getting hardcore performance data." The article describes a "blackbox" log format, like a blackbox flight recorder on an aircraft, that gathers information used to profile server performance, missing from the hit tracking log formats: Keep alive status, remote port, child processes, bytes sent, etc. LogFormat "%a/%S %X %t \"%r\" %s/%>s %{pid}P/%{tid}P %T/%D %I/%O/%B" blackbox I'm trying to recreate as much of the format for Nginx, and would like help filling in the blanks. Here's what Nginx blackbox format would look like, the unmapped Apache directives have question marks after their names. access_log blackbox '$remote_addr/$remote_port X? [$time_local] "$request"' 's?/$status $pid/0 T?/D? I?/$bytes_sent/$body_bytes_sent' Here's a table of the variables I've been able to map from the Nginx documentation. %a = $remote_addr - The IP address of the remote client. %S = $remote_port - The port of the remote client. %X = ? - Keep alive status. %t = $time_local - The start time of the request. %r = $request - The first line of request containing method verb, path and protocol. %s = ? - Status before any redirections. %>s = $status - Status after any redirections. %{pid}P = $pid - The process id. %{tid}P = N/A - The thread id, which is non-applicable to Nignx. %T = ? - The time in seconds to handle the request. %D = $request_time - The time in milliseconds to handle the request. %I = ? - The count of bytes received including headers. %O = $bytes_sent - The count of bytes sent including headers. %B = $body_bytes_sent - The count of bytes sent excluding headers, but with a 0 for none instead of '-'. Looking for help filling in the missing variables, or confirmation that the missing variables are in fact, unavailable in Nginx.

Read the article
mongodb : Can create new thread on FreeBSD?

- by user197739

We experienced some strange thing in our mongodb gridfs platform. The platform actually is a bi Xeon E5 (bi quad core) with 128GB of memory, running on freebsd 9 with a zfs pool dedicated for mongodb. [root@mongofile1 ~]# uname -sr FreeBSD 9.1-RELEASE our /boot/loader.conf vfs.zfs.arc_min="2048M" vfs.zfs.arc_max="7680M" vm.kmem_size_max="16G" vm.kmem_size="12G" vfs.zfs.prefetch_disable="1" kern.ipc.nmbclusters="32768" /etc/sysctl.conf net.inet.tcp.msl=15000 net.inet.tcp.keepidle=300000 kern.ipc.nmbclusters=32768 kern.ipc.maxsockbuf=2097152 kern.ipc.somaxconn=8192 kern.maxfiles=65536 kern.maxfilesperproc=32768 net.inet.tcp.delayed_ack=0 net.inet.tcp.sendspace=65535 net.inet.udp.recvspace=65535 net.inet.udp.maxdgram=57344 net.local.stream.recvspace=65535 net.local.stream.sendspace=65535 we follow the recommendation for the ulimit : [root@mongofile1 ~]# su - mongodb $ ulimit -a cpu time (seconds, -t) unlimited file size (512-blocks, -f) unlimited data seg size (kbytes, -d) 33554432 stack size (kbytes, -s) 524288 core file size (512-blocks, -c) unlimited max memory size (kbytes, -m) unlimited locked memory (kbytes, -l) unlimited max user processes (-u) 5547 open files (-n) 32768 virtual mem size (kbytes, -v) unlimited swap limit (kbytes, -w) unlimited sbsize (bytes, -b) unlimited pseudo-terminals (-p) unlimited This server have a twin (same config exactly) for ReplSet in other data center and we have a virtualized arbiter. Some time, almost 3 days, the process of mongodb exit. The problem begin with: Fri Nov 8 11:27:31.741 [conn774697] end connection 192.168.10.162:47963 (23 connections now open) Fri Nov 8 11:27:31.770 [initandlisten] can't create new thread, closing connection Fri Nov 8 11:27:31.771 [rsHealthPoll] replSet member mongofile2:27017 is now in state DOWN Fri Nov 8 11:27:31.774 [initandlisten] connection accepted from 192.168.10.162:47968 #774702 (20 connections now open) Fri Nov 8 11:27:31.774 [initandlisten] connection accepted from 192.168.10.161:28522 #774703 (21 connections now open) Fri Nov 8 11:27:31.774 [initandlisten] connection accepted from 192.168.10.164:15406 #774704 (22 connections now open) Fri Nov 8 11:27:31.774 [initandlisten] connection accepted from 192.168.10.163:25750 #774705 (23 connections now open) Fri Nov 8 11:27:31.810 [initandlisten] connection accepted from 192.168.10.182:20779 #774706 (24 connections now open) Fri Nov 8 11:27:31.855 [initandlisten] connection accepted from 192.168.10.161:28524 #774707 (25 connections now open) Fri Nov 8 11:27:31.869 [initandlisten] connection accepted from 192.168.10.182:20786 #774708 (26 connections now open) and after many "can create new thread" [root@mongofile1 /usr/mongodb]# tail -n 15000 mongod.log.old |grep "create new thread"|wc 5020 55220 421680 and finish by a magnificent Fri Nov 8 11:30:22.333 [rsMgr] replSet warning caught unexpected exception in electSelf() pure virtual method called Fri Nov 8 11:30:22.333 Got signal: 6 (Abort trap: 6). Fri Nov 8 11:30:22.337 Backtrace: 0x599efc 0x8035cb516 0x599efc <_ZN5mongo10abruptQuitEi+988> at /usr/local/bin/mongod 0x8035cb516 <_pthread_sigmask+918> at /lib/libthr.so.3 Extract of mongodb from top 78126 mongodb 77 20 0 1253G 1449M sbwait 0 0:20 0.00% mongod If I restart the process when it crash, the problem is fixed for almost 3 days. Has anyone seen this before, or know of a fix?

Read the article

Search Results

Search found 5679 results on 228 pages for 'kill processes'.

Page 207/228 | < Previous Page | 203 204 205 206 207 208 209 210 211 212 213 214 | Next Page >

- by ManiacZX

- by quanta

- by Paul Eisner

- by milkmansrevenge

- by Andrei Bârsan

- by Martin F

- by Robbie

- by Edward De Leau

- by sheldon

- by DeaconDesperado

- by ApacheQ

- by Kyle Brandt

- by dunxd

- by viraptor

- by Alex Popov

- by threecheeseopera

- by mfinni

- by D.W.

- by Matthew Phipps

- by Max Pollack

- by kostmo

- by mfinni

- by Alan Gutierrez

- by Alan Gutierrez

- by user197739

< Previous Page | 203 204 205 206 207 208 209 210 211 212 213 214 | Next Page >