Search Results

Search found 99 results on 4 pages for 'ulimit'.

Page 4/4 | < Previous Page | 1 2 3 4 

  • Replicated MongoDB server slower than simple shards

    - by displayName
    I tried to compare the performance of a sharded configuration against a sharded and replicated configuration. The sharded configuration consists of 8 shards each running on three different machines thereby constituting a total of 24 shards. All 8 of these shards run in the same partition on each machine. The sharded and replicated version is 8 shards again just like plain sharding, and all 8 mongods run on the same partition in each machine. But apart from this, each of these three machine now run additional 16 threads on another partition which serve as the secondary for the 8 mongods running on other machines. This is the way I prepared a sharded and replicated configuration with data chunks having replication factor of 3. Important point to note is that once the data has been loaded, it is not modified. So after primary and secondaries have synchronized then it doesn't matter which one i read from. To run the queries, I use an entirely different machine (let's call it config) which runs mongos and this machine's only purpose is to receive queries and run them on the cluster. Contrary to my expectations, plain sharding of 8 threads on each machine (total = 3 * 8 = 24) is performing better for queries than the sharded + replicated configuration. I have a script written to perform the query. So in order to time the scripts, I use time ./testScript and see the result. I tried changing the reading preference for replicated cluster by logging to mongo of config and run db.getMongo().setReadPref('secondary') and then exit the shell and run the queries like time ./testScript. The questions are: Where am i going wrong in the replication? Why is it slower than its plain sharding version? Does the db.getMongo().ReadPref('secondary') persist when i leave the shell and try to perform the query? All the four machines are running Linux and i have already increased the ulimit -n to 2048 from initial value of 1024 to allow more connections. The collections are properly distributed and all the mongods have equal number of chunks. Goes without saying that indices in both configurations are the same.

    Read the article

  • Maximum limit of filepointer in php reached and not changeable

    - by mlaug
    I have a server with the current 5.3.x version installed. Since we are running a really simple and small server in php using sockets, that connects to a lot clients using sockets we need to raise the open file limit that has been already done on the server for the user, that runs the server #ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 29879 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 8192 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 29879 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited and we compiled php with --enable-fd-setsize=8192 still we are getting [19-Nov-2012 09:24:23 Europe/Berlin] PHP Warning: socket_select(): You MUST recompile PHP with a larger value of FD_SETSIZE. It is set to 1024, but you have descriptors numbered at least as high as 1024. --enable-fd-setsize=2048 is recommended, but you may want to set it to equal the maximum number of open files supported by your system, in order to avoid seeing this error again at a later date. once in a while in our logs. Anyone knows who to configure the unix server and php correctly to have that working? I found a bug, but that is related to 2006 and marked as "not a bug" https://bugs.php.net/bug.php?id=37025&edit=1

    Read the article

  • Server Recovery from Denial of Service

    - by JMC
    I'm looking at a server that might be misconfigured to handle Denial of Service. The database was knocked offline after the attack, and was unable to restart itself after it failed to restart when the attack subsided. Details of the Attack: The Attacker either intentionally or unintentionally sent 1000's of search queries using the applications search query url within a couple of seconds. It looks like the server was overwhelmed and it caused the database to log this message: Server Specs: 1.5GB of dedicated memory Are there any obvious mis-configurations here that I'm missing? **mysql.log** 121118 20:28:54 mysqld_safe Number of processes running now: 0 121118 20:28:54 mysqld_safe mysqld restarted 121118 20:28:55 [Warning] option 'slow_query_log': boolean value '/var/log/mysqld.slow.log' wasn't recognized. Set to OFF. 121118 20:28:55 [Note] Plugin 'FEDERATED' is disabled. 121118 20:28:55 InnoDB: The InnoDB memory heap is disabled 121118 20:28:55 InnoDB: Mutexes and rw_locks use GCC atomic builtins 121118 20:28:55 InnoDB: Compressed tables use zlib 1.2.3 121118 20:28:55 InnoDB: Using Linux native AIO 121118 20:28:55 InnoDB: Initializing buffer pool, size = 512.0M InnoDB: mmap(549453824 bytes) failed; errno 12 121118 20:28:55 InnoDB: Completed initialization of buffer pool 121118 20:28:55 InnoDB: Fatal error: cannot allocate memory for the buffer pool 121118 20:28:55 [ERROR] Plugin 'InnoDB' init function returned error. 121118 20:28:55 [ERROR] Plugin 'InnoDB' registration as a STORAGE ENGINE failed. 121118 20:28:55 [ERROR] Unknown/unsupported storage engine: InnoDB 121118 20:28:55 [ERROR] Aborting **ulimit -a** core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 13089 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 1024 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited **httpd.conf** StartServers 10 MinSpareServers 8 MaxSpareServers 12 ServerLimit 256 MaxClients 256 MaxRequestsPerChild 4000 **my.cnf** innodb_buffer_pool_size=512M # Increase Innodb Thread Concurrency = 2 * [numberofCPUs] + 2 innodb_thread_concurrency=4 # Set Table Cache table_cache=512 # Set Query Cache_Size query_cache_size=64M query_cache_limit=2M # A sort buffer is used for optimizing sorting sort_buffer_size=8M # Log slow queries slow_query_log=/var/log/mysqld.slow.log long_query_time=2 #performance_tweak join_buffer_size=2M **php.ini** memory_limit = 128M post_max_size = 8M

    Read the article

  • Can Haproxy deny a request by IP if its stick-table is full?

    - by bantic
    In my haproxy configs I'm setting a stick-table of size 5 that stores every incoming IP address (for 1 minute), and it is set as nopurge so new entries won't get stored in the table. What I'd like to have happen is that they would get denied, but that isn't happening. The stick-table line is: stick-table type ip size 5 expire 1m nopurge store gpc0 And the whole configs are: global maxconn 30000 ulimit-n 65536 log 127.0.0.1 local0 log 127.0.0.1 local1 debug stats socket /var/run/haproxy.stat mode 600 level operator defaults mode http timeout connect 5000ms timeout client 50000ms timeout server 50000ms backend fragile_backend tcp-request content track-sc2 src stick-table type ip size 5 expire 1m nopurge store gpc0 server fragile_backend1 A.B.C.D:80 frontend http_proxy bind *:80 mode http option forwardfor default_backend fragile_backend I have confirmed (connecting to haproxy's stats using socat readline /var/run/haproxy.stat) that the stick-table fills up with 5 IP addresses, but then every request after that from a new IP just goes straight through -- it isn't added to the stick-table, nothing is removed from the stick-table, and the request is not denied. What I'd like to do is deny the request if the stick-table is full. Is this possible? I'm using haproxy 1.5.

    Read the article

  • Why is MySQL unable to open hosts.allow/hosts.deny?

    - by HonoredMule
    I have a storage server running Nexenta (OpenSolaris kernel, Ubuntu userspace) with MySQL on top of a ZFS storage array, using innodb_file_per_table and ulimit -n set to 8K. mysqltuner.pl confirms the file limit and claims there are 169 files. The following command: pfiles `fuser -c / 2>/dev/null indicates one mysqld process having 485 file/device descriptors (and they're almost all for files) so I don't know how reliable the tuning script is, but it is still way less than 8K and this list also finds no other process which is close to it's limit. The global total number of descriptors in use is around 1K. So what can cause mysqld to be constantly streaming the following errors? [date] [host] mysqld[pid]: warning: cannot open /etc/hosts.allow: Too many open files [date] [host] mysqld[pid]: warning: cannot open /etc/hosts.deny: Too many open files Everything appears to actually be operating fine, but the issue is constantly flooding the admin console and starts right away on a fresh boot (not only reproducible, but always from mysqld and always the hosts files, whose permissions are the default -rw-r--r-- 1 root root). I could, of course, suppress it from the admin console but I'd rather get to the bottom of it and still allow mysqld warnings/errors to reach the admin console. EDIT: not only is the actual file descriptor well within sane limits, the issue also persists (with immediate appearance) even with the file limit raised to 65535 and always only on hosts.allow/deny.

    Read the article

  • Does Mac OS X throttle the RATE of socket creation?

    - by pbhogan
    This may seem programming related, but this is an OS question. I'm writing a small high performance daemon that takes thousands of connections per second. It's working fine on Linux (specifically Ubuntu 9.10 on EC2). On Mac OS X if I throw a few thousand connections at it (roughly about 16350) in a benchmark that simply opens a connection, does it's thing and closes the connection, then the benchmark program hangs for several seconds waiting for a socket to become available before continuing (or timing out in the process). I used both Apache Bench as well as Siege (to make sure it wasn't the benchmark application). So why/how is Mac OS X limiting the RATE at which sockets can be used, and can I stop it from doing this? Or is there something else going on? I know there is a file descriptor limit, but I'm not hitting that. There is no error on accepting a socket, it's simply hangs for a while after the first (roughly) 16000, waiting -- I assume -- for the OS to release a socket. This shouldn't happen since all prior the sockets are closed at that point. They're supposed to come available at the rate they're closed, and do on Ubuntu, but there seems to be some kind of multi (5-10?) second delay on Mac OS X. I tried tweaking with ulimit every-which-way. Nada.

    Read the article

  • Does Mac OS X throttle the RATE of socket creation?

    - by pbhogan
    This may seem programming related, but this is an OS question. I'm writing a small high performance daemon that takes thousands of connections per second. It's working fine on Linux (specifically Ubuntu 9.10 on EC2). On Mac OS X if I throw a few thousand connections at it (roughly about 16350) in a benchmark that simply opens a connection, does it's thing and closes the connection, then the benchmark program hangs for several seconds waiting for a socket to become available before continuing (or timing out in the process). I used both Apache Bench as well as Siege (to make sure it wasn't the benchmark application). So why/how is Mac OS X limiting the RATE at which sockets can be used, and can I stop it from doing this? Or is there something else going on? I know there is a file descriptor limit, but I'm not hitting that. There is no error on accepting a socket, it's simply hangs for a while after the first (roughly) 16000, waiting -- I assume -- for the OS to release a socket. This shouldn't happen since all prior the sockets are closed at that point. They're supposed to come available at the rate they're closed, and do on Ubuntu, but there seems to be some kind of multi (5-10?) second delay on Mac OS X. I tried tweaking with ulimit every-which-way. Nada.

    Read the article

  • Should I upgrade to "Ubuntu 14.04 'Trusty Tahr'" from "Ubuntu 12.04 LTS" and what care do I need to take if I upgrade?

    - by PHPLover
    I'm basically a Web Developer(PHP Developer) by profession. I mainly work on PHP, jQuery, AJAX, Smarty, HTML and CSS, Bootstrap front-end web development framework. I've also installed and using IDEs/editors like Sublime Text, NetBeans. I'm also using Git repository for my website development as a versioning tool. I'm using "Ubuntu 12.04 LTS" on my machine almost since last two years. My machine configuraion is as follows: Memory : 3.7 GiB Processor : Intel® Core™ i3 CPU M 370 @ 2.40GHz × 4 Graphics : Unknown OS type : 64-bit Disk : 64-bit The important softwares present on my machine and which I'm using daily for my work are as follows: PHP : PHP 5.3.10-1ubuntu3.13 with Suhosin-Patch (cli) (built: Jul 7 2014 18:54:55) Copyright (c) 1997-2012 The PHP Group Zend Engine v2.3.0, Copyright (c) 1998-2012 Zend Technologies Apache web server : /usr/sbin/apachectl: 87: ulimit: error setting limit (Operation not permitted) Server version: Apache/2.2.22 (Ubuntu) Server built: Jul 22 2014 14:35:25 Server's Module Magic Number: 20051115:30 Server loaded: APR 1.4.6, APR-Util 1.3.12 Compiled using: APR 1.4.6, APR-Util 1.3.12 Architecture: 64-bit Server MPM: Prefork threaded: no forked: yes (variable process count) Server compiled with.... -D APACHE_MPM_DIR="server/mpm/prefork" -D APR_HAS_SENDFILE -D APR_HAS_MMAP -D APR_HAVE_IPV6 (IPv4-mapped addresses enabled) -D APR_USE_SYSVSEM_SERIALIZE -D APR_USE_PTHREAD_SERIALIZE -D SINGLE_LISTEN_UNSERIALIZED_ACCEPT -D APR_HAS_OTHER_CHILD -D AP_HAVE_RELIABLE_PIPED_LOGS -D DYNAMIC_MODULE_LIMIT=128 -D HTTPD_ROOT="/etc/apache2" -D SUEXEC_BIN="/usr/lib/apache2/suexec" -D DEFAULT_PIDLOG="/var/run/apache2.pid" -D DEFAULT_SCOREBOARD="logs/apache_runtime_status" -D DEFAULT_LOCKFILE="/var/run/apache2/accept.lock" -D DEFAULT_ERRORLOG="logs/error_log" -D AP_TYPES_CONFIG_FILE="mime.types" -D SERVER_CONFIG_FILE="apache2.conf" MySQL : 5.5.38-0ubuntu0.12.04.1 Smarty : 2.6.18 **NetBeans :** NetBeans IDE 8.0 (Build 201403101706) Sublime Text 2 : Version 2.0.2, Build 2221 Yesterday suddenly a pop-up message appeared on my screen asking me to upgrade to "Ubuntu 14.04 'Trusty Tahr'". I'd also be very happy to upgrade my system to "Ubuntu 14.04 'Trusty Tahr'". Following are the issues about which I'm little bit scared about and I need you all talented people's expert advice/help/suggestions on it: Will upgrading to "Ubuntu 14.04 'Trusty Tahr'" affect the softwares I mentioned above? I mean will I need to re-install/un-install and install these softwares too? Do I really need to and is it really a worth to upgrade to "Ubuntu 14.04 'Trusty Tahr'" from "Ubuntu 12.04 LTS" now? If I upgrade to "Ubuntu 14.04 'Trusty Tahr'" what advantage I'll get from web developer's point of view? Will the upgrade be hassle free and will I be ablr to continue my on-going work without any difficulties? Is "Ubuntu 14.04 'Trusty Tahr'" a LTS version and if yes till when it's going to provide support? These are the five crucial queries I have. If you want any further explanation from me please feel free to ask me. Thanks for spending some of your vaulable time in reading and understanding my issue. Any kind of help/comment/suggestion/answer would be highly appreciated. Though if someone gives canonical, precise and up to the mark answer, it will be of great help to me as well as other web developers using Ubuntu around the world. Once again thank you so much you great people around the globe. Waiting for your precious replies.

    Read the article

  • Segmentation fault on MPI, runs properly on OpenMP

    - by Bellman
    Hi, I am trying to run a program on a computer cluster. The structure of the program is the following: PROGRAM something ... CALL subroutine1(...) ... END PROGRAM SUBROUTINE subroutine1(...) ... DO i=1,n CALL subroutine2(...) ENDDO ... END SUBROUTINE SUBROUTINE subroutine2(...) ... CALL subroutine3(...) CALL subroutine4(...) ... END SUBROUTINE The idea is to parallelize the loop that calls subroutine2. Main program basically only makes the call to subroutine1 and only its arguments are declared. I use two alternatives. On the one hand, I write OpenMP clauses arround the loop. On the other hand, I add an IF conditional branch arround the call and I use MPI to share the results. In the OpenMP case, I add CALL KMP_SET_STACKSIZE(402653184) at the beginning of the main program and I can run it with 8 threads on an 8 core machine. When I run it (on the same 8 core machine) with MPI (either using 8 or 1 processors) it crashes just when makes the call to subroutine3 with a segmentation fault (signal 11) error. If I comment subroutine4, then it doesn't crash (notice that it crashed just when calling subroutine3 and it works when commenting subroutine4). I compile with mpif90 using MPICH2 libraries and the following flags: -O3 -fpscomp logicals -openmp -threads -m64 -xS. The machine has EM64T architecture and I use a Debian Linux distribution. I set ulimit -s hard before running the program. Any ideas on what is going on? Has it something to do with stack size? Thanks in advance

    Read the article

  • MD5 wrong number of arguments (1 for 0) Error

    - by Salil
    Hi All, I have following error on my Server which is working properly on my local on following line . event_id = MD5.new("event-init-flash-#{Asteroid::now}").to_s #line 232 ERROR: wrong number of arguments (1 for 0) /ruby/gems/gems/shooting_star-3.2.7/bin/../lib/shooting_star/server.rb:232:in initialize' /ruby/gems/gems/shooting_star-3.2.7/bin/../lib/shooting_star/server.rb:232:in new' /ruby/gems/gems/shooting_star-3.2.7/bin/../lib/shooting_star/server.rb:232:in make_flash_connection' /ruby/gems/gems/shooting_star-3.2.7/bin/../lib/shooting_star/server.rb:70:in receive_data' /ruby/gems/gems/shooting_star-3.2.7/bin/../lib/shooting_star.rb:87:in run' /ruby/gems/gems/shooting_star-3.2.7/bin/../lib/shooting_star.rb:87:in start' /ruby/gems/gems/shooting_star-3.2.7/bin/shooting_star:61 /ruby/gems/bin/shooting_star:19:in `load' /ruby/gems/bin/shooting_star:19 POST /10 HTTP/1.1 Host: 67.222.55.30:8080 Content-length: 103 I used shooting_star to create an Chat Application. Ref:- http://github.com/genki/shooting-star Following are the REQUIREMENTS of the shooting_star Linux or xBSD OS having epoll or kqueue. Increase ulimit of nofile up to over 100,000. (edit /etc/security/limits.conf file.) prototype.js 1.5.0+ Ruby 1.8.5+ Ruby on Rails 1.2.0+ My Local Configuration are O.S Linux Ruby ruby 1.8.6 (2009-08-04 patchlevel 383) [i386-linux] Rails 2.3.4 shooting_star 3.2.7 prototype.js 1.6.0.3 My Server Configuration are O.S Linux Ruby ruby 1.8.6 (2009-08-04 patchlevel 383) [x86_64-linux] Rails 2.3.4 shooting_star 3.2.7 prototype.js 1.6.0.3 I just want to know what is the problem why it's not working on server if everything is fine in local. Regards, Salil Gaikwad

    Read the article

  • NPTL Default Stack Size Problem

    - by eyazici
    Hello, I am developing a multithread modular application using C programming language and NPTL 2.6. For each plugin, a POSIX thread is created. The problem is each thread has its own stack area, since default stack size depends on user's choice, this may results in huge memory consumption in some cases. To prevent unnecessary memory usage I used something similar to this to change stack size before creating each thread: pthread_attr_t attr; pthread_attr_init (&attr); pthread_attr_getstacksize(&attr, &st1); if(pthread_attr_setstacksize (&attr, MODULE_THREAD_SIZE) != 0) perror("Stack ERR"); pthread_attr_getstacksize(&attr, &st2); printf("OLD:%d, NEW:%d - MIN: %d\n", st1, st2, PTHREAD_STACK_MIN); pthread_attr_setdetachstate(&attr, PTHREAD_CREATE_DETACHED); /* "this" is static data structure that stores plugin related data */ pthread_create(&this->runner, &attr, (void *)(void *)this->run, NULL); EDIT I: pthread_create() section added. This did not work work as I expected, the stack size reported by pthread_attr_getstacksize() is changed but total memory usage of the application (from ps/top/pmap output) did not changed: OLD:10485760, NEW:65536 - MIN: 16384 When I use ulimit -s MY_STACK_SIZE_LIMIT before starting application I achieve the expected result. My questions are: 1-) Is there any portable(between UNIX variants) way to change (default)thread stack size after starting application(before creating thread of course)? 2-) Is it possible to use same stack area for every thread? 3-) Is it possible completely disable stack for threads without much pain?

    Read the article

  • Too many open files in one of my java routine.

    - by Irfan Zulfiqar
    I have a multithreaded code that has to generated a set of objects and write them to a file. When I run it I sometime get "Too many open files" message in Exception. I have checked the code to make sure that all the file streams are being closed properly. Here is the stack trace. When I do ulimit -a, open files allowed is set to 1024. We think increasing this number is not a viable option / solution. [java] java.io.FileNotFoundException: /export/event_1_0.dtd (Too many open files) [java] at java.io.FileInputStream.open(Native Method) [java] at java.io.FileInputStream.<init>(FileInputStream.java:106) [java] at java.io.FileInputStream.<init>(FileInputStream.java:66) [java] at sun.net.www.protocol.file.FileURLConnection.connect(FileURLConnection.java:70) [java] at sun.net.www.protocol.file.FileURLConnection.getInputStream(FileURLConnection.java:161) [java] at java.net.URL.openStream(URL.java:1010) Now what we have identified so far by looking closely at the list of open files is that the VM is opening same class file multiple times. /export/BaseEvent.class 236 /export/EventType1BaseEvent.class 60 /export/EventType2BaseEvent.class 48 /export/EventType2.class 30 /export/EventType1.class 14 Where BaseEvent is partent of all the classes and EventType1 ant EventType2 inherits EventType1BaseEvent and EventType2BaseEvent respectively. Why would a class loader load the same class file 200+ times. It seems it is opening up the base class as many time it create any child instance. Is this normal? Can it be handler any other way apart from increasing the number of open files?

    Read the article

  • File descriptor limits and default stack sizes

    - by Charles
    Where I work we build and distribute a library and a couple complex programs built on that library. All code is written in C and is available on most 'standard' systems like Windows, Linux, Aix, Solaris, Darwin. I started in the QA department and while running tests recently I have been reminded several times that I need to remember to set the file descriptor limits and default stack sizes higher or bad things will happen. This is particularly the case with Solaris and now Darwin. Now this is very strange to me because I am a believer in 0 required environment fiddling to make a product work. So I am wondering if there are times where this sort of requirement is a necessary evil, or if we are doing something wrong. Edit: Great comments that describe the problem and a little background. However I do not believe I worded the question well enough. Currently, we require customers, and hence, us the testers, to set these limits before running our code. We do not do this programatically. And this is not a situation where they MIGHT run out, under normal load our programs WILL run out and seg fault. So rewording the question, is requiring the customer to change these ulimit values to run our software to be expected on some platforms, ie, Solaris, Aix, or are we as a company making it to difficult for these users to get going? Bounty: I added a bounty to hopefully get a little more information on what other companies are doing to manage these limits. Can you set these pragmatically? Should we? Should our programs even be hitting these limits or could this be a sign that things might be a bit messy under the covers? That is really what I want to know, as a perfectionist a seemingly dirty program really bugs me.

    Read the article

  • gunicorn + django + nginx unix://socket failed (11: Resource temporarily unavailable)

    - by user1068118
    Running very high volume traffic on these servers configured with django, gunicorn, supervisor and nginx. But a lot of times I tend to see 502 errors. So I checked the nginx logs to see what error and this is what is recorded: [error] 2388#0: *208027 connect() to unix:/tmp/gunicorn-ourapp.socket failed (11: Resource temporarily unavailable) while connecting to upstream Can anyone help debug what might be causing this to happen? This is our nginx configuration: sendfile on; tcp_nopush on; tcp_nodelay off; listen 80 default_server; server_name imp.ourapp.com; access_log /mnt/ebs/nginx-log/ourapp-access.log; error_log /mnt/ebs/nginx-log/ourapp-error.log; charset utf-8; keepalive_timeout 60; client_max_body_size 8m; gzip_types text/plain text/xml text/css application/javascript application/x-javascript application/json; location / { proxy_pass http://unix:/tmp/gunicorn-ourapp.socket; proxy_pass_request_headers on; proxy_read_timeout 600s; proxy_connect_timeout 600s; proxy_redirect http://localhost/ http://imp.ourapp.com/; #proxy_set_header Host $host; #proxy_set_header X-Real-IP $remote_addr; #proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; #proxy_set_header X-Forwarded-Proto $my_scheme; #proxy_set_header X-Forwarded-Ssl $my_ssl; } We have configure Django to run in Gunicorn as a generic WSGI application. Supervisord is used to launch the gunicorn workers: home/user/virtenv/bin/python2.7 /home/user/virtenv/bin/gunicorn --config /home/user/shared/etc/gunicorn.conf.py daggr.wsgi:application This is what the gunicorn.conf.py looks like: import multiprocessing bind = 'unix:/tmp/gunicorn-ourapp.socket' workers = multiprocessing.cpu_count() * 3 + 1 timeout = 600 graceful_timeout = 40 Does anyone know where I can start digging to see what might be causing the problem? This is what my ulimit -a output looks like on the server: core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 59481 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 50000 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 1024 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited

    Read the article

  • Django + gunicorn + virtualenv + Supervisord issue

    - by Florian Le Goff
    Dear all, I have a strange issue with my virtualenv + gunicorn setup, only when gunicorn is launched via supervisord. I do realize that it may very well be an issue with my supervisord and I would appreciate any feedback on a better place to ask for help... In a nutshell : when I run gunicorn from my user shell, inside my virtualenv, everything is working flawlessly. I'm able to access all the views of my Django project. When gunicorn is launched by supervisord at the system startup, everything is OK. But, if I have to kill the gunicorn_django processes, or if I perform a supervisord restart, once that gunicorn_django has relaunched, every request is answered with a weird Traceback : (...) File "/home/hc/prod/venv/lib/python2.6/site-packages/Django-1.2.5-py2.6.egg/django/db/__init__.py", line 77, in connection = connections[DEFAULT_DB_ALIAS] File "/home/hc/prod/venv/lib/python2.6/site-packages/Django-1.2.5-py2.6.egg/django/db/utils.py", line 92, in __getitem__ backend = load_backend(db['ENGINE']) File "/home/hc/prod/venv/lib/python2.6/site-packages/Django-1.2.5-py2.6.egg/django/db/utils.py", line 50, in load_backend raise ImproperlyConfigured(error_msg) TemplateSyntaxError: Caught ImproperlyConfigured while rendering: 'django.db.backends.postgresql_psycopg2' isn't an available database backend. Try using django.db.backends.XXX, where XXX is one of: 'dummy', 'mysql', 'oracle', 'postgresql', 'postgresql_psycopg2', 'sqlite3' Error was: cannot import name utils Full stack available here : http://pastebin.com/BJ5tNQ2N I'm running... Ubuntu/maverick (up-to-date) Python = 2.6.6 virtualenv = 1.5.1 gunicorn = 0.12.0 Django = 1.2.5 psycopg2 = '2.4-beta2 (dt dec pq3 ext)' gunicorn configuration : backlog = 2048 bind = "127.0.0.1:8000" pidfile = "/tmp/gunicorn-hc.pid" daemon = True debug = True workers = 3 logfile = "/home/hc/prod/log/gunicorn.log" loglevel = "info" supervisord configuration : [program:gunicorn] directory=/home/hc/prod/hc command=/home/hc/prod/venv/bin/gunicorn_django -c /home/hc/prod/hc/gunicorn.conf.py user=hc umask=022 autostart=True autorestart=True redirect_stderr=True Any advice ? I've been stuck on this one for quite a while. It seems like some weird memory limit, as I'm not enforcing anything special : $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 20 file size (blocks, -f) unlimited pending signals (-i) 16382 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) unlimited virtual memory (kbytes, -v) unlimited file locks (-x) unlimited Thank you.

    Read the article

  • Nodemanager Init.d Script

    - by john.graves(at)oracle.com
    I’ve seen many of these floating around.  This is my favourite on an Ubuntu based machine. Just throw it into the /etc/init.d directory and update the following lines: export MW_HOME=/opt/app/wls10.3.4 user='weblogic' Then run: update-rc.d nodemanager default Everything else should be ok for 10.3.4. #!/bin/sh # ### BEGIN INIT INFO # Provides: nodemanager # Required-Start: # Required-Stop: # Default-Start: 2 3 4 5 # Default-Stop: 0 1 6 # Short-Description: WebLogic Nodemanager ### END INIT INFO # nodemgr Oracle Weblogic NodeManager service # # chkconfig: 345 85 15 # description: Oracle Weblogic NodeManager service # ### BEGIN INIT INFO # Provides: nodemgr # Required-Start: $network $local_fs # Required-Stop: # Should-Start: # Should-Stop: # Default-Start: 3 4 5 # Default-Stop: 0 1 2 6 # Short-Description: Oracle Weblogic NodeManager service. # Description: Starts and stops Oracle Weblogic NodeManager. ### END INIT INFO # Source function library. . /lib/lsb/init-functions # set Weblogic environment defining CLASSPATH and LD_LIBRARY_PATH # to start/stop various components. export MW_HOME=/opt/app/wls10.3.4 # # Note: # The setWLSEnv.sh not only does a good job of setting the environment, # but also advertises the fact explicitly in the console! Silence it. # . $MW_HOME/wlserver_10.3/server/bin/setWLSEnv.sh > /dev/null # set NodeManager environment export NodeManagerHome=$WL_HOME/common/nodemanager NodeManagerLockFile=$NodeManagerHome/nodemanager.log.lck # check JAVA_HOME if [ -z ${JAVA_HOME:-} ]; then export JAVA_HOME=/opt/sun/products/java/jdk1.6.0_18 fi exec=$MW_HOME/wlserver_10.3/server/bin/startNodeManager.sh prog='nodemanager' user='weblogic' is_nodemgr_running() { local nodemgr_cnt=`ps -ef | \ grep -i 'java ' | \ grep -i ' weblogic.NodeManager ' | \ grep -v grep | \ wc -l` echo $nodemgr_cnt } get_nodemgr_pid() { nodemgr_pid=0 if [ `is_nodemgr_running` -eq 1 ]; then nodemgr_pid=`ps -ef | \ grep -i 'java ' | \ grep -i ' weblogic.NodeManager ' | \ grep -v grep | \ tr -s ' ' | \ cut -d' ' -f2` fi echo $nodemgr_pid } check_nodemgr_status () { local retval=0 local nodemgr_cnt=`is_nodemgr_running` if [ $nodemgr_cnt -eq 0 ]; then if [ -f $NodeManagerLockFile ]; then retval=2 else retval=3 fi elif [ $nodemgr_cnt -gt 1 ]; then retval=4 else retval=0 fi echo $retval } start() { ulimit -n 65535 [ -x $exec ] || exit 5 echo -n $"Starting $prog: " su $user -c "$exec &" retval=$? echo return $retval } stop() { echo -n $"Stopping $prog: " kill -s 9 `get_nodemgr_pid` &> /dev/null retval=$? echo [ $retval -eq 0 ] && rm -f $NodeManagerLockFile return $retval } restart() { stop start } reload() { restart } force_reload() { restart } rh_status() { local retval=`check_nodemgr_status` if [ $retval -eq 0 ]; then echo "$prog (pid:`get_nodemgr_pid`) is running..." elif [ $retval -eq 4 ]; then echo "Multiple instances of $prog are running..." else echo "$prog is stopped" fi return $retval } rh_status_q() { rh_status >/dev/null 2>&1 } case "$1" in start) rh_status_q && exit 0 $1 ;; stop) rh_status_q || exit 0 $1 ;; restart) $1 ;; reload) rh_status_q || exit 7 $1 ;; force-reload) force_reload ;; status) rh_status ;; condrestart|try-restart) rh_status_q || exit 0 restart ;; *) echo -n "Usage: $0 {" echo -n "start|" echo -n "stop|" echo -n "status|" echo -n "restart|" echo -n "condrestart|" echo -n "try-restart|" echo -n "reload|" echo -n "force-reload" echo "}" exit 2 esac exit $? .csharpcode, .csharpcode pre { font-size: small; color: black; font-family: consolas, "Courier New", courier, monospace; background-color: #ffffff; /*white-space: pre;*/ } .csharpcode pre { margin: 0em; } .csharpcode .rem { color: #008000; } .csharpcode .kwrd { color: #0000ff; } .csharpcode .str { color: #006080; } .csharpcode .op { color: #0000c0; } .csharpcode .preproc { color: #cc6633; } .csharpcode .asp { background-color: #ffff00; } .csharpcode .html { color: #800000; } .csharpcode .attr { color: #ff0000; } .csharpcode .alt { background-color: #f4f4f4; width: 100%; margin: 0em; } .csharpcode .lnum { color: #606060; }

    Read the article

  • crash in calloc

    - by mmd
    I'm trying to debug a program I wrote. I ran it inside gdb and I managed to catch a SIGABRT from inside calloc(). I'm completely confused about how this can arise. Can it be a bug in gcc or even libc?? More details: My program uses OpenMP. I ran it through valgrind in single-threaded mode with no errors. I also use mmap() to load a 40GB file, but I doubt that is relevant. Inside gdb, I'm running with 30 threads. Several identical runs (same input&CL) finished correctly, until the problematic one that I caught. On the surface this suggests there might be a race condition of some type. However, the SIGABRT comes from calloc() which is out of my control. Here is some relevant gdb output: (gdb) info threads [...] * 11 Thread 0x7ffff0056700 (LWP 73449) 0x00007ffff6a948a5 in raise () from /lib64/libc.so.6 [...] (gdb) thread 11 [Switching to thread 11 (Thread 0x7ffff0056700 (LWP 73449))]#0 0x00007ffff6a948a5 in raise () from /lib64/libc.so.6 (gdb) bt #0 0x00007ffff6a948a5 in raise () from /lib64/libc.so.6 #1 0x00007ffff6a96085 in abort () from /lib64/libc.so.6 #2 0x00007ffff6ad1fe7 in __libc_message () from /lib64/libc.so.6 #3 0x00007ffff6ad7916 in malloc_printerr () from /lib64/libc.so.6 #4 0x00007ffff6adb79f in _int_malloc () from /lib64/libc.so.6 #5 0x00007ffff6adbdd6 in calloc () from /lib64/libc.so.6 #6 0x000000000040e87f in my_calloc (re=0x7fff2867ef10, st=0, options=0x632020) at gmapper/../gmapper/../common/my-alloc.h:286 #7 read_get_hit_list_per_strand (re=0x7fff2867ef10, st=0, options=0x632020) at gmapper/mapping.c:1046 #8 0x000000000041308a in read_get_hit_list (re=<value optimized out>, options=0x632010, n_options=1) at gmapper/mapping.c:1239 #9 handle_read (re=<value optimized out>, options=0x632010, n_options=1) at gmapper/mapping.c:1806 #10 0x0000000000404f35 in launch_scan_threads (.omp_data_i=<value optimized out>) at gmapper/gmapper.c:557 #11 0x00007ffff7230502 in ?? () from /usr/lib64/libgomp.so.1 #12 0x00007ffff6dfc851 in start_thread () from /lib64/libpthread.so.0 #13 0x00007ffff6b4a11d in clone () from /lib64/libc.so.6 (gdb) f 6 #6 0x000000000040e87f in my_calloc (re=0x7fff2867ef10, st=0, options=0x632020) at gmapper/../gmapper/../common/my-alloc.h:286 286 res = calloc(size, 1); (gdb) p size $2 = 814080 (gdb) The function my_calloc() is just a wrapper, but the problem is not in there, as the real calloc() call looks legit. These are the limits set in the shell: $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 2067285 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 1024 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited The program is not out of memory, it's using 41GB on a machine with 256GB available: $ top -b -n 1 | grep gmapper 73437 user 20 0 41.5g 16g 15g T 0.0 6.6 55:17.24 gmapper-ls $ free -m total used free shared buffers cached Mem: 258437 195567 62869 0 82 189677 -/+ buffers/cache: 5807 252629 Swap: 0 0 0 I compiled using gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), with flags -g -O2 -DNDEBUG -mmmx -msse -msse2 -fopenmp -Wall -Wno-deprecated -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS.

    Read the article

  • Huawei b153 limit of devices

    - by bdecaf
    I set up my home network all through this 3G wifi router. Problem is it only allows 5 devices to connect. That's not much especially if a wifi printer and gaming consoles keep hogging these slots. On the other hand I don't see the point on blocking these devices. They are (should) not doing anything online just intern in my network. The documentation I can find is surpirisingly unhelpful and focuses how to plug the device in a power socket. So what would be my options. Notes: I have already been able to get a shell on the device using ssh. It's running some Busybox. But I fail to find the how and where this limit is enforced/created. Notes 2: Specifically my device is a 3WebCube - unfortunately not specifically marked with the Huawei Model number. Successes so far After enabling ssh in the options I can login: ssh -T [email protected] [email protected]'s password: ------------------------------- -----Welcome to ATP Cli------ ------------------------------- unfortunately because of this -T - the tab key does not work for autocomplete and all inputted commands will be echoed. Also no history with arrow keys. ATP interface this custom interface is not very useful: ATP>help help Welcome to ATP command line tool. If any question, please input "?" at the end of command. ATP>? ? cls debug help save ? exit ATP>save? save? Command failed. ATP>save ? save ? ATP>debug ? debug ? display set trace ? Shell BUT undocumented - I somehow found on a auto translated chinese website - all you need to do is input sh ATP>sh sh BusyBox vv1.9.1 (2011-03-27 11:59:11 CST) built-in shell (ash) Enter 'help' for a list of built-in commands. # builtin commands # help Built-in commands: ------------------- . : alias bg break cd chdir command continue eval exec exit export false fg getopts hash help jobs kill let local pwd read readonly return set shift source times trap true type ulimit umask unalias unset wait shows standard unix structure: # ls / var tmp proc linuxrc init etc bin usr sbin mnt lib html dev in /bin # ls /bin zebra strace ppps ln echo cat wscd startbsp pppc klog ebtables busybox wlancmd sshd ping kill dns brctl web sntp netstat iwpriv dhcps auth usbdiagd sms mount iwcontrol dhcpc atserver upnp sleep mknod iptables date atcmd upg siproxd mkdir ipcheck cp at umount sh mini_upnpd ip console ash test_at rm mic igmpproxy cms telnetd ripd ls ethcmd cmgr swapdev ps log equipcmd cli in /sbin # ls /sbin vconfig reboot insmod ifconfig arp route poweroff init halt using tftp after installing tftp on my desktop I was able to send files with tftp -s -l curcfg.xml 192.168.1.103 and to download onto the huawei with tftp -g -r curcfg.xml 192.168.1.103 I think I'll need that - because I don't see any editor installed. readout stuff (still playing around where I would get interesting info) For confirmation of hardware: # cat /var/log/modem_hardware_name ^HWVER:"WL1B153M001"# # cat /var/log/modem_software_name 1096.11.03.02.107 # cat /var/log/product_name B153

    Read the article

  • Monit won't run

    - by Yaniro
    I have two identical EC2 instances (the second is a replica of the first), running Gentoo. The first instance has monit running which monitors a single process and some system resources and functions great. In the second instance, monit runs but quits right away. The configuration is similar on both instances so are the versions of monit. monit.log shows: [GMT Oct 3 08:36:41] info : monit daemon with PID 5 awakened Final lines on strace monit show: write(2, "monit daemon with PID 5 awakened"..., 33monit daemon with PID 5 awakened ) = 33 time(NULL) = 1349252827 open("/etc/localtime", O_RDONLY) = 4 fstat64(4, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 fstat64(4, {st_mode=S_IFREG|0644, st_size=118, ...}) = 0 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb773a000 read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\1\0\0\0\1\0\0\0\0"..., 4096) = 118 _llseek(4, -6, [112], SEEK_CUR) = 0 read(4, "\nGMT0\n", 4096) = 6 close(4) = 0 munmap(0xb773a000, 4096) = 0 write(3, "[GMT Oct 3 08:27:07] info :"..., 33) = 33 write(3, "monit daemon with PID 5 awakened"..., 33) = 33 waitpid(-1, NULL, WNOHANG) = -1 ECHILD (No child processes) close(3) = 0 exit_group(0) = ? No core dumps (ulimit -c shows unlimited) monit -v shows: monit: Debug: Adding host allow 'localhost' monit: Debug: Skipping redundant host 'localhost' monit: Debug: Skipping redundant host 'localhost' monit: Debug: Adding credentials for user 'xxxx'. Runtime constants: Control file = /etc/monitrc Log file = /var/log/monit/monit.log Pid file = /var/run/monit.pid Id file = /var/run/monit.pid Debug = True Log = True Use syslog = False Is Daemon = True Use process engine = True Poll time = 30 seconds with start delay 0 seconds Expect buffer = 256 bytes Event queue = base directory /var/monit with 100 slots Mail server(s) = xx.xxx.xx.xxx with timeout 30 seconds Mail from = (not defined) Mail subject = (not defined) Mail message = (not defined) Start monit httpd = True httpd bind address = Any/All httpd portnumber = 2812 httpd signature = True Use ssl encryption = False httpd auth. style = Basic Authentication and Host/Net allow list Alert mail to = [email protected] Alert on = All events The service list contains the following entries: System Name = xxxx Monitoring mode = active CPU wait limit = if greater than 20.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert CPU system limit = if greater than 30.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert CPU user limit = if greater than 70.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Swap usage limit = if greater than 25.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Memory usage limit = if greater than 75.0% 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Load avg. (5min) = if greater than 2.0 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Load avg. (1min) = if greater than 4.0 1 times within 1 cycle(s) then alert else if succeeded 1 times within 1 cycle(s) then alert Process Name = xxxx Group = server Pid file = /var/run/xxxx.pid Monitoring mode = active Start program = '/etc/init.d/xxxx restart' timeout 20 second(s) Stop program = '/etc/init.d/xxxx stop' timeout 30 second(s) Existence = if does not exist 1 times within 1 cycle(s) then restart else if succeeded 1 times within 1 cycle(s) then alert Pid = if changed 1 times within 1 cycle(s) then alert Ppid = if changed 1 times within 1 cycle(s) then alert Timeout = If restarted 3 times within 5 cycle(s) then unmonitor Alert mail to = [email protected] Alert on = All events Alert mail to = [email protected] Alert on = All events ------------------------------------------------------------------------------- monit daemon with PID 5 awakened Ran emerge --sync before emerge -va monit which installed monit v5.3.2. When that didn't work i've downloaded v5.5 from their website and compiled from source which did not work either.

    Read the article

  • Improving TCP performance over a gigabit network with lots of connections and high traffic of small packets

    - by MinimeDJ
    I’m trying to improve my TCP throughput over a “gigabit network with lots of connections and high traffic of small packets”. My server OS is Ubuntu 11.10 Server 64bit. There are about 50.000 (and growing) clients connected to my server through TCP Sockets (all on the same port). 95% of of my packets have size of 1-150 bytes (TCP header and payload). The rest 5% vary from 150 up to 4096+ bytes. With the config below my server can handle traffic up to 30 Mbps (full duplex). Can you please advice best practice to tune OS for my needs? My /etc/sysctl.cong looks like this: kernel.pid_max = 1000000 net.ipv4.ip_local_port_range = 2500 65000 fs.file-max = 1000000 # net.core.netdev_max_backlog=3000 net.ipv4.tcp_sack=0 # net.core.rmem_max = 16777216 net.core.wmem_max = 16777216 net.core.somaxconn = 2048 # net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_wmem = 4096 65536 16777216 # net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_mem = 50576 64768 98152 # net.core.wmem_default = 65536 net.core.rmem_default = 65536 net.ipv4.tcp_window_scaling=1 # net.ipv4.tcp_mem= 98304 131072 196608 # net.ipv4.tcp_timestamps = 0 net.ipv4.tcp_rfc1337 = 1 net.ipv4.ip_forward = 0 net.ipv4.tcp_congestion_control=cubic net.ipv4.tcp_tw_recycle = 0 net.ipv4.tcp_tw_reuse = 0 # net.ipv4.tcp_orphan_retries = 1 net.ipv4.tcp_fin_timeout = 25 net.ipv4.tcp_max_orphans = 8192 Here are my limits: $ ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 193045 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1000000 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 1000000 [ADDED] My NICs are the following: $ dmesg | grep Broad [ 2.473081] Broadcom NetXtreme II 5771x 10Gigabit Ethernet Driver bnx2x 1.62.12-0 (2011/03/20) [ 2.477808] bnx2x 0000:02:00.0: eth0: Broadcom NetXtreme II BCM57711E XGb (A0) PCI-E x4 5GHz (Gen2) found at mem fb000000, IRQ 28, node addr d8:d3:85:bd:23:08 [ 2.482556] bnx2x 0000:02:00.1: eth1: Broadcom NetXtreme II BCM57711E XGb (A0) PCI-E x4 5GHz (Gen2) found at mem fa000000, IRQ 40, node addr d8:d3:85:bd:23:0c [ADDED 2] ethtool -k eth0 Offload parameters for eth0: rx-checksumming: on tx-checksumming: on scatter-gather: on tcp-segmentation-offload: on udp-fragmentation-offload: off generic-segmentation-offload: on generic-receive-offload: on large-receive-offload: on rx-vlan-offload: on tx-vlan-offload: on ntuple-filters: off receive-hashing: off [ADDED 3] sudo ethtool -S eth0|grep -vw 0 NIC statistics: [1]: rx_bytes: 17521104292 [1]: rx_ucast_packets: 118326392 [1]: tx_bytes: 35351475694 [1]: tx_ucast_packets: 191723897 [2]: rx_bytes: 16569945203 [2]: rx_ucast_packets: 114055437 [2]: tx_bytes: 36748975961 [2]: tx_ucast_packets: 194800859 [3]: rx_bytes: 16222309010 [3]: rx_ucast_packets: 109397802 [3]: tx_bytes: 36034786682 [3]: tx_ucast_packets: 198238209 [4]: rx_bytes: 14884911384 [4]: rx_ucast_packets: 104081414 [4]: rx_discards: 5828 [4]: rx_csum_offload_errors: 1 [4]: tx_bytes: 35663361789 [4]: tx_ucast_packets: 194024824 [5]: rx_bytes: 16465075461 [5]: rx_ucast_packets: 110637200 [5]: tx_bytes: 43720432434 [5]: tx_ucast_packets: 202041894 [6]: rx_bytes: 16788706505 [6]: rx_ucast_packets: 113123182 [6]: tx_bytes: 38443961940 [6]: tx_ucast_packets: 202415075 [7]: rx_bytes: 16287423304 [7]: rx_ucast_packets: 110369475 [7]: rx_csum_offload_errors: 1 [7]: tx_bytes: 35104168638 [7]: tx_ucast_packets: 184905201 [8]: rx_bytes: 12689721791 [8]: rx_ucast_packets: 87616037 [8]: rx_discards: 2638 [8]: tx_bytes: 36133395431 [8]: tx_ucast_packets: 196547264 [9]: rx_bytes: 15007548011 [9]: rx_ucast_packets: 98183525 [9]: rx_csum_offload_errors: 1 [9]: tx_bytes: 34871314517 [9]: tx_ucast_packets: 188532637 [9]: tx_mcast_packets: 12 [10]: rx_bytes: 12112044826 [10]: rx_ucast_packets: 84335465 [10]: rx_discards: 2494 [10]: tx_bytes: 36562151913 [10]: tx_ucast_packets: 195658548 [11]: rx_bytes: 12873153712 [11]: rx_ucast_packets: 89305791 [11]: rx_discards: 2990 [11]: tx_bytes: 36348541675 [11]: tx_ucast_packets: 194155226 [12]: rx_bytes: 12768100958 [12]: rx_ucast_packets: 89350917 [12]: rx_discards: 2667 [12]: tx_bytes: 35730240389 [12]: tx_ucast_packets: 192254480 [13]: rx_bytes: 14533227468 [13]: rx_ucast_packets: 98139795 [13]: tx_bytes: 35954232494 [13]: tx_ucast_packets: 194573612 [13]: tx_bcast_packets: 2 [14]: rx_bytes: 13258647069 [14]: rx_ucast_packets: 92856762 [14]: rx_discards: 3509 [14]: rx_csum_offload_errors: 1 [14]: tx_bytes: 35663586641 [14]: tx_ucast_packets: 189661305 rx_bytes: 226125043936 rx_ucast_packets: 1536428109 rx_bcast_packets: 351 rx_discards: 20126 rx_filtered_packets: 8694 rx_csum_offload_errors: 11 tx_bytes: 548442367057 tx_ucast_packets: 2915571846 tx_mcast_packets: 12 tx_bcast_packets: 2 tx_64_byte_packets: 35417154 tx_65_to_127_byte_packets: 2006984660 tx_128_to_255_byte_packets: 373733514 tx_256_to_511_byte_packets: 378121090 tx_512_to_1023_byte_packets: 77643490 tx_1024_to_1522_byte_packets: 43669214 tx_pause_frames: 228 Some info about SACK: When to turn TCP SACK off?

    Read the article

  • "Can't create table" when having to many partitions

    - by Chris
    I am currently having a problem I dont understand. Wherever I look it says mySQL (5.5) / InnoDB doesnt have a table limit. I wanted to test the InnoDB compression and was about to create an empty copy of an existing table and ran into the following problem. this one works: CREATE TABLE `hsc` ( LOTS OF STUFF ) ENGINE=InnoDB CHARSET=utf8 PARTITION BY RANGE (pid) SUBPARTITION BY HASH (cons) SUBPARTITIONS 2 (PARTITION hsc_p0 VALUES LESS THAN (10000) , PARTITION hsc_p1 VALUES LESS THAN (20000) , PARTITION hsc_p2 VALUES LESS THAN (30000) , PARTITION hsc_p3 VALUES LESS THAN (40000) , PARTITION hsc_p4 VALUES LESS THAN (50000) , PARTITION hsc_p40 VALUES LESS THAN (4000000) ); this one doesn't: CREATE TABLE `hsc` ( LOTS OF STUFF ) ENGINE=InnoDB CHARSET=utf8 PARTITION BY RANGE (pid) SUBPARTITION BY HASH (cons) SUBPARTITIONS 2 (PARTITION hsc_p0 VALUES LESS THAN (10000) , PARTITION hsc_p1 VALUES LESS THAN (20000) , PARTITION hsc_p2 VALUES LESS THAN (30000) , PARTITION hsc_p3 VALUES LESS THAN (40000) , PARTITION hsc_p4 VALUES LESS THAN (50000) , PARTITION hsc_p5 VALUES LESS THAN (75000) , PARTITION hsc_p6 VALUES LESS THAN (100000) , PARTITION hsc_p7 VALUES LESS THAN (125000) , PARTITION hsc_p8 VALUES LESS THAN (150000) , PARTITION hsc_p9 VALUES LESS THAN (175000) , PARTITION hsc_p40 VALUES LESS THAN (4000000) ); ERROR 1005 (HY000): Can't create table 'hsc' (errno: 1) Its reproducable by removing the number of partitions and adding them again. it does not have to do anything with the name of the table as i tried various names. there is also enough empty space on the HDD. /dev/simfs 230G 26G 192G 12% /var/lib/mysql.mnt There should be no limit on the partitions http://dev.mysql.com/doc/refman/5.5/en/partitioning-limitations.html Maximum number of partitions. The maximum possible number of partitions for a given table (that does not use the NDB storage engine) is 1024. This number includes subpartitions. i have increased both open_files show variables where variable_name LIKE '%open_files%'; +-------------------+-------+ | Variable_name | Value | +-------------------+-------+ | innodb_open_files | 512 | | open_files_limit | 1536 | +-------------------+-------+ No change. Any clues where should I start looking? UPDATE: the whole thing is running in an openvz environment. i saw in users_beancounters that the numflock was a problem, so i increased it. but the problem still persists. maybe this helps: ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 515011 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 10240 cpu time (seconds, -t) unlimited max user processes (-u) 515011 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited cat /proc/user_beancounters Version: 2.5 uid resource held maxheld barrier limit failcnt 200: kmemsize 9309653 13357056 14372700 14790164 0 lockedpages 0 1008 2048 2048 0 privvmpages 675424 686528 1048576 1572864 0 shmpages 33 673 21504 21504 0 dummy 0 0 9223372036854775807 9223372036854775807 0 numproc 49 90 240 240 0 physpages 243761 246945 0 9223372036854775807 0 vmguarpages 0 0 1048576 1048576 0 oomguarpages 81672 83305 1048576 1048576 0 numtcpsock 6 8 360 360 0 numflock 175 188 512 512 8 numpty 1 9 16 16 0 numsiginfo 0 48 256 256 0 tcpsndbuf 104640 263912 1720320 2703360 0 tcprcvbuf 98304 131072 1720320 2703360 0 othersockbuf 32368 89304 1126080 2097152 0 dgramrcvbuf 0 2312 262144 262144 0 numothersock 19 28 360 360 0 dcachesize 2285052 3624426 3409920 3624960 0 numfile 616 870 9312 9312 0 dummy 0 0 9223372036854775807 9223372036854775807 0 dummy 0 0 9223372036854775807 9223372036854775807 0 dummy 0 0 9223372036854775807 9223372036854775807 0 numiptent 24 24 128 128 0

    Read the article

  • HAProxy: Display a "BADREQ" | BADREQ's by the thousands

    - by GruffTech
    My HAProxy Configuration. #HA-Proxy version 1.3.22 2009/10/14 Copyright 2000-2009 Willy Tarreau <[email protected]> global maxconn 10000 spread-checks 50 user haproxy group haproxy daemon stats socket /tmp/haproxy log localhost local0 log localhost local1 notice defaults mode http maxconn 50000 timeout client 10000 option forwardfor except 127.0.0.1 option httpclose option httplog listen dcaustin 0.0.0.0:80 mode http timeout connect 12000 timeout server 60000 timeout queue 120000 balance roundrobin option httpchk GET /index.html log global option httplog option dontlog-normal server web1 10.10.10.101:80 maxconn 300 check fall 1 server web2 10.10.10.102:80 maxconn 300 check fall 1 server web3 10.10.10.103:80 maxconn 300 check fall 1 server web4 10.10.10.104:80 maxconn 300 check fall 1 listen stats 0.0.0.0:9000 mode http balance log global timeout client 5000 timeout connect 4000 timeout server 30000 stats uri /haproxy HAProxy is running, and the socket is working... adam@dcaustin:/etc/haproxy# echo "show info" | socat stdio /tmp/haproxy Name: HAProxy Version: 1.3.22 Release_date: 2009/10/14 Nbproc: 1 Process_num: 1 Pid: 6320 Uptime: 0d 0h14m58s Uptime_sec: 898 Memmax_MB: 0 Ulimit-n: 20017 Maxsock: 20017 Maxconn: 10000 Maxpipes: 0 CurrConns: 47 PipesUsed: 0 PipesFree: 0 Tasks: 51 Run_queue: 1 node: dcaustin desiption: Errors show nothing from socket... adam@dcaustin:/etc/haproxy# echo "show errors" | socat stdio /tmp/haproxy adam@dcaustin:/etc/haproxy# However... My Error log is exploding with "badrequests" with the Error code cR. cR (according to 1.3 documentation) is The "timeout http-request" stroke before the client sent a full HTTP request. This is sometimes caused by too large TCP MSS values on the client side for PPPoE networks which cannot transport full-sized packets, or by clients sending requests by hand and not typing fast enough, or forgetting to enter the empty line at the end of the request. The HTTP status code is likely a 408 here. Correct on the 408, but we're getting literally thousands of these requests every hour. (This log snippet is an clip for about 10 seconds of time...) Jun 30 11:08:52 localhost haproxy[6320]: 92.22.213.32:26448 [30/Jun/2011:11:08:42.384] dcaustin dcaustin/<NOSRV> -1/-1/-1/-1/10002 408 212 - - cR-- 35/35/18/0/0 0/0 "<BADREQ>" Jun 30 11:08:54 localhost haproxy[6320]: 71.62.130.24:62818 [30/Jun/2011:11:08:44.457] dcaustin dcaustin/<NOSRV> -1/-1/-1/-1/10001 408 212 - - cR-- 39/39/16/0/0 0/0 "<BADREQ>" Jun 30 11:08:55 localhost haproxy[6320]: 84.73.75.236:3589 [30/Jun/2011:11:08:45.021] dcaustin dcaustin/<NOSRV> -1/-1/-1/-1/10008 408 212 - - cR-- 35/35/15/0/0 0/0 "<BADREQ>" Jun 30 11:08:55 localhost haproxy[6320]: 69.39.20.190:49969 [30/Jun/2011:11:08:45.709] dcaustin dcaustin/<NOSRV> -1/-1/-1/-1/10000 408 212 - - cR-- 37/37/16/0/0 0/0 "<BADREQ>" Jun 30 11:08:56 localhost haproxy[6320]: 2.29.0.9:58772 [30/Jun/2011:11:08:46.846] dcaustin dcaustin/<NOSRV> -1/-1/-1/-1/10001 408 212 - - cR-- 43/43/22/0/0 0/0 "<BADREQ>" Jun 30 11:08:57 localhost haproxy[6320]: 212.139.250.242:57537 [30/Jun/2011:11:08:47.568] dcaustin dcaustin/<NOSRV> -1/-1/-1/-1/10000 408 212 - - cR-- 42/42/21/0/0 0/0 "<BADREQ>" Jun 30 11:08:58 localhost haproxy[6320]: 74.79.195.75:55046 [30/Jun/2011:11:08:48.559] dcaustin dcaustin/<NOSRV> -1/-1/-1/-1/10000 408 212 - - cR-- 46/46/24/0/0 0/0 "<BADREQ>" Jun 30 11:08:58 localhost haproxy[6320]: 74.79.195.75:55044 [30/Jun/2011:11:08:48.554] dcaustin dcaustin/<NOSRV> -1/-1/-1/-1/10004 408 212 - - cR-- 45/45/24/0/0 0/0 "<BADREQ>" Jun 30 11:08:58 localhost haproxy[6320]: 74.79.195.75:55045 [30/Jun/2011:11:08:48.554] dcaustin dcaustin/<NOSRV> -1/-1/-1/-1/10005 408 212 - - cR-- 44/44/24/0/0 0/0 "<BADREQ>" Jun 30 11:09:00 localhost haproxy[6320]: 68.197.56.2:52781 [30/Jun/2011:11:08:50.975] dcaustin dcaustin/<NOSRV> -1/-1/-1/-1/10000 408 212 - - cR-- 49/49/28/0/0 0/0 "<BADREQ>" From what I read on google, if i wanted to see what the bad requests are, I can show errors to the socket and it will spit them out. We do run a pretty heavily trafficed website and the percentage of "BADREQS" to normal requests is quite low, but I'd like to be able to get ahold of what that request WAS so I can debug it. stats # pxname,svname,qcur,qmax,scur,smax,slim,stot,bin,bout,dreq,dresp,ereq,econ,eresp,wretr,wredis,status,weight,act,bck,chkfail,chkdown,lastchg,downtime,qlimit,pid,iid,sid,throttle,lbtot,tracked,type,rate,rate_lim,rate_max, dcaustin,FRONTEND,,,64,120,50000,88433,105889100,2553809875,0,0,4641,,,,,OPEN,,,,,,,,,1,1,0,,,,0,45,0,128, dcaustin,web1,0,0,10,28,300,20941,25402112,633143416,,0,,0,3,0,0,UP,1,1,0,0,0,2208,0,,1,1,1,,20941,,2,11,,30, dcaustin,web2,0,0,9,30,300,20941,25026691,641475169,,0,,0,3,0,0,UP,1,1,0,0,0,2208,0,,1,1,2,,20941,,2,11,,30, dcaustin,web3,0,0,10,27,300,20940,30116527,635015040,,0,,0,9,0,0,UP,1,1,0,0,0,2208,0,,1,1,3,,20940,,2,10,,31, dcaustin,web4,0,0,5,28,300,20940,25343770,643209546,,0,,0,8,0,0,UP,1,1,0,0,0,2208,0,,1,1,4,,20940,,2,11,,31, dcaustin,BACKEND,0,0,34,95,50000,83762,105889100,2553809875,0,0,,0,34,0,0,UP,4,4,0,,0,2208,0,,1,1,0,,83762,,1,43,,122, 88500 "Sessions" and 4500 errors. in the last 20 minutes.

    Read the article

  • php crashes with no core file and this message : apc_mmap failed

    - by greg0ire
    Description of the problem Regularly, cron php processes crash on our production server, which result in mails with the following body : PHP Fatal error: PHP Startup: apc_mmap: mmap failed: in Unknown on line 0 Segmentation fault (core dumped) I think the Segmentation fault (core dumped) should result in core files being handled by apport and then written in /var/crashes, but the files I can see there are there since yesterday, although the last crash occured today : -rw-r----- 1 root whoopsie 1138528 mai 22 04:09 _usr_bin_php5.0.crash -rw-r----- 1 frontoffice whoopsie 1166373 mai 20 18:00 _usr_bin_php5.1005.crash -rw-r----- 1 frontoffice whoopsie 81622658 mai 22 00:05 _usr_sbin_php5-fpm.1005.crash I tried to download the last one anyway, and ran gdb /usr/sbin/php5-fpm /tmp/_usr_sbin_php5-fpm.1005.crash, only to be told that the file is not a core file (its format was not recognized). Here is the server's apc configuration : cat /etc/php5/cli/conf.d/20-apc.ini extension=apc.so apc.shm_size=512M apc.ttl=3600 apc.user_ttl=3600 apc.enable_cli=1 I'm mostly worried about the apc.shm_size… isn't it too high or too low ? I understand it has to do with the size of memory segments. Question(s) What could be the problem ? How can I troubleshoot it (how can I get a valid core file ?) ? System information free total used free shared buffers cached Mem: 5081296 4354684 726612 0 374744 959968 -/+ buffers/cache: 3019972 2061324 Swap: 522236 516888 5348 cat /etc/lsb-release DISTRIB_ID=Ubuntu DISTRIB_RELEASE=12.04 DISTRIB_CODENAME=precise DISTRIB_DESCRIPTION="Ubuntu 12.04.2 LTS" php -v PHP 5.4.17-1~precise+1 (cli) (built: Jul 17 2013 18:14:06) Copyright (c) 1997-2013 The PHP Group Zend Engine v2.4.0, Copyright (c) 1998-2013 Zend Technologies php -i excerpt : Configuration apc APC Support => enabled Version => 3.1.13 APC Debugging => Disabled MMAP Support => Enabled MMAP File Mask => Locking type => pthread mutex Locks Serialization Support => php Revision => $Revision: 327136 $ Build Date => Nov 20 2012 18:41:36 Directive => Local Value => Master Value apc.cache_by_default => On => On apc.canonicalize => On => On apc.coredump_unmap => Off => Off apc.enable_cli => On => On apc.enabled => On => On apc.file_md5 => Off => Off apc.file_update_protection => 2 => 2 apc.filters => no value => no value apc.gc_ttl => 3600 => 3600 apc.include_once_override => Off => Off apc.lazy_classes => Off => Off apc.lazy_functions => Off => Off apc.max_file_size => 1M => 1M apc.mmap_file_mask => no value => no value apc.num_files_hint => 1000 => 1000 apc.preload_path => no value => no value apc.report_autofilter => Off => Off apc.rfc1867 => Off => Off apc.rfc1867_freq => 0 => 0 apc.rfc1867_name => APC_UPLOAD_PROGRESS => APC_UPLOAD_PROGRESS apc.rfc1867_prefix => upload_ => upload_ apc.rfc1867_ttl => 3600 => 3600 apc.serializer => default => default apc.shm_segments => 1 => 1 apc.shm_size => 512M => 512M apc.shm_strings_buffer => 4M => 4M apc.slam_defense => On => On apc.stat => On => On apc.stat_ctime => Off => Off apc.ttl => 3600 => 3600 apc.use_request_time => On => On apc.user_entries_hint => 4096 => 4096 apc.user_ttl => 3600 => 3600 apc.write_lock => On => On php -m [PHP Modules] apc bcmath bz2 calendar Core ctype curl date dba dom ereg exif fileinfo filter ftp gd gettext hash iconv imagick intl json ldap libxml mbstring memcache memcached mhash mysql mysqli openssl pcntl pcre PDO pdo_mysql pdo_pgsql pdo_sqlite pgsql Phar posix Reflection session shmop SimpleXML soap sockets SPL sqlite3 standard sysvmsg sysvsem sysvshm tidy tokenizer wddx xml xmlreader xmlwriter zip zlib [Zend Modules] ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 39531 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 39531 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited

    Read the article

  • webserver horrible slow, sometimes incredible fast

    - by dhanke
    i am running a small community ( 6000+ Members ) on a non-virtual 64-bit ubuntu 11.04 system. I am not a Linux-pro, not even advanced, i just tried to setup a webserver, which does nothing special actually. Delivering some dynamic PHP and RoR websites is its task. So it might be that my configuration files do look horrible bad. Also, i might use the wrong vocabulary, so in doubt, please ask. Having a current all-time record of 520 registered users (board-accounts, no system-users) online at same time, average server-load is about 2.0 - 5.0. Meantime (~250 users) average server load value is at about 0.4 - 0.8, sometimes, on some expensive searches a bit higher. everything fine. From time to time however, the load increases up to 120 (120.0, not 12.0 ;) ). In this time, its hard to even connect via SSH, but when i reach the server, and use top/htop/iotop to see whats happening, i cannot identify any process causing high CPU load. iotop tells me about a current reading/writing speed of about approx. 70kb/s, which is quite equal to power-off i think. Memory-Usage is max. at ~ 12GB of 16GB, so swap remains empty. now the odd (at least for me:) waiting some minutes ( since i always get a bit into a panic when this happens, it feels like 5 minutes, but i suppose its more like 20-30 minutes) and the server is back to normal. everything continues as normal. another odd fact: when i run hdparm -tT /dev/sda, i get answer like: /dev/sda: Timing cached reads: 7180 MB in 2.00 seconds = 3591.13 MB/sec Timing buffered disk reads: 348 MB in 3.02 seconds = 115.41 MB/sec when i run the same command while the server is "frozen", the answer is like /dev/sda: <- takes about 5 minutes until this line appears Timing cached reads: 7180 MB in 2.00 seconds = 3591.13 MB/sec <- 5 more minutes Timing buffered disk reads: 348 MB in 3.02 seconds = 115.41 MB/sec <- another 5 minutes so the values are the same, but the quoted time is completely wrong. using time command as prefix also tells me that ~ 15 minutes were used. I searched in dmesg, /var/log/[messages|syslog] - nothing found. /var/log/errors however tells me that: Jul 4 20:28:30 localhost kernel: [19080.671415] INFO: task php5-fpm:27728 blocked for more than 120 seconds. Jul 4 20:28:30 localhost kernel: [19080.671419] "echo 0 /proc/sys/kernel/hung_task_timeout_secs" disables this message. multiple times. now that message does tell me that php5-fpm task was blocked or did block ? - but not if that is the cause or just one of the results of that "freeze". Anyone? to cut the long story short, i dont know where even to start analyzing. So if you can give me any advice by looking at following specs and configs, or ask me to provide more information, i`d be glad. Specs: 6 Core AMD Phenom(tm) II X6 1055T Processor * 16 Gigabyte Ram 2x 1.5 TB Seagate ST1500DL003-9VT16L via SATA 3 via SoftwareRaid (i suppose) Services: (due to service --status-all, those with [ + ]) nginx Webserver 1.0.14 mySQL 5.1.63 Server Ruby on Rails 2.3.11 ( passenger-nginx-module ) php5-fpm 5.3.6-13ubuntu3.7 SSH ido2db Further services: default crontab + nightly backup. syslog-ng Website consists of 2 subdomains, forum. and www. where forum is a phpBB3.x PHP-Board, and www a Ruby on Rails 2.3.11 application (portal). Mini-Note: sometimes i notice that the forum is pretty slow, in contrast to the always-fast (except for this "freeze") portal. Both share the same Database, but the portal is using it read-only. The Webserver is nginx, using phusion passenger module to communicate with the ruby-application. Also, for the forum it communicates with php5-fpm via socket: relevant nginx configuration parts ( with comments/questions starting by ; ) ; in case of freeze due to too high Filesystem activity, maybe adding a limit? #worker_rlimit_nofile 50000; user www-data; ; 6 cores, so i read 6 fits. maybe already wrong? worker_processes 6; pid /var/run/nginx.pid; events { worker_connections 1024; } http { passenger_root /var/lib/gems/1.8/gems/passenger-3.0.11; passenger_ruby /usr/bin/ruby1.8; ; the forum once featured a chat, which was working w/o websockets. ; so it was a hell of pull requests (deactivated now, freeze still happening) keepalive_timeout 65; keepalive_requests 50; gzip on; server { listen 80; server_name www.domain.tld; root /var/www/domain/rails/public; passenger_enabled on; } server { listen 80; server_name forum.domain.tld; location / { root /var/www/domain/forum; index index.php; } ; satic stuff to be handled by nginx location ~* ^/style/.+.(jpg|jpeg|gif|css|png|js|ico|xml)$ { access_log off; expires 30d; root /var/www/domain/forum/; } ; now the php magic, note the "backend"-fcgi_pass location ~ .php$ { fastcgi_split_path_info ^(.+\.php)(.*)$; fastcgi_pass backend; fastcgi_index index.php; fastcgi_param SCRIPT_FILENAME /var/www/domain/forum$fastcgi_script_name; include fastcgi_params; fastcgi_param QUERY_STRING $query_string; fastcgi_param REQUEST_METHOD $request_method; fastcgi_param CONTENT_TYPE $content_type; fastcgi_param CONTENT_LENGTH $content_length; fastcgi_intercept_errors on; fastcgi_ignore_client_abort off; fastcgi_connect_timeout 60; fastcgi_send_timeout 180; fastcgi_read_timeout 180; fastcgi_buffer_size 128k; fastcgi_buffers 256 16k; fastcgi_busy_buffers_size 256k; fastcgi_temp_file_write_size 256k; fastcgi_max_temp_file_size 0; } location ~ /\.ht { deny all; } } ;the php5-fpm socket. i read that /dev/shm/ whould be the fastes place for this. bad idea in general? upstream backend { server unix:/dev/shm/phpfpm; } ... } php5-fpm settings (i changed this values due to php5-fpm error log messages higher and higher.. (freeze-problem was there before as well)* listen = /dev/shm/phpfpm user = www-data group = www-data pm = dynamic ; holy, 4000! well, shinking this value to earth-level gave me ; 100s of 502 bad gateway commands. this values were quite stable. ; since there are only max 520 users online i dont get it, why i would need ; as many children as configured here. due to keep-alive maybe? ; asking questions is easier for me since restarting server will make ; my community-members angry ;) pm.max_children = 4000 pm.start_servers = 100 pm.min_spare_servers = 50 pm.max_spare_servers = 150 pm.max_requests = 10 pm.status_path = /status ping.path = /ping ping.response = pong slowlog = log/$pool.log.slow ;should i use rlimit? ;rlimit_files = 1024 chdir = / mysql/my.cnf [client] port = 3306 socket = /var/run/mysqld/mysqld.sock [mysqld_safe] socket = /var/run/mysqld/mysqld.sock nice = 0 [mysqld] user = mysql socket = /var/run/mysqld/mysqld.sock port = 3306 basedir = /usr datadir = /var/lib/mysql tmpdir = /tmp skip-external-locking bind-address = 127.0.0.1 key_buffer = 16M max_allowed_packet = 16M thread_stack = 192K thread_cache_size = 8 myisam-recover = BACKUP ; high number, but less gives some phpBB errors. max_connections = 450 table_cache = 512 ; i read twice the cpu cores, bad? thread_concurrency = 12 join_buffer_size = 2084K concurrent_insert = 3 query_cache_limit = 64M query_cache_size = 512M query_cache_type = 1 log_error = /var/log/mysql/error.log log_slow_queries = /var/log/mysql/mysql-slow.log long_query_time = 2 expire_logs_days = 10 max_binlog_size = 100M low_priority_updates=1 [mysqldump] quick quote-names max_allowed_packet = 16M [isamchk] key_buffer = 16M !includedir /etc/mysql/conf.d/ I used smartctl already, hdds seem to be fine. /proc/mdstatus quotes: Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md3 : active raid1 sda3[1] 1459264192 blocks [2/1] [_U] md1 : active raid1 sda1[0] 3911680 blocks [2/1] [U_] unused devices: ulimit -a core file size (blocks, -c) 0 data seg size (kbytes, -d) unlimited scheduling priority (-e) 0 file size (blocks, -f) unlimited pending signals (-i) 127727 max locked memory (kbytes, -l) 64 max memory size (kbytes, -m) unlimited open files (-n) 1024 pipe size (512 bytes, -p) 8 POSIX message queues (bytes, -q) 819200 real-time priority (-r) 0 stack size (kbytes, -s) 8192 cpu time (seconds, -t) unlimited max user processes (-u) 127727 virtual memory (kbytes, -v) unlimited file locks (-x) unlimited I quote some questions in my configuration files, these are not (intentional) directly problem-related, but would be nice for me to know wether they are indeed questionable or done right. One additional Fact: my MYSQL-database is at 12GB size. i dont know if that does matter, but mytop sometimes shows me 4-5 seconds long insert queries, some are 20-30 seconds long. Its just a feeling that i am unable to prove (because i dont know how), but when i disable the database, the freeze seems not to happen. Example: i created a dummy rails application to see the development log. the app made some sql-queries, reads and inserts. the log quite often was like: DbTest Load (0.3ms) SELECT * FROM `db_test` WHERE (`db_test`.`id` = 31722) LIMIT 1 SQL (0.1ms) BEGIN DbTest Update (0.3ms) UPDATE `db_test` SET `updated_at` = '2012-07-04 23:32:34' WHERE `id` = 31722 - now the log stands still for 5-60 seconds. SQL (49.1ms) COMMIT - SQL-Update time in the log does not include freeze time Rendering test/index Completed in 96ms (View: 16, DB: 59) | 200 OK [http://localhost:9000/test] Bad part is: this mini-freeze here only happens from time to time as well. note: meanwhile i cannot even upload files via scp. I currently feel like running form bad to worse and back by googling for my server-problem due to immense lack of knowledge regarding server configurations. It still makes me wonder, why those problems even appear, since 250 users a time is not such a high amount, right? So my questions: whats wrong and how to fix? ;) or: what information can i provide to make the situation more clear? can you point at some critical bad configuration-line which i should consider to catch up in the documentation? are there any tools i can run to see some possible bottlenecks? any further advice? (next to: "pay someone who knows what he does" - its a private project, server costs enough already. :)) Thanks for your time and help. Best Regards, Daniel P.S.: i renamed the configfiles to domain.tld since i dont want to have any % more load to the server until its fixed. might be a exaggeratedly thought.. P.P.S: if i asked a complete duplicate question, sorry. my search results seemed to be quite specific in their own way.

    Read the article

< Previous Page | 1 2 3 4