Search Results

Search found 10883 results on 436 pages for 'thread timeout'.

Page 129/436 | < Previous Page | 125 126 127 128 129 130 131 132 133 134 135 136  | Next Page >

  • IP failover with 2 nodes on different subnet: cannot ping virtual IP from second node?

    - by quanta
    I'm going to setup redundant failover Redmine: another instance was installed on the second server without problem MySQL (running on the same machine with Redmine) was configured as master-master replication Because they are in different subnet (192.168.3.x and 192.168.6.x), it seems that VIPArip is the only choice. /etc/ha.d/ha.cf on node1 logfacility none debug 1 debugfile /var/log/ha-debug logfile /var/log/ha-log autojoin none warntime 3 deadtime 6 initdead 60 udpport 694 ucast eth1 node2.ip keepalive 1 node node1 node node2 crm respawn /etc/ha.d/ha.cf on node2: logfacility none debug 1 debugfile /var/log/ha-debug logfile /var/log/ha-log autojoin none warntime 3 deadtime 6 initdead 60 udpport 694 ucast eth0 node1.ip keepalive 1 node node1 node node2 crm respawn crm configure show: node $id="6c27077e-d718-4c82-b307-7dccaa027a72" node1 node $id="740d0726-e91d-40ed-9dc0-2368214a1f56" node2 primitive VIPArip ocf:heartbeat:VIPArip \ params ip="192.168.6.8" nic="lo:0" \ op start interval="0" timeout="20s" \ op monitor interval="5s" timeout="20s" depth="0" \ op stop interval="0" timeout="20s" \ meta is-managed="true" property $id="cib-bootstrap-options" \ stonith-enabled="false" \ dc-version="1.0.12-unknown" \ cluster-infrastructure="Heartbeat" \ last-lrm-refresh="1338870303" crm_mon -1: ============ Last updated: Tue Jun 5 18:36:42 2012 Stack: Heartbeat Current DC: node2 (740d0726-e91d-40ed-9dc0-2368214a1f56) - partition with quorum Version: 1.0.12-unknown 2 Nodes configured, unknown expected votes 1 Resources configured. ============ Online: [ node1 node2 ] VIPArip (ocf::heartbeat:VIPArip): Started node1 ip addr show lo: 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 16436 qdisc noqueue link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00 inet 127.0.0.1/8 scope host lo inet 192.168.6.8/32 scope global lo inet6 ::1/128 scope host valid_lft forever preferred_lft forever I can ping 192.168.6.8 from node1 (192.168.3.x): # ping -c 4 192.168.6.8 PING 192.168.6.8 (192.168.6.8) 56(84) bytes of data. 64 bytes from 192.168.6.8: icmp_seq=1 ttl=64 time=0.062 ms 64 bytes from 192.168.6.8: icmp_seq=2 ttl=64 time=0.046 ms 64 bytes from 192.168.6.8: icmp_seq=3 ttl=64 time=0.059 ms 64 bytes from 192.168.6.8: icmp_seq=4 ttl=64 time=0.071 ms --- 192.168.6.8 ping statistics --- 4 packets transmitted, 4 received, 0% packet loss, time 3000ms rtt min/avg/max/mdev = 0.046/0.059/0.071/0.011 ms but cannot ping virtual IP from node2 (192.168.6.x) and outside. Did I miss something? PS: you probably want to set IP2UTIL=/sbin/ip in the /usr/lib/ocf/resource.d/heartbeat/VIPArip resource agent script if you get something like this: Jun 5 11:08:10 node1 lrmd: [19832]: info: RA output: (VIPArip:stop:stderr) 2012/06/05_11:08:10 ERROR: Invalid OCF_RESK EY_ip [192.168.6.8] http://www.clusterlabs.org/wiki/Debugging_Resource_Failures Reply to @DukeLion: Which router receives RIP updates? When I start the VIPArip resource, ripd was run with below configuration file (on node1): /var/run/resource-agents/VIPArip-ripd.conf: hostname ripd password zebra debug rip events debug rip packet debug rip zebra log file /var/log/quagga/quagga.log router rip !nic_tag no passive-interface lo:0 network lo:0 distribute-list private out lo:0 distribute-list private in lo:0 !metric_tag redistribute connected metric 3 !ip_tag access-list private permit 192.168.6.8/32 access-list private deny any

    Read the article

  • MySQL: Pacemaker cannot start the failed master as a new slave?

    - by quanta
    I'm going to setup failover for MySQL replication (1 master and 1 slave) follow this guide: https://github.com/jayjanssen/Percona-Pacemaker-Resource-Agents/blob/master/doc/PRM-setup-guide.rst Here're the output of crm configure show: node serving-6192 \ attributes p_mysql_mysql_master_IP="192.168.6.192" node svr184R-638.localdomain \ attributes p_mysql_mysql_master_IP="192.168.6.38" primitive p_mysql ocf:percona:mysql \ params config="/etc/my.cnf" pid="/var/run/mysqld/mysqld.pid" socket="/var/lib/mysql/mysql.sock" replication_user="repl" replication_passwd="x" test_user="test_user" test_passwd="x" \ op monitor interval="5s" role="Master" OCF_CHECK_LEVEL="1" \ op monitor interval="2s" role="Slave" timeout="30s" OCF_CHECK_LEVEL="1" \ op start interval="0" timeout="120s" \ op stop interval="0" timeout="120s" primitive writer_vip ocf:heartbeat:IPaddr2 \ params ip="192.168.6.8" cidr_netmask="32" \ op monitor interval="10s" \ meta is-managed="true" ms ms_MySQL p_mysql \ meta master-max="1" master-node-max="1" clone-max="2" clone-node-max="1" notify="true" globally-unique="false" target-role="Master" is-managed="true" colocation writer_vip_on_master inf: writer_vip ms_MySQL:Master order ms_MySQL_promote_before_vip inf: ms_MySQL:promote writer_vip:start property $id="cib-bootstrap-options" \ dc-version="1.0.12-unknown" \ cluster-infrastructure="openais" \ expected-quorum-votes="2" \ no-quorum-policy="ignore" \ stonith-enabled="false" \ last-lrm-refresh="1341801689" property $id="mysql_replication" \ p_mysql_REPL_INFO="192.168.6.192|mysql-bin.000006|338" crm_mon: Last updated: Mon Jul 9 10:30:01 2012 Stack: openais Current DC: serving-6192 - partition with quorum Version: 1.0.12-unknown 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ serving-6192 svr184R-638.localdomain ] Master/Slave Set: ms_MySQL Masters: [ serving-6192 ] Slaves: [ svr184R-638.localdomain ] writer_vip (ocf::heartbeat:IPaddr2): Started serving-6192 Editing /etc/my.cnf on the serving-6192 of wrong syntax to test failover and it's working fine: svr184R-638.localdomain being promoted to become the master writer_vip switch to svr184R-638.localdomain Current state: Last updated: Mon Jul 9 10:35:57 2012 Stack: openais Current DC: serving-6192 - partition with quorum Version: 1.0.12-unknown 2 Nodes configured, 2 expected votes 2 Resources configured. ============ Online: [ serving-6192 svr184R-638.localdomain ] Master/Slave Set: ms_MySQL Masters: [ svr184R-638.localdomain ] Stopped: [ p_mysql:0 ] writer_vip (ocf::heartbeat:IPaddr2): Started svr184R-638.localdomain Failed actions: p_mysql:0_monitor_5000 (node=serving-6192, call=15, rc=7, status=complete): not running p_mysql:0_demote_0 (node=serving-6192, call=22, rc=7, status=complete): not running p_mysql:0_start_0 (node=serving-6192, call=26, rc=-2, status=Timed Out): unknown exec error Remove the wrong syntax from /etc/my.cnf on serving-6192, and restart corosync, what I would like to see is serving-6192 was started as a new slave but it doesn't: Failed actions: p_mysql:0_start_0 (node=serving-6192, call=4, rc=1, status=complete): unknown error Here're snippet of the logs which I'm suspecting: Jul 09 10:46:32 serving-6192 lrmd: [7321]: info: rsc:p_mysql:0:4: start Jul 09 10:46:32 serving-6192 lrmd: [7321]: info: RA output: (p_mysql:0:start:stderr) Error performing operation: The object/attribute does not exist Jul 09 10:46:32 serving-6192 crm_attribute: [7420]: info: Invoked: /usr/sbin/crm_attribute -N serving-6192 -l reboot --name readable -v 0 The full logs: http://fpaste.org/AyOZ/ The strange thing is I can starting it manually: export OCF_ROOT=/usr/lib/ocf export OCF_RESKEY_config="/etc/my.cnf" export OCF_RESKEY_pid="/var/run/mysqld/mysqld.pid" export OCF_RESKEY_socket="/var/lib/mysql/mysql.sock" export OCF_RESKEY_replication_user="repl" export OCF_RESKEY_replication_passwd="x" export OCF_RESKEY_test_user="test_user" export OCF_RESKEY_test_passwd="x" sh -x /usr/lib/ocf/resource.d/percona/mysql start: http://fpaste.org/RVGh/ Did I make something wrong?

    Read the article

  • OS X Snow Leopard 10.6 Refuses to Load Websites the first time intermittently

    - by Brandon
    Many times when I am browsing the web, Snow Leopard will sit and load a site for 20 seconds or more, until it times out and says it cannot be displayed. If I refresh, it loads RIGHT away, every time. The issue is intermittent but happens from once every couple of days to a few times a day. So the long and short of it is this: Aluminum MacBook (Non-Pro) 2.4GHz Core2Duo, 4GB DDR3 I am using 10.6.6 but I have had this issue since 10.6.0 It happens in Firefox, Chrome, and Safari I have flushed my DNS (using the command 'blablabla flush') I am using custom DNS servers which I hoped would fix it but it had no effect* I am running Apache currently but haven't been for most of the time I've reformatted multiple times, always experiencing the issue I am on Cox cable internet, with a Motorola Surfboard & a Belkin F6D4230-4 v1 (Pre?) N wireless router. I've put the router in G only & N only & G+N to no effect It seems to be domain dependant as I can sometimes load the Google cache right away, and sometimes other sites will load but Google will refuse My Powerbook G4 with Leopard, other Windows XP laptops, & my wired Win7 desktop do not suffer from the issue. *I recently started using these to escape the awful Cox redirect page on timeouts I'm almost positive the issue has happened on other networks but I can't recall a specific instance (I have a terrible memory). The problem is intermittent and fixable enough (I just have to wait until it times out and hit refresh one time) but incredibly annoying since I'm constantly reading documentation from a large variety of sites. EDIT: To clarify, this happens with ALL sites, not only specific sites. I haven't been able to detect any pattern to the failures, but one day Google.com will refuse to load while reddit.com will, and the next day vice versa. Keep in mind that waiting for a timeout and hitting refresh loads the page right away, every time. If I don't wait for the timeout, opening more links, hitting refresh, and clicking the link a billion times have no effect. It seems to be domain neutral, affecting sites seemingly at random. It doesn't seem to have anything to do with connection inactivity either, because I will be SSHed into different servers, uploading files, browsing, downloading, etc, and it will just quit loading Jquery.com (for example) until I sit and wait for a timeout. /EDIT This is my last resort. Please, someone, tell me what is happening. Thank you.

    Read the article

  • How to enable Chess 3D on Ubuntu 9.10?

    - by Jian Lin
    The 3D cannot be easily enabled. A thread that people refer to is http://ubuntuforums.org/showthread.php?t=416660 but I tried several suggestions on that thread and it doesn't work yet. The message is: No Python OpenGL support No Python GTKGLExt support

    Read the article

  • Monitoring mongrel with monit

    - by matnagel
    I wrote a monit.d file for mongrels which works in this version: check process redmine with pidfile /home/redmine/service/redmine.pid group webservice start program = "/usr/bin/mongrel_rails start -p 41328 -e production -d --pid /home/redmine/service/redmine.pid --user redmine --group redmine -a 127.0.0.1 -c /home/redmine/app" stop program = "/usr/bin/mongrel_rails stop --pid /home/redmine/service/redmine.pid -c /home/redmine/app && rm /home/redmine/service/redmine.pid > /dev/null 2>&1 if cpu greater 50% for 2 cycles then alert if cpu greater 80% for 3 cycles then restart if totalmem greater 60.0 MB for 5 cycles then restart if loadavg (5min) greater 4 for 8 cycles then restart if 3 restarts within 5 cycles then timeout $ Checking monit control file syntax... $ Control file syntax OK I want to also monitor the http response, so I add this line at the end: if failed port 41328 protocol http with timeout 10 seconds then restart Now monit complains: $ Checking monit control file syntax... $ /etc/monit.d/redmine:16: Error: exceeded maximum number of program arguments 'http' $ ERROR: CHECK MONIT CONFIG FILE SYNTAX How do I correctly monitor the port?

    Read the article

  • Keep-Alive header not sent from Tomcat 5.5 http connector?

    - by Codek
    Hi, We're currently using a hardware load balancer, which then goes to Apache and that then goes to Tomcat 5.5 via the AJP connector. We've decided to dump apache for various reasons - In our current system it doesnt provide any advantage. However when I look at the headers sent when we do this, the "Keep-Alive: timeout=15 max=96" header doesnt get sent when you use the tomcat http connector Interestingly, i can find no documentiation on "keepalivetimeout" for tomcat5.5, but i can for tomcat6. But neither can i find evidence that tomcat5.5 doesnt support this setting. here's my connector: <Connector port="8090" maxHttpHeaderSize="8192" maxThreads="400" minSpareThreads="150" maxSpareThreads="300" enableLookups="false" connectionTimeout="2" maxKeepAliveRequests="400" disableUploadTimeout="true" /> So; Is there any way I can specify the keepalive timeout if we use the http connector with tomcat 5.5, and force this header entry to be sent? Thanks, Dan

    Read the article

  • site timing out when under heavy load

    - by naunu
    My client sends out eblasts at 8am monday/wed/friday. Between 8:15-8:45 the site becomes extremely slow and many users sessions timeout. My setup: Mediatemple VE 2gb dedicated ram (3 burst) Ubuntu 9.10 Apache2-mpm-worker PHP5.3-fcgi MySQL 5 I recently tried to remedy the problem by switching from apache2-mpm-prefork to mpm-worker, but am still having the same issues. My apache settings are: Timeout 100 KeepAlive On MaxKeepAliveRequests 100 <IfModule mpm_worker_module> StartServers 12 MinSpareThreads 25 MaxSpareThreads 96 ThreadLimit 96 ThreadsPerChild 25 MaxClients 225 MaxRequestsPerChild 0 </IfModule> The site is only getting ~10,000 page views during the 8am-9am hour, which I dont think should be stressing the server too badly. Maybe it is an error with the PHP settings, or bandwidth per unit time, or the site outgrew the server? Any suggestions would be very helpful - as you can see i've given it a good go before looking for help (installed mpm-worker). Also, can anyone suggest to me some free load testing software, or a tutorial on mod_status? Thank you

    Read the article

  • SSH login to Cisco switch using Rancid times out

    - by Lars
    I have a 3560 switch that I have configured to accept SSH logins, and this works fine. However I cannot get Rancid to complete the login process to any of my switches using SSH. I get a timeout error after a minute or so. Telnet logins work fine with the same username and password. Here is my rancid setup in .cloginrc: add user * {myuser} add password * {strongAccessPassword} {strongEnablePassword} add method * ssh telnet Then, when I run bin/clogin 10.10.1.10 I get: # bin/clogin 10.10.1.10 10.10.1.10 spawn ssh -c 3des -x -l myuser 10.10.1.10 ############################################### Please authenticate. ############################################### Password: Error: TIMEOUT reached Again, when I do this using telnet as my preferred mothod in .cloginrc, it works without issue.

    Read the article

  • In Nginx can I set Keep-Alive dynamically depending on ssl connection?

    - by ck_
    I would like to avoid having to repeat all the virtualhost server {} blocks in nginx just to have custom ssl settings that vary slightly from plain http requests. Most ssl directives can be placed right in the main block, except one hurdle I cannot find a workaround for: different keep-alive for https vs http Is there any way I can use $scheme to dynamically change the keepalive_timeout ? I've even considered that I can use more_set_input_headers -r 'Keep-Alive: timeout=60'; to conditionally replace the keep-alive timeout only if it already exists, but the problem is $scheme cannot be used in location ie. this is invalid location ^https {}

    Read the article

  • Xen DomU on DRBD device: barrier errors

    - by Halfgaar
    I'm testing setting up a Xen DomU with a DRBD storage for easy failover. Most of the time, immediatly after booting the DomU, I get an IO error: [ 3.153370] EXT3-fs (xvda2): using internal journal [ 3.277115] ip_tables: (C) 2000-2006 Netfilter Core Team [ 3.336014] nf_conntrack version 0.5.0 (3899 buckets, 15596 max) [ 3.515604] init: failsafe main process (397) killed by TERM signal [ 3.801589] blkfront: barrier: write xvda2 op failed [ 3.801597] blkfront: xvda2: barrier or flush: disabled [ 3.801611] end_request: I/O error, dev xvda2, sector 52171168 [ 3.801630] end_request: I/O error, dev xvda2, sector 52171168 [ 3.801642] Buffer I/O error on device xvda2, logical block 6521396 [ 3.801652] lost page write due to I/O error on xvda2 [ 3.801755] Aborting journal on device xvda2. [ 3.804415] EXT3-fs (xvda2): error: ext3_journal_start_sb: Detected aborted journal [ 3.804434] EXT3-fs (xvda2): error: remounting filesystem read-only [ 3.814754] journal commit I/O error [ 6.973831] init: udev-fallback-graphics main process (538) terminated with status 1 [ 6.992267] init: plymouth-splash main process (546) terminated with status 1 The manpage of drbdsetup says that LVM (which I use) doesn't support barriers (better known as tagged command queuing or native command queing), so I configured the drbd device not to use barriers. This can be seen in /proc/drbd (by "wo:f, meaning flush, the next method drbd chooses after barrier): 3: cs:Connected ro:Primary/Secondary ds:UpToDate/UpToDate C r---- ns:2160152 nr:520204 dw:2680344 dr:2678107 al:3549 bm:9183 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 And on the other host: 3: cs:Connected ro:Secondary/Primary ds:UpToDate/UpToDate C r---- ns:0 nr:2160152 dw:2160152 dr:0 al:0 bm:8052 lo:0 pe:0 ua:0 ap:0 ep:1 wo:f oos:0 I also enabled the option disable_sendpage, as per the drbd docs: cat /sys/module/drbd/parameters/disable_sendpage Y I also tried adding barriers=0 to fstab as mount option. Still it sometimes says: [ 58.603896] blkfront: barrier: write xvda2 op failed [ 58.603903] blkfront: xvda2: barrier or flush: disabled I don't even know if ext3 has a nobarrier option. And, because only one of my storage systems is battery backed, it would not be smart anyway. Why does it still compain about barriers when I disabled that? Both host are: Debian: 6.0.4 uname -a: Linux 2.6.32-5-xen-amd64 drbd: 8.3.7 Xen: 4.0.1 Guest: Ubuntu 12.04 LTS uname -a: Linux 3.2.0-24-generic pvops drbd resource: resource drbdvm { meta-disk internal; device /dev/drbd3; startup { # The timeout value when the last known state of the other side was available. 0 means infinite. wfc-timeout 0; # Timeout value when the last known state was disconnected. 0 means infinite. degr-wfc-timeout 180; } syncer { # This is recommended only for low-bandwidth lines, to only send those # blocks which really have changed. #csums-alg md5; # Set to about half your net speed rate 60M; # It seems that this option moved to the 'net' section in drbd 8.4. (later release than Debian has currently) verify-alg md5; } net { # The manpage says this is recommended only in pre-production (because of its performance), to determine # if your LAN card has a TCP checksum offloading bug. #data-integrity-alg md5; } disk { # Detach causes the device to work over-the-network-only after the # underlying disk fails. Detach is not default for historical reasons, but is # recommended by the docs. # However, the Debian defaults in drbd.conf suggest the machine will reboot in that event... on-io-error detach; # LVM doesn't support barriers, so disabling it. It will revert to flush. Check wo: in /proc/drbd. If you don't disable it, you get IO errors. no-disk-barrier; } on host1 { # universe is a VG disk /dev/universe/drbdvm-disk; address 10.0.0.1:7792; } on host2 { # universe is a VG disk /dev/universe/drbdvm-disk; address 10.0.0.2:7792; } } DomU cfg: bootloader = '/usr/lib/xen-default/bin/pygrub' vcpus = '2' memory = '512' # # Disk device(s). # root = '/dev/xvda2 ro' disk = [ 'phy:/dev/drbd3,xvda2,w', 'phy:/dev/universe/drbdvm-swap,xvda1,w', ] # # Hostname # name = 'drbdvm' # # Networking # # fake IP for posting vif = [ 'ip=1.2.3.4,mac=00:16:3E:22:A8:A7' ] # # Behaviour # on_poweroff = 'destroy' on_reboot = 'restart' on_crash = 'restart' In my test setup: the primary host's storage is 9650SE SATA-II RAID PCIe with battery. The secondary is software RAID1. Isn't DRBD+Xen widely used? With these problems, it's not going to work.

    Read the article

  • mysql connector/net ssl shutsdown the server

    - by Simon
    Hello, when I try to connect my server throw connector/net using ssl with pfx certificate I had problem with establishing the connection. I get connection timeout. And the server probably fall down (I dont know it for sure, becouse I dont manage the server). On the Windows XP works all right, but on Windows 7 dont. Please, where is problem? In Windows 7 or on the server (mysql 5.0)? Sometimes I get "Calling interface SSPI Failed" error, but not everytime. Sometimes is only connection timeout error. Thank you a lot for any help. Regards, simon

    Read the article

  • Debugging / troubleshooting .inf driver install in Windows

    - by Jesus Cuenca
    I'm trying to install a graphics driver in a laptop with Windows 7 but the setup fails with a timeout. I checked the setup log but it does not include any details about the .inf installation. It only shows the error code, which I searched only to find what I already knew (timeout error, no more details) I also checked the .inf file, but it has hundreds of lines so it's hard to guess which step is failing. So I'm wondering, is there some procedure to debug a .inf file? Like, having a detailed log of what happened after executing each line of the .inf Thanks!

    Read the article

  • VLAN trunking between Juniper EX -> Cisco Catalyst -> and Cisco Router

    - by Hugo Garcia
    I have the following scenario: EX2200 Switch whit ge-0/0/6 set as an access port on VLAN 80 ge-0/0/0 set as a trunk port connected to a catalyst switch and various vlans allowed to pass includin vlan 80 On the Catalyst Switch. port #3 set up as a trunk port that receives traffic from the EX switch. port 46 is set up also as a trunk port that connects to a cisco router. Port #48 is where the host used to be connected host - EX2200 - Catalyst - Router the problem is that this EX2200 is a new addition to the network and the host connected previosly to the catalyst switch. traffic is not getting from the host to the router, but the router can send ARP request to the host. following is the relevant configuration: Catalyst Switch: interface GigabitEthernet1/46 switchport trunk encapsulation dot1q switchport trunk allowed vlan 80,82,83,93,289 switchport mode trunk mtu 1532 media-type rj45 speed 1000 duplex full arp timeout 300 ! interface GigabitEthernet1/48 switchport access vlan 80 switchport mode access mtu 1532 media-type rj45 speed 100 duplex full arp timeout 300 no cdp enable ! EX2200 Switch:

    Read the article

  • IIS 7 AppPool logs an error after recycle due to inactivity

    - by ddysart
    We have Windows 2008 RS Server running IIS hosting an ASP.NET site. This morning there was a weird sequence. First a notice that the AppPool was being recycled due to inactivity: "A worker process with process id of '6896' serving application pool 'xxxx' was shutdown due to inactivity. Application Pool timeout configuration was set to 20 minutes. A new worker process will be started when needed." This makes sense and jibes with out timeout settings, but 30 seconds later we see: "A process serving application pool 'xxxx' terminated unexpectedly. The process id was '6896'. The process exit code was '0xc0000005'." I found an older KB article that explains a condition where this might happend on IIS6 due to permission issues, but am curious what might cause this on IIS7.5, especially since we are not seeing it regularly.

    Read the article

  • Coldfusion 8 Application Crashes Under Heavy Load

    - by KM01
    Hello, We have a CF8 app that runs for 20-25 minutes before crashing under heavy load ~ 1200 users. This load is generated by our load testing tool: 1200 users ramped up in 5 mins (approx behavior of our users), running for an hour. We have this app on Solaris 10, Apache 2, JRun 4 and Oracle 10g. Java version is 1.6. During the initial load tests, the thread dumps pointed to monitor deadlocks that pointed to sessions. "jrpp-173": waiting to lock monitor 0x019fdc60 (object 0x6b893530, a java.util.Hashtable), which is held by "scheduler-1" "scheduler-1": waiting to lock monitor 0x026c3ce0 (object 0x6abe2f20, a coldfusion.monitor.memory.SessionMemoryMonitor$TopMemoryUsedSessions), which is held by "jrpp-167" "jrpp-167": waiting to lock monitor 0x019fdc60 (object 0x6b893530, a java.util.Hashtable), which is held by "scheduler-1" We increased the number of sessions relative to the number of CPUs (48 simultaneous threads against 32 CPUs), and the deadlock went away. While varying the simultaneous threads helped a little bit in terms of response time, the CF server still tanked in 20-25 minutes during all of these tests. We ran more thread dumps, and saw a thread locking a monitor, for e.g.: "jrpp-475" prio=3 tid=0x02230800 nid=0x2c5 runnable [0x4397d000] java.lang.Thread.State: RUNNABLE at java.util.HashMap.getEntry(HashMap.java:347) at java.util.HashMap.containsKey(HashMap.java:335) at java.util.HashSet.contains(HashSet.java:184) at coldfusion.monitor.memory.MemoryTracker.onAddObject(MemoryTracker.java:124) at coldfusion.monitor.memory.MemoryTrackerProxy.onReplaceValue(MemoryTrackerProxy.java:598) at coldfusion.monitor.memory.MemoryTrackerProxy.onPut(MemoryTrackerProxy.java:510) at coldfusion.util.CaseInsensitiveMap.put(CaseInsensitiveMap.java:250) at coldfusion.util.FastHashtable.put(FastHashtable.java:43) - locked <0x6f7e1a78> (a coldfusion.runtime.Struct) at coldfusion.runtime.CfJspPage._arrayset(CfJspPage.java:1027) at coldfusion.runtime.CfJspPage._arraySetAt(CfJspPage.java:2117) at cfvalidation2ecfc1052964961$funcSETUSERAUDITDATA.runFunction(/app/docs/apply/cfcs/validation.cfc:377) As you see in the last line above there were several references CFMs and CFCs, and the lines have "cflock" tags, which were scoped to the "application." We (the dev team) then changed them to be scoped to a "name". After more load tests, there is no locking going on and there no deadlocks, but now the application tanks in 7-10 minutes. We've gotten system, network and DB reports from the respective admins, and they are not being taxed; even watched the server stats with server monitor, top, prstat, ran sar reports, etc. So we believe it is an issue with the CF server or maybe the JVM. I am running out of ideas as to what else we can try. Disclaimer: I am not a CF developer or Admin. I am just running the load test, analyzing the reports, threads etc, and sharing the results with the dev and admin teams, and trying the next change, and so on. So far no dice. Has anyone run into something similar? How did you go about diagnosing and troubleshooting? All thoughts and pointers welcome. Thank you for your time! KM

    Read the article

  • Why is my mdadm raid-1 recovery so slow?

    - by dimmer
    On a system I'm running Ubuntu 10.04. My raid-1 restore started out fast but quickly became ridiculously slow (at this rate the restore will take 150 days!): dimmer@paimon:~$ cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md0 : active raid1 sdc1[2] sdb1[1] 1953513408 blocks [2/1] [_U] [====>................] recovery = 24.4% (477497344/1953513408) finish=217368.0min speed=113K/sec unused devices: <none> Eventhough I have set the kernel variables to reasonably quick values: dimmer@paimon:~$ cat /proc/sys/dev/raid/speed_limit_min 1000000 dimmer@paimon:~$ cat /proc/sys/dev/raid/speed_limit_max 100000000 I am using 2 2.0TB Western Digital Hard Disks, WDC WD20EARS-00M and WDC WD20EARS-00J. I believe they have been partitioned such that their sectors are aligned. dimmer@paimon:/sys$ sudo parted /dev/sdb GNU Parted 2.2 Using /dev/sdb Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) p Model: ATA WDC WD20EARS-00M (scsi) Disk /dev/sdb: 2000GB Sector size (logical/physical): 512B/512B Partition Table: gpt Number Start End Size File system Name Flags 1 1049kB 2000GB 2000GB ext4 (parted) unit s (parted) p Number Start End Size File system Name Flags 1 2048s 3907028991s 3907026944s ext4 (parted) q dimmer@paimon:/sys$ sudo parted /dev/sdc GNU Parted 2.2 Using /dev/sdc Welcome to GNU Parted! Type 'help' to view a list of commands. (parted) p Model: ATA WDC WD20EARS-00J (scsi) Disk /dev/sdc: 2000GB Sector size (logical/physical): 512B/4096B Partition Table: gpt Number Start End Size File system Name Flags 1 1049kB 2000GB 2000GB ext4 I am beginning to think that I have a hardware problem, otherwise I can't imagine why the mdadm restore should be so slow. I have done a benchmark on /dev/sdc using Ubuntu's disk utility GUI app, and the results looked normal so I know that sdc has the capability to write faster than this. I also had the same problem on a similar WD drive that I RMAd because of bad sectors. I suppose it's possible they sent me a replacement with bad sectors too, although there are no SMART values showing them yet. Any ideas? Thanks. As requested, output of top sorted by cpu usage (notice there is ~0 cpu usage). iowait is also zero which seems strange: top - 11:35:13 up 2 days, 9:40, 3 users, load average: 2.87, 2.58, 2.30 Tasks: 142 total, 1 running, 141 sleeping, 0 stopped, 0 zombie Cpu(s): 0.0%us, 0.2%sy, 0.0%ni, 99.8%id, 0.0%wa, 0.0%hi, 0.0%si, 0.0%st Mem: 3096304k total, 1482164k used, 1614140k free, 617672k buffers Swap: 1526132k total, 0k used, 1526132k free, 535416k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 45 root 20 0 0 0 0 S 0 0.0 2:17.02 scsi_eh_0 1 root 20 0 2808 1752 1204 S 0 0.1 0:00.46 init 2 root 20 0 0 0 0 S 0 0.0 0:00.00 kthreadd 3 root RT 0 0 0 0 S 0 0.0 0:00.02 migration/0 4 root 20 0 0 0 0 S 0 0.0 0:00.17 ksoftirqd/0 5 root RT 0 0 0 0 S 0 0.0 0:00.00 watchdog/0 6 root RT 0 0 0 0 S 0 0.0 0:00.02 migration/1 ... dmesg errors, definitely looking like hardware: [202884.000157] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [202884.007015] ata5.00: failed command: FLUSH CACHE EXT [202884.013728] ata5.00: cmd ea/00:00:00:00:00/00:00:00:00:00/a0 tag 0 [202884.013730] res 40/00:00:ff:59:2e/00:00:35:00:00/e0 Emask 0x4 (timeout) [202884.033667] ata5.00: status: { DRDY } [202884.040329] ata5: hard resetting link [202889.400050] ata5: link is slow to respond, please be patient (ready=0) [202894.048087] ata5: COMRESET failed (errno=-16) [202894.054663] ata5: hard resetting link [202899.412049] ata5: link is slow to respond, please be patient (ready=0) [202904.060107] ata5: COMRESET failed (errno=-16) [202904.066646] ata5: hard resetting link [202905.840056] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [202905.849178] ata5.00: configured for UDMA/133 [202905.849188] ata5: EH complete [203899.000292] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [203899.007096] ata5.00: failed command: IDENTIFY DEVICE [203899.013841] ata5.00: cmd ec/00:01:00:00:00/00:00:00:00:00/00 tag 0 pio 512 in [203899.013843] res 40/00:00:ff:f9:f6/00:00:38:00:00/e0 Emask 0x4 (timeout) [203899.041232] ata5.00: status: { DRDY } [203899.048133] ata5: hard resetting link [203899.816134] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [203899.826062] ata5.00: configured for UDMA/133 [203899.826079] ata5: EH complete [204375.000200] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [204375.007421] ata5.00: failed command: IDENTIFY DEVICE [204375.014799] ata5.00: cmd ec/00:01:00:00:00/00:00:00:00:00/00 tag 0 pio 512 in [204375.014800] res 40/00:00:ff:0c:0f/00:00:39:00:00/e0 Emask 0x4 (timeout) [204375.044374] ata5.00: status: { DRDY } [204375.051842] ata5: hard resetting link [204380.408049] ata5: link is slow to respond, please be patient (ready=0) [204384.440076] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [204384.449938] ata5.00: configured for UDMA/133 [204384.449955] ata5: EH complete [204395.988135] ata5.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen [204395.988140] ata5.00: failed command: IDENTIFY DEVICE [204395.988147] ata5.00: cmd ec/00:01:00:00:00/00:00:00:00:00/00 tag 0 pio 512 in [204395.988149] res 40/00:00:ff:0c:0f/00:00:39:00:00/e0 Emask 0x4 (timeout) [204395.988151] ata5.00: status: { DRDY } [204395.988156] ata5: hard resetting link [204399.320075] ata5: SATA link up 3.0 Gbps (SStatus 123 SControl 300) [204399.330487] ata5.00: configured for UDMA/133 [204399.330503] ata5: EH complete

    Read the article

  • Auto Log-Off Windows users - Windows 2003 domain

    - by thehatter
    I am trying to make windows clients automatically log off after some time, I have been trying to use the winexit.scr which I have seen working else where in a similar environment. After working though these instructions (I did read the comments and notice the original ADM provided is buggy) I've had no joy what so ever! Winexit.scr refuses to read any settings in the registry, even while using a test account I can access the required reg key(s); edit, add, and remove values. Essentially winexit.scr always uses it's default values: 30 second timeout, no forced log-out. What I really want is a 30 minute timeout with a forced log-out, closing all the users apps etc. I've tried removing and re-adding the ADM template, creating the GPO from scratch several times, giving various registry permissions - including full control to "Everybody" just for fun! Oh, clients are all win XP SP3, DC is win 2003 R2 SP2. So, can anybody suggest something? Cheers!

    Read the article

  • eclipse template autoformat [closed]

    - by Matiaan
    I have a template (sleep) in eclipse that should expand to try { Thread.sleep(1000); } catch (Exception e) {} I want eclipse to add it as one line, just as I put it in the template. However, eclipse autoformats it after inserting to: try { Thread.sleep(1000); } catch (Exception e) { } Is there any way to disable formatting just for this one template (or all templates since all templates I use are one liners)?

    Read the article

  • Grub2 menu after VMware guest customization

    - by poopa
    Hi, I have ubuntu 9.10 desktop VMware VM with the default grub2 installed. There is some weird problem with this VM. When you clone this vm and have a customization script run, the cloned machine crashes at first boot (VMware does not officially suport customizaing Ubuntu newer than 8.04). After the creash the Grub boot menu is displayed but there is not time out. I checked /boot/grub/grub.cfg and it does indeed show a timeout of 10 seconds. Nothing happens till I select an option with the keyboard. The second time the Ubuntu loads, it does not crash. My question is, how do I make the grub menu timeout in that case? Thanks.

    Read the article

  • 504 Gateway Time-out after php fatal error

    - by tiagojsag
    I'm using nginx and php-fpm to develop a symfony2 based website, under ubuntu 12.10 (yes, I know I'm using a beta OS). Everything was working out fine until, due to an error on my code, I called an unexisting function, and got the following: Fatal error: Call to a member function (....) This isn't a problem (it's a bug in my code, easily fixable), but after this, no other page loads. My browser just keeps trying to load the page from the webserver, until nginx timeouts (after +- 30s, which should be some default timeout) and returns: 504 Gateway Time-out Restarting php-fpm solves the issue. Nginx logs show a timeout message, and nothing appears on php-fpm logs, even if I set them to debug level. I tried switching from fpm to fastcgi, and the same thing happens. I've looked around, but all similar error are related to big requests/file handling, which isn't the case. All the pages on my website load in a few seconds, even under development conditions (no caching, etc).

    Read the article

  • httpd service keep restarting. after 15-20 mins

    - by niraj
    I have recently purchased Dedicated Server which has 16bg ram and 1TB Harddisk. It has Cpanel and for firewall CSF Installd. I am mainly going to install it for File hosting service. Now the day i moved my httpd service keep restarting every 15-20 mins. It becomes unresponsive after that so have to manually restart it. My httpd settings are Start Servers = 5 Minimum Spare Servers = 5 Maximum Spare Servers = 10 Server Limit = 20000 Max Clients = 10000 Max Requests Per Child = 10000 Keep-Alive = On Keep-Alive Timeout = 5 Max Keep-Alive Requests = Unlimited Timeout 300 TOP is top - 14:53:41 up 1 day, 23:39, 2 users, load average: 0.10, 0.14, 0.09 Tasks: 1563 total, 1 running, 1562 sleeping, 0 stopped, 0 zombie Cpu(s): 0.7%us, 0.6%sy, 0.0%ni, 98.1%id, 0.2%wa, 0.0%hi, 0.5%si, 0.0%st Mem: 16303780k total, 16142048k used, 161732k free, 135264k buffers Swap: 8224760k total, 868k used, 8223892k free, 14136616k cached Please help me in this its keep happning.

    Read the article

  • Uninstalled Ubuntu, no GRLDR?

    - by user32965
    So I'm a big fat idiot. I installed Ubuntu 11.04 on my school's laptop, and here's come the time that I have to turn it back in. I wrote GRUB to the Master Boot Record, thinking it wasn't going to be permanent. So, fast forward to yesterday. I decided to hell with this, and popped in my Windows 7 CD, deleted the whole partition, formatted to NTFS, and installed Windows 7 on it. I'm surfing the web and my computer overheats [totally typical] I boot up, and get this: Try (hd0,0): FAT32: No GRLDR Try (hd0,1): invalid or null Try (hd0,2): invalid or null Try (hd0,3): invalid or null Try (hd1,0): NTFS5: No grldr Try (hd1,1): invalid or null Try (hd1,2): invalid or null Try (hd1,3): invalid or null Cannot find GRLDR. Press space bar to hold the screen, any other key to boot previous MBR... Timeout: 5 The timeout part just counts down to 0 from 5. I need to turn in this thing before tomorrow, please please please can someone help me out?

    Read the article

  • How to save a document and close it after the save

    - by Le Questione
    Currently I have a .bat file that opens a program, opens a document and then reloads this document so it contains new data. After that I need a timeout so it can run some other things within the document which takes about 2 minutes in total, but for safety I used 5 minutes. Then I want it to save the document and close the program. I've searched a lot for this but I can't find anything that saves the document and then closes it. Only can find how to save a bat file, but I already have that! CD /D "D:\01 - Config\02 - Developer" start qv.exe /l "D:\02 - Content\07 - Accesspoint\tracktrace.qvw" timeout /t 300 ???? exit 0 Please advise, thanks

    Read the article

  • Windows 2008 Server SP2 64bit - TCP Connections never releasing after TIME_WAIT

    - by Peco
    Hello fellow admins :) We have an issue with Windows 2008 Datacenter edition SP2 64bit. We have a process that is polling very frequently and establishing new TCP connections. The system gets in a state where we end up with over 16k connections in TIME_WAIT state. The default OS timeout is 120 seconds after which these connections should go away, but that never happens. These connections persist and never get cleaned up even after the originating process has long terminated (we are still at 16k connections two days after the process was killed). The OS is supposed to time them out but it doesn't. Has anyone else seen this behavior and if so what was done to resolve it. We are aware of how to tune the tcp stack to make the timeout shorter or allow more connections but this is not the issue here. Thanks!

    Read the article

  • Extension or settings in Chrome that syncs settings and extensions?

    - by Aequitarum Custos
    I use Chrome at home and the office, and I was wondering if there is an extension that will sync not just my Chrome settings and bookmarks, but also my extensions and their settings between computers I run Chrome on? http://www.google.com/support/forum/p/Chrome/thread?tid=469242bc0eb964d6&hl=en Is a thread asking for support on it, but nobody mentions if there is an extension created for Chrome doing this, or if it was ever implemented.

    Read the article

< Previous Page | 125 126 127 128 129 130 131 132 133 134 135 136  | Next Page >