Search Results

Search found 9816 results on 393 pages for 'blade servers'.

Page 377/393 | < Previous Page | 373 374 375 376 377 378 379 380 381 382 383 384 | Next Page >

apache webserver unresponsible with server-status showing all child processes waiting for connection

- by Jeff

My setup: i have 3 nearly identical webserver machines serving the same high loaded dynamic website with simple load balancing over dns. The service has been working for over two ears with the same apache config. apache2, php5, ubuntu 8.04 linux 2.6.24-29-server My problem: since about two weeks i'm experiencing problems with this config. Nearly every day i have one small moment about 5 minutes, in which the website is unreachable. I'm still able to login to the servers over ssh. If i run htop, i see the machine simply doing nothing. i have about 1000 apache processes running, but no cpu activity. i've used the apache mod_status to debug this situation. the process scoreboard looks like this: _C.___K_______________________R._______.__K_K____K___C_______.__ _______C__________.___________________________________.________C _.____K__________K___K_WK_____._K_____________________________._ W______K__________K________.____________________._______C_______ _C_.__K__K____.._.._____________________________________C_______ _R___________K___.______C________.C_________.______._____C______ ____________KKC____K_____K__WC_________________C_____.__.____.__ _____________________C_________K______.____C______._____________ _.___C____.___.___________________________.K______.____K________ W__.___________________C.__.____K________K_______R_._.__._______ __C__C_.__________C__C_______._____W______________C_.___C_______ ____.______C_____________C________.____C____________.________._K __.__________.K_____________K_________._____C____.K__________KW_ __K.W________R_________._______.___W___________.____.__K_____W__ W___.___..________W____K Scoreboard Key: "_" Waiting for Connection, "S" Starting up, "R" Reading Request, "W" Sending Reply, "K" Keepalive (read), "D" DNS Lookup, "C" Closing connection, "L" Logging, "G" Gracefully finishing, "I" Idle cleanup of worker, "." Open slot with no current process So the most of the processes are just waiting for connection. after about 5 minutes the situation will return to normal: i have lot least processes on every machine, the most workers have the "."-status (meaing they are open to process a request) and of course the website is reachable! so i'm trying to find something in the logs, but there is simply nothing... the apache access log is silent for about 4 minutes, the same is for the error log. i also can not figure out anything wrong in other system logs. the situation is the same on all 3 webservers (all of them have this load peak and unresposibility at the same time), so i do not thing this is hardware related. but i think, this might be related to some network (tcp) issue. any ideas? EDIT: some more information, that i have just discovered: it has just happened again. and i was able to verify that i'm also not able to connect locally when this problem occurs. i have made some connection statistics with the following command after it happend netstat -an|awk '/tcp/ {print $6}'|sort|uniq -c 109 CLOSE_WAIT 2652 ESTABLISHED 2 FIN_WAIT1 11 LAST_ACK 12 LISTEN 91 SYN_RECV 1 SYN_SENT 16 TIME_WAIT If i execute the same command some time later, i have something like this: 4 CLOSING 108 ESTABLISHED 18 FIN_WAIT1 182 FIN_WAIT2 37 LAST_ACK 12 LISTEN 50 SYN_RECV 11276 TIME_WAIT So in the normal situation i have only 100-200 open connections by clients beeing handled by apache in this moment. when i have this "crash", i have a lot more connections. what is the best way to analyse this? EDIT2: the important lines in apache2.conf are: KeepAlive On MaxKeepAliveRequests 20 KeepAliveTimeout 1 <IfModule mpm_prefork_module> ServerLimit 920 StartServers 30 MinSpareServers 80 MaxSpareServers 120 MaxClients 920 MaxRequestsPerChild 700 </IfModule> it is an apache2 prefork with php_mod. the server has 8GB ram and a 4gb swap partition.

Read the article
Error during GENERAL_REQUEST_ENTITY for POST results in ASP .NET session state never getting unlocked

- by Jesse

I have been trying to chase down the root cause of a condition where ASP .NET session state remains locked after a web request has been terminated due to an unexpected error. We use the SQL Server session state provider for session because we have several servers in a web farm. This issue first presented itself in the form of many requests getting stuck on the 'AcquireRequestState' event of their lifecycle for no apparent reason. I was able to finding corresponding entries for these requests in the session state database in SQL server that were all locked (column Locked = 1). I was also able to correlate these requests to entries in the IIS log with HTTP status codes of 500 (with a sub status of 0). These findings lead me to believe that, in some cases, a request was erroring out but was NOT releasing its lock on session state like it should. I enabled Failed Request Tracing in IIS for the website in question for status code 500 with all available providers selected each with the 'Verbose' setting for verbosity. I've since gathered several failed traces that have caused permanently locked ASP .NET sessions. They all share the same characteristics: They are all 'POST' requests where the browser is posting data to be processed/saved. They all have events indicating that the 'Session' module was invoked during the REQUEST_ACQUIRE_STATE event. At this point the request would have marked the row in the session state database as being "locked". This is normal and expected. They all have GENERAL_READ_ENTITY_START, GENERAL_READ_ENTITY_END, and GENERAL_REQUEST_ENTITY entries that appear to be reading in the data that was posted to the server as part of the request. This appears to be a buffered operation as these events get repeated over and over with each one reading in some subset of the posted data. At some point during the 'read entity' related events and error occurs. Some have the error code "Incorrect function. (0x80070001)" and others have "The I/O operation has been aborted because of either a thread exit or an application request. (0x800703e3)". Once the error has been encountered, they all jump directly to the END_REQUEST events. The issue here is that, under normal circumstances, there should be a RELEASE_REQUEST_STATE event that will allow the Session module to release the lock it has on the session. This event is being skipped in this scenario. Just to be sure, I enabled failed request tracing for the '200' status code as well and generated several traces of successful requests that do have the RELEASE_REQUEST_STATE event being handled by the Session module. My theory at this point is that some kind of network issue is causing the 'Incorrect function' and 'I/O operation has been aborted because of either a thread exit or an application request' errors, but I don't understand why this seems to be causing the request handling to skip over the RELEASE_REQUEST_STATE event. If the request went through REQUEST_ACQUIRE_STATE it seems like it should also hit RELEASE_REQUEST_STATE as well. I'm loathe to say that this is a bug in IIS or ASP .NET, but it certainly appears that way to me at this point. Are there any configuration changes I could make to help ensure that 'RELEASE_REQUEST_STATE' is fired under all error conditions?

Read the article
"Possible SYN flooding" in log despite low number of SYN_RECV connections

- by al4

Recently we had an apache server which was responding very slowly due to SYN flooding. The workaround for this was to enable tcp_syncookies (net.ipv4.tcp_syncookies=1 in /etc/sysctl.conf). I posted a question about this here if you want more background. After enabling syncookies we started seeing the following message in /var/log/messages approximately every 60 seconds: [84440.731929] possible SYN flooding on port 80. Sending cookies. Vinko Vrsalovic informed me that this means the syn backlog is getting full, so I raised tcp_max_syn_backlog to 4096. At some point I also lowered tcp_synack_retries to 3 (down from the default of 5) by issuing sysctl -w net.ipv4.tcp_synack_retries=3. After doing this, the frequency seemed to drop, with the interval of the messages varying between roughly 60 and 180 seconds. Next I issued sysctl -w net.ipv4.tcp_max_syn_backlog=65536, but am still getting the message in the log. Throughout all this I've been watching the number of connections in SYN_RECV state (by running watch --interval=5 'netstat -tuna |grep "SYN_RECV"|wc -l'), and it never goes higher than about 240, much much lower than the size of the backlog. Yet I have a Red Hat server which hovers around 512 (limit on this server is the default of 1024). Are there any other tcp settings which would limit the size of the backlog or am I barking up the wrong tree? Should the number of SYN_RECV connections in netstat -tuna correlate to the size of the backlog? Update As best I can tell I'm dealing with legitimate connections here, netstat -tuna|wc -l hovers around 5000. I've been researching this today and found this post from a last.fm employee, which has been rather useful. I've also discovered that the tcp_max_syn_backlog has no effect when syncookies are enabled (as per this link) So as a next step I set the following in sysctl.conf: net.ipv4.tcp_syn_retries = 3 # default=5 net.ipv4.tcp_synack_retries = 3 # default=5 net.ipv4.tcp_max_syn_backlog = 65536 # default=1024 net.core.wmem_max = 8388608 # default=124928 net.core.rmem_max = 8388608 # default=131071 net.core.somaxconn = 512 # default = 128 net.core.optmem_max = 81920 # default = 20480 I then setup my response time test, ran sysctl -p and disabled syncookies by sysctl -w net.ipv4.tcp_syncookies=0. After doing this the number of connections in the SYN_RECV state still remained around 220-250, but connections were starting to delay again. Once I noticed these delays I re-enabled syncookies and the delays stopped. I believe what I was seeing was still an improvement from the initial state, however some requests were still delayed which is much worse than having syncookies enabled. So it looks like I'm stuck with them enabled until we can get some more servers online to cope with the load. Even then, I'm not sure I see a valid reason to disable them again as they're only sent (apparently) when the server's buffers get full. But the syn backlog doesn't appear to be full with only ~250 connections in the SYN_RECV state! Is it possible that the SYN flooding message is a red herring and it's something other than the syn_backlog that's filling up? If anyone has any other tuning options I haven't tried yet I'd be more than happy to try them out, but I'm starting to wonder if the syn_backlog setting isn't being applied properly for some reason.

Read the article
Problem with Domain delegation...

- by Lockhead

Okey I have the subdomain news.247dist.com, if i dig any this domain i get: ; <<>> DiG 9.4.3-P3 <<>> news.247dist.com any ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 36179 ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 2 ;; QUESTION SECTION: ;news.247dist.com. IN ANY ;; ANSWER SECTION: news.247dist.com. 259018 IN NS b.ns.broadmail.de. news.247dist.com. 259018 IN NS a.ns.broadmail.de. news.247dist.com. 2382 IN SOA a.ns.broadmail.de. hostmaster.news.247dist.com. 1274182332 16384 2048 1048576 2560 ;; ADDITIONAL SECTION: a.ns.broadmail.de. 718 IN A 193.169.180.254 b.ns.broadmail.de. 718 IN A 193.169.181.254 ;; Query time: 0 msec ;; SERVER: 80.67.16.6#53(80.67.16.6) ;; WHEN: Wed May 19 17:21:16 2010 ;; MSG SIZE rcvd: 160 The Problem is, if I dig any this subdomain and ask one of these NS Servers in the above dig i get: ; <<>> DiG 9.4.3-P3 <<>> any @a.ns.broadmail.de news.247dist.com ; (1 server found) ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 3887 ;; flags: qr aa rd; QUERY: 1, ANSWER: 6, AUTHORITY: 0, ADDITIONAL: 3 ;; WARNING: recursion requested but not available ;; QUESTION SECTION: ;news.247dist.com. IN ANY ;; ANSWER SECTION: news.247dist.com. 2560 IN SOA a.ns.broadmail.de. hostmaster.news.247dist.com. 1274182332 16384 2048 1048576 2560 news.247dist.com. 900 IN NS a.ns.broadmail.de. news.247dist.com. 900 IN NS b.ns.broadmail.de. news.247dist.com. 900 IN MX 0 mail.srv2.de. news.247dist.com. 900 IN TXT "v=spf1 ip4:213.61.69.122/32 ip4:193.169.180.0/23 -all" news.247dist.com. 900 IN A 193.169.180.252 ;; ADDITIONAL SECTION: a.ns.broadmail.de. 900 IN A 193.169.180.254 b.ns.broadmail.de. 900 IN A 193.169.181.254 mail.srv2.de. 900 IN A 193.169.180.201 ;; Query time: 23 msec ;; SERVER: 193.169.180.254#53(193.169.180.254) ;; WHEN: Wed May 19 17:26:33 2010 ;; MSG SIZE rcvd: 284 So why I don't get the second result if i simple dig any news.247dist.com?

Read the article
Bypass DNSSEC for local Stub zones

- by Starsky

I am using bind 9.9.2 as a DNSSEC validating recursive resolver in an Internet DMZ. I want to point to my internal DNS servers as stub zones (ideally) or anything except slave zones (to avoid very large zone transfers). We use a routable ip space for our Internal addressing. Sorry if I am using an IP space that you own in my example, but 167.x.x.x is the first zone I found that fits my issue. E.G dnssec-enable yes; dnssec-validation yes; dnssec-accept-expired no; zone "16.172.in-addr.arpa" { type stub; masters { 167.255.1.53; } } zone "myzone.com" in { type stub; masters { 167.255.1.53; } } When queries hit the DNS server, they attempt at being validated, and fail because 167.in-addr.arpa HAS an RRSIG record, but sub zones do not (and should not!). Google dns is used in this example, but in reality it would be my recursive resolver. @8.8.8.8 -x 167.255.1.53 +dnssec ; (1 server found) ;; global options: +cmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 17488 ;; flags: qr rd ra ad; QUERY: 1, ANSWER: 0, AUTHORITY: 6, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags: do; udp: 512 ;; QUESTION SECTION: ;53.1.255.167.in-addr.arpa. IN PTR ;; AUTHORITY SECTION: 167.in-addr.arpa. 1800 IN SOA z.arin.net. dns-ops.arin.net. 2013100713 1800 900 691200 10800 167.in-addr.arpa. 1800 IN RRSIG SOA 5 3 86400 20131017160124 20131007160124 812 167.in-addr.arpa. Lcl8sCps7LapnAj4n403KXx7A3GO7+2z/9Q2R2mwkh9FL26iDx7GlU4+ NufGd92IEJCdBu9IgcZP4I9QcKi8DI28og27WrfKd5moSl/STj02GliS qPTfNiewmTTIDw5++IlhITbp+CoJuZCRCdDbyWKmd5NSLcbskAwbCVlO vVA= 167.in-addr.arpa. 10800 IN NSEC 1.167.in-addr.arpa. NS SOA TXT RRSIG NSEC DNSKEY 167.in-addr.arpa. 10800 IN RRSIG NSEC 5 3 10800 20131017160124 20131007160124 812 167.in-addr.arpa. XALsd59i+XGvCIzjhTUFXcr11/M8prcaaPQ5yFSbvP9TzqjJ3wpizvH6 202MdrIWbsT1Dndri49lHKAXgBQ5OOsUmOh+eoRYR5okxRO4VLc5Tkze Gh0fQLcwGXPuv9A4SFNIrNyi3XU4Qvq0cViKXIuEGTa3C+zMPuvc0her oKk= 254.167.in-addr.arpa. 10800 IN NSEC 26.167.in-addr.arpa. NS RRSIG NSEC 254.167.in-addr.arpa. 10800 IN RRSIG NSEC 5 4 10800 20131017160124 20131007160124 812 167.in-addr.arpa. xnsLBTnPhdyABdvqtEHPxa6Y6NASfYAWfW1yYlNliTyV8TFeNOqewjwj nY43CWD77ftFDDQTLFEOPpV5vwmnUGYTRztK+kB5UrlflhPgiqYiBaBD RQaFQ8DIKaof8/snusZjK7aNmfe09t9gRcaX/pXn3liKz7m/ggxZi0f9 xo0= ;; Query time: 31 msec ;; SERVER: 8.8.8.8#53(8.8.8.8) ;; WHEN: Mon Oct 7 16:52:59 2013 ;; MSG SIZE rcvd: 722 Is there a way to bypass DNSSEC validation for specific zones? Any zone that I host internally, I do not want DNSSEC validation performed on. I have only see this interfere w/ certain reverse zones where the top level has DS/RRSIG records. Thanks.

Read the article
SNTP, why do you mock me?!

- by Matthew

--- SOLVED SEE EDIT 5 --- My w2k3 pdc is configured as an authoritative time server. Other servers on the domain are able to sync with it if I manually specify it in the peer list. By if I try to sync from flags 'domhier', it wont resync; I get the error message The computer did not resync because no time data was available. I can only think that it is not querying the pdc. I also tried setting the registry as shown here (http://support.microsoft.com/kb/193825). But no luck (I have not restarted the server, I am hoping I wont have to since it is the pdc) If you would like any further information on my config, please let me know. Edit 1: I have set the w32time service config AnnouceFlags to 0x05 as documented here www.krr.org/microsoft/authoritative_time_servers.php and a number of other places. The PDC syncs to an external time source (ntp). I can get the stripchart on the client from the pdc no problems. The loginserver for the host I am trying to configure is shown as the pdc. Edit 2: The packet capture has revealed something interesting. The client is contacting the correct server, and getting a valid response but I still get the same error message. Here is the NTP excerpt from the client to the server Flags: 11.. .... = Leap Indicator: alarm condition (clock not synchronized) (3) ..01 1... = Version number: NTP Version 3 (3) .... .011 = Mode: client (3) Peer Clock Stratum: unspecified or unavailable (0) Peer Polling Interval: 10 (1024 sec) Peer Clock Precision: 0.015625 sec Root Delay: 0.0000 sec Root Dispersion: 1.0156 sec Reference Clock ID: NULL Reference Clock Update Time: Sep 1, 2010 05:29:39.8170 UTC Originate Time Stamp: NULL Receive Time Stamp: NULL Transmit Time Stamp: Nov 8, 2010 01:44:44.1450 UTC Key ID: DC080000 Here is the reply NTP excerpt from the server to the client Flags: 0x1c 00.. .... = Leap Indicator: no warning (0) ..01 1... = Version number: NTP Version 3 (3) .... .100 = Mode: server (4) Peer Clock Stratum: secondary reference (3) Peer Polling Interval: 10 (1024 sec) Peer Clock Precision: 0.00001 sec Root Delay: 0.1484 sec Root Dispersion: 0.1060 sec Reference Clock ID: 192.189.54.17 Reference Clock Update Time: Nov 8,2010 01:18:04.6223 UTC Originate Time Stamp: Nov 8, 2010 01:44:44.1450 UTC Receive Time Stamp: Nov 8, 2010 01:46:44.1975 UTC Transmit Time Stamp: Nov 8, 2010 01:46:44.1975 UTC Key ID: 00000000 Edit 3: dumpreg for paramters on pdc Value Name Value Type Value Data ------------------------------------------------------------------------ ServiceMain REG_SZ SvchostEntry_W32Time ServiceDll REG_EXPAND_SZ C:\WINDOWS\system32\w32time.dll NtpServer REG_SZ bhvmmgt01.domain.com,0x1 Type REG_SZ AllSync and config Value Name Value Type Value Data -------------------------------------------------------------------------- LastClockRate REG_DWORD 156249 MinClockRate REG_DWORD 155860 MaxClockRate REG_DWORD 156640 FrequencyCorrectRate REG_DWORD 4 PollAdjustFactor REG_DWORD 5 LargePhaseOffset REG_DWORD 50000000 SpikeWatchPeriod REG_DWORD 900 HoldPeriod REG_DWORD 5 LocalClockDispersion REG_DWORD 10 EventLogFlags REG_DWORD 2 PhaseCorrectRate REG_DWORD 7 MinPollInterval REG_DWORD 6 MaxPollInterval REG_DWORD 10 UpdateInterval REG_DWORD 100 MaxNegPhaseCorrection REG_DWORD -1 MaxPosPhaseCorrection REG_DWORD -1 AnnounceFlags REG_DWORD 5 MaxAllowedPhaseOffset REG_DWORD 300 FileLogSize REG_DWORD 10000000 FileLogName REG_SZ C:\Windows\Temp\w32time.log FileLogEntries REG_SZ 0-300 Edit 4: Here are some notables from the ntp log file on the pdc. ReadConfig: failed. Use default one 'TimeJumpAuditOffset'=0x00007080 DomainHierachy: we are now the domain root. ClockDispln: we're a reliable time service with no time source: LS: 0, TN: 864000000000, WAIT: 86400000 Edit 5: F&^%ING SOLVED! Ok so I was reading about people with similar problems, some mentioned w32time server settings applied by GPO, but I tested this early on and there were no settings applied to this service by gpo. Others said that the reporting software may not be picking up some old gpo settings applied. So I searched the registry for all w32time instaces. I came across an interesting key that indicated there may be some other ntp software running on the server. Sure enough, I look through the installed software list and there the little F*&%ER is. Uninstalled and now working like a dream. FFFFFFFUUUUUUUUUUUU

Read the article
Putting indexes in separate filegroup kills our queries

- by womp

Can anyone shed some light on this? On our dev boxes, our database resides entirely in the PRIMARY filegroup, and everything works fine. On one of our production servers, recently upgraded from 2005 to 2008, we noticed it was performing slower than it should. On this machine, there are two filegroups - PRIMARY and INDEXES. Both filegroups contain 1 file per logical volume, one logical volume per CPU, (and each logical volume is a RAID 10 of 4 physical disks). We isolated a few queries that were performing fast on the dev boxes and slow (up to 40x slower) on the production machine. Turned out these queries were using the non-clustered indexes that resided in the INDEXES filegroup. Tweaking some of the queries to only use clustered indexes that were in the PRIMARY filegroup dropped their times back to normal. As a final confirmation, we redeployed the same database on the same machine to have everything in PRIMARY, and things went back to normal! Here's the statistics output of one of the queries, run identically on the machine with different filegroup configurations (table names changed to protect the innocent): FAST (everything in PRIMARY filegroup): (3 row(s) affected) Table '0'. Scan count 2, logical reads 14, ... Table '1'. Scan count 0, logical reads 0, ... Table '1'. Scan count 0, logical reads 0, ... Table '2'. Scan count 2, logical reads 7, ... Table '3'. Scan count 2, logical reads 1012, ... Table '4'. Scan count 1, logical reads 3, ... SQL Server Execution Times: CPU time = 437 ms, elapsed time = 445 ms. SLOW (indexes split into their own filegroup): (3 row(s) affected) Table '0'. Scan count 209, logical reads 428, ... Table '1'. Scan count 0, logical reads 0,... Table '2'. Scan count 1021, logical reads 9043,.... Table '3'. Scan count 209, logical reads 105754, .... Table '4'. Scan count 0, logical reads 0, .... Table '5'. Scan count 1, logical reads 695, ... **Table '#46DA8CA9'. Scan count 205, logical reads 205, ...** Table '6'. Scan count 6, logical reads 436, ... Table '7'. Scan count 1, logical reads 12,.... SQL Server Execution Times: CPU time = 17581 ms, elapsed time = 17595 ms. Notice the weird temp table and extra tables involved in the slow query. It seems clear that having a second file group is making SQL Server batty with choosing an execution plan. What the heck is going on?

Read the article
Dynamic DNS registration for VPN clients

- by Eric Falsken

I've got a VPN server set up in my Active Directory on a remote network. (VPN Server is separate box from DNS/AD) When I dial into the network (client machine is not a member of the AD) the machine does not register its IP or Hostname in the DNS. I've played with all possible combinations of DHCP and RRAS-allocated IP pools, and none of them seem to cause my client to register. Is it because my client has to be a member of the domain? Are there some security settins I can tweak so that it can register its hostname/ip? I've looked in the event logs (System and Security) for the AD, DNS, DHCP, RRAS, and the client machine, and don't see anything relating to DNS Registration. Here's the IPConfig on the client machine (once connected): PPP adapter My VPN Name: Connection-specific DNS Suffix . : mydomain.local Description . . . . . . . . . . . : My VPN Name Physical Address. . . . . . . . . : DHCP Enabled. . . . . . . . . . . : No Autoconfiguration Enabled . . . . : Yes IPv4 Address. . . . . . . . . . . : 192.168.1.22(Preferred) Subnet Mask . . . . . . . . . . . : 255.255.255.255 Default Gateway . . . . . . . . . : DNS Servers . . . . . . . . . . . : 192.168.1.52 <- DC1 192.168.1.53 <- DC2 NetBIOS over Tcpip. . . . . . . . : Enabled

Read the article
Hardware recommendations / parts list for a modern, quiet ZFS NAS box - 2011-Feb edition [closed]

- by dandv

I want to build some really reliable storage for my data, and it seems that ZFS is the only filesystem at the moment that does live checksumming. That rules out DroboPro, so I'm looking to building a quiet ZFS NAS that would start with 4 2TB or larger hard drives. I'd like this system to be very reliable and relatively future-proof for 2-3 years, so I'm willing to invest some $$$ and buy higher end components. I did see questions here and on other forums about low-cost servers, but I'm not looking for those. I'd be super happy to go for an off-the-shelf solution, but I haven't found one that's quiet. I started doing the research (summarized on my wiki), but I realized that it just gets too complicated for what I know as a software dude, and I'm entering the analysis paralysis area. At this point, I'm basically looking for a parts list for a configuration that will work (and is modern), and I know there are folks around here who are way more competent than me. I've built computers and am comfortable assembling one and messing with *nix; I can follow guides; I just want to end the decision process for the hardware and software configuration. What I've researched so far (not that these are very modern components): Case: I think I've settled on the Antec Twelve Hundred case because it cools well, is quiet, and simply has 12 bays that allow elastic mounting. The SilverStone Raven is its counter-candidate, but I find its construction quite odd. For the PSU, I'm torn between Antec CP-850 and Nexus RX-8500, but I did this research more than a year ago. The Nexus has a very uniform power profile, and I'd rather not have the Antec spin up and down based on load. On the other hand, I'm not sure how often my file server will draw more than 400W under use. For the hard drives, I've read that WD Black drives are actually WD RE3 with a software setting changed. I'd also like to buy different drive types, not just 4 WDs. Recommendations? Right now I have a 2TB Hitachi Deskstar 7K300. For the motherboard, CPU and RAM I have no idea, other than the RAM must be ECC. I already asked a question here about ECC RAM, but I was misguided and was looking for a motherboard that would support USB 3.0 as well. I've learned to go with eSATA, or worry about USB later. Then there's the (liquid) cooling, Wi-Fi card, and FreeBSD vs. OpenSolaris Express. Lastly, I'm wondering if I can make this PC into a media server by adding a Blu-ray drive and a good sound card. But support for Blu-Ray is spotty on Linux, and I don't know if Windows 7 on VirtualBox would get sufficient hardware access to output HDMI or SPDIFF signals. (Running OpenSolaris virtualized is not an option because of the reliability risk.) Then there are HDCP concerns. Suggestions on that would be appreciated as well, but I don't want us to get sidetracked. A specific shopping list on the core components would be great, so I can start ordering, and in the meantime educate myself with regards to the other issues. Finally, I think this could become a great FAQ for those technically inclined to build their own ZFS server, but confused by the dizzying array of options out there, and I promise to compile the results and share my experience building and benchmarking the server.

Read the article
An error occured synchronizing windows with time.windows.com

- by Killrawr

Okay so I've tried stopping/registering the win32tm service on this Windows Server 2008 Enterprise Computer. C:\Users\Administrator>net stop w32time The Windows Time service is stopping. The Windows Time service was stopped successfully. C:\Users\Administrator>w32tm /unregister The following error occurred: Access is denied. (0x80070005) C:\Users\Administrator>w32tm /unregister W32Time successfully unregistered. C:\Users\Administrator>w32tm /register W32Time successfully registered. C:\Users\Administrator>net start w32time The Windows Time service is starting. The Windows Time service was started successfully. (Source : http://social.technet.microsoft.com/Forums/en-US/winserverDS/thread/9bdfc2cc-4775-4435-8868-57d214e1e3ba/) And I get this error from the Date and Time, Internet Time tab (After also following the steps here). I've even tried the Atomic Time Clock Worldtimeserver and I get the error The following error occurred: The specified module could not be found. (0x8007007E). I've also disabled the Windows Firewall, that might of been blocking the synchronization. I've done a file scan with sfc /scannow that came back with no errors. C:\Users\Administrator>sfc /scannow Beginning system scan. This process will take some time. Beginning verification phase of system scan. Verification 100% complete. Windows Resource Protection did not find any integrity violations. C:\Users\Administrator> But I'm not having much luck. Is there anyway lo possibly solve this? or is the time.windows.com servers unsupported? because the software is from 2008? (I really don't know :/), My ping result to time.windows.com C:\Users\Administrator>ping time.windows.com Pinging time.microsoft.akadns.net [65.55.21.22] with 32 bytes of data: Request timed out. Request timed out. Request timed out. Request timed out. Ping statistics for 65.55.21.22: Packets: Sent = 4, Received = 0, Lost = 4 (100% loss), And tracert result C:\Users\Administratortracert time.windows.com Tracing route to time.microsoft.akadns.net [65.55.21.24] over a maximum of 30 hops: 1 1 ms <1 ms <1 ms 192.168.1.1 2 32 ms 31 ms 32 ms be2-100.bras1wtc.wlg.vf.net.nz [203.109.129.113] 3 31 ms 32 ms 31 ms be5-100.ppnzwtc01.wlg.vf.net.nz.129.109.203.in-a ddr.arpa [203.109.129.114] 4 31 ms 31 ms 31 ms gi0-2-0-3.ppnzwtc01.wlg.vf.net.nz.180.109.203.in -addr.arpa [203.109.180.210] 5 31 ms 31 ms 30 ms gi0-2-0-3.ppnzwtc02.wlg.vf.net.nz [203.109.180.2 09] 6 167 ms 166 ms 166 ms ip-141.199.31.114.VOCUS.net.au [114.31.199.141] 7 175 ms 175 ms 175 ms microsoft.com.any2ix.coresite.com [206.223.143.1 43] 8 177 ms 180 ms 176 ms xe-7-0-2-0.by2-96c-1a.ntwk.msn.net [207.46.42.17 6] 9 205 ms 205 ms 204 ms xe-10-0-2-0.co1-96c-1b.ntwk.msn.net [207.46.45.3 1] 10 * * * Request timed out. 11 * * * Request timed out. 12 * * * Request timed out. 13 * * * Request timed out. 14 * * * Request timed out. 15 * * * Request timed out. 16 ^C And nslookup C:\Users\Administrator>nslookup time.windows.com Server: UnKnown Address: 192.168.1.1 Non-authoritative answer: Name: time.microsoft.akadns.net Address: 65.55.21.22 Aliases: time.windows.com

Read the article
DRBD not syncing between my nodes when IP is reset

- by ramdaz

I am trying to setup DRBD by following the article at http://www.howtoforge.com/setting-up-network-raid1-with-drbd-on-ubuntu-11.10-p2 I am using Ubuntu 10.04 DRBD - 8.3.11 In the first run I had everything working perfectly and when shifting the systems to a production environment I decided to restart the Meta Data creation part and start from scratch. The IPs had changed entirely in the production environment. Issuing drdbadm create-md r0 in both the servers runs successfully. But when I do "drbdadm -- --overwrite-data-of-peer primary all" on the primary it fails to start the re sync. My config file is as given below resource r0 { protocol C; syncer { rate 50M; } startup { wfc-timeout 15; degr-wfc-timeout 60; } net { cram-hmac-alg sha1; shared-secret "aklsadkjlhdbskjndsf8738734jkfkjfkjf"; } on primaryds { device /dev/drbd0; disk /dev/md2; address 172.16.7.1:7788; meta-disk internal; } on secondaryds { device /dev/drbd0; disk /dev/md2; address 172.16.7.3:7788; meta-disk internal; } } Status on primary root at primaryds:~# cat /proc/drbd version: 8.3.7 (api:88/proto:86-91) GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by root at primaryds, 2012-05-12 15:08:01 0: cs:WFBitMapS ro:Primary/Secondary ds:UpToDate/Inconsistent C r---- ns:0 nr:0 dw:0 dr:200 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:5690352828 Status on secondary root at secondaryds:/etc/drbd.d# cat /proc/drbd version: 8.3.7 (api:88/proto:86-91) GIT-hash: ea9e28dbff98e331a62bcbcc63a6135808fe2917 build by root at secondaryds, 2012-05-12 15:25:25 0: cs:WFBitMapT ro:Secondary/Primary ds:Inconsistent/UpToDate C r---- ns:0 nr:0 dw:0 dr:0 al:0 bm:0 lo:0 pe:0 ua:0 ap:0 ep:1 wo:b oos:5690352828 Log of Primary May 30 13:42:23 primaryds kernel: [ 1584.057076] block drbd0: role( Secondary -> Primary ) disk( Inconsistent -> UpToDate ) May 30 13:42:23 primaryds kernel: [ 1584.086264] block drbd0: Forced to consider local data as UpToDate! May 30 13:42:23 primaryds kernel: [ 1584.086303] block drbd0: Creating new current UUID May 30 13:42:26 primaryds kernel: [ 1586.405551] block drbd0: drbd_sync_handshake: May 30 13:42:26 primaryds kernel: [ 1586.405564] block drbd0: self E8A075F378173D4B:0000000000000004:0000000000000000:0000000000000000 bits:1422588207 flags:0 May 30 13:42:26 primaryds kernel: [ 1586.405574] block drbd0: peer 0000000000000004:0000000000000000:0000000000000000:0000000000000000 bits:1422588207 flags:0 May 30 13:42:26 primaryds kernel: [ 1586.405582] block drbd0: uuid_compare()=2 by rule 30 May 30 13:42:26 primaryds kernel: [ 1586.405587] block drbd0: Becoming sync source due to disk states. May 30 13:42:26 primaryds kernel: [ 1586.405592] block drbd0: Writing the whole bitmap, full sync required after drbd_sync_handshake. May 30 13:42:27 primaryds kernel: [ 1588.171638] block drbd0: 5427 GB (1422588207 bits) marked out-of-sync by on disk bit-map. May 30 13:42:27 primaryds kernel: [ 1588.172769] block drbd0: conn( Connected -> WFBitMapS ) Log in Secondary May 30 13:42:24 secondaryds kernel: [ 1563.304894] block drbd0: peer( Secondary - Primary ) pdsk( Inconsistent - UpToDate ) May 30 13:42:24 secondaryds kernel: [ 1563.339674] block drbd0: drbd_sync_handshake: May 30 13:42:24 secondaryds kernel: [ 1563.339685] block drbd0: self 0000000000000004:0000000000000000:0000000000000000:0000000000000000 bits:1422588207 flags:0 May 30 13:42:24 secondaryds kernel: [ 1563.339695] block drbd0: peer E8A075F378173D4B:0000000000000004:0000000000000000:0000000000000000 bits:1422588207 flags:0 May 30 13:42:24 secondaryds kernel: [ 1563.339703] block drbd0: uuid_compare()=-2 by rule 20 May 30 13:42:24 secondaryds kernel: [ 1563.339709] block drbd0: Becoming sync target due to disk states. May 30 13:42:24 secondaryds kernel: [ 1563.339714] block drbd0: Writing the whole bitmap, full sync required after drbd_sync_handshake. May 30 13:42:26 secondaryds kernel: [ 1565.652342] block drbd0: 5427 GB (1422588207 bits) marked out-of-sync by on disk bit-map. May 30 13:42:26 secondaryds kernel: [ 1565.652965] block drbd0: conn( Connected - WFBitMapT ) The serves are not responding once it reaches this stage. Tried redoing it couple of time but noting happens. Why could the resync not be taking place? I would like some advice? Directions?

Read the article
CPU Utilization LAMP stack

- by Max

We've got an ec2 m2.4xlarge running Magento (centos 5.6, httpd 2.2, php 5.2.17 with eaccelerator 0.9.5.3, mysql 5.1.52). Right now we're getting a large traffic spike, and our top looks like this: top - 09:41:29 up 31 days, 1:12, 1 user, load average: 120.01, 129.03, 113.23 Tasks: 1190 total, 18 running, 1172 sleeping, 0 stopped, 0 zombie Cpu(s): 97.3%us, 1.8%sy, 0.0%ni, 0.5%id, 0.0%wa, 0.0%hi, 0.0%si, 0.4%st Mem: 71687720k total, 36898928k used, 34788792k free, 49692k buffers Swap: 880737784k total, 0k used, 880737784k free, 1586524k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 2433 mysql 15 0 23.6g 4.5g 7112 S 564.7 6.6 33607:34 mysqld 24046 apache 16 0 411m 65m 28m S 26.4 0.1 0:09.05 httpd 24360 apache 15 0 410m 60m 25m S 26.4 0.1 0:03.65 httpd 24993 apache 16 0 410m 57m 21m S 26.1 0.1 0:01.41 httpd 24838 apache 16 0 428m 74m 20m S 24.8 0.1 0:02.37 httpd 24359 apache 16 0 411m 62m 26m R 22.3 0.1 0:08.12 httpd 23850 apache 15 0 411m 64m 27m S 16.8 0.1 0:14.54 httpd 25229 apache 16 0 404m 46m 17m R 10.2 0.1 0:00.71 httpd 14594 apache 15 0 404m 63m 34m S 8.4 0.1 1:10.26 httpd 24955 apache 16 0 404m 50m 21m R 8.4 0.1 0:01.66 httpd 24313 apache 16 0 399m 46m 22m R 8.1 0.1 0:02.30 httpd 25119 apache 16 0 411m 59m 23m S 6.8 0.1 0:01.45 httpd Questions: Would giving msyqld more memory help it cache queries and react faster? If so, how? Other than splitting mysql and php to separate servers (which we're about to do) is there anything else we could/should be doing? Thanks! UPDATE: Here's our my.cnf along with the output of mysqltuner. It looks like a cache problem. Thanks again! # cat /etc/my.cnf [client] port = **** socket = /var/lib/mysql/mysql.sock [mysqld] datadir=/mnt/persistent/mysql port=**** socket=/var/lib/mysql/mysql.sock key_buffer = 512M max_allowed_packet = 64M table_cache = 1024 sort_buffer_size = 8M read_buffer_size = 4M read_rnd_buffer_size = 2M myisam_sort_buffer_size = 64M thread_cache_size = 128M tmp_table_size = 128M join_buffer_size = 1M query_cache_limit = 2M query_cache_size= 64M query_cache_type = 1 max_connections = 1000 thread_stack = 128K thread_concurrency = 48 log-bin=mysql-bin server-id = 1 wait_timeout = 300 innodb_data_home_dir = /mnt/persistent/mysql/ innodb_data_file_path = ibdata1:10M:autoextend innodb_buffer_pool_size = 20G innodb_additional_mem_pool_size = 20M innodb_log_file_size = 64M innodb_log_buffer_size = 8M innodb_flush_log_at_trx_commit = 1 innodb_lock_wait_timeout = 50 innodb_thread_concurrency = 48 ft_min_word_len=3 [myisamchk] ft_min_word_len=3 key_buffer = 128M sort_buffer_size = 128M read_buffer = 2M write_buffer = 2M # ./mysqltuner.pl >> MySQLTuner 1.2.0 - Major Hayden <[email protected]> >> Bug reports, feature requests, and downloads at http://mysqltuner.com/ >> Run with '--help' for additional options and output filtering -------- General Statistics -------------------------------------------------- [--] Skipped version check for MySQLTuner script [OK] Currently running supported MySQL version 5.1.52-log [OK] Operating on 64-bit architecture -------- Storage Engine Statistics ------------------------------------------- [--] Status: +Archive -BDB +Federated +InnoDB -ISAM -NDBCluster [--] Data in MyISAM tables: 2G (Tables: 26) [--] Data in InnoDB tables: 749M (Tables: 250) [!!] Total fragmented tables: 262 -------- Security Recommendations ------------------------------------------- -------- Performance Metrics ------------------------------------------------- [--] Up for: 31d 2h 30m 38s (680M q [253.371 qps], 2M conn, TX: 4825B, RX: 236B) [--] Reads / Writes: 89% / 11% [--] Total buffers: 20.6G global + 15.1M per thread (1000 max threads) [OK] Maximum possible memory usage: 35.4G (51% of installed RAM) [OK] Slow queries: 0% (35K/680M) [OK] Highest usage of available connections: 53% (537/1000) [OK] Key buffer size / total MyISAM indexes: 512.0M/457.2M [OK] Key buffer hit rate: 100.0% (9B cached / 264K reads) [OK] Query cache efficiency: 42.3% (260M cached / 615M selects) [!!] Query cache prunes per day: 4384652 [OK] Sorts requiring temporary tables: 0% (1K temp sorts / 38M sorts) [!!] Joins performed without indexes: 100404 [OK] Temporary tables created on disk: 17% (7M on disk / 45M total) [OK] Thread cache hit rate: 99% (537 created / 2M connections) [!!] Table cache hit rate: 0% (1K open / 946K opened) [OK] Open file limit used: 9% (453/5K) [OK] Table locks acquired immediately: 99% (758M immediate / 758M locks) [OK] InnoDB data size / buffer pool: 749.3M/20.0G -------- Recommendations ----------------------------------------------------- General recommendations: Run OPTIMIZE TABLE to defragment tables for better performance Enable the slow query log to troubleshoot bad queries Adjust your join queries to always utilize indexes Increase table_cache gradually to avoid file descriptor limits Variables to adjust: query_cache_size (> 64M) join_buffer_size (> 1.0M, or always use indexes with joins) table_cache (> 1024)

Read the article
nginx : backend https, proxy_pass shows ip

- by Vulpo

I am using nginx as a reverse proxy listening at port 80 (http). I am using proxy_pass to forward requests to backend http and https servers. Everything works fine for my http server but when I try to reach the https server through nginx reverse proxy the ip of the https server is shown in the client's web browser. I want the uri of the nginx server to be shown instead of the https backend server's ip (once again, this works fine with the http server but not for the https server). See this post on the forum Here is my configuration file : server { listen 80; server_name domain1.com; access_log off; root /var/www; if ($request_method !~ ^(GET|HEAD|POST)$ ) { return 444; } location / { proxy_pass http://ipOfHttpServer:port/; } } server { listen 80; server_name domain2.com; access_log off; root /var/www; if ($request_method !~ ^(GET|HEAD|POST)$ ) { return 444; } location / { proxy_pass http://ipOfHttpsServer:port/; proxy_set_header X_FORWARDED_PROTO https; #proxy_set_header Host $http_host; } } When I try the "proxy_set_header Host $http_host" directive and "proxy_set_header Host $host" the web page can't be reached (page not found). But when I comment it, the ip of the https server is shown in the browser (which is bad). Does anyone have an idea ? My other configs files are : proxy_redirect off; proxy_set_header Host $host; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; #proxy_hide_header X-Powered-By; proxy_intercept_errors on; proxy_buffering on; proxy_cache_key "$scheme://$host$request_uri"; proxy_cache_path /var/cache/nginx levels=1:2 keys_zone=cache:10m inactive=7d max_size=700m; user www-data; worker_processes 2; error_log /var/log/nginx/error.log; pid /var/run/nginx.pid; events { worker_connections 1024; } http { include /etc/nginx/mime.types; default_type application/octet-stream; access_log /var/log/nginx/access.log; server_names_hash_bucket_size 64; sendfile off; tcp_nopush on; #keepalive_timeout 0; keepalive_timeout 65; tcp_nodelay on; gzip on; gzip_comp_level 5; gzip_http_version 1.0; gzip_min_length 0; gzip_types text/plain text/html text/css image/x-icon application/x-javascript; gzip_vary on; include /etc/nginx/conf.d/*.conf; include /etc/nginx/sites-enabled/*; } Thanks for your help !

Read the article
Ubuntu server loses exactly 5 minutes once in a while

- by Harold Smith

I noticed that my server, an Ubuntu server 12.04, was losing time. I figured the hardware clock was off or maybe dying due to a faulty CMOS battery. I installed NTP to ensure the drift would be corrected, but to no avail. During a day it would lose 20 minutes or so. To debug, I created a small cron job to check against a remote servers time, which I knew to be correct. The script calculates the difference in seconds between local and remote time. The result was interesting. It seems to be losing exactly 5 minutes several times during the day. Look at this log (difference from remote server noted in seconds): Tue Oct 23 03:30:02 CEST 2012: 284 Tue Oct 23 03:35:02 CEST 2012: 284 Tue Oct 23 03:40:01 CEST 2012: 285 Tue Oct 23 03:45:02 CEST 2012: 285 Tue Oct 23 03:50:02 CEST 2012: 285 Tue Oct 23 03:55:02 CEST 2012: 284 Tue Oct 23 04:00:02 CEST 2012: 284 Tue Oct 23 04:05:01 CEST 2012: 285 Tue Oct 23 04:10:01 CEST 2012: 285 Tue Oct 23 04:15:02 CEST 2012: 585 Tue Oct 23 04:20:02 CEST 2012: 584 Tue Oct 23 04:25:02 CEST 2012: 584 Tue Oct 23 04:30:02 CEST 2012: 584 Tue Oct 23 04:35:01 CEST 2012: 585 Tue Oct 23 04:40:01 CEST 2012: 585 Tue Oct 23 04:45:02 CEST 2012: 585 Tue Oct 23 04:50:02 CEST 2012: 584 Tue Oct 23 04:55:02 CEST 2012: 584 Tue Oct 23 05:00:02 CEST 2012: 584 Tue Oct 23 05:05:01 CEST 2012: 585 Tue Oct 23 05:10:01 CEST 2012: 585 Tue Oct 23 05:15:02 CEST 2012: 585 Tue Oct 23 05:20:02 CEST 2012: 584 Tue Oct 23 05:25:02 CEST 2012: 584 Tue Oct 23 05:30:02 CEST 2012: 584 Tue Oct 23 05:35:01 CEST 2012: 585 Tue Oct 23 05:40:01 CEST 2012: 585 Tue Oct 23 05:45:02 CEST 2012: 584 Tue Oct 23 05:50:02 CEST 2012: 584 Tue Oct 23 05:55:02 CEST 2012: 584 Tue Oct 23 06:00:02 CEST 2012: 584 Tue Oct 23 06:05:03 CEST 2012: 584 Tue Oct 23 06:10:02 CEST 2012: 584 Tue Oct 23 06:15:01 CEST 2012: 585 Tue Oct 23 06:20:02 CEST 2012: 584 Tue Oct 23 06:25:02 CEST 2012: 584 Tue Oct 23 06:30:02 CEST 2012: 584 Tue Oct 23 06:35:02 CEST 2012: 584 Tue Oct 23 06:40:02 CEST 2012: 584 Tue Oct 23 06:45:01 CEST 2012: 585 Tue Oct 23 06:50:02 CEST 2012: 584 Tue Oct 23 06:55:01 CEST 2012: 585 Tue Oct 23 07:00:02 CEST 2012: 584 Tue Oct 23 07:05:02 CEST 2012: 584 Tue Oct 23 07:10:02 CEST 2012: 584 Tue Oct 23 07:15:02 CEST 2012: 584 Tue Oct 23 07:20:02 CEST 2012: 584 Tue Oct 23 07:25:02 CEST 2012: 584 Tue Oct 23 07:30:01 CEST 2012: 585 Tue Oct 23 07:35:02 CEST 2012: 584 Tue Oct 23 07:40:02 CEST 2012: 584 Tue Oct 23 07:45:02 CEST 2012: 584 Tue Oct 23 07:50:02 CEST 2012: 584 Tue Oct 23 07:55:02 CEST 2012: 584 Tue Oct 23 08:00:01 CEST 2012: 585 Tue Oct 23 08:05:02 CEST 2012: 584 Tue Oct 23 08:10:02 CEST 2012: 584 Tue Oct 23 08:15:02 CEST 2012: 584 Tue Oct 23 08:20:02 CEST 2012: 584 Tue Oct 23 08:25:02 CEST 2012: 584 Tue Oct 23 08:30:01 CEST 2012: 585 Tue Oct 23 08:35:02 CEST 2012: 584 Tue Oct 23 08:40:02 CEST 2012: 584 Tue Oct 23 08:45:02 CEST 2012: 584 Tue Oct 23 08:50:02 CEST 2012: 584 Tue Oct 23 08:55:02 CEST 2012: 584 Tue Oct 23 09:00:02 CEST 2012: 584 Tue Oct 23 09:05:03 CEST 2012: 584 Tue Oct 23 09:10:02 CEST 2012: 584 Tue Oct 23 09:15:02 CEST 2012: 584 Tue Oct 23 09:20:02 CEST 2012: 584 Tue Oct 23 09:25:02 CEST 2012: 584 Tue Oct 23 09:30:01 CEST 2012: 584 Tue Oct 23 09:35:02 CEST 2012: 584 Tue Oct 23 09:40:02 CEST 2012: 584 Tue Oct 23 09:45:02 CEST 2012: 584 Tue Oct 23 09:50:02 CEST 2012: 584 Tue Oct 23 09:55:02 CEST 2012: 584 Tue Oct 23 10:00:01 CEST 2012: 584 Tue Oct 23 10:05:02 CEST 2012: 584 Tue Oct 23 10:10:07 CEST 2012: 584 Tue Oct 23 10:15:02 CEST 2012: 584 Tue Oct 23 10:20:02 CEST 2012: 884 Tue Oct 23 10:25:02 CEST 2012: 884 Tue Oct 23 10:30:02 CEST 2012: 883 Tue Oct 23 10:35:01 CEST 2012: 884 Tue Oct 23 10:40:02 CEST 2012: 884 Tue Oct 23 10:45:02 CEST 2012: 884 Tue Oct 23 10:50:02 CEST 2012: 884 Tue Oct 23 10:55:02 CEST 2012: 1184 Tue Oct 23 11:00:02 CEST 2012: 1183 Tue Oct 23 11:05:01 CEST 2012: 1184 Tue Oct 23 11:10:02 CEST 2012: 1184 Tue Oct 23 11:15:02 CEST 2012: 1184 Tue Oct 23 11:20:02 CEST 2012: 1184 This does not seem to be faulty CMOS battery in my opinion. But what do you think?

Read the article
Wear and tear on server hard drive from filesystem polling by PHP script

- by jackie

So I'm working on a discussion platform, and various clients will visit http://host/thread.php, which will render the discussion thread to date in addition to a form to submit a new post. When a new post is submitted, I would like all of the other clients with browser windows open to have it appear in near-real-time. One of the constraints of my script is that it may not use a DBMS and it must stay in the filesystem. Additionally, I can't use any PECL/PEAR extensions like inotify or anything like that for IPC. The flow will look like this: Client A requests thread.php and the thread is so far empty, but nonetheless it opens a Server-Side Event at eventPusher.php. Client B does the same. Client A fills out a post in the form and and submits (POSTs) it to subHandler.php. ??? (subHandler stores the new submission into the main thread storefile which gets read from when a fresh, new client requests thread.php, in addition to somehow signalling to the continually-running eventPusher event-source that a new comment was posted and that it should echo the event-json to the client. How, exactly, it will send this signal I'm yet unsure of, but there are a few options that I've thought of -- this is the crux of the question, so see below for more clarification) eventPusher.php happily pushes the new event to the client and it shows up soon after it was originally submitted on all clients who have the page open's screens. Now for the #4 missing-link mystery-step, I see a few problems. I mean, either way, eventPusher is gonna be doing a while loop of some sort -- it's gonna be polling something, I think that much is clear. (If that's a bad assumption please do let me know.) Now, the simplest way would be subHandler gets invoked on the form submission, writes it to the main store in addition to newComments.xml, then exits without doing anything else. Then eventPusher checks in newComments.xml every X seconds (by the way, what would be a reasonable time interval here?) and if it finds something then it emits an event to the client. Now, my fear with this is that the server's hard drive will have to constantly start spinning up. Maybe this isn't the case, perhaps it would just get cached in RAM and the linux kernel would take care of this transparently such that filesystem access doesn't actually engage the device because the kernel knows that that particular file hasn't changed since last read. * idea #2: I have no idea how to go about this, but perhaps there is a variable scope that gets stored in general RAM on the system which can be read by any process. Like if we mega-exported a bash variable so that $new_post is normally false but it gets toggled to true by subHandler, and then back to flase once it's pushed to the client. I doubt there's such a variable scope in PHP directly, but I struggle with the concept of variable scope, I just can't seem to understand it no matter what I read on it. * idea #3: eventPusher queries ps in its whileloop for another instance of itself. If there's not another eventPusher active then it's highly unlikely that new comments will be getting submitted. It's okay if this only works =90% of the time, it doesn't need to be completely foolproof. * idea #4: eventPusher queries DMESG to see if that file's been written to recently. So to sum everything up, I need to have inter-php-script-communication in near-real-time that will work on a standard mod_php shared hosting setup without any elevated privileges, PHP addon modules, or other system adjustments that can't be done from the PHP script itself at runtime. With*out* spinning up the drive more than a few times. No SQL servers either. Apologies if my english isn't the best, I'm still trying to improve on it.

Read the article
/usr/bin/sshd isn't linked against PAM on one of my systems. What is wrong and how can I fix it?

- by marc.riera

Hi, I'm using AD as my user account server with ldap. Most of the servers run with UsePam yes except this one, it has lack of pam support on sshd. root@linserv9:~# ldd /usr/sbin/sshd linux-vdso.so.1 => (0x00007fff621fe000) libutil.so.1 => /lib/libutil.so.1 (0x00007fd759d0b000) libz.so.1 => /usr/lib/libz.so.1 (0x00007fd759af4000) libnsl.so.1 => /lib/libnsl.so.1 (0x00007fd7598db000) libcrypto.so.0.9.8 => /usr/lib/libcrypto.so.0.9.8 (0x00007fd75955b000) libcrypt.so.1 => /lib/libcrypt.so.1 (0x00007fd759323000) libc.so.6 => /lib/libc.so.6 (0x00007fd758fc1000) libdl.so.2 => /lib/libdl.so.2 (0x00007fd758dbd000) /lib64/ld-linux-x86-64.so.2 (0x00007fd759f0e000) I have this packages installed root@linserv9:~# dpkg -l|grep -E 'pam|ssh' ii denyhosts 2.6-2.1 an utility to help sys admins thwart ssh hac ii libpam-modules 0.99.7.1-5ubuntu6.1 Pluggable Authentication Modules for PAM ii libpam-runtime 0.99.7.1-5ubuntu6.1 Runtime support for the PAM library ii libpam-ssh 1.91.0-9.2 enable SSO behavior for ssh and pam ii libpam0g 0.99.7.1-5ubuntu6.1 Pluggable Authentication Modules library ii libpam0g-dev 0.99.7.1-5ubuntu6.1 Development files for PAM ii openssh-blacklist 0.1-1ubuntu0.8.04.1 list of blacklisted OpenSSH RSA and DSA keys ii openssh-client 1:4.7p1-8ubuntu1.2 secure shell client, an rlogin/rsh/rcp repla ii openssh-server 1:4.7p1-8ubuntu1.2 secure shell server, an rshd replacement ii quest-openssh 5.2p1_q13-1 Secure shell root@linserv9:~# What I'm doing wrong? thanks. Edit: root@linserv9:~# cat /etc/pam.d/sshd # PAM configuration for the Secure Shell service # Read environment variables from /etc/environment and # /etc/security/pam_env.conf. auth required pam_env.so # [1] # In Debian 4.0 (etch), locale-related environment variables were moved to # /etc/default/locale, so read that as well. auth required pam_env.so envfile=/etc/default/locale # Standard Un*x authentication. @include common-auth # Disallow non-root logins when /etc/nologin exists. account required pam_nologin.so # Uncomment and edit /etc/security/access.conf if you need to set complex # access limits that are hard to express in sshd_config. # account required pam_access.so # Standard Un*x authorization. @include common-account # Standard Un*x session setup and teardown. @include common-session # Print the message of the day upon successful login. session optional pam_motd.so # [1] # Print the status of the user's mailbox upon successful login. session optional pam_mail.so standard noenv # [1] # Set up user limits from /etc/security/limits.conf. session required pam_limits.so # Set up SELinux capabilities (need modified pam) # session required pam_selinux.so multiple # Standard Un*x password updating. @include common-password Edit2: UsePAM yes fails With this configuration ssh fails to start : root@linserv9:/home/admmarc# cat /etc/ssh/sshd_config |grep -vE "^[ \t]*$|^#" Port 22 Protocol 2 ListenAddress 0.0.0.0 RSAAuthentication yes PubkeyAuthentication yes AuthorizedKeysFile .ssh/authorized_keys ChallengeResponseAuthentication yes UsePAM yes Subsystem sftp /usr/lib/sftp-server root@linserv9:/home/admmarc# The error it gives is as follows root@linserv9:/home/admmarc# /etc/init.d/ssh start * Starting OpenBSD Secure Shell server sshd /etc/ssh/sshd_config: line 75: Bad configuration option: UsePAM /etc/ssh/sshd_config: terminating, 1 bad configuration options ...fail! root@linserv9:/home/admmarc#

Read the article
Looking for advice on Hyper-v storage replication

- by Notre1

I am designing a 2-host Hyper-V R2 cluster with 6-10 guests stored on a SMB iSCSI SAN device (probably Promise VessRAID). I will be getting at least two of the SAN devices and need to eliminate the storage a single point of failure. Ideally, that would involve real-time failover for the storage, like the Windows failover clustering does for the hosts. This design will be used at around six of our sites, and I would like to allow for us to eventually setup a cluster at colocation site and replicate each site's VMs there for DR. (Ideally a live multi-site cluster, but a manual import of the VMs would be fine for this sort of DR.) The tools that come with enterprise SANs, like EMC and NetApp, seem to be the most commonly used items for a Hyper-V cluster, but I can't afford their prices with my budget. Outside of them, the two tools that seem to be most common for Hyper-V storage replication are SteelEye (now SIOS) DataKeeper Cluster Edition and Double-Take Availability. Originally, I was planning on using Clustered Shared Volume(s) (CSV), but it seems like replication support for these is either not available or brand new in both these products. It looks like CSVs are supported in Double-Take 5.22, see this discussion, but I don't think I want to run something that new in production. Right now, it seems like the best option for me is not to implement CSVs, implement some sort of storage replication, and upgrade to CSVs at a later date once replicating them is more mature. I would love to have live migration, and CSVs are not required for live migration if you are using one LUN per VM, so I guess this is what I'll do. I would prefer to stick to the using the Microsoft Windows Server and Hyper-V tools and features as much as possible. From that standpoint, SteelEye looks more appealing than Double-Take because they make the DataKeeper volume(s) available to the Failover Clustering Manager and then failover clustering is all configured and managed through the native Microsoft tools. Double-Take says that "clustered Hyper-V hosts are not supported," and Double-Take Availability itself seems to be what is used for the actual clustering and failover. Does anyone know if any of these replication tools work with more than two hosts in the cluster? All the information I can find on the web only uses two hosts in their examples. Are there any better tools than SteelEye and Double-Take for doing what I am trying to do, which is eliminate the storage as as single point of failure? Neverfail, AppAssure, and DataCore all seem to offer similar functionality, but they don't seems to be as popular as SteelEye and Double-Take. I have seen a number of people suggest using Starwind iSCSI SAN software for the shared storage, which includes replication (and CSV replication at that). There are a couple of reasons I have not seriously considered this route: 1) The company I work for is exclusively a Dell shop and Dell does not have any servers with that I can pack with more than six 3.5" SATA drives. 2) In the future, it could be advantegous for us to not be locked into a particular brand or type of storage and third-party replication softwares all allow replication to heterogeneous storage devices. I am pretty new to iSCSI and clustering, so please let me know if it looks like I am planning something that goes against best practices or overlooking/missing something.

Read the article
Email forwarding from my domain to gmail - FAIL

- by pitosalas

[There are numerous similar questions on ServerFault but I couldn't find one that was exactly on point] Background: I use Gmail for my email client. My email is [email protected]. However the email that people communicate to me with is [email protected]. I run the server that hosts www.example.com and other domains, at ServerBeach. Up to yesterday, I had SENDMAIL painlessly just forward emails to [email protected] to [email protected] and everything was fine, for several years in fact. Suddenly my email stopped working - that is, my gmail account stopped receiving emails via the forward from my server. Looking into it I found a bunch of emails sitting on my server with content like this: ... while talking to gmail-smtp-in.l.google.com.: RCPT To: <<< 450-4.2.1 The user you are trying to contact is receiving mail at a rate that <<< 450-4.2.1 prevents additional messages from being delivered. Please resend your <<< 450-4.2.1 message at a later time. If the user is able to receive mail at that <<< 450-4.2.1 time, your message will be delivered. For more information, please <<< 450 4.2.1 visit xxxxxx://mail.google.com/support/bin/answer.py?answer=6592 u15si37138086qco.76 [email protected]... Deferred: 450-4.2.1 The user you are trying to contact is receiving mail at a rate that DATA <<< 550-5.7.1 [64.34.168.137 1] Our system has detected an unusual rate of <<< 550-5.7.1 unsolicited mail originating from your IP address. To protect our <<< 550-5.7.1 users from spam, mail sent from your IP address has been blocked. <<< 550-5.7.1 Please visit xxxxx://www.google.com/mail/help/bulk_mail.html to review <<< 550 5.7.1 our Bulk Email Senders Guidelines. u15si37138086qco.76 554 5.0.0 Service unavailable ... while talking to alt1.gmail-smtp-in.l.google.com.: From what I've been researching, I think somehow someone has/is hijacking my domain name or something and this somehow has caused gmail's servers to notice and cut me off. But I don't know really what's going on nor do I see whatever emails might be involved. I've read stuff on zoneedit.com that sounds like they might have a solution in their service for what I am trying to do. I also read a lot about admining DNS and SENDMAIL and tried various things, but nothing works. Can you tell from my description what is going on that caused GMail's server to stop accepting email from my server and is there a way to stop it? What is the 'correct' way to configure things so that emails to [email protected] behave as if they were sent to [email protected]? Thanks so much!

Read the article
mysqld service crashes on restart, after importing mysqldump #innodb

- by ubunut

I have 2 mysql servers. Let's call them server01 & server02. Both have the same configuration: mysqladmin Ver 8.42 Distrib 5.1.61, for redhat-linux-gnu on x86_64 [client] default-character-set=utf8 [mysqld] datadir=/var/lib/mysql socket=/var/lib/mysql/mysql.sock user=mysql # Disabling symbolic-links is recommended to prevent assorted security risks symbolic-links=0 max_allowed_packet = 16M default-character-set=utf8 default-collation=utf8_unicode_ci character-set-server=utf8 collation-server=utf8_unicode_ci default-storage-engine = InnoDB innodb_data_home_dir = /var/lib/mysql innodb_log_group_home_dir = /var/lib/mysql innodb_data_file_path = ibdata1:10M:autoextend innodb_additional_mem_pool_size = 2M innodb_log_file_size = 5M innodb_log_buffer_size = 8M innodb_lock_wait_timeout = 50 innodb_flush_log_at_trx_commit = 1 innodb_buffer_pool_size = 700M table_cache = 300 thread_cache_size = 4 query_cache_size = 200m query_cache_limit = 10m [mysqld_safe] log-error=/var/log/mysqld.log pid-file=/var/run/mysqld/mysqld.pid I make a mysqldump on server01: mysqldump -uuser -ppassword --all-databases testservers.sql (most tables in these databases are innodb, some of the mysql.* tables are Innodb too) Then I import the testservers.sql on server02: mysql -uuser < testservers.sql (mysqld has been started with --skip-network). So far so good, I can login into mysql & everything seems to be ok. BUT when I exit to the shell and execute service mysqld restart, The service fails to start. stack-trace in /var/log/mysqld.log: 121022 14:53:19 mysqld_safe Starting mysqld daemon with databases from /var/lib/mysql 121022 14:53:19 [Warning] '--default-character-set' is deprecated and will be removed in a future release. Please use '--character-set-server' instead. 121022 14:53:19 [Warning] '--default-collation' is deprecated and will be removed in a future release. Please use '--collation-server' instead. 12:53:19 UTC - mysqld got signal 11 ; This could be because you hit a bug. It is also possible that this binary or one of the libraries it was linked against is corrupt, improperly built, or misconfigured. This error can also be caused by malfunctioning hardware. We will try our best to scrape up some info that will hopefully help diagnose the problem, but since we have already crashed, something is definitely wrong and this may fail. key_buffer_size=8384512 read_buffer_size=131072 max_used_connections=0 max_threads=151 thread_count=0 connection_count=0 It is possible that mysqld could use up to key_buffer_size + (read_buffer_size + sort_buffer_size)*max_threads = 338324 K bytes of memory Hope that's ok; if not, decrease some variables in the equation. Thread pointer: 0x267e630 Attempting backtrace. You can use the following information to find out where mysqld died. If you see no messages after this, something went terribly wrong... stack_bottom = 7fff3efe0be0 thread_stack 0x40000 /usr/libexec/mysqld(my_print_stacktrace+0x29) [0x84bd89] /usr/libexec/mysqld(handle_fatal_signal+0x483) [0x6a0be3] /lib64/libpthread.so.0() [0x338d60f500] /usr/libexec/mysqld(ha_resolve_by_name(THD*, st_mysql_lex_string const*)+0x81) [0x6956e1] /usr/libexec/mysqld(open_table_def(THD*, st_table_share*, unsigned int)+0xe0a) [0x60e5ba] /usr/libexec/mysqld(get_table_share(THD*, TABLE_LIST*, char*, unsigned int, unsigned int, int*)+0x20b) [0x602b0b] /usr/libexec/mysqld() [0x603597] /usr/libexec/mysqld(open_table(THD*, TABLE_LIST*, st_mem_root*, bool*, unsigned int)+0x7a1) [0x6079a1] /usr/libexec/mysqld(open_tables(THD*, TABLE_LIST**, unsigned int*, unsigned int)+0x5d0) [0x608570] /usr/libexec/mysqld(open_and_lock_tables_derived(THD*, TABLE_LIST*, bool)+0x6a) [0x60877a] /usr/libexec/mysqld(plugin_init(int*, char**, int)+0x622) [0x715af2] /usr/libexec/mysqld() [0x5bd3b2] /usr/libexec/mysqld(main+0x1b3) [0x5bfc93] /lib64/libc.so.6(__libc_start_main+0xfd) [0x338d21ecdd] /usr/libexec/mysqld() [0x5087b9] Trying to get some variables. Some pointers may be invalid and cause the dump to abort. Query (0): is an invalid pointer Connection ID (thread ID): 0 Status: NOT_KILLED The manual page at http://dev.mysql.com/doc/mysql/en/crashing.html contains information that should help you find out what is causing the crash. 121022 14:53:19 mysqld_safe mysqld from pid file /var/run/mysqld/mysqld.pid ended A typical mysqdump entry looks like this: DROP TABLE IF EXISTS `adodb_logsql`; /*!40101 SET @saved_cs_client = @@character_set_client */; /*!40101 SET character_set_client = utf8 */; CREATE TABLE `adodb_logsql` ( `id` bigint(10) unsigned NOT NULL AUTO_INCREMENT, `created` datetime NOT NULL, `sql0` varchar(250) NOT NULL DEFAULT '', `sql1` text, `params` text, `tracer` text, `timer` decimal(16,6) NOT NULL DEFAULT '0.000000', PRIMARY KEY (`id`) ) ENGINE=InnoDB DEFAULT CHARSET=utf8 COMMENT='to save some logs from ADOdb'; /*!40101 SET character_set_client = @saved_cs_client */; IF I change all occurrences of "ENGINE=InnoDB" to "ENGINE=MyISAM" before import, then the service has no problem restarting. I'm quite puzzled as to what's happening, maybe I'm just an idiot, then by all means tell me so. Any help would be greatly appreciated!

Read the article
Symantec Protection Suite and System Recovery 2011 Desktop Edition

- by rihatum

I am re-posting this as my previous question was being treated as if I am "Shopping or seeking Product Recommendations" even though I was NOT - BTW they have deleted my comments too which were not offensive in nature. anyway - I have re-phrased some parts of my question and I hope SF Admins "Do Not Modify / Edit" this one - will be most grateful for that. I have a lot of respect for the People who visit this SITE and help others ! Just To clarify : Just to go by SF rules - I am not seeking someone to Design this solution, I am simply seeking real world examples, experiences, technical expert opinions / suggestions, any tips or tricks they may have or any problems they may have faced while doing something similar above with these products. I am also not asking for Capacity Planning for Storage, We have done some research and I am seeking Expert Assurance / Suggestions. We (our company) are planning to deploy Symantec Endpoint Protection and Symantec Desktop Recovery 2011 Desktop Edition to our 3000 - 4000 workstations (Windows7 32 and 64) with a few 100s with Windows XP 32/64 Bit. I have read the implementation guide for SEP and have read tech-notes for Desktop Recovery 2011. Our team have planned to deploy this as follows : 1 x dedicated SQL 2008R2 for Symantec Endpoint Protection (Instead of using the Embedded Database) 1 x Dedicated SQL 2008R2 for Symantec Desktop Recovery 2011 (Instead of using the Embedded Database) 1 x Dedicated W2K8 R2 Box for the SEPM (Symantec Endpoint Protection Manager - Mgmt. APP) 1 x Dedicated W2K8 R2 Box for the Symantec Desktop Recovery 2011 Management Application Agent Deployment : As per Symantec Documentation for both of the above, an agent can be pushed via the Mgmt. Application (provided no firewalls are blocking ports required etc. - we have Windows firewall disabled already). Server Hardware : Per SQL Server : 16GB RAM + SAS DISKS + Dual XEON, RAID-10 for the SQL DB or I can always mount a LUN from our existing Hitachi or EMC SAN. SEPM Server : 16GB RAM + SAS DISKS + DUAL XEON System Recovery MGMT SERVER : 16GB RAM + SAS DISKS + DUAL XEON Above is the initial plan we have for 3000 - 4000 client workstation (Windows) Now my Questions :-) a) If we had these users distributed amongst two sites with AD DC / GC in each site, How would I restrict SEPM and Desktop Mgmt. solution to only check for users in their respective site ? b) At present all users are under one building but we are going to move some dept. to a new location (with dedicated connectivity), How would we control which SEPM / MGMT Server is responsible for which site ? c) We have netbackup in our environment backing up other servers, I am planning to protect these 4 (2 x SQL, 1 x SEPM, 1 x System Recovery Mgmt. Server) via netbackup or I can use System recovery 2011 server edition on all 4 of these boxes as well. (License is not an issue as we have the complete symantec portfolio included in our license). d) Now - Saving Desktop backups - What strategies have you implemented ? Any best practice recommendation for a large user base ? I was thinking to either mount a LUN from our Hitachi SAN on the Symantec Recovery Server itself or backup to the users hard drive locally and then copy it over to a network location ? Suggestions welcome :-) If you have anything to add / correct - that will be really helpful before diving into the actual implementation phase. Will be most grateful with your suggestions, recommendations and corrections with above - Many Thanks !

Read the article
Can not access network computers anymore

- by Johny Skovdal

Last Thursday (03/05/12) I got a new computer to be able to work from home. I plugged it, by cable, into the company network and installed most of the software needed for me to do so, by accessing a share on my stationary computer at work. I had no issues here what so ever, and everything just worked. Yesterday evening I tried accessing the company network trough Windows VPN, and while I was able to connect to the network, I was unable to connect to any computers on the network. I did, however, get an error when connecting, but I can't seem to get the error again, to get the details of the error message. Today I am sitting on the company network again, and now I can not access anything on the network like I could last Thursday, though I can ping all the computers I am attempting to access. Here is a list of details that might help in troubleshooting this issue (updated): List of observations / actions My computer is identical to another computer that has no issues. It is not on the domain but rather on the default workgroup, but this was not an issue last Thursday, so I am assuming it still is not. I am able to access my e-mail on the exchange server. I can connect to our TFS server from Visual Studio but not from Explorer. I can also connect to Database Servers and Remote Desktop. I can see several computers when browsing network computers, but I am unable to connect to any of them. When trying to connect to a computer I am consistently met with the error code "0x80070035" (network path not found). I also get the 0x80070035 error when double clicking the target computer from the Network UI. I am not met with a login dialog when trying to access a computer, as I should, since I am not on the domain. (I did login to both Exchange, Remote Desktop and TFS though) Between Thursday where it worked and Sunday evening where it did not, I have installed quite a few security updates, plus various tools etc. that I need for programming. I have tried accessing by computer name and ip and neither of them work. I can ping by computer name. I have deleted all (1 entry) stored network credentials. I am able to access my computer from the target computer. Client and Server can see each other on the network = Network Discovery is enabled. I am using the network profile "Work". When accessing the network through VPN, I am unable to get anything to work using computernames, but all of the above applies when using IP adresses instead of computername. I run Windows 7 Home Premium on my computer. Using powershell attempting to access a share I get the following error (ComputerName and ShareName being correct values of course): PS C:\Users\MyUser> cd \\ComputerName\ShareName Set-Location : Cannot find path '\\ComputerName\ShareName' because it does not exist. At line:1 char:3 + cd <<<< \\ComputerName\ShareName + CategoryInfo : ObjectNotFound: (\\ComputerName\ShareName:String) [Set-Location], ItemNotFoundException + FullyQualifiedErrorId : PathNotFound,Microsoft.PowerShell.Commands.SetLocationCommand However, ping'ing the same machine (ping ComputerName) from powershell I get response immediately. (As mentioned in the list of observations/actions, I tried the above with the IP address again on VPN, to get the same result) Conclusion So to sum up, pretty much the only thing I can not do, is access the other computers through browsing (explorer.exe, powershell, map networkdrive, etc.), which means that I am pretty much down to, that it is unable to resolve the path somehow, when trying to connect to other computers trough browsing, though the path gets resolved perfectly using all kinds of other services. Any recommendations as to what I can try next to resolve the issue? :)

Read the article
Mongodb Slave replication lag

- by Leonid Bugaev

We using standard mongo setup: 2 replicas + 1 arbiter. Both replica servers use same AWS m1.medium with RAID10 EBS. We experiencing constantly growing replication lag on secondary replica. I tried to do full-resync, you can see it on graph, but it helped only for some hours. Our mongo usage is really low now, and frankly i can't understan why it can be. iostat 1 for secondary: avg-cpu: %user %nice %system %iowait %steal %idle 80.39 0.00 2.94 0.00 16.67 0.00 Device: tps kB_read/s kB_wrtn/s kB_read kB_wrtn xvdap1 0.00 0.00 0.00 0 0 xvdb 0.00 0.00 0.00 0 0 xvdfp4 12.75 0.00 189.22 0 193 xvdfp3 12.75 0.00 189.22 0 193 xvdfp2 7.84 0.00 40.20 0 41 xvdfp1 7.84 0.00 40.20 0 41 md127 19.61 0.00 219.61 0 224 mongostat for secondary (why 100% locks? i guess its the problem): insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn set repl time *10 *0 *16 *0 0 2|4 0 30.9g 62.4g 1.65g 0 107 0 0|0 0|0 198b 1k 16 replset-01 SEC 06:55:37 *4 *0 *8 *0 0 12|0 0 30.9g 62.4g 1.65g 0 91.7 0 0|0 0|0 837b 5k 16 replset-01 SEC 06:55:38 *4 *0 *7 *0 0 3|0 0 30.9g 62.4g 1.64g 0 110 0 0|0 0|0 342b 1k 16 replset-01 SEC 06:55:39 *4 *0 *8 *0 0 1|0 0 30.9g 62.4g 1.64g 0 82.9 0 0|0 0|0 62b 1k 16 replset-01 SEC 06:55:40 *3 *0 *7 *0 0 5|0 0 30.9g 62.4g 1.6g 0 75.2 0 0|0 0|0 466b 2k 16 replset-01 SEC 06:55:41 *4 *0 *7 *0 0 1|0 0 30.9g 62.4g 1.64g 0 138 0 0|0 0|1 62b 1k 16 replset-01 SEC 06:55:42 *7 *0 *15 *0 0 3|0 0 30.9g 62.4g 1.64g 0 95.4 0 0|0 0|0 342b 1k 16 replset-01 SEC 06:55:43 *7 *0 *14 *0 0 1|0 0 30.9g 62.4g 1.64g 0 98 0 0|0 0|0 62b 1k 16 replset-01 SEC 06:55:44 *8 *0 *17 *0 0 3|0 0 30.9g 62.4g 1.64g 0 96.3 0 0|0 0|0 342b 1k 16 replset-01 SEC 06:55:45 *7 *0 *14 *0 0 3|0 0 30.9g 62.4g 1.64g 0 96.1 0 0|0 0|0 186b 2k 16 replset-01 SEC 06:55:46 mongostat for primary insert query update delete getmore command flushes mapped vsize res faults locked % idx miss % qr|qw ar|aw netIn netOut conn set repl time 12 30 20 0 0 3 0 30.9g 62.6g 641m 0 0.9 0 0|0 0|0 212k 619k 48 replset-01 M 06:56:41 5 17 10 0 0 2 0 30.9g 62.6g 641m 0 0.5 0 0|0 0|0 159k 429k 48 replset-01 M 06:56:42 9 22 16 0 0 3 0 30.9g 62.6g 642m 0 0.7 0 0|0 0|0 158k 276k 48 replset-01 M 06:56:43 6 18 12 0 0 2 0 30.9g 62.6g 640m 0 0.7 0 0|0 0|0 93k 231k 48 replset-01 M 06:56:44 6 12 8 0 0 3 0 30.9g 62.6g 640m 0 0.3 0 0|0 0|0 80k 125k 48 replset-01 M 06:56:45 8 21 14 0 0 9 0 30.9g 62.6g 641m 0 0.6 0 0|0 0|0 118k 419k 48 replset-01 M 06:56:46 10 34 20 0 0 6 0 30.9g 62.6g 640m 0 1.3 0 0|0 0|0 164k 527k 48 replset-01 M 06:56:47 6 21 13 0 0 2 0 30.9g 62.6g 641m 0 0.7 0 0|0 0|0 111k 477k 48 replset-01 M 06:56:48 8 21 15 0 0 2 0 30.9g 62.6g 641m 0 0.7 0 0|0 0|0 204k 336k 48 replset-01 M 06:56:49 4 12 8 0 0 8 0 30.9g 62.6g 641m 0 0.5 0 0|0 0|0 156k 530k 48 replset-01 M 06:56:50 Mongo version: 2.0.6

Read the article
TCP stops sending weirdly.

- by Utoah

In case to find out the cause of TCP retransmits on my Linux (RHEL, kernel 2.6.18) servers connecting to the same switch. I had a client-server pair send "Hello" to each other every 200us and captured the packets with tcpdump on the client machine. The command I used to mimic client and server are: while [ 0 ]; do echo "Hello"; usleep 200; done | nc server 18510 while [ 0 ]; do echo "Hello"; usleep 200; done | nc -l 18510 When the server machine was busy serving some other requests, the client suffered from abrupt retransmits occasionally. But the output of tcpdump seemed irrational. 16:04:58.898970 IP server.18510 > client.34533: P 4531:4537(6) ack 3204 win 123 <nop,nop,timestamp 1923778643 3452833828> 16:04:58.901797 IP client.34533 > server.18510: P 3204:3210(6) ack 4537 win 33 <nop,nop,timestamp 3452833831 1923778643> 16:04:58.901855 IP server.18510 > client.34533: P 4537:4549(12) ack 3210 win 123 <nop,nop,timestamp 1923778646 3452833831> 16:04:58.903871 IP client.34533 > server.18510: P 3210:3216(6) ack 4549 win 33 <nop,nop,timestamp 3452833833 1923778646> 16:04:58.903950 IP server.18510 > client.34533: P 4549:4555(6) ack 3216 win 123 <nop,nop,timestamp 1923778648 3452833833> 16:04:58.905796 IP client.34533 > server.18510: P 3216:3222(6) ack 4555 win 33 <nop,nop,timestamp 3452833835 1923778648> 16:04:58.905860 IP server.18510 > client.34533: P 4555:4561(6) ack 3222 win 123 <nop,nop,timestamp 1923778650 3452833835> 16:04:58.908903 IP client.34533 > server.18510: P 3222:3228(6) ack 4561 win 33 <nop,nop,timestamp 3452833838 1923778650> 16:04:58.908966 IP server.18510 > client.34533: P 4561:4567(6) ack 3228 win 123 <nop,nop,timestamp 1923778653 3452833838> 16:04:58.911855 IP client.34533 > server.18510: P 3228:3234(6) ack 4567 win 33 <nop,nop,timestamp 3452833841 1923778653> 16:04:59.112573 IP client.34533 > server.18510: P 3228:3234(6) ack 4567 win 33 <nop,nop,timestamp 3452834042 1923778653> 16:04:59.112648 IP server.18510 > client.34533: P 4567:5161(594) ack 3234 win 123 <nop,nop,timestamp 1923778857 3452834042> 16:04:59.112659 IP client.34533 > server.18510: P 3234:3672(438) ack 5161 win 35 <nop,nop,timestamp 3452834042 1923778857> 16:04:59.114427 IP server.18510 > client.34533: P 5161:5167(6) ack 3672 win 126 <nop,nop,timestamp 1923778858 3452834042> 16:04:59.114439 IP client.34533 > server.18510: P 3672:3678(6) ack 5167 win 35 <nop,nop,timestamp 3452834044 1923778858> 16:04:59.116435 IP server.18510 > client.34533: P 5167:5173(6) ack 3678 win 126 <nop,nop,timestamp 1923778860 3452834044> 16:04:59.116444 IP client.34533 > server.18510: P 3678:3684(6) ack 5173 win 35 <nop,nop,timestamp 3452834046 1923778860> Packet 3228:3234(6) from client was retransmitted due to ack timeout. What I could not understand was that the client machine did not send out any packets after the first 3228:3234(6) packets was sent. The server machine had advertised a window (scaled) large enough. The data transfer up to the retransmit was fine which meant no slow start should be in action. What can cause the client machine to stop sending until the packet timed out? BTW, I am unable to run tcpdump on the server machine.

Read the article
Outlook Web Access, reverse proxy and browser

- by M'vy

Hi SF'ers! We recently moved an exchange server behind a reverse proxy due to the loss of a public IP. I've managed to configure the reverse proxy (httpd proxy_http). But there is a problem for the SSL configuration. When accessing the OWA interface with Firefox, all is ok and working. When accessing with MSIE or Chrome, they do not retrieve the good SSL Certificate. I think this is due to the multiples virtual host for httpd. Is there a workaround to make sure MSIE/Chrome request the certificate for the good domain name like FF does? Already tested with the SSL virtual host : SetEnvIf User-Agent ".*MSIE.*" value BrowserMSIE Header unset WWW-Authenticate Header add WWW-Authenticate "Basic realm=exchange.domain.com" A: ProxyPreserveHost On also: BrowserMatch ".*MSIE.*" \ nokeepalive ssl-unclean-shutdown \ downgrade-1.0 force-response-1.0 Or: SetEnvIf User-Agent ".*MSIE.*" \ nokeepalive ssl-unclean-shutdown \ downgrade-1.0 force-response-1.0 And lots of ProxyPassand ProxyReversePath on /exchweb /exchange /public etc... And it still don't seem to work. Any clue? Thanks. Edit 1: Precision of versions # openssl version OpenSSL 0.9.8k-fips 25 Mar 2009 /usr/sbin/httpd -v Server version: Apache/2.2.11 (Unix) Server built: Mar 17 2009 09:15:10 Browser versions : MSIE : 8.0.6001 Opera: Version 11.01 Revision 1190 Firefox: 3.6.15 Chrome: 10.0.648.151 Operating System: Windows Vista 32bits. They are all SNI compliant, I've tested them this afternoon https://sni.velox.ch/ You're right Shane Madden, I have multiple sites on the same public IP (and same port as well). The server itself is just a reverse proxy, that rewrite addresses to internal servers. The default host is a dev site, configure with the certificate that does not match the OWA (of course... would have been to easy) <VirtualHost *:443> ServerName dev2.domain.com ServerAdmin [email protected] CustomLog "| /usr/sbin/rotatelogs /var/log/httpd/access-%y%m%d.log 86400" combined ErrorLog "| /usr/sbin/rotatelogs /var/log/httpd/error-%y%m%d.log 86400" LogLevel warn RewriteEngine on SetEnvIfNoCase X-Forwarded-For .+ proxy=yes SSLEngine on SSLProtocol -all +SSLv3 +TLSv1 SSLCipherSuite ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP:+eNULL:+SSLv3 SSLCertificateFile /etc/httpd/ssl/domain.com.crt SSLCertificateKeyFile /etc/httpd/ssl/domain.com.key RewriteCond %{HTTP_HOST} dev2\.domain\.com RewriteRule ^/(.*)$ http://dev2.domain.com/$1 [L,P] </VirtualHost> The certificate of domain is a *.domain.com The second vHost is : <VirtualHost *:443> ServerName exchange.domain2.com ServerAdmin [email protected] CustomLog "| /usr/sbin/rotatelogs /var/log/httpd/exchange/access-%y%m%d.log 86400" combined ErrorLog "| /usr/sbin/rotatelogs /var/log/httpd/exchange/error-%y%m%d.log 86400" LogLevel warn SSLEngine on SSLProxyEngine On SSLProtocol -all +SSLv3 +TLSv1 SSLCipherSuite ALL:!ADH:!EXPORT56:RC4+RSA:+HIGH:+MEDIUM:+LOW:+SSLv2:+EXP:+eNULL:+SSLv3 SSLCertificateFile /etc/httpd/ssl/exchange.pem SSLCertificateKeyFile /etc/httpd/ssl/exchange.key RewriteEngine on SetEnvIfNoCase X-Forwarded-For .+ proxy=yes RewriteCond %{HTTP_HOST} exchange\.domain2\.com RewriteRule ^/(.*)$ https://exchange.domain2.com/$1 [L,P] </VirtualHost> and it's certificate is exchange.domain2.com only. I presume the SNI is somewhere not activated on my server. The versions of openssl and apache seams to be ok for the SNI support. The only thing I do not know is if httpd has been compile with the good options. (I assume it's a fedora packet).

Read the article
cannot send mail to postfix /w iptables linux proxy

- by Juzzam

I have two separate servers, both running Ubuntu 8.04. Server 1 has the real domain name of our site, let's refer to it as example.com. Server 2 is a mail server I have setup with postfix/courier. The hostname for this server is mail.example.com. I've setup iptables on Server 1 to forward all traffic on port 25 to Server 2. I used this script (except I changed the target ip address and the port from 80 to 25). When I send an email to [email protected] it works. However, when I try to send an email to [email protected] from gmail, I get this error: 550 550 #5.1.0 Address rejected [email protected] (state 14) /var/log/mail.log shows no new lines when this happens. What is strange is that it works with telnet from my local machine. For example: $ telnet example.com 25 220 VO13421.localdomain SMTP Postfix EHLO example.com 250-VO13421.localdomain 250-PIPELINING 250-SIZE 10240000 250-ETRN 250-STARTTLS 250-ENHANCEDSTATUSCODES 250-8BITMIME 250 DSN MAIL FROM: [email protected] 250 2.1.0 Ok RCPT TO: [email protected] 250 2.1.5 Ok data 354 Please start mail input. hello user... how have you been? . 250 Mail queued for delivery. quit 221 Closing connection. Good bye. /var/log/mail.log shows success (and the email goes to the maildr): Feb 24 09:47:36 VO13421 postfix/smtpd[2212]: connect from 81.208.68.208.static.dnsptr.net[208.68.xxx.xxx] Feb 24 09:48:01 VO13421 postfix/smtpd[2212]: warning: restriction `smtpd_data_restrictions' after `permit' is ignored Feb 24 09:48:01 VO13421 postfix/smtpd[2212]: 65C68120321: client=81.208.68.208.static.dnsptr.net[208.68.xxx.xxx] Feb 24 09:48:29 VO13421 postfix/smtpd[2212]: warning: restriction `smtpd_data_restrictions' after `permit' is ignored Feb 24 09:48:29 VO13421 postfix/smtpd[2212]: 6BDFA120321: client=81.208.68.208.static.dnsptr.net[208.68.xxx.xxx] Feb 24 09:48:29 VO13421 postfix/cleanup[2216]: 6BDFA120321: message-id= Feb 24 09:48:29 VO13421 postfix/qmgr[2042]: 6BDFA120321: from=, size=395, nrcpt=1 (queue active) Feb 24 09:48:29 VO13421 postfix/virtual[2217]: 6BDFA120321: to=, relay=virtual, delay=0.28, delays=0.25/0.02/0/0.01, dsn=2.0.0, status=sent (delivered to maildir) Feb 24 09:48:29 VO13421 postfix/qmgr[2042]: 6BDFA120321: removed Feb 24 09:48:30 VO13421 postfix/smtpd[2212]: disconnect from 81.208.68.208.static.dnsptr.net[208.68.xxx.xxx] iptables -L -n -v --line on example.com yields the following. Anyone know an iptables command to see the port forwarding? Also, it seems to accept all traffic, that's probably bad right? ;] num pkts bytes target prot opt in out source destination 1 14041 1023K ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 Chain FORWARD (policy ACCEPT 0 packets, 0 bytes) num pkts bytes target prot opt in out source destination 1 338 20722 ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 Chain OUTPUT (policy ACCEPT 419K packets, 425M bytes) num pkts bytes target prot opt in out source destination 1 13711 2824K ACCEPT all -- * * 0.0.0.0/0 0.0.0.0/0 postconf -n results in: alias_database = hash:/etc/postfix/aliases alias_maps = hash:/etc/postfix/aliases append_dot_mydomain = no biff = no config_directory = /etc/postfix delay_warning_time = 4h disable_vrfy_command = yes inet_interfaces = all local_recipient_maps = mailbox_size_limit = 0 masquerade_domains = mail.example.com mail1.example.com masquerade_exceptions = root maximal_backoff_time = 8000s maximal_queue_lifetime = 7d minimal_backoff_time = 1000s mydestination = mynetworks = 127.0.0.0/8 [::ffff:127.0.0.0]/104 [::1]/128 mynetworks_style = host myorigin = example.com readme_directory = no recipient_delimiter = + relayhost = smtp_helo_timeout = 60s smtp_tls_session_cache_database = btree:${data_directory}/smtp_scache smtpd_banner = $myhostname SMTP $mail_name smtpd_client_restrictions = reject_rbl_client sbl.spamhaus.org, reject_rbl_client blackholes.easynet.nl, reject_rbl_client dnsbl.njabl.org smtpd_delay_reject = yes smtpd_hard_error_limit = 12 smtpd_helo_required = yes smtpd_helo_restrictions = permit_mynetworks, warn_if_reject reject_non_fqdn_hostname, reject_invalid_hostname, permit smtpd_recipient_limit = 16 smtpd_recipient_restrictions = reject_unauth_pipelining, permit_mynetworks, reject_non_fqdn_recipient, reject_unknown_recipient_domain, reject_unauth_destination, permit smtpd_data_restrictions = reject_unauth_pipelining smtpd_sender_restrictions = permit_mynetworks, warn_if_reject reject_non_fqdn_sender, reject_unknown_sender_domain, reject_unauth_pipelining, permit smtpd_soft_error_limit = 3 smtpd_tls_cert_file = /etc/ssl/certs/ssl-cert-snakeoil.pem smtpd_tls_key_file = /etc/ssl/private/ssl-cert-snakeoil.key smtpd_tls_session_cache_database = btree:${data_directory}/smtpd_scache smtpd_use_tls = yes unknown_local_recipient_reject_code = 450 virtual_alias_maps = mysql:/etc/postfix/mysql_alias.cf virtual_gid_maps = mysql:/etc/postfix/mysql_gid.cf virtual_mailbox_base = /var/spool/mail/virtual virtual_mailbox_domains = mysql:/etc/postfix/mysql_domains.cf virtual_mailbox_maps = mysql:/etc/postfix/mysql_mailbox.cf virtual_uid_maps = mysql:/etc/postfix/mysql_uid.cf

Read the article

Search Results

Search found 9816 results on 393 pages for 'blade servers'.

Page 377/393 | < Previous Page | 373 374 375 376 377 378 379 380 381 382 383 384 | Next Page >

- by Jeff

- by Jesse

- by al4

- by Lockhead

- by Starsky

- by Matthew

- by womp

- by Eric Falsken

- by dandv

- by Killrawr

- by ramdaz

- by Max

- by Vulpo

- by Harold Smith

- by jackie

- by marc.riera

- by Notre1

- by pitosalas

- by ubunut

- by rihatum

- by Johny Skovdal

- by Leonid Bugaev

- by Utoah

- by M'vy

- by Juzzam

< Previous Page | 373 374 375 376 377 378 379 380 381 382 383 384 | Next Page >