Search Results

Search found 2366 results on 95 pages for 'cat mitch'.

Page 36/95 | < Previous Page | 32 33 34 35 36 37 38 39 40 41 42 43  | Next Page >

  • Simple mdadm RAID 1 not activating spare

    - by Nick Liu
    I had created two 2TB HDD partitions (/dev/sdb1 and /dev/sdc1) in a RAID 1 array called /dev/md0 using mdadm on Ubuntu 12.04 LTS Precise Pangolin. The command sudo mdadm --detail /dev/md0 used to indicate both drives as active sync. Then, for testing, I failed /dev/sdb1, removed it, then added it again with the command sudo mdadm /dev/md0 --add /dev/sdb1 watch cat /proc/mdstat showed a progress bar of the array rebuilding, but I wouldn't spend hours watching it, so I assumed that the software knew what it was doing. After the progress bar was no longer showing, cat /proc/mdstat displays: md0 : active raid1 sdb1[2](S) sdc1[1] 1953511288 blocks super 1.2 [2/1] [U_] And sudo mdadm --detail /dev/md0 shows: /dev/md0: Version : 1.2 Creation Time : Sun May 27 11:26:05 2012 Raid Level : raid1 Array Size : 1953511288 (1863.01 GiB 2000.40 GB) Used Dev Size : 1953511288 (1863.01 GiB 2000.40 GB) Raid Devices : 2 Total Devices : 2 Persistence : Superblock is persistent Update Time : Mon May 28 11:16:49 2012 State : clean, degraded Active Devices : 1 Working Devices : 2 Failed Devices : 0 Spare Devices : 1 Name : Deltique:0 (local to host Deltique) UUID : 49733c26:dd5f67b5:13741fb7:c568bd04 Events : 32365 Number Major Minor RaidDevice State 1 8 33 0 active sync /dev/sdc1 1 0 0 1 removed 2 8 17 - spare /dev/sdb1 I've been told that mdadm automatically replaces removed drives with spares, but /dev/sdb1 isn't being moved into the expected position, RaidDevice 1. UPDATE (30 May 2012): A badblocks destructive read-write test of the entire /dev/sdb yielded no errors as expected; both HDDs are new. As of the latest edit, I assembled the array with this command: sudo mdadm --assemble --force --no-degraded /dev/md0 /dev/sdb1 /dev/sdc1 The output was: mdadm: /dev/md0 has been started with 1 drive (out of 2) and 1 rebuilding. Rebuilding looks like it's progressing normally: md0 : active raid1 sdc1[1] sdb1[2] 1953511288 blocks super 1.2 [2/1] [U_] [>....................] recovery = 0.6% (13261504/1953511288) finish=2299.7min speed=14060K/sec unused devices: <none> I'm now waiting on this rebuild, but I'm expecting /dev/sdb1 to become a spare just like the five or six times that I've tried rebuilding before. UPDATE (31 May 2012): Yeah, it's still a spare. Ugh! UPDATE (01 June 2012): I'm trying Adrian Kelly's suggested command: sudo mdadm --assemble --update=resync /dev/md0 /dev/sdb1 /dev/sdc1 Waiting on the rebuild now... My questions are: Why isn't the spare drive becoming active sync? How can I make the spare drive become active?

    Read the article

  • how to check open ports of bunch of website at once with nmap/linux?

    - by austin powers
    hi , I want to use nmap in such a way that I could check bunch of server's port at once for checking whether their particular port is open or not? right now I have 10 ip addresses but in future this could be more . I know the very basic command in linux like cat/nano/piping but I don't know how can I feed to nmap the list of my servers to open them one by one and return the result.

    Read the article

  • Why ethernet cables must be ended with specific arrangement

    - by adopilot
    I just accepted that ethernet cables CAT 5 and more must be ended with specific arrangement. I learned when I ending my cables to take attention that either end must be in same arrangement(568A or 568B ). Sometime I get stacked with my fellow servant that they claim that Cable should work if just arrangement at both side are same even if it is not in 568A or 568B layouts. My experience said that it is not true, but I am now looking for some technical argument to prove that.

    Read the article

  • How to free up block device that is mounted to an inaccessible place?

    - by Vi
    root@vi-notebook:~# cat /proc/mounts | grep raidy /dev/md0 /root/e/i/wpc2/boot/mnt/raidy reiserfs ro,nosuid,nodev,noexec,noatime 0 0 root@vi-notebook:~# umount -n /root/e/i/wpc2/boot/mnt/raidy umount: /root/e/i/wpc2/boot/mnt/raidy: Transport endpoint is not connected root@vi-notebook:~# mount /dev/md/raidy /mnt/raidy/ -t reiserfs -o nodev,nosuid,noexec,acl,noatime mount: /dev/md0 already mounted or /mnt/raidy/ busy The only workaround I found is: root@vi-notebook:~# losetup /dev/loop0 /dev/md/raidy root@vi-notebook:~# mount /dev/loop0 /mnt/raidy/ -t reiserfs -o nodev,nosuid,noexec,acl,noatime

    Read the article

  • How can I get a list of linux users/group?

    - by Sergei
    Hello, guys, I need to get and filter the linux users list like: username1 username1_group username2 username2_group ... usernameN usernameN_group I've tried, but only that I've found is: cat /etc/passwd | grep /home | cut -d: -f1 It gives me the list of users in /home folder. But how can I add the group name to each of them? Thanks in advance!

    Read the article

  • Nagios shell script cannot be executed

    - by MeinAccount
    I'm trying to monitor GitLab with nagios. I've created the following command definition and shell script but when checking the service I'm receiving the following e-mail. How can I solve this? The file is executable. [...] nagios : 3 incorrect password attempts ; TTY=unknown ; PWD=/ ; USER=git ; COMMAND=/bin/bash -c /var/lib/nagios/custom_plugins/check_gitlab.sh Command definition: define command { command_name custom_check_gitlab command_line /var/lib/nagios/custom_plugins/check_gitlab.sh } Shell script: #! /bin/sh # [...] RAILS_ENV="production" # Script variable names should be lower-case not to conflict with internal /bin/sh variables such as PATH, EDITOR or SHELL. app_root="/home/git/gitlab" app_user="git" unicorn_conf="$app_root/config/unicorn.rb" pid_path="$app_root/tmp/pids" socket_path="$app_root/tmp/sockets" web_server_pid_path="$pid_path/unicorn.pid" sidekiq_pid_path="$pid_path/sidekiq.pid" ### Here ends user configuration ### # Switch to the app_user if it is not he/she who is running the script. if [ "$USER" != "$app_user" ]; then sudo -u "$app_user" -H -i $0 "$@"; exit; fi # Switch to the gitlab path, if it fails exit with an error. if ! cd "$app_root" ; then echo "Failed to cd into $app_root, exiting!"; exit 1 fi ### Init Script functions check_pids(){ if ! mkdir -p "$pid_path"; then echo "Could not create the path $pid_path needed to store the pids." exit 1 fi # If there exists a file which should hold the value of the Unicorn pid: read it. if [ -f "$web_server_pid_path" ]; then wpid=$(cat "$web_server_pid_path") else wpid=0 fi if [ -f "$sidekiq_pid_path" ]; then spid=$(cat "$sidekiq_pid_path") else spid=0 fi } # Checks whether the different parts of the service are already running or not. check_status(){ check_pids # If the web server is running kill -0 $wpid returns true, or rather 0. # Checks of *_status should only check for == 0 or != 0, never anything else. if [ $wpid -ne 0 ]; then kill -0 "$wpid" 2>/dev/null web_status="$?" else web_status="-1" fi if [ $spid -ne 0 ]; then kill -0 "$spid" 2>/dev/null sidekiq_status="$?" else sidekiq_status="-1" fi } check_pids check_status if [ "$web_status" != "0" -a "$sidekiq_status" != "0" ]; then echo "GitLab is not running." exit 2 fi if [ "$web_status" != "0" ]; then printf "The GitLab Unicorn webserver is \033[31mnot running\033[0m.\n" exit 1 fi if [ "$sidekiq_status" != "0" ]; then printf "The GitLab Sidekiq job dispatcher is \033[31mnot running\033[0m.\n" exit 1 fi if [ "$web_status" = "0" -a "$sidekiq_status" = "0" ]; then printf "GitLab and all it's components are \033[32mup and running\033[0m.\n" exit 0 fi

    Read the article

  • Does gunzip work in memory or does it write to disk?

    - by Ryan Detzel
    We have our log files gzipped to save space. Normally we keep them compressed and just do gunzip -c file.gz | grep 'test' to find important information but we're wondering if it's quicker to keep the files uncompressed and then do the grep. cat file | grep 'test' There has been some discussions about how gzip works if it would make sense that if it reads it into memory and unzips then the first one would be faster but if it doesn't then the second one would be faster. Does anyone know how gzip uncompresses data?

    Read the article

  • Is it possible to prevent SCP while still allowing SSH access?

    - by Jason
    Using Solaris and Linux servers and OpenSSH, is it possible to prevent users from copying files using "scp" while still allowing shell access with "ssh"? I realize that 'ssh $server "cat file" ' type file accesses are much harder to prevent, but I need to see about stopping "scp" for starters. Failing that, is there a way to reliably log all SCP access on the server side through syslog?

    Read the article

  • Console session stuck ESXi unsuported mode

    - by Datapimp23
    Hi guys, I was editing the /etc/inetd.conf file in VI to enable SSH and when I believed to have saved it. I entered cat and pressed enter accidently. Now I seem to be stuck in a black screen where I can enter characters but can't exit from it. How would I go on my way to get back to the normal console without rebooting the ESXi. Thanks

    Read the article

  • Why ethernet calbes must be ended with specific arrangement

    - by adopilot
    I just accepted that ethernet cables CAT 5 and more must be ended with specific arrangement. I learned when I ending my cables to take attention that either end must be in same arrangement(568A or 568B ). Sometime I get stacked with my fellow servant that they claim that Cable should work if just arrangement at both side are same even if it is not in 568A or 568B layouts. My experience said that it is not true, but I am now looking for some technical argument to prove that.

    Read the article

  • nginx + Jetty - thousands of connections stuck in LAST_ACK

    - by virulence
    I have a FreeBSD machine with jails -- two in particular, one that runs nginx and another that runs a Java program that accepts requests via Jetty (embedded mode) Jetty receives upwards of 500 requests/sec constantly and there has been an issue lately where I will constantly have over 60,000 connections in the LAST_ACK state between nginx and jetty. Distribution of all connections (includes some other services, particularly php-fpm) root@host:/root # netstat -an > conns.txt root@host:/root # cat conns.txt | awk '{print $6}' | sort | uniq -c | sort -n 18 LISTEN 112 CLOSING 485 ESTABLISHED 650 FIN_WAIT_2 1425 FIN_WAIT_1 3301 TIME_WAIT 64215 LAST_ACK Distribution of nginx - jetty connections root@host:/root # cat conns.txt | grep '10.10.1.57' | awk '{print $6}' | sort | uniq -c | sort -n 1 3 CLOSE_WAIT 3 LISTEN 18 FIN_WAIT_2 125 ESTABLISHED 64193 LAST_ACK I'd prefer every request to fully close the connection. Clients requests are about 10 minutes apart from each other so connections must be closed. Some of the connections, tcp4 0 0 10.10.1.50.46809 10.10.1.57.9050 LAST_ACK tcp4 0 0 10.10.1.50.46805 10.10.1.57.9050 LAST_ACK tcp4 0 0 10.10.1.50.46797 10.10.1.57.9050 LAST_ACK tcp4 0 0 10.10.1.50.46794 10.10.1.57.9050 LAST_ACK tcp4 0 0 10.10.1.50.46790 10.10.1.57.9050 LAST_ACK tcp4 0 0 10.10.1.50.46789 10.10.1.57.9050 LAST_ACK tcp4 0 0 10.10.1.50.46771 10.10.1.57.9050 LAST_ACK etc.. On Jetty's end I've set maxIdleTime to 2000 -- before this all connections were in ESTABLISHED but they are now LAST_ACK On Jetty's end I've set Connection: close (i.e response.setHeader(HttpHeaders.CONNECTION, HttpHeaderValues.CLOSE);) Jetty never reports a lot of open connections -- always very few. PF/IPFW is not currently being used nginx - reset_timedout_connection is on I cannot figure out how to get nginx or jetty to forcibly close the connection, is this simply something that needs to be fixed in Jetty so that it fully closes the socket after the request finishes? Thanks a lot in advance EDIT: forgot my nginx config for the proxy setup- proxy_pass http://10.10.1.57:9050; proxy_set_header HTTP_X_GEOIP $http_x_geoip; proxy_set_header GEOIP_COUNTRY_CODE $geoip_country_code; proxy_set_header X-Real-IP $remote_addr; proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for; proxy_set_header Host $http_host; proxy_set_header Connection ""; proxy_http_version 1.1; EDIT2: Forcing Jetty to close the connection via request.getConnection().getEndPoint().close() does nothing -- it's obvious the connection IS being closed (as it's in LAST_ACK) but why isn't it getting past this? Is Nginx keeping the connection open to the backend for some reason?

    Read the article

  • Serial communication in between VM and ESX?

    - by user41685
    I am in verse to understand the serial communication between guest and host.I have just added a serial port through VI Client selecting "Use serial port on host"(host being ESX host).If I am on VM and run : ll /dev/ttyS0 I understand that it should get displayed through ESX Host. So On ESX Host, I typed: cat /dev/ttyS0 But nothing worked !!!

    Read the article

  • Run a local script on a remote server using ssh with out having to worry about quotes

    - by Michael Irey
    So I have been running local scripts fine on a remote server: ssh user@server '`cat local-script.sh`' However, today I have a script that has both single and double quotes in it. Which causes the script to fail because the output of cat local-script.sh is wrapped in quotes. With out modifying the script itself, is there a better way to handle this? I thought this may work: ssh user@server $(<local-script.sh) But is does not seem to do anything...

    Read the article

  • Ubuntu karmic doesn't have the version file?

    - by Blankman
    I was following this tutorial: http://articles.slicehost.com/2010/4/23/ubuntu-karmic-setup-part-2 On my ubuntu karmic version (on ec2, ami from elastic) I don't see this file: cat /etc/lsb-release It just isn't there. How can I see the version of the O/S? And shouldn't that file be there? Some people have told me ubuntu isn't really used as a server, is that true or is the trend making it more viable?

    Read the article

  • How to interpret the contents of /proc/bus/pci/devices ?

    - by vivekian2
    The first few fields of 'cat /proc/bus/pci/devices' are understandable. Field 1 - BusDevFunc Field 2 - Vendor Id + Device Id Field 3 - Interrupt Line Field 4 - BAR 0 and the rest of the BAR registers (0 - 5) after that. After the BAR registers are printed out, what are the other fields? Specifically, what PCI configuration space registers(offsets) are printed out?

    Read the article

  • Failed to start up after upgrading software in ubuntu 10.10

    - by Landy
    I've been running Ubuntu 10.10 in a physical x86-64 machine. Today Update Manager reminded me that there are some updates to install and I confirmed the action. I should had read the update list but I didn't. I can only remember there is an update about cups. After the upgrading, Update Manager requires a restart and I confirmed too. But after the restart, the computer can't start up. There are errors in the console. Begin: Running /scripts/init-premount ... done. Begin: Mounting root file system ... Begin: Running /scripts/local-top ... done. [xxx]usb 1-8: new high speed USB device using ehci_hcd and address 3 [xxx]usb 2-1: new full speed USB device using ohci_hcd and address 2 [xxx]hub 2-1:1.0: USB hub found [xxx]hub 2-1:1.0: 4 ports detected [xxx]usb 2-1.1: new low speed USB device using ohci_hcd and address 3 Gave up waiting for root device. Common probles: - Boot args (cat /proc/cmdline) - Check rootdelay=(did the system wait long enough) - Check root= (did the system wait for the right device?) - Missing modules (cat /proc/modules; ls /dev) FATAL: Could not load /lib/modules/2.6.35-22-generic/modules.dep: No such file or directory FATAL: Could not load /lib/modules/2.6.35-22-generic/modules.dep: No such file or directory ALERT! /dev/sda1 does not exist. Dropping to a shell! BusyBox v1.15.3 (Ubuntu 1:1.15.3-1ubuntu5) built-in shell(ash) Enter 'help' for a list of built-in commands. (initramfs)[cursor is here] At the moment, I can't input anything in the console. The keyboard doesn't work at all. What's wrong? How can I check boot args or "root=" as suggested? How can I fix this issue? Thanks. =============== PS1: the /dev/sda1 is type ext4 (rw,nosuid,nodev) PS2: the /dev/sda1 can be mounted and accessed successfully under SUSE 11 SP1 x64.

    Read the article

  • how do I start GIT daemon automatically under CentOS 4.8 ?

    - by ck2
    Apparently my server is running CentOS 4.8 with Cpanel uname -a 2.6.9-023stab048.6-enterprise #1 SMP MSK 2008 i686 i686 i386 GNU/Linux cat /etc/redhat-release CentOS release 4.8 (Final) I'd prefer to install it as a service but I cannot seem to install "yum git-daemon" there is no package available for CentOS 4.8 (when I try to include another repos for it I get too many dependency failures) So what's the easiest way to just start it? Typically this is how I do it from CLI git daemon --detach --user=git --group=git Thanks for any help!

    Read the article

  • How do I tell mdadm to start using a missing disk in my RAID5 array again?

    - by Jon Cage
    I have a 3-disk RAID array running in my Ubuntu server. This has been running flawlessly for over a year but I was recently forced to strip, move and rebuild the machine. When I had it all back together and ran up Ubuntu, I had some problems with disks not being detected. A couple of reboots later and I'd solved that issue. The problem now is that the 3-disk array is showing up as degraded every time I boot up. For some reason it seems that Ubuntu has made a new array and added the missing disk to it. I've tried stopping the new 1-disk array and adding the missing disk, but I'm struggling. On startup I get this: root@uberserver:~# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md_d1 : inactive sdf1[2](S) 1953511936 blocks md0 : active raid5 sdg1[2] sdc1[3] sdb1[1] sdh1[0] 2930279808 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU] I have two RAID arrays and the one that normally pops up as md1 isn't appearing. I read somewhere that calling mdadm --assemble --scan would re-assemble the missing array so I've tried first stopping the existing array that ubuntu started: root@uberserver:~# mdadm --stop /dev/md_d1 mdadm: stopped /dev/md_d1 ...and then tried to tell ubuntu to pick the disks up again: root@uberserver:~# mdadm --assemble --scan mdadm: /dev/md/1 has been started with 2 drives (out of 3). So that's started md1 again but it's not picking up the disk from md_d1: root@uberserver:~# cat /proc/mdstat Personalities : [linear] [multipath] [raid0] [raid1] [raid6] [raid5] [raid4] [raid10] md1 : active raid5 sde1[1] sdf1[2] 3907023872 blocks level 5, 64k chunk, algorithm 2 [3/2] [_UU] md_d1 : inactive sdd1[0](S) 1953511936 blocks md0 : active raid5 sdg1[2] sdc1[3] sdb1[1] sdh1[0] 2930279808 blocks level 5, 64k chunk, algorithm 2 [4/4] [UUUU] What's going wrong here? Why is Ubuntu trying to pick up sdd into a different array? How do I get that missing disk back home again? [Edit] - After adding the md1 to mdadm.conf it now tries to mount the array on startup but it's still missing the disk. If I tell it to try and assemble automatically I get the impression it know it needs sdd but can't use it: root@uberserver:~# mdadm --assemble --scan /dev/md1: File exists mdadm: /dev/md/1 already active, cannot restart it! mdadm: /dev/md/1 needed for /dev/sdd1... What am I missing?

    Read the article

  • Is Assert.Fail() considered bad practice?

    - by Mendelt
    I use Assert.Fail a lot when doing TDD. I'm usually working on one test at a time but when I get ideas for things I want to implement later I quickly write an empty test where the name of the test method indicates what I want to implement as sort of a todo-list. To make sure I don't forget I put an Assert.Fail() in the body. When trying out xUnit.Net I found they hadn't implemented Assert.Fail. Of course you can always Assert.IsTrue(false) but this doesn't communicate my intention as well. I got the impression Assert.Fail wasn't implemented on purpose. Is this considered bad practice? If so why? @Martin Meredith That's not exactly what I do. I do write a test first and then implement code to make it work. Usually I think of several tests at once. Or I think about a test to write when I'm working on something else. That's when I write an empty failing test to remember. By the time I get to writing the test I neatly work test-first. @Jimmeh That looks like a good idea. Ignored tests don't fail but they still show up in a separate list. Have to try that out. @Matt Howells Great Idea. NotImplementedException communicates intention better than assert.Fail() in this case @Mitch Wheat That's what I was looking for. It seems it was left out to prevent it being abused in another way I abuse it.

    Read the article

  • How to start Cygwin's NFS server in read-write mode?

    - by Vi
    Installed Cyginw NFS server. It works. But I can't make it allow writing to the filesystem. Why does it fail? Server: $ cat /etc/exports #/ 10.99.98.2(rw,no_root_squash) /cygdrive/c/foranevia *(rw,no_squash_root,anon_uid=0,anon_gid=0,no_subtree_check) Client: root@vi-notebook:/mnt# mount wpc:/cygdrive/c/foranevia nfs root@vi-notebook:/mnt# mkdir nfs/qqq mkdir: cannot create directory `nfs/qqq': Read-only file system

    Read the article

  • How do I prevent TCP connection freezes over an OpenVPN network?

    - by Jason R
    New details added at the end of this question; it's possible that I'm zeroing in on the cause. I have a UDP OpenVPN-based VPN set up in tap mode (I need tap because I need the VPN to pass multicast packets, which doesn't seem to be possible with tun networks) with a handful of clients across the Internet. I've been experiencing frequent TCP connection freezes over the VPN. That is, I will establish a TCP connection (e.g. an SSH connection, but other protocols have similar issues), and at some point during the session, it seems that traffic will cease being transmitted over that TCP session. This seems to be related to points at which large data transfers occur, such as if I execute an ls command in an SSH session, or if I cat a long log file. Some Google searches turn up a number of answers like this previous one on Server Fault, indicating that the likely culprit is an MTU issue: that during periods of high traffic, the VPN is trying to send packets that get dropped somewhere in the pipes between the VPN endpoints. The above-linked answer suggests using the following OpenVPN configuration settings to mitigate the problem: fragment 1400 mssfix This should limit the MTU used on the VPN to 1400 bytes and fix the TCP maximum segment size to prevent the generation of any packets larger than that. This seems to mitigate the problem a bit, but I still frequently see the freezes. I've tried a number of sizes as arguments to the fragment directive: 1200, 1000, 576, all with similar results. I can't think of any strange network topology between the two ends that could trigger such a problem: the VPN server is running on a pfSense machine connected directly to the Internet, and my client is also connected directly to the Internet at another location. One other strange piece of the puzzle: if I run the tracepath utility, then that seems to band-aid the problem. A sample run looks like: [~]$ tracepath -n 192.168.100.91 1: 192.168.100.90 0.039ms pmtu 1500 1: 192.168.100.91 40.823ms reached 1: 192.168.100.91 19.846ms reached Resume: pmtu 1500 hops 1 back 64 The above run is between two clients on the VPN: I initiated the trace from 192.168.100.90 to the destination of 192.168.100.91. Both clients were configured with fragment 1200; mssfix; in an attempt to limit the MTU used on the link. The above results would seem to suggest that tracepath was able to detect a path MTU of 1500 bytes between the two clients. I would assume that it would be somewhat smaller due to the fragmentation settings specified in the OpenVPN configuration. I found that result somewhat strange. Even stranger, however: if I have a TCP connection in the stalled state (e.g. an SSH session with a directory listing that froze in the middle), then executing the tracepath command shown above causes the connection to start up again! I can't figure out any reasonable explanation for why this would be the case, but I feel like this might be pointing toward a solution to ultimately eradicate the problem. Does anyone have any recommendations for other things to try? Edit: I've come back and looked at this a bit further, and have found only more confounding information: I set the OpenVPN connection to fragment at 1400 bytes, as shown above. Then, I connected to the VPN from across the Internet and used Wireshark to look at the UDP packets that were sent to the VPN server while the stall occurred. None were greater than the specified 1400 byte count, so the fragmentation seems to be functioning properly. To verify that even a 1400-byte MTU would be sufficient, I pinged the VPN server using the following (Linux) command: ping <host> -s 1450 -M do This (I believe) sends a 1450-byte packet with fragmentation disabled (I at least verified that it didn't work if I set it to an obviously-too-large value like 1600 bytes). These seem to work just fine; I get replies back from the host with no issue. So, maybe this isn't an MTU issue at all. I'm just confused as to what else it might be! Edit 2: The rabbit hole just keeps getting deeper: I've now isolated the problem a bit more. It seems to be related to the exact OS that the VPN client uses. I have successfully duplicated the problem on at least three Ubuntu machines (versions 12.04 through 13.04). I can reliably duplicate an SSH connection freeze within a minute or so by just cat-ing a large log file. However, if I do the same test using a CentOS 6 machine as a client, then I don't see the problem! I've tested using the exact same OpenVPN client version as I was using on the Ubuntu machines. I can cat log files for hours without seeing the connection freeze. This seems to provide some insight as to the ultimate cause, but I'm just not sure what that insight is. I have examined the traffic over the VPN using Wireshark. I'm not a TCP expert, so I'm not sure what to make of the gory details, but the gist is that at some point, a UDP packet gets dropped due to the limited bandwidth of the Internet link, causing TCP retransmissions inside the VPN tunnel. On the CentOS client, these retransmissions occur properly and things move on happily. At some point with the Ubuntu clients, though, the remote end starts retransmitting the same TCP segment over and over (with the transmit delay increasing between each retransmission). The client sends what looks like a valid TCP ACK to each retransmission, but the remote end still continues to transmit the same TCP segment periodically. This extends ad infinitum and the connection stalls. My question here would be: Does anyone have any recommendations for how to troubleshoot and/or determine the root cause of the TCP issue? It's as if the remote end isn't accepting the ACK messages sent by the VPN client. One common difference between the CentOS node and the various Ubuntu releases is that Ubuntu has a much more recent Linux kernel version (from 3.2 in Ubuntu 12.04 to 3.8 in 13.04). A pointer to some new kernel bug maybe? I'm assuming that if that were so, then I wouldn't be the only one experiencing the problem; I don't think this seems like a particularly exotic setup.

    Read the article

< Previous Page | 32 33 34 35 36 37 38 39 40 41 42 43  | Next Page >