Search Results

Search found 400 results on 16 pages for 'checksum'.

Page 6/16 | < Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13 | Next Page >

File corruption after copying files in Windows 7 64 bit using two methods

- by DustByte

I have 5000 pictures and other files in a directory taking up 35 GB. I want to duplicate this directory. Method 1: I do a simple copy and paste of the directory in explorer. I have the habit of checking the checksums after copying important files. In this case I noticed that around 2000 files failed the MD5 test. At a closer inspection of a randomly chosen JPEG with different checksums it turns out that some XMP metadata had changed. In particular, the tag <MicrosoftPhoto:DateAcquired> had changed the date from 2009 to today (possibly around the time I was copying the files). I have no idea what triggered this XMP data to be changed and exactly when it was changed and why for these particular files, but at least it seems to explain the checksum discrepancy. Method 2: As I want the exact files to be duplicated, I tried the program FreeFileSync to mirror the directory, hoping no XMP metadata would mysteriously change. A checksum test in addition to a thorough file comparison test in FreeFileSync lead to two similar but yet different results: 31 files fail the checksum test, 23 files fail the file comparison test. The smaller set is not entirely contained in the bigger set, although many files occur in both. What is alarming here is that not only JPEGs are flagged as altered but also som AVIs, MPGs and a large 7-zip file. Closer inspection of a JPEG indicates that it is indeed corrupt: the bottom half of the picture is simply plain gray. Due to the size of the 7-zip file, I have not been able to pin down the discrepancy. Note, in both methods, every file has its correct file size after being copied. Question: Any thoughts on what is possibly going on here? I have never had this problem before, and I am now terrified that files get corrupted after simple actions like copy/paste and file sync. Even if I manage to successfully copy the files somehow, I would still like an explanation to this.

Read the article
Yum Error Installing Git from kernel.org Repo

- by Lance

I want to install the latest version of Git using yum and the RPM repository on kernel.org, but adding the repo to yum.repos.d causes yum to fail with checksum errors. The prevailing solution to this issue seems to be to simply use the repository at Webtatic as answered here on superuser. I know I can also install an older version of Git using the EPEL repo, or compile from the latest source tarball, but honestly I want to understand why I'm having issues using the kernel.org repo. Here’s the workflow, after a clean install of CentOS 5.5 and "yum update": [root]# wget -P /etc/yum.repos.d/ http://kernel.org/pub/software/scm/git/RPMS/git.repo [root]# yum clean all [root]# yum repolist Loaded plugins: fastestmirror Determining fastest mirrors * addons: mirrors.netdna.com * base: mirror.clarkson.edu * epel: serverbeach1.fedoraproject.org * extras: centos.mirror.nac.net * updates: mirror.cogentco.com addons | 951 B 00:00 addons/primary | 202 B 00:00 base | 2.1 kB 00:00 base/primary_db | 1.6 MB 00:01 epel | 3.7 kB 00:00 epel/primary_db | 2.8 MB 00:01 extras | 2.1 kB 00:00 extras/primary_db | 188 kB 00:00 git | 1.2 kB 00:00 git/primary | 155 kB 00:00 http://www.kernel.org/pub/software/scm/git/RPMS/i386/repodata/primary.xml.gz: [Errno -3] Error performing checksum Trying other mirror. git/primary | 155 kB 00:00 http://www.kernel.org/pub/software/scm/git/RPMS/i386/repodata/primary.xml.gz: [Errno -3] Error performing checksum Trying other mirror. Error: failure: repodata/primary.xml.gz from git: [Errno 256] No more mirrors to try. Any suggestions as to a solution, or details why the kernel.org repo has this issue? (Sorry I can't include more links to my references, but I don't have the reputation for that yet.)

Read the article
Yum Error Installing Git from kernel.org Repo

- by Lance

I want to install the latest version of Git using yum and the RPM repository on kernel.org, but adding the repo to yum.repos.d causes yum to fail with checksum errors. The prevailing solution to this issue seems to be to simply use the repository at Webtatic as answered here on superuser. I know I can also install an older version of Git using the EPEL repo, or compile from the latest source tarball, but honestly I want to understand why I'm having issues using the kernel.org repo. Here’s the workflow, after a clean install of CentOS 5.5 and "yum update": [root]# wget -P /etc/yum.repos.d/ http://kernel.org/pub/software/scm/git/RPMS/git.repo [root]# yum clean all [root]# yum repolist Loaded plugins: fastestmirror Determining fastest mirrors * addons: mirrors.netdna.com * base: mirror.clarkson.edu * epel: serverbeach1.fedoraproject.org * extras: centos.mirror.nac.net * updates: mirror.cogentco.com addons | 951 B 00:00 addons/primary | 202 B 00:00 base | 2.1 kB 00:00 base/primary_db | 1.6 MB 00:01 epel | 3.7 kB 00:00 epel/primary_db | 2.8 MB 00:01 extras | 2.1 kB 00:00 extras/primary_db | 188 kB 00:00 git | 1.2 kB 00:00 git/primary | 155 kB 00:00 http://www.kernel.org/pub/software/scm/git/RPMS/i386/repodata/primary.xml.gz: [Errno -3] Error performing checksum Trying other mirror. git/primary | 155 kB 00:00 http://www.kernel.org/pub/software/scm/git/RPMS/i386/repodata/primary.xml.gz: [Errno -3] Error performing checksum Trying other mirror. Error: failure: repodata/primary.xml.gz from git: [Errno 256] No more mirrors to try. Any suggestions as to a solution, or details why the kernel.org repo has this issue? (Sorry I can't include more links to my references, but I don't have the reputation for that yet.)

Read the article
Modulo in JavaScript - large number

- by Benedikt R.

Hi! I try to calculate with JS' modulo function, but don't get the right result (which should be 1). Here is a hardcoded piece of code. var checkSum = 210501700012345678131468; alert(checkSum % 97); Result: 66 Whats the problem here? Regards, Benedikt

Read the article
Upload large files via a webpage

- by Hultner

What way is the best way to let users upload large files from there webbrowser to a server. I'm talking 200MB+ possible up to a few gigatyes. I have been thinking of a few possible solutions to the problem (not tried them yet) and this is basically the things I came up with. Server download speed will not be a problem but the users connection possibly could. Having some sort of applet on the client side written in Java or Flash which sends the file in parts (is this possible with an applet) to a php/other script on the server and a checksum+ some other info about the file. On the server scripts all the parts and the info file is saved in a temporary directory wich has a unique name based on the checksum of the file and the ip of the user. When the last chunk is sent the applet sends a signal to the server saying it's finished and the server put the file together in the right location. If a chunk doesn't match the checksum for that part the server will send a response to the applet telling it to reupload that chunk. I don't know how important the checksum checking is since it's all tcpackages, someone with more insigth migth be able to answer on that. This is probably the worst way, changing the settings on your server to allow huge fileuploads via an inputfiel. Do it like a normal transfer. User an uploadmanager which does pretty much the same thing as applet i mentioned above. Pros of the first is probably that it would most likely be rather secure, you could show progress as well and possibly resume an upload if ip hasn't changed and do a threaded upload of the chunks. Cons of the first is that the user will need flash/java for it to work. Pros of the 2nd is that it will pretty much work for everyone but cons are big, first there's no way resuming an intruppted download and if something is wrong the whole file would have to be reuploaded is a few of cons. For the third one the pros is pretty muc the same as for the first but the cons is that the user would have to download an application to their computer and run and the application will have to be have to be compatible with their computer and OS. Another way may be a combination of two. Lets say an applet for bigger or more files and a simple input which is rather restricted to maybe max 10-20MB for smaller files and comability. There are probably other much smarter ways to tackle this and that's why I'm asking for advice here on SO.

Read the article
Read part of a file in PHP

- by Johan

I would like to read the last 1 megabyte of a MP3 file and calculate SHA1 checksum for just that part of the file. The reason that I would want this is that when I'm looking for duplicate MP3's, the header info (song title, album etc.) can differ even though it's the exakt same audio file, so I figured I would be better of to checksum a part of the file at the end instead of the whole one. Is there an efficient way of doing this?

Read the article
Starting MySQL database server: mysqld . . . . . . . . . . . . . . failed!

- by meder

I restarted my VPS box ( manually/hard restart ) and ever since, mysql fails to start for whatever reason. I did a tail /var/log/syslog and I get this: Feb 20 11:49:33 kyrgyznews mysqld[11461]: ) ;InnoDB: End of page dump 575 Feb 20 11:49:33 kyrgyznews mysqld[11461]: 110220 11:49:33 InnoDB: Page checksum 1045788239, prior-to-4.0.14-form checksum 236985105 576 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: stored checksum 1178062585, prior-to-4.0.14-form stored checksum 236985105 577 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: Page lsn 0 10651, low 4 bytes of lsn at page end 10651 578 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: Page number (if stored to page already) 3, 579 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: space id (if created with >= MySQL-4.1.1 and stored already) 0 580 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: Database page corruption on disk or a failed 581 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: file read of page 3. 582 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: You may have to recover from a backup. 583 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: It is also possible that your operating 584 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: system has corrupted its own file cache 585 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: and rebooting your computer removes the 586 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: error. 587 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: If the corrupt page is an index page 588 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: you can also try to fix the corruption 589 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: by dumping, dropping, and reimporting 590 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: the corrupt table. You can use CHECK 591 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: TABLE to scan your table for corruption. 592 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: See also InnoDB: http://dev.mysql.com/doc/refman/5.0/en/forcing-recovery.html 593 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: about forcing recovery. 594 Feb 20 11:49:33 kyrgyznews mysqld[11461]: InnoDB: Ending processing because of a corrupt database page. 595 Feb 20 11:49:33 kyrgyznews mysqld_safe[11469]: ended 596 Feb 20 11:49:47 kyrgyznews /etc/init.d/mysql[12228]: 0 processes alive and '/usr/bin/mysqladmin --defaults-file=/etc/mysql/debian.cnf ping' resulted in 597 Feb 20 11:49:47 kyrgyznews /etc/init.d/mysql[12228]: ^G/usr/bin/mysqladmin: connect to server at 'localhost' failed 598 Feb 20 11:49:47 kyrgyznews /etc/init.d/mysql[12228]: error: 'Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (2)' 599 Feb 20 11:49:47 kyrgyznews /etc/init.d/mysql[12228]: Check that mysqld is running and that the socket: '/var/run/mysqld/mysqld.sock' exists! 600 Feb 20 11:49:47 kyrgyznews /etc/init.d/mysql[12228]: 601 Feb 20 11:49:56 kyrgyznews mysqld_safe[13437]: started 602 Feb 20 11:49:56 kyrgyznews mysqld[13440]: InnoDB: The log sequence number in ibdata files does not match 603 Feb 20 11:49:56 kyrgyznews mysqld[13440]: InnoDB: the log sequence number in the ib_logfiles! 604 Feb 20 11:49:56 kyrgyznews mysqld[13440]: 110220 11:49:56 InnoDB: Database was not shut down normally! 605 Feb 20 11:49:56 kyrgyznews mysqld[13440]: InnoDB: Starting crash recovery. 606 Feb 20 11:49:56 kyrgyznews mysqld[13440]: InnoDB: Reading tablespace information from the .ibd files... 607 Feb 20 11:49:56 kyrgyznews mysqld[13440]: InnoDB: Restoring possible half-written data pages from the doublewrite 608 Feb 20 11:49:56 kyrgyznews mysqld[13440]: InnoDB: buffer... 609 Feb 20 11:49:56 kyrgyznews mysqld[13440]: InnoDB: Database page corruption on disk or a failed 610 Feb 20 11:49:56 kyrgyznews mysqld[13440]: InnoDB: file read of page 3. 611 Feb 20 11:49:56 kyrgyznews mysqld[13440]: InnoDB: You may have to recover from a backup. I have looked at the page it referenced, http://dev.mysql.com/doc/refman/5.0/en/forcing-innodb-recovery.html, but before messing with any settings I was wondering what experienced DBAs would suggest doing? Is there any harm in forcing the recovery? PS - I did not make any updates to mysql. Version is mysql Ver 14.12 Distrib 5.0.51a, for debian-linux-gnu (i486) using readline 5.2.

Read the article
Raid 1 array won't assemble after power outage. How do I fix this ext4 mirror?

- by Forkrul Assail

Two ext4 drives on Raid 1 with mdadm won't reassemble after the power went out for an extended period (UPS drained). After turning the machine back on, mdadm said that the array was degraded, after which it took about 2 days for a full resync, which completed without problems. On trying to remount the array I get: mount: you must specify the filesystem type cat /etc/fstab lines relevant to setup: /dev/md127 /media/mediapool ext4 defaults 0 0 dmesg | tail (on trying to mount) says: [ 1050.818782] EXT3-fs (md127): error: can't find ext3 filesystem on dev md127. [ 1050.849214] EXT4-fs (md127): VFS: Can't find ext4 filesystem [ 1050.944781] FAT-fs (md127): invalid media value (0x00) [ 1050.944782] FAT-fs (md127): Can't find a valid FAT filesystem [ 1058.272787] EXT2-fs (md127): error: can't find an ext2 filesystem on dev md127. cat /proc/mdstat says: Personalities : [raid1] [linear] [multipath] [raid0] [raid6] [raid5] [raid4] [raid10] md127 : active (auto-read-only) raid1 sdj[2] sdi[0] 2930135360 blocks super 1.2 [2/2] [UU] unused devices: <none> fsck /dev/md127 says: fsck from util-linux 2.20.1 e2fsck 1.42 (29-Nov-2011) fsck.ext2: Superblock invalid, trying backup blocks... fsck.ext2: Bad magic number in super-block while trying to open /dev/md127 The superblock could not be read or does not describe a correct ext2 filesystem. If the device is valid and it really contains an ext2 filesystem (and not swap or ufs or something else), then the superblock is corrupt, and you might try running e2fsck with an alternate superblock: e2fsck -b 8193 <device> mdadm -E /dev/sdi gives me: /dev/sdi: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 37ac1824:eb8a21f6:bd5afd6d:96da6394 Name : sojourn:33 Creation Time : Sat Nov 10 10:43:52 2012 Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 5860271016 (2794.40 GiB 3000.46 GB) Array Size : 2930135360 (2794.39 GiB 3000.46 GB) Used Dev Size : 5860270720 (2794.39 GiB 3000.46 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : 3e6e9a4f:6c07ab3d:22d47fce:13cecfd0 Update Time : Tue Nov 13 20:34:18 2012 Checksum : f7d10db9 - correct Events : 27 Device Role : Active device 0 Array State : AA ('A' == active, '.' == missing) boot@boot ~ $ sudo mdadm -E /dev/sdj /dev/sdj: Magic : a92b4efc Version : 1.2 Feature Map : 0x0 Array UUID : 37ac1824:eb8a21f6:bd5afd6d:96da6394 Name : sojourn:33 Creation Time : Sat Nov 10 10:43:52 2012 Raid Level : raid1 Raid Devices : 2 Avail Dev Size : 5860271016 (2794.40 GiB 3000.46 GB) Array Size : 2930135360 (2794.39 GiB 3000.46 GB) Used Dev Size : 5860270720 (2794.39 GiB 3000.46 GB) Data Offset : 262144 sectors Super Offset : 8 sectors State : clean Device UUID : 7fb84af4:e9295f7b:ede61f27:bec0cb57 Update Time : Tue Nov 13 20:34:18 2012 Checksum : b9d17fef - correct Events : 27 Device Role : Active device 1 Array State : AA ('A' == active, '.' == missing) machine@user ~ dmesg | tail [ 61.785866] init: alsa-restore main process (2736) terminated with status 99 [ 68.433548] eth0: no IPv6 routers present [ 534.142511] EXT4-fs (sdi): ext4_check_descriptors: Block bitmap for group 0 not in group (block 2838187772)! [ 534.142518] EXT4-fs (sdi): group descriptors corrupted! [ 546.418780] EXT2-fs (sdi): error: couldn't mount because of unsupported optional features (240) [ 549.654127] EXT3-fs (sdi): error: couldn't mount because of unsupported optional features (240) Since this is Raid 1 it was suggested that I try and mount or fsck the drives separately. After a long fsck on one drive, it ended with this as tail: Illegal double indirect block (2298566437) in inode 39717736. CLEARED. Illegal block #4231180 (2611866932) in inode 39717736. CLEARED. Error storing directory block information (inode=39717736, block=0, num=1092368): Memory allocation failed Recreate journal? yes Creating journal (32768 blocks): Done. *** journal has been re-created - filesystem is now ext3 again *** The drive however still doesn't want to mount: dmesg | tail [ 170.674659] md: export_rdev(sdc) [ 170.675152] md: export_rdev(sdc) [ 195.275288] md: export_rdev(sdc) [ 195.275876] md: export_rdev(sdc) [ 1338.540092] CE: hpet increased min_delta_ns to 30169 nsec [26125.734105] EXT4-fs (sdc): ext4_check_descriptors: Checksum for group 0 failed (43502!=37987) [26125.734115] EXT4-fs (sdc): group descriptors corrupted! [26182.325371] EXT3-fs (sdc): error: couldn't mount because of unsupported optional features (240) [27083.316519] EXT4-fs (sdc): ext4_check_descriptors: Checksum for group 0 failed (43502!=37987) [27083.316530] EXT4-fs (sdc): group descriptors corrupted! Please help me fix this. I never in my wildest nightmares thought a complete mirror would die this badly. Am I missing something? Suggestions on fixing this? Could someone explain why it would resync after the powerout, only to seemingly nuke the drive? Thanks for reading. Any help much appreciated. I've tried everything I can think of, including booting and filesystem checking with SystemRescue and Ubuntu liveboot discs.

Read the article
How do i change the BIOS boot splash screen?

- by YumYumYum

I have a Dell PC which has very ugly and bad luck looking Alien face on every boot. I want to change it or disable it forever, but in Bios they do not have any options. How can i change this from my linux Fedora or ArchLinux which is running now? Tried following does not work. ( http://www.pixelbeat.org/docs/bios/ ) ./flashrom -r firmware.old #save current flash ROM just in case ./flashrom -wv firmware.new #write and verify new flash ROM image Also tried: $ cat c.c #include <stdio.h> #include <inttypes.h> #include <netinet/in.h> #include <stdlib.h> #include <string.h> #include <sys/types.h> #include <sys/stat.h> #include <fcntl.h> #include <unistd.h> #define lengthof(x) (sizeof(x)/sizeof(x[0])) uint16_t checksum(const uint8_t* data, int len) { uint16_t sum = 0; int i; for (i=0; i<len; i++) sum+=*(data+i); return htons(sum); } void usage(void) { fprintf(stderr,"Usage: therm_limit [0,50,53,56,60,63,66,70]\n"); fprintf(stderr,"Report therm limit of terminal in BIOS\n"); fprintf(stderr,"If temp specifed, it is changed if required.\n"); exit(EXIT_FAILURE); } #define CHKSUM_START 51 #define CHKSUM_END 109 #define THERM_OFFSET 67 #define THERM_SHIFT 0 #define THERM_MASK (0x7 << THERM_SHIFT) #define THERM_OFF 0 uint8_t thermal_limits[]={0,50,53,56,60,63,66,70}; #define THERM_MAX (lengthof(thermal_limits)-1) #define DEV_NVRAM "/dev/nvram" #define NVRAM_MAX 114 uint8_t nvram[NVRAM_MAX]; int main(int argc, char* argv[]) { int therm_request = -1; if (argc>2) usage(); if (argc==2) { if (*argv[1]=='-') usage(); therm_request=atoi(argv[1]); int i; for (i=0; i<lengthof(thermal_limits); i++) if (thermal_limits[i]==therm_request) break; if (i==lengthof(thermal_limits)) usage(); else therm_request=i; } int fd_nvram=open(DEV_NVRAM, O_RDWR); if (fd_nvram < 0) { fprintf(stderr,"Error opening %s [%m]\n", DEV_NVRAM); exit(EXIT_FAILURE); } if (read(fd_nvram, nvram, sizeof(nvram))==-1) { fprintf(stderr,"Error reading %s [%m]\n", DEV_NVRAM); close(fd_nvram); exit(EXIT_FAILURE); } uint16_t chksum = *(uint16_t*)(nvram+CHKSUM_END); printf("%04X\n",chksum); exit(0); if (chksum == checksum(nvram+CHKSUM_START, CHKSUM_END-CHKSUM_START)) { uint8_t therm_byte = *(uint16_t*)(nvram+THERM_OFFSET); uint8_t therm_status=(therm_byte & THERM_MASK) >> THERM_SHIFT; printf("Current thermal limit: %d°C\n", thermal_limits[therm_status]); if ( (therm_status == therm_request) ) therm_request=-1; if (therm_request != -1) { if (therm_status != therm_request) printf("Setting thermal limit to %d°C\n", thermal_limits[therm_request]); uint8_t new_therm_byte = (therm_byte & ~THERM_MASK) | (therm_request << THERM_SHIFT); *(uint8_t*)(nvram+THERM_OFFSET) = new_therm_byte; *(uint16_t*)(nvram+CHKSUM_END) = checksum(nvram+CHKSUM_START, CHKSUM_END-CHKSUM_START); (void) lseek(fd_nvram,0,SEEK_SET); if (write(fd_nvram, nvram, sizeof(nvram))!=sizeof(nvram)) { fprintf(stderr,"Error writing %s [%m]\n", DEV_NVRAM); close(fd_nvram); exit(EXIT_FAILURE); } } } else { fprintf(stderr,"checksum failed. Aborting\n"); close(fd_nvram); exit(EXIT_FAILURE); } return EXIT_SUCCESS; } $ gcc c.c -o bios # ./bios 16DB

Read the article
How can I recover an ext4 filesystem corrupted after a fsck?

- by Regan

I have an ext4 filesystem on luks over software raid5. The filesystem was operating "just fine" for several years when I was beginning to run out of space. I had a 9T volume on 6x2T drives. I began upgrading to 3T drives by doing the mdadm fail, remove, add, rebuild, repeat process until I had a larger array. I then grew the luks container, and then when I unmounted and tried to resize2fs I was given the message the filesystem was dirty and needed e2fsck. Without thinking I just did e2fsck -y /dev/mapper/candybox and it began spewing all kinds of inode being removed type messages (can't remember exactly) I killed e2fsck and tried to remount the filesystem to backup data I was concerned about. When trying to mount at this point I get: # mount /dev/mapper/candybox /candybox mount: wrong fs type, bad option, bad superblock on /dev/mapper/candybox, missing codepage or helper program, or other error In some cases useful info is found in syslog - try dmesg | tail or so Looking back at my older logs I noticed the filesystem was giving this error each time the machine booted: kernel: [79137.275531] EXT4-fs (dm-2): warning: mounting fs with errors, running e2fsck is recommended So shame on me for not paying attention :( I then tried to mount using every backup superblock (one after another) and each attempt left this in my log: EXT4-fs (dm-2): ext4_check_descriptors: Checksum for group 0 failed (26534!=65440) EXT4-fs (dm-2): ext4_check_descriptors: Checksum for group 1 failed (38021!=36729) EXT4-fs (dm-2): ext4_check_descriptors: Checksum for group 2 failed (18336!=39845) ... EXT4-fs (dm-2): ext4_check_descriptors: Checksum for group 11911 failed (28743!=44098) BUG: soft lockup - CPU#0 stuck for 23s! [mount:2939] Attempts to restart e2fsck results in: # e2fsck /dev/mapper/candybox e2fsck 1.41.14 (22-Dec-2010) e2fsck: Group descriptors look bad... trying backup blocks... candy: recovering journal e2fsck: unable to set superblock flags on candy At this point, I decided it best to order some more drives and make an image using ddrescue Now two weeks later I have an image of the luks partition in a .img file. # ls -lh total 14T -rw-r--r-- 1 root root 14T Oct 25 01:57 candybox.img -rw-r--r-- 1 root root 271 Oct 20 14:32 candybox.logfile After numerous attempts using everything I could find online I could not coerce e2fsck to do anything on the image, so I used mkfs.ext4 -L candy candybox.img -m 0 -S and I was able to mount the dirty filesystem readonly without the journal and recover 960G of data. It gave all kinds of errors of various directories not existing and so forth but I was able to get some stuff. Which gave me some hope! I then ran e2fsck again and it had to recreate the root inode and gave a massive list of correcting group counts, I accepted the root inode creation and said no to everything else, leaving a completely empty filesystem. Re-ran again and said yes to all questions with the same result but now a "clean" but empty filesystem. extundelete gives me 0 recoverable inodes found. And now I'm stuck again, I can't come up with any other methods other than dropping to something like photorec which will give me an absolute mess with how large the filesystem was. I'm willing to re-copy the image from the original array and start over, if I can get any suggestions or ideas on a way to get more of my files back. I wish I could give more detailed logs of the commands that have run, but the output is long scrolled passed except for what gets logged to syslog and my memory is not as detailed due to the timeframe this has occurred over. Any help is greatly appreciated!

Read the article
Create Folders from text file and place dummy file in them using a CustomAction

- by Birkoff

I want my msi installer to generate a set of folders in a particular location and put a dummy file in each directory. Currently I have the following CustomActions: <CustomAction Id="SMC_SetPathToCmd" Property="Cmd" Value="[SystemFolder]cmd.exe"/> <CustomAction Id="SMC_GenerateMovieFolders" Property="Cmd" ExeCommand="for /f "tokens=* delims= " %a in ([MBSAMPLECOLLECTIONS]movies.txt) do (echo %a)" /> <CustomAction Id="SMC_CopyDummyMedia" Property="Cmd" ExeCommand="for /f "tokens=* delims= " %a in ([MBSAMPLECOLLECTIONS]movies.txt) do (copy [MBSAMPLECOLLECTIONS]dummy.avi "%a"\"%a".avi)" /> These are called in the InstallExecuteSequence: <Custom Action="SMC_SetPathToCmd" After="InstallFinalize"/> <Custom Action="SMC_GenerateMovieFolders" After="SMC_SetPathToCmd"/> <Custom Action="SMC_CopyDummyMedia" After="SMC_GenerateMovieFolders"/> The custom actions seem to start, but only a blank command prompt window is shown and the directories are not generated. The files needed for the customaction are copied to the correct directory: <Directory Id="WIX_DIR_COMMON_VIDEO"> <Directory Id="MBSAMPLECOLLECTIONS" Name="MB Sample Collections" /> </Directory> <DirectoryRef Id="MBSAMPLECOLLECTIONS"> <Component Id="SampleCollections" Guid="C481566D-4CA8-4b10-B08D-EF29ACDC10B5" DiskId="1"> <File Id="movies.txt" Name="movies.txt" Source="SampleCollections\movies.txt" Checksum="no" /> <File Id="series.txt" Name="series.txt" Source="SampleCollections\series.txt" Checksum="no" /> <File Id="dummy.avi" Name="dummy.avi" Source="SampleCollections\dummy.avi" Checksum="no" /> </Component> </DirectoryRef> What's wrong with these Custom Actions or is there a simpler way to do this?

Read the article
Counting elements and reading attributes with .net2.0 ?

- by Prix

I have an application that is on .net 2.0 and I am having some difficult with it as I am more use to linq. The xml file look like this: <?xml version="1.0" encoding="UTF-8" standalone="yes"?> <updates> <files> <file url="files/filename.ext" checksum="06B9EEA618EEFF53D0E9B97C33C4D3DE3492E086" folder="bin" system="0" size="40448" /> <file url="files/filename.ext" checksum="CA8078D1FDCBD589D3769D293014154B8854D6A9" folder="" system="0" size="216" /> <file url="files/filename.ext" checksum="CA8078D1FDCBD589D3769D293014154B8854D6A9" folder="" system="0" size="216" /> </files> </updates> The file is downloaded and readed on the fly: XmlDocument readXML = new XmlDocument(); readXML.LoadXml(xmlData); Initially i was thinking it would go with something like this: XmlElement root = doc.DocumentElement; XmlNodeList nodes = root.SelectNodes("//files"); foreach (XmlNode node in nodes) { ... im reading it ... } But before reading them I need to know how many they are to use on my progress bar and I am also clueless on how to grab the attribute of the file element in this case. How could I count how many "file" ELEMENTS I have (count them before entering the foreach ofc) and read their attributes ? I need the count because it will be used to update the progress bar. Overall it is not reading my xml very well.

Read the article
Why is function's length information of other shared lib in ELF?

- by minastaros

Our project (C++, Linux, gcc, PowerPC) consists of several shared libraries. When releasing a new version of the package, only those libs should change whose source code was actually affected. With "change" I mean absolute binary identity (the checksum over the file is compared. Different checksum - different version according to the policy). (I should mention that the whole project is always built at once, no matter if any code has changed or not per library). Usually this can by achieved by hiding private parts of the included Header files and not changing the public ones. However, there was a case where just a delete was added to the destructor of a class TableManager (in the TableManager.cpp file!) of library libTableManager.so, and yet the binary/checksum of library libB.so (which uses class TableManager ) has changed. TableManager.h: class TableManager { public: TableManager(); ~TableManager(); private: int* myPtr; } TableManager.cpp: TableManager::~TableManager() { doSomeCleanup(); delete myPtr; // this delete has been added } By inspecting libB.so with readelf --all libB.so, looking at the .dynsym section, it turned out that the length of all functions, even the dynamically used ones from other libraries, are stored in libB! It looks like this (length is the 668 in the 3rd column): 527: 00000000 668 FUNC GLOBAL DEFAULT UND _ZN12TableManagerD1Ev So my questions are: Why is the length of a function actually stored in the client lib? Wouldn't a start address be sufficient? Can this be suppressed somehow when compiling/linking of libB.so (kind of "stripping")? We would really like to reduce this degree of dependency...

Read the article
Creating a new variable in C from only part of an existing u_char

- by Alex Kloss

I'm writing some C code to parse IEEE 802.11 frames, but I'm stuck trying to create a new variable whose length depends on the size of the frame itself. Here's the code I currently have: int frame_body_len = pkt_hdr->len - radio_hdr->len - wifi_hdr_len - 4; u_char *frame_body = (u_char *) (packet + radio_hdr->len + wifi_hdr_len); Basically, the frame consists of a header, a body, and a checksum at the end. I can calculate the length of the frame body by taking the length of the packet and subtracting the length of the two headers that appear before it (radio_hdr->len and wifi_hdr_len respectively), plus 4 bytes at the end for the checksum. However, how can I create the frame_body variable without the trailing checksum? Right now, I'm initializing it with the contents of the packet starting at the position after the two headers, but is there some way to start at that position and end 4 bytes before the end of packet? packet is a pointer to a u_char, if it helps. I'm a new C programmer, so any and all advice about my code you can give me would be much appreciated. Thanks!

Read the article
MS Windows Server 2008R2 slow file copy, slow network connection

- by MattrixHax

i just setup a windows 2008R2 standard server, with the only installed app being Hyper-V, and only 1 windows XP VM is running. Whenever i try to copy a file from my windows 7 laptop over to the 2008R2 server machine's admin shares ( \\servername\c$ ) the files start transferring around 60mb/s and then drop to around 5mb/s. My windows 7 machine and the server 2008 machine are both in WORKGROUP (no domain here). when i try the same transfer to our server 2003 box the transfer speeds are fine. tried disabling autotuning (netsh interface tcp set global autotuninglevel=disabled) as well as turning off the checksum offload to the adapter (tx and rx) - i still see strange packet errors (bad header checksum) using wireshark and just cannot seem to track down what the issue is - over 1 hour to transfer 4gb of files from 1 server to another that are on the same GB switch is just crazy.... any ideas would be greatly appreciated!

Read the article
monit configuration for php-fpm

- by Adam Jimenez

I'm struggling to find a monit config for php-fpm that works. This is what I've tried: ### Monitoring php-fpm: the parent process. check process php-fpm with pidfile /var/run/php-fpm/php-fpm.pid group phpcgi # phpcgi group start program = "/etc/init.d/php-fpm start" stop program = "/etc/init.d/php-fpm stop" ## Test the UNIX socket. Restart if down. if failed unixsocket /var/run/php-fpm.sock then restart ## If the restarts attempts fail then alert. if 3 restarts within 5 cycles then timeout depends on php-fpm_bin depends on php-fpm_init ## Test the php-fpm binary. check file php-fpm_bin with path /usr/sbin/php-fpm group phpcgi if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor ## Test the init scripts. check file php-fpm_init with path /etc/init.d/php-fpm group phpcgi if failed checksum then unmonitor if failed permission 755 then unmonitor if failed uid root then unmonitor if failed gid root then unmonitor But it fails because there is no php-fpm.sock (Centos 6)

Read the article
SMTP port open - but not open

- by Frederik Nielsen

As some of you might know, I am setting up an exchange server. Now I ran into another problem: I cannot connect to the SMTP service from outside the server! The ports are opened in the gateway device (a ZyXEL USG50), Windows firewall is off. I see the packets travekl through the ZyXEL firewall, and I can also see the packets with wireshark on the server, so I know they are getting all the way in to the server. I also know it receives them, and sends out the reply - and this is where things go bad! Analyzing with wireshark, I get these errors in the return packets: Header checksum: 0x0000 [incorrect, should be 0x0779 (may be caused by "IP checksum offload"?)] And: Acknowledgment Number: 0x8e3337d1 [should be 0x00000000 because ACK flag is not set] What the (sorry my French) hell is going on? I really cant figure it out.. Thanks in advance.

Read the article
rsync doesn't use delta transfer on first run

- by ockzon

I'm trying to synchronize a large local directory (with a batch file using rsync 3.0.7 on Cygwin, Windows 7 x64, 30k files, 200gb size) to a remote server (Debian x64 with kernel 2.6, rsyncd 3.0.7) over a slow internet connection (90kbyte/s upload). I know almost all files are identical and I verified that using md5sum locally and remotely. However when executing rsync from my local machine every file gets transferred completely for the first time. When I terminate the batch file after a few transfers and run it again then the already transferred files are skipped. But as soon as it gets to a file not yet transferred it uploads the file as a whole again instead of noticing that the checksum is the same locally and remotely. The batch file calling rsync looks like this (backslashes and line brakes added here for readability): c:\cygwin\bin\rsync.exe --verbose --human-readable --progress --stats \ --recursive --ignore-times --password-file pwd.txt \ /cygdrive/d/ftp/data/ \ rsync://[email protected]:33400/data/ | \ c:\cygwin\bin\tee.exe --append rsync.log I experimented using the following parameters in varying combinations but that didn't help either: --checksum --partial --partial-dir=/tmp/.rsync-partial --compress

Read the article
Accessing a broken mdadm raid

- by CarstenCarsten

Hi! I used a western digital mybookworld (SOHO NAS storage using Linux) as backup for my Linux box. Suddenly, the mybookworld does not boot up any more. So I opened the box, removed the hard disk and put the hard disk into an external USB HDD case, and connected it to my Linux box. [ 530.640301] usb 2-1: new high speed USB device using ehci_hcd and address 3 [ 530.797630] scsi7 : usb-storage 2-1:1.0 [ 531.794844] scsi 7:0:0:0: Direct-Access WDC WD75 00AAKS-00RBA0 PQ: 0 ANSI: 2 [ 531.796490] sd 7:0:0:0: Attached scsi generic sg3 type 0 [ 531.797966] sd 7:0:0:0: [sdc] 1465149168 512-byte logical blocks: (750 GB/698 GiB) [ 531.800317] sd 7:0:0:0: [sdc] Write Protect is off [ 531.800327] sd 7:0:0:0: [sdc] Mode Sense: 38 00 00 00 [ 531.800333] sd 7:0:0:0: [sdc] Assuming drive cache: write through [ 531.803821] sd 7:0:0:0: [sdc] Assuming drive cache: write through [ 531.803836] sdc: sdc1 sdc2 sdc3 sdc4 [ 531.815831] sd 7:0:0:0: [sdc] Assuming drive cache: write through [ 531.815842] sd 7:0:0:0: [sdc] Attached SCSI disk The dmesg output looks normal, but I was wondering why the hardisk was not mounted at all. And why there are 4 different partitions on it. fdisk showed the following: root@ubuntu:/home/ubuntu# fdisk /dev/sdc WARNING: DOS-compatible mode is deprecated. It's strongly recommended to switch off the mode (command 'c') and change display units to sectors (command 'u'). Command (m for help): p Disk /dev/sdc: 750.2 GB, 750156374016 bytes 255 heads, 63 sectors/track, 91201 cylinders Units = cylinders of 16065 * 512 = 8225280 bytes Sector size (logical/physical): 512 bytes / 512 bytes I/O size (minimum/optimal): 512 bytes / 512 bytes Disk identifier: 0x00007c00 Device Boot Start End Blocks Id System /dev/sdc1 4 369 2939895 fd Linux raid autodetect /dev/sdc2 370 382 104422+ fd Linux raid autodetect /dev/sdc3 383 505 987997+ fd Linux raid autodetect /dev/sdc4 506 91201 728515620 fd Linux raid autodetect Oh no! Everything seems to be created as a mdadm software raid. Calling mdadm --examine with the different partitions seems to affirm that. I think the only partition I am interested in, is /dev/sdc4 (because it is the largest). But nevertheless I called mdadm --examine with every partition. root@ubuntu:/home/ubuntu# mdadm --examine /dev/sdc1 /dev/sdc1: Magic : a92b4efc Version : 00.90.00 UUID : 5626a2d8:070ad992:ef1c8d24:cd8e13e4 Creation Time : Wed Feb 20 00:57:49 2002 Raid Level : raid1 Used Dev Size : 2939776 (2.80 GiB 3.01 GB) Array Size : 2939776 (2.80 GiB 3.01 GB) Raid Devices : 2 Total Devices : 1 Preferred Minor : 1 Update Time : Sun Nov 21 11:05:27 2010 State : clean Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Checksum : 4c90bc55 - correct Events : 16682 Number Major Minor RaidDevice State this 0 8 1 0 active sync /dev/sda1 0 0 8 1 0 active sync /dev/sda1 1 1 0 0 1 faulty removed root@ubuntu:/home/ubuntu# mdadm --examine /dev/sdc2 /dev/sdc2: Magic : a92b4efc Version : 00.90.00 UUID : 9734b3ee:2d5af206:05fe3413:585f7f26 Creation Time : Wed Feb 20 00:57:54 2002 Raid Level : raid1 Used Dev Size : 104320 (101.89 MiB 106.82 MB) Array Size : 104320 (101.89 MiB 106.82 MB) Raid Devices : 2 Total Devices : 1 Preferred Minor : 2 Update Time : Wed Oct 27 20:19:08 2010 State : clean Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Checksum : 55560b40 - correct Events : 9884 Number Major Minor RaidDevice State this 0 8 2 0 active sync /dev/sda2 0 0 8 2 0 active sync /dev/sda2 1 1 0 0 1 faulty removed root@ubuntu:/home/ubuntu# mdadm --examine /dev/sdc3 /dev/sdc3: Magic : a92b4efc Version : 00.90.00 UUID : 08f30b4f:91cca15d:2332bfef:48e67824 Creation Time : Wed Feb 20 00:57:54 2002 Raid Level : raid1 Used Dev Size : 987904 (964.91 MiB 1011.61 MB) Array Size : 987904 (964.91 MiB 1011.61 MB) Raid Devices : 2 Total Devices : 1 Preferred Minor : 3 Update Time : Sun Nov 21 11:05:27 2010 State : clean Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Checksum : 39717874 - correct Events : 73678 Number Major Minor RaidDevice State this 0 8 3 0 active sync 0 0 8 3 0 active sync 1 1 0 0 1 faulty removed root@ubuntu:/home/ubuntu# mdadm --examine /dev/sdc4 /dev/sdc4: Magic : a92b4efc Version : 00.90.00 UUID : febb75ca:e9d1ce18:f14cc006:f759419a Creation Time : Wed Feb 20 00:57:55 2002 Raid Level : raid1 Used Dev Size : 728515520 (694.77 GiB 746.00 GB) Array Size : 728515520 (694.77 GiB 746.00 GB) Raid Devices : 2 Total Devices : 1 Preferred Minor : 4 Update Time : Sun Nov 21 11:05:27 2010 State : clean Active Devices : 1 Working Devices : 1 Failed Devices : 1 Spare Devices : 0 Checksum : 2f36a392 - correct Events : 519320 Number Major Minor RaidDevice State this 0 8 4 0 active sync 0 0 8 4 0 active sync 1 1 0 0 1 faulty removed If I read the output correctly everything was removed, because it was faulty. Is there ANY way to see the contents of the largest partition? Or seeing somehow which files are broken? I see that everything is raid1 which is only mirroring, so this should be a normal partition. I am anxious to do anything with mdadm, in fear that I destroy the data on the hard disk. I would be very thankful for any help.

Read the article
ASMLib

- by wcoekaer

Oracle ASMlib on Linux has been a topic of discussion a number of times since it was released way back when in 2004. There is a lot of confusion around it and certainly a lot of misinformation out there for no good reason. Let me try to give a bit of history around Oracle ASMLib. Oracle ASMLib was introduced at the time Oracle released Oracle Database 10g R1. 10gR1 introduced a very cool important new features called Oracle ASM (Automatic Storage Management). A very simplistic description would be that this is a very sophisticated volume manager for Oracle data. Give your devices directly to the ASM instance and we manage the storage for you, clustered, highly available, redundant, performance, etc, etc... We recommend using Oracle ASM for all database deployments, single instance or clustered (RAC). The ASM instance manages the storage and every Oracle server process opens and operates on the storage devices like it would open and operate on regular datafiles or raw devices. So by default since 10gR1 up to today, we do not interact differently with ASM managed block devices than we did before with a datafile being mapped to a raw device. All of this is without ASMLib, so ignore that one for now. Standard Oracle on any platform that we support (Linux, Windows, Solaris, AIX, ...) does it the exact same way. You start an ASM instance, it handles storage management, all the database instances use and open that storage and read/write from/to it. There are no extra pieces of software needed, including on Linux. ASM is fully functional and selfcontained without any other components. In order for the admin to provide a raw device to ASM or to the database, it has to have persistent device naming. If you booted up a server where a raw disk was named /dev/sdf and you give it to ASM (or even just creating a tablespace without asm on that device with datafile '/dev/sdf') and next time you boot up and that device is now /dev/sdg, you end up with an error. Just like you can't just change datafile names, you can't change device filenames without telling the database, or ASM. persistent device naming on Linux, especially back in those days ways to say it bluntly, a nightmare. In fact there were a number of issues (dating back to 2004) : Linux async IO wasn't pretty persistent device naming including permissions (had to be owned by oracle and the dba group) was very, very difficult to manage system resource usage in terms of open file descriptors So given the above, we tried to find a way to make this easier on the admins, in many ways, similar to why we started working on OCFS a few years earlier - how can we make life easier for the admins on Linux. A feature of Oracle ASM is the ability for third parties to write an extension using what's called ASMLib. It is possible for any third party OS or storage vendor to write a library using a specific Oracle defined interface that gets used by the ASM instance and by the database instance when available. This interface offered 2 components : Define an IO interface - allow any IO to the devices to go through ASMLib Define device discovery - implement an external way of discovering, labeling devices to provide to ASM and the Oracle database instance This is similar to a library that a number of companies have implemented over many years called libODM (Oracle Disk Manager). ODM was specified many years before we introduced ASM and allowed third party vendors to implement their own IO routines so that the database would use this library if installed and make use of the library open/read/write/close,.. routines instead of the standard OS interfaces. PolyServe back in the day used this to optimize their storage solution, Veritas used (and I believe still uses) this for their filesystem. It basically allowed, in particular, filesystem vendors to write libraries that could optimize access to their storage or filesystem.. so ASMLib was not something new, it was basically based on the same model. You have libodm for just database access, you have libasm for asm/database access. Since this library interface existed, we decided to do a reference implementation on Linux. We wrote an ASMLib for Linux that could be used on any Linux platform and other vendors could see how this worked and potentially implement their own solution. As I mentioned earlier, ASMLib and ODMLib are libraries for third party extensions. ASMLib for Linux, since it was a reference implementation implemented both interfaces, the storage discovery part and the IO part. There are 2 components : Oracle ASMLib - the userspace library with config tools (a shared object and some scripts) oracleasm.ko - a kernel module that implements the asm device for /dev/oracleasm/* The userspace library is a binary-only module since it links with and contains Oracle header files but is generic, we only have one asm library for the various Linux platforms. This library is opened by Oracle ASM and by Oracle database processes and this library interacts with the OS through the asm device (/dev/asm). It can install on Oracle Linux, on SuSE SLES, on Red Hat RHEL,.. The library itself doesn't actually care much about the OS version, the kernel module and device cares. The support tools are simple scripts that allow the admin to label devices and scan for disks and devices. This way you can say create an ASM disk label foo on, currently /dev/sdf... So if /dev/sdf disappears and next time is /dev/sdg, we just scan for the label foo and we discover it as /dev/sdg and life goes on without any worry. Also, when the database needs access to the device, we don't have to worry about file permissions or anything it will be taken care of. So it's a convenience thing. The kernel module oracleasm.ko is a Linux kernel module/device driver. It implements a device /dev/oracleasm/* and any and all IO goes through ASMLib - /dev/oracleasm. This kernel module is obviously a very specific Oracle related device driver but it was released under the GPL v2 so anyone could easily build it for their Linux distribution kernels. Advantages for using ASMLib : A good async IO interface for the database, the entire IO interface is based on an optimal ASYNC model for performance A single file descriptor per Oracle process, not one per device or datafile per process reducing # of open filehandles overhead Device scanning and labeling built-in so you do not have to worry about messing with udev or devlabel, permissions or the likes which can be very complex and error prone. Just like with OCFS and OCFS2, each kernel version (major or minor) has to get a new version of the device drivers. We started out building the oracleasm kernel module rpms for many distributions, SLES (in fact in the early days still even for this thing called United Linux) and RHEL. The driver didn't make sense to get pushed into upstream Linux because it's unique and specific to the Oracle database. As it takes a huge effort in terms of build infrastructure and QA and release management to build kernel modules for every architecture, every linux distribution and every major and minor version we worked with the vendors to get them to add this tiny kernel module to their infrastructure. (60k source code file). The folks at SuSE understood this was good for them and their customers and us and added it to SLES. So every build coming from SuSE for SLES contains the oracleasm.ko module. We weren't as successful with other vendors so for quite some time we continued to build it for RHEL and of course as we introduced Oracle Linux end of 2006 also for Oracle Linux. With Oracle Linux it became easy for us because we just added the code to our build system and as we churned out Oracle Linux kernels whether it was for a public release or for customers that needed a one off fix where they also used asmlib, we didn't have to do any extra work it was just all nicely integrated. With the introduction of Oracle Linux's Unbreakable Enterprise Kernel and our interest in being able to exploit ASMLib more, we started working on a very exciting project called Data Integrity. Oracle (Martin Petersen in particular) worked for many years with the T10 standards committee and storage vendors and implemented Linux kernel support for DIF/DIX, data protection in the Linux kernel, note to those that wonder, yes it's all in mainline Linux and under the GPL. This basically gave us all the features in the Linux kernel to checksum a data block, send it to the storage adapter, which can then validate that block and checksum in firmware before it sends it over the wire to the storage array, which can then do another checksum and to the actual DISK which does a final validation before writing the block to the physical media. So what was missing was the ability for a userspace application (read: Oracle RDBMS) to write a block which then has a checksum and validation all the way down to the disk. application to disk. Because we have ASMLib we had an entry into the Linux kernel and Martin added support in ASMLib (kernel driver + userspace) for this functionality. Now, this is all based on relatively current Linux kernels, the oracleasm kernel module depends on the main kernel to have support for it so we can make use of it. Thanks to UEK and us having the ability to ship a more modern, current version of the Linux kernel we were able to introduce this feature into ASMLib for Linux from Oracle. This combined with the fact that we build the asm kernel module when we build every single UEK kernel allowed us to continue improving ASMLib and provide it to our customers. So today, we (Oracle) provide Oracle ASMLib for Oracle Linux and in particular on the Unbreakable Enterprise Kernel. We did the build/testing/delivery of ASMLib for RHEL until RHEL5 but since RHEL6 decided that it was too much effort for us to also maintain all the build and test environments for RHEL and we did not have the ability to use the latest kernel features to introduce the Data Integrity features and we didn't want to end up with multiple versions of asmlib as maintained by us. SuSE SLES still builds and comes with the oracleasm module and they do all the work and RHAT it certainly welcome to do the same. They don't have to rebuild the userspace library, it's really about the kernel module. And finally to re-iterate a few important things : Oracle ASM does not in any way require ASMLib to function completely. ASMlib is a small set of extensions, in particular to make device management easier but there are no extra features exposed through Oracle ASM with ASMLib enabled or disabled. Often customers confuse ASMLib with ASM. again, ASM exists on every Oracle supported OS and on every supported Linux OS, SLES, RHEL, OL withoutASMLib Oracle ASMLib userspace is available for OTN and the kernel module is shipped along with OL/UEK for every build and by SuSE for SLES for every of their builds ASMLib kernel module was built by us for RHEL4 and RHEL5 but we do not build it for RHEL6, nor for the OL6 RHCK kernel. Only for UEK ASMLib for Linux is/was a reference implementation for any third party vendor to be able to offer, if they want to, their own version for their own OS or storage ASMLib as provided by Oracle for Linux continues to be enhanced and evolve and for the kernel module we use UEK as the base OS kernel hope this helps.

Read the article
unexplainable packet drops with 5 ethernet NICs and low traffic on Ubuntu

- by jon

I'm stuck on problem where my machine started to drops packets with no sign of ANY system load or high interrupt usage after an upgrade to Ubuntu 12.04. My server is a network monitoring sensor, running Ubuntu LTS 12.04, it passively collects packets from 5 interfaces doing network intrusion type stuff. Before the upgrade I managed to collect 200+GB of packets a day while writing them to disk with around 0% packet loss depending on the day with the help of CPU affinity and NIC IRQ to CPU bindings. Now I lose a great deal of packets with none of my applications running and at very low PPS rate which a modern workstation NIC would have no trouble with. Specs: x64 Xeon 4 cores 3.2 Ghz 16 GB RAM NICs: 5 Intel Pro NICs using the e1000 driver (NAPI). [1] eth0 and eth1 are integrated NICs (in the motherboard) There are 2 other PCI-X network cards, each with 2 Ethernet ports. 3 of the interfaces are running at Gigabit Ethernet, the others are not because they're attached to hubs. Specs: [2] http://support.dell.com/support/edocs/systems/pe2850/en/ug/t1390aa.htm uptime 17:36:00 up 1:43, 2 users, load average: 0.00, 0.01, 0.05 # uname -a Linux nms 3.2.0-29-generic #46-Ubuntu SMP Fri Jul 27 17:03:23 UTC 2012 x86_64 x86_64 x86_64 GNU/Linux I also have the CPU governor set to performance mode and irqbalance off. The problem still occurs with them on. # lspci -t -vv -[0000:00]-+-00.0 Intel Corporation E7520 Memory Controller Hub +-02.0-[01-03]--+-00.0-[02]----0e.0 Dell PowerEdge Expandable RAID controller 4 | \-00.2-[03]-- +-04.0-[04]-- +-05.0-[05-07]--+-00.0-[06]----07.0 Intel Corporation 82541GI Gigabit Ethernet Controller | \-00.2-[07]----08.0 Intel Corporation 82541GI Gigabit Ethernet Controller +-06.0-[08-0a]--+-00.0-[09]--+-04.0 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) | | \-04.1 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) | \-00.2-[0a]--+-02.0 Digium, Inc. Wildcard TE210P/TE212P dual-span T1/E1/J1 card 3.3V | +-03.0 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) | \-03.1 Intel Corporation 82546EB Gigabit Ethernet Controller (Copper) +-1d.0 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #1 +-1d.1 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #2 +-1d.2 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB UHCI Controller #3 +-1d.7 Intel Corporation 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller +-1e.0-[0b]----0d.0 Advanced Micro Devices [AMD] nee ATI RV100 QY [Radeon 7000/VE] +-1f.0 Intel Corporation 82801EB/ER (ICH5/ICH5R) LPC Interface Bridge \-1f.1 Intel Corporation 82801EB/ER (ICH5/ICH5R) IDE Controller I believe the NIC nor the NIC drivers are dropping the packets because ethtool reports 0 under rx_missed_errors and rx_no_buffer_count for each interface. On the old system, if it couldn't keep up this is where the drops would be. I drop packets on multiple interfaces just about every second, usually in small increments of 2-4. I tried all these sysctl values, I'm currently using the uncommented ones. # cat /etc/sysctl.conf # high net.core.netdev_max_backlog = 3000000 net.core.rmem_max = 16000000 net.core.rmem_default = 8000000 # defaults #net.core.netdev_max_backlog = 1000 #net.core.rmem_max = 131071 #net.core.rmem_default = 163480 # moderate #net.core.netdev_max_backlog = 10000 #net.core.rmem_max = 33554432 #net.core.rmem_default = 33554432 Here's an example of an interface stats report with ethtool. They are all the same, nothing is out of the ordinary ( I think ), so I'm only going to show one: ethtool -S eth2 NIC statistics: rx_packets: 7498 tx_packets: 0 rx_bytes: 2722585 tx_bytes: 0 rx_broadcast: 327 tx_broadcast: 0 rx_multicast: 1504 tx_multicast: 0 rx_errors: 0 tx_errors: 0 tx_dropped: 0 multicast: 1504 collisions: 0 rx_length_errors: 0 rx_over_errors: 0 rx_crc_errors: 0 rx_frame_errors: 0 rx_no_buffer_count: 0 rx_missed_errors: 0 tx_aborted_errors: 0 tx_carrier_errors: 0 tx_fifo_errors: 0 tx_heartbeat_errors: 0 tx_window_errors: 0 tx_abort_late_coll: 0 tx_deferred_ok: 0 tx_single_coll_ok: 0 tx_multi_coll_ok: 0 tx_timeout_count: 0 tx_restart_queue: 0 rx_long_length_errors: 0 rx_short_length_errors: 0 rx_align_errors: 0 tx_tcp_seg_good: 0 tx_tcp_seg_failed: 0 rx_flow_control_xon: 0 rx_flow_control_xoff: 0 tx_flow_control_xon: 0 tx_flow_control_xoff: 0 rx_long_byte_count: 2722585 rx_csum_offload_good: 0 rx_csum_offload_errors: 0 alloc_rx_buff_failed: 0 tx_smbus: 0 rx_smbus: 0 dropped_smbus: 01 # ifconfig eth0 Link encap:Ethernet HWaddr 00:11:43:e0:e2:8c UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:373348 errors:16 dropped:95 overruns:0 frame:16 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:356830572 (356.8 MB) TX bytes:0 (0.0 B) eth1 Link encap:Ethernet HWaddr 00:11:43:e0:e2:8d UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:13616 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:8690528 (8.6 MB) TX bytes:0 (0.0 B) eth2 Link encap:Ethernet HWaddr 00:04:23:e1:77:6a UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:7750 errors:0 dropped:471 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:2780935 (2.7 MB) TX bytes:0 (0.0 B) eth3 Link encap:Ethernet HWaddr 00:04:23:e1:77:6b UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:5112 errors:0 dropped:206 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:639472 (639.4 KB) TX bytes:0 (0.0 B) eth4 Link encap:Ethernet HWaddr 00:04:23:b6:35:6c UP BROADCAST RUNNING NOARP PROMISC ALLMULTI MULTICAST MTU:1500 Metric:1 RX packets:961467 errors:0 dropped:935 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:0 RX bytes:958561305 (958.5 MB) TX bytes:0 (0.0 B) eth5 Link encap:Ethernet HWaddr 00:04:23:b6:35:6d inet addr:192.168.1.6 Bcast:192.168.1.255 Mask:255.255.255.0 UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1 RX packets:4264 errors:0 dropped:16 overruns:0 frame:0 TX packets:699 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:1000 RX bytes:572228 (572.2 KB) TX bytes:124456 (124.4 KB) I tried the defaults, then started to play around with settings. I wasn't using any flow control and I increased the RxDescriptor count to 4096 before the upgrade as well without any problems. # cat /etc/modprobe.d/e1000.conf options e1000 XsumRX=0,0,0,0,0 RxDescriptors=4096,4096,4096,4096,4096 FlowControl=0,0,0,0,0 debug=16 Here's my network configuration file, I turned off checksumming and various offloading mechanisms along with setting CPU affinity with heavy use interfaces getting an entire CPU and light use interfaces sharing a CPU. I used these settings prior to the upgrade without problems. # cat /etc/network/interfaces # The loopback network interface auto lo iface lo inet loopback # The primary network interface auto eth0 iface eth0 inet manual pre-up /sbin/ethtool -G eth0 rx 4096 tx 0 pre-up /sbin/ethtool -K eth0 gro off gso off rx off pre-up /sbin/ethtool -A eth0 rx off autoneg off up ifconfig eth0 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up post-up echo "4" > /proc/irq/48/smp_affinity down ifconfig eth0 down post-down /sbin/ethtool -G eth0 rx 256 tx 256 post-down /sbin/ethtool -K eth0 gro on gso on rx on post-down /sbin/ethtool -A eth0 rx on autoneg on auto eth1 iface eth1 inet manual pre-up /sbin/ethtool -G eth1 rx 4096 tx 0 pre-up /sbin/ethtool -K eth1 gro off gso off rx off pre-up /sbin/ethtool -A eth1 rx off autoneg off up ifconfig eth1 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up post-up echo "4" > /proc/irq/49/smp_affinity down ifconfig eth1 down post-down /sbin/ethtool -G eth1 rx 256 tx 256 post-down /sbin/ethtool -K eth1 gro on gso on rx on post-down /sbin/ethtool -A eth1 rx on autoneg on auto eth2 iface eth2 inet manual pre-up /sbin/ethtool -G eth2 rx 4096 tx 0 pre-up /sbin/ethtool -K eth2 gro off gso off rx off pre-up /sbin/ethtool -A eth2 rx off autoneg off up ifconfig eth2 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up post-up echo "1" > /proc/irq/82/smp_affinity down ifconfig eth2 down post-down /sbin/ethtool -G eth2 rx 256 tx 256 post-down /sbin/ethtool -K eth2 gro on gso on rx on post-down /sbin/ethtool -A eth2 rx on autoneg on auto eth3 iface eth3 inet manual pre-up /sbin/ethtool -G eth3 rx 4096 tx 0 pre-up /sbin/ethtool -K eth3 gro off gso off rx off pre-up /sbin/ethtool -A eth3 rx off autoneg off up ifconfig eth3 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up post-up echo "2" > /proc/irq/83/smp_affinity down ifconfig eth3 down post-down /sbin/ethtool -G eth3 rx 256 tx 256 post-down /sbin/ethtool -K eth3 gro on gso on rx on post-down /sbin/ethtool -A eth3 rx on autoneg on auto eth4 iface eth4 inet manual pre-up /sbin/ethtool -G eth4 rx 4096 tx 0 pre-up /sbin/ethtool -K eth4 gro off gso off rx off pre-up /sbin/ethtool -A eth4 rx off autoneg off up ifconfig eth4 0.0.0.0 -arp promisc mtu 1500 allmulti txqueuelen 0 up post-up echo "4" > /proc/irq/77/smp_affinity down ifconfig eth4 down post-down /sbin/ethtool -G eth4 rx 256 tx 256 post-down /sbin/ethtool -K eth4 gro on gso on rx on post-down /sbin/ethtool -A eth4 rx on autoneg on auto eth5 iface eth5 inet static pre-up /etc/fw.conf address 192.168.1.1 netmask 255.255.255.0 broadcast 192.168.1.255 gateway 192.168.1.1 dns-nameservers 192.168.1.2 192.168.1.3 up ifconfig eth5 up post-up echo "8" > /proc/irq/77/smp_affinity down ifconfig eth5 down Here's a few examples of packet drops, i ran one after another, probabling totaling 3 or 4 seconds. You can see increases in the drops from the 1st and 3rd. This was a non-busy time, very little traffic. # awk '{ print $1,$5 }' /proc/net/dev Inter-| face drop eth3: 225 lo: 0 eth2: 505 eth1: 0 eth5: 17 eth0: 105 eth4: 1034 # awk '{ print $1,$5 }' /proc/net/dev Inter-| face drop eth3: 225 lo: 0 eth2: 507 eth1: 0 eth5: 17 eth0: 105 eth4: 1034 # awk '{ print $1,$5 }' /proc/net/dev Inter-| face drop eth3: 227 lo: 0 eth2: 512 eth1: 0 eth5: 17 eth0: 105 eth4: 1039 I tried the pci=noacpi options. With and without, it's the same. This is what my interrupt stats looked like before the upgrade, after, with ACPI on PCI it showed multiple NICs bound to an interrupt and shared with other devices such as USB drives which I didn't like so I think i'm going to keep it with ACPI off as it's easier to designate sole purpose interrupts. Is there any advantage I would have using the default i.e. ACPI w/ PCI. ? # cat /etc/default/grub | grep CMD_LINE GRUB_CMDLINE_LINUX_DEFAULT="ipv6.disable=1 noacpi pci=noacpi" GRUB_CMDLINE_LINUX="" # cat /proc/interrupts CPU0 CPU1 CPU2 CPU3 0: 45 0 0 16 IO-APIC-edge timer 1: 1 0 0 7936 IO-APIC-edge i8042 2: 0 0 0 0 XT-PIC-XT-PIC cascade 6: 0 0 0 3 IO-APIC-edge floppy 8: 0 0 0 1 IO-APIC-edge rtc0 9: 0 0 0 0 IO-APIC-edge acpi 12: 0 0 0 1809 IO-APIC-edge i8042 14: 1 0 0 4498 IO-APIC-edge ata_piix 15: 0 0 0 0 IO-APIC-edge ata_piix 16: 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb2 18: 0 0 0 1350 IO-APIC-fasteoi uhci_hcd:usb4, radeon 19: 0 0 0 0 IO-APIC-fasteoi uhci_hcd:usb3 23: 0 0 0 4099 IO-APIC-fasteoi ehci_hcd:usb1 38: 0 0 0 61963 IO-APIC-fasteoi megaraid 48: 0 0 1002319 4 IO-APIC-fasteoi eth0 49: 0 0 38772 3 IO-APIC-fasteoi eth1 77: 0 0 130076 432159 IO-APIC-fasteoi eth4 78: 0 0 0 23917 IO-APIC-fasteoi eth5 82: 1329033 0 0 4 IO-APIC-fasteoi eth2 83: 0 4886525 0 6 IO-APIC-fasteoi eth3 NMI: 5 6 4 5 Non-maskable interrupts LOC: 61409 57076 64257 114764 Local timer interrupts SPU: 0 0 0 0 Spurious interrupts IWI: 0 0 0 0 IRQ work interrupts RES: 17956 25333 13436 14789 Rescheduling interrupts CAL: 22436 607 539 478 Function call interrupts TLB: 1525 1458 4600 4151 TLB shootdowns TRM: 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 Machine check exceptions MCP: 16 16 16 16 Machine check polls ERR: 0 MIS: 0 Here's sample output of vmstat, showing the system. Barebones system right now. root@nms:~# vmstat -S m 1 procs -----------memory---------- ---swap-- -----io---- -system-- ----cpu---- r b swpd free buff cache si so bi bo in cs us sy id wa 0 0 0 14992 192 1029 0 0 56 2 419 29 1 0 99 0 0 0 0 14992 192 1029 0 0 0 0 922 27 0 0 100 0 0 0 0 14991 192 1029 0 0 0 36 763 50 0 0 100 0 0 0 0 14991 192 1029 0 0 0 0 646 35 0 0 100 0 0 0 0 14991 192 1029 0 0 0 0 722 54 0 0 100 0 0 0 0 14991 192 1029 0 0 0 0 793 27 0 0 100 0 ^C Here's dmesg output. I can't figure out why my PCI-X slots are negotiated as PCI. The network cards are all PCI-X with the exception of the integrated NICs that came with the server. In the output below it looks as if eth3 and eth2 negotiated at PCI-X speeds rather than PCI:66Mhz. Wouldn't they all drop to PCI:66Mhz? If your integrated NICs are PCI, as labeled below (eth0,eth1), then wouldn't all devices on your bus speed drop down to that slower bus speed? If not, I still don't know why only one of my NICs ( each has two ethernet ports) is labeled as PCI-X in the output below. Does that mean it is running at PCI-X speeds are is it showing that it's capable? # dmesg | grep e1000 [ 3678.349337] e1000: Intel(R) PRO/1000 Network Driver - version 7.3.21-k8-NAPI [ 3678.349342] e1000: Copyright (c) 1999-2006 Intel Corporation. [ 3678.349394] e1000 0000:06:07.0: PCI->APIC IRQ transform: INT A -> IRQ 48 [ 3678.409725] e1000 0000:06:07.0: Receive Descriptors set to 4096 [ 3678.409730] e1000 0000:06:07.0: Checksum Offload Disabled [ 3678.409734] e1000 0000:06:07.0: Flow Control Disabled [ 3678.586409] e1000 0000:06:07.0: eth0: (PCI:66MHz:32-bit) 00:11:43:e0:e2:8c [ 3678.586419] e1000 0000:06:07.0: eth0: Intel(R) PRO/1000 Network Connection [ 3678.586642] e1000 0000:07:08.0: PCI->APIC IRQ transform: INT A -> IRQ 49 [ 3678.649854] e1000 0000:07:08.0: Receive Descriptors set to 4096 [ 3678.649859] e1000 0000:07:08.0: Checksum Offload Disabled [ 3678.649863] e1000 0000:07:08.0: Flow Control Disabled [ 3678.826436] e1000 0000:07:08.0: eth1: (PCI:66MHz:32-bit) 00:11:43:e0:e2:8d [ 3678.826444] e1000 0000:07:08.0: eth1: Intel(R) PRO/1000 Network Connection [ 3678.826627] e1000 0000:09:04.0: PCI->APIC IRQ transform: INT A -> IRQ 82 [ 3679.093266] e1000 0000:09:04.0: Receive Descriptors set to 4096 [ 3679.093271] e1000 0000:09:04.0: Checksum Offload Disabled [ 3679.093275] e1000 0000:09:04.0: Flow Control Disabled [ 3679.130239] e1000 0000:09:04.0: eth2: (PCI-X:133MHz:64-bit) 00:04:23:e1:77:6a [ 3679.130246] e1000 0000:09:04.0: eth2: Intel(R) PRO/1000 Network Connection [ 3679.130449] e1000 0000:09:04.1: PCI->APIC IRQ transform: INT B -> IRQ 83 [ 3679.397312] e1000 0000:09:04.1: Receive Descriptors set to 4096 [ 3679.397318] e1000 0000:09:04.1: Checksum Offload Disabled [ 3679.397321] e1000 0000:09:04.1: Flow Control Disabled [ 3679.434350] e1000 0000:09:04.1: eth3: (PCI-X:133MHz:64-bit) 00:04:23:e1:77:6b [ 3679.434360] e1000 0000:09:04.1: eth3: Intel(R) PRO/1000 Network Connection [ 3679.434553] e1000 0000:0a:03.0: PCI->APIC IRQ transform: INT A -> IRQ 77 [ 3679.704072] e1000 0000:0a:03.0: Receive Descriptors set to 4096 [ 3679.704077] e1000 0000:0a:03.0: Checksum Offload Disabled [ 3679.704081] e1000 0000:0a:03.0: Flow Control Disabled [ 3679.738364] e1000 0000:0a:03.0: eth4: (PCI:33MHz:64-bit) 00:04:23:b6:35:6c [ 3679.738371] e1000 0000:0a:03.0: eth4: Intel(R) PRO/1000 Network Connection [ 3679.738538] e1000 0000:0a:03.1: PCI->APIC IRQ transform: INT B -> IRQ 78 [ 3680.046060] e1000 0000:0a:03.1: eth5: (PCI:33MHz:64-bit) 00:04:23:b6:35:6d [ 3680.046067] e1000 0000:0a:03.1: eth5: Intel(R) PRO/1000 Network Connection [ 3682.132415] e1000: eth0 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None [ 3682.224423] e1000: eth1 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None [ 3682.316385] e1000: eth2 NIC Link is Up 100 Mbps Half Duplex, Flow Control: None [ 3682.408391] e1000: eth3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None [ 3682.500396] e1000: eth4 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: None [ 3682.708401] e1000: eth5 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX At first I thought it was the NIC drivers but I'm not so sure. I really have no idea where else to look at the moment. Any help is greatly appreciated as I'm struggling with this. If you need more information just ask. Thanks! [1]http://www.cs.fsu.edu/~baker/devices/lxr/http/source/linux/Documentation/networking/e1000.txt?v=2.6.11.8 [2] http://support.dell.com/support/edocs/systems/pe2850/en/ug/t1390aa.htm

Read the article
ActiveMQ - "Cannot send, channel has already failed" every 2 seconds?

- by quanta

ActiveMQ 5.7.0 In the activemq.log, I'm seeing this exception every 2 seconds: 2013-11-05 13:00:52,374 | DEBUG | Transport Connection to: tcp://127.0.0.1:37501 failed: org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://127.0.0.1:37501 | org.apache.activemq.broker.TransportConnection.Transport | Async Exception Handler org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://127.0.0.1:37501 at org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:282) at org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:271) at org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:85) at org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104) at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68) at org.apache.activemq.broker.TransportConnection.dispatch(TransportConnection.java:1312) at org.apache.activemq.broker.TransportConnection.processDispatch(TransportConnection.java:838) at org.apache.activemq.broker.TransportConnection.iterate(TransportConnection.java:873) at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129) at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) Due to this keyword InactivityIOException, the first thing comes to my mind is InactivityMonitor, but the strange thing is MaxInactivityDuration=30000: 2013-11-05 13:11:02,672 | DEBUG | Sending: WireFormatInfo { version=9, properties={MaxFrameSize=9223372036854775807, CacheSize=1024, CacheEnabled=true, SizePrefixDisabled=false, MaxInactivityDurationInitalDelay=10000, TcpNoDelayEnabled=true, MaxInactivityDuration=30000, TightEncodingEnabled=true, StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]} | org.apache.activemq.transport.WireFormatNegotiator | ActiveMQ BrokerService[localhost] Task-2 Moreover, I also didn't see something like this: No message received since last read check for ... or: Channel was inactive for too (30000) long Do a netstat, I see these connections in TIME_WAIT state: tcp 0 0 127.0.0.1:38545 127.0.0.1:61616 TIME_WAIT - tcp 0 0 127.0.0.1:38544 127.0.0.1:61616 TIME_WAIT - tcp 0 0 127.0.0.1:38522 127.0.0.1:61616 TIME_WAIT - Here're the output when running tcpdump: Internet Protocol Version 4, Src: 127.0.0.1 (127.0.0.1), Dst: 127.0.0.1 (127.0.0.1) Version: 4 Header length: 20 bytes Differentiated Services Field: 0x00 (DSCP 0x00: Default; ECN: 0x00: Not-ECT (Not ECN-Capable Transport)) 0000 00.. = Differentiated Services Codepoint: Default (0x00) .... ..00 = Explicit Congestion Notification: Not-ECT (Not ECN-Capable Transport) (0x00) Total Length: 296 Identification: 0x7b6a (31594) Flags: 0x02 (Don't Fragment) 0... .... = Reserved bit: Not set .1.. .... = Don't fragment: Set ..0. .... = More fragments: Not set Fragment offset: 0 Time to live: 64 Protocol: TCP (6) Header checksum: 0xc063 [correct] [Good: True] [Bad: False] Source: 127.0.0.1 (127.0.0.1) Destination: 127.0.0.1 (127.0.0.1) Transmission Control Protocol, Src Port: 61616 (61616), Dst Port: 54669 (54669), Seq: 1, Ack: 2, Len: 244 Source port: 61616 (61616) Destination port: 54669 (54669) [Stream index: 11] Sequence number: 1 (relative sequence number) [Next sequence number: 245 (relative sequence number)] Acknowledgement number: 2 (relative ack number) Header length: 32 bytes Flags: 0x018 (PSH, ACK) 000. .... .... = Reserved: Not set ...0 .... .... = Nonce: Not set .... 0... .... = Congestion Window Reduced (CWR): Not set .... .0.. .... = ECN-Echo: Not set .... ..0. .... = Urgent: Not set .... ...1 .... = Acknowledgement: Set .... .... 1... = Push: Set .... .... .0.. = Reset: Not set .... .... ..0. = Syn: Not set .... .... ...0 = Fin: Not set Window size value: 256 [Calculated window size: 32768] [Window size scaling factor: 128] Checksum: 0xff1c [validation disabled] [Good Checksum: False] [Bad Checksum: False] Options: (12 bytes) No-Operation (NOP) No-Operation (NOP) Timestamps: TSval 2304161892, TSecr 2304161891 Kind: Timestamp (8) Length: 10 Timestamp value: 2304161892 Timestamp echo reply: 2304161891 [SEQ/ACK analysis] [Bytes in flight: 244] Constrained Application Protocol, TID: 240, Length: 244 00.. .... = Version: 0 ..00 .... = Type: Confirmable (0) .... 0000 = Option Count: 0 Code: Unknown (0) Transaction ID: 240 Payload Content-Type: text/plain (default), Length: 240, offset: 4 Line-based text data: text/plain [truncated] \001ActiveMQ\000\000\000\t\001\000\000\000<DE>\000\000\000\t\000\fMaxFrameSize\006\177<FF><FF><FF><FF> <FF><FF><FF>\000\tCacheSize\005\000\000\004\000\000\fCacheEnabled\001\001\000\022SizePrefixDisabled\001\000\000 MaxInactivityDurationInitalDelay\006\ It is very likely a tcp port check. This is what I see when trying telnet from another host: 2013-11-05 16:12:41,071 | DEBUG | Transport Connection to: tcp://10.8.20.9:46775 failed: java.io.EOFException | org.apache.activemq.broker.TransportConnection.Transport | ActiveMQ Transport: tcp:///10.8.20.9:46775@61616 java.io.EOFException at java.io.DataInputStream.readInt(DataInputStream.java:375) at org.apache.activemq.openwire.OpenWireFormat.unmarshal(OpenWireFormat.java:275) at org.apache.activemq.transport.tcp.TcpTransport.readCommand(TcpTransport.java:229) at org.apache.activemq.transport.tcp.TcpTransport.doRun(TcpTransport.java:221) at org.apache.activemq.transport.tcp.TcpTransport.run(TcpTransport.java:204) at java.lang.Thread.run(Thread.java:662) 2013-11-05 16:12:41,071 | DEBUG | Transport Connection to: tcp://10.8.20.9:46775 failed: org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection.Transport | Async Exception Handler org.apache.activemq.transport.InactivityIOException: Cannot send, channel has already failed: tcp://10.8.20.9:46775 at org.apache.activemq.transport.AbstractInactivityMonitor.doOnewaySend(AbstractInactivityMonitor.java:282) at org.apache.activemq.transport.AbstractInactivityMonitor.oneway(AbstractInactivityMonitor.java:271) at org.apache.activemq.transport.TransportFilter.oneway(TransportFilter.java:85) at org.apache.activemq.transport.WireFormatNegotiator.oneway(WireFormatNegotiator.java:104) at org.apache.activemq.transport.MutexTransport.oneway(MutexTransport.java:68) at org.apache.activemq.broker.TransportConnection.dispatch(TransportConnection.java:1312) at org.apache.activemq.broker.TransportConnection.processDispatch(TransportConnection.java:838) at org.apache.activemq.broker.TransportConnection.iterate(TransportConnection.java:873) at org.apache.activemq.thread.PooledTaskRunner.runTask(PooledTaskRunner.java:129) at org.apache.activemq.thread.PooledTaskRunner$1.run(PooledTaskRunner.java:47) at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908) at java.lang.Thread.run(Thread.java:662) 2013-11-05 16:12:41,071 | DEBUG | Unregistering MBean org.apache.activemq:BrokerName=localhost,Type=Connection,ConnectorName=ope nwire,ViewType=address,Name=tcp_//10.8.20.9_46775 | org.apache.activemq.broker.jmx.ManagementContext | ActiveMQ Transport: tcp:/ //10.8.20.9:46775@61616 2013-11-05 16:12:41,073 | DEBUG | Stopping connection: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,073 | DEBUG | Stopping transport tcp:///10.8.20.9:46775@61616 | org.apache.activemq.transport.tcp.TcpTranspo rt | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,073 | DEBUG | Initialized TaskRunnerFactory[ActiveMQ Task] using ExecutorService: java.util.concurrent.Threa dPoolExecutor@23cc2a28 | org.apache.activemq.thread.TaskRunnerFactory | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,074 | DEBUG | Closed socket Socket[addr=/10.8.20.9,port=46775,localport=61616] | org.apache.activemq.transpo rt.tcp.TcpTransport | ActiveMQ Task-1 2013-11-05 16:12:41,074 | DEBUG | Forcing shutdown of ExecutorService: java.util.concurrent.ThreadPoolExecutor@23cc2a28 | org.apache.activemq.util.ThreadPoolUtils | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,074 | DEBUG | Stopped transport: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,074 | DEBUG | Connection Stopped: tcp://10.8.20.9:46775 | org.apache.activemq.broker.TransportConnection | ActiveMQ BrokerService[localhost] Task-5 2013-11-05 16:12:41,902 | DEBUG | Sending: WireFormatInfo { version=9, properties={MaxFrameSize=9223372036854775807, CacheSize=1024, CacheEnabled=true, SizePrefixDisabled=false, MaxInactivityDurationInitalDelay=10000, TcpNoDelayEnabled=true, MaxInactivityDuration=30000, TightEncodingEnabled=true, StackTraceEnabled=true}, magic=[A,c,t,i,v,e,M,Q]} | org.apache.activemq.transport.WireFormatNegotiator | ActiveMQ BrokerService[localhost] Task-5 So the question is: how can I find out the process that is trying to connect to my ActiveMQ (from localhost) every 2 seconds?

Read the article
CRC8-Check in PHP

- by Lancelot

How can I generate a CRC-8 checksum in PHP?

Read the article
SIGSEGV problem

- by sickmate

I'm designing a protocol (in C) to implement the layered OSI network structure, using cnet (http://www.csse.uwa.edu.au/cnet/). I'm getting a SIGSEGV error at runtime, however cnet compiles my source code files itself (I can't compile it through gcc) so I can't easily use any debugging tools such as gdb to find the error. Here's the structures used, and the code in question: typedef struct { char *data; } DATA; typedef struct { CnetAddr src_addr; CnetAddr dest_addr; PACKET_TYPE type; DATA data; } Packet; typedef struct { int length; int checksum; Packet datagram; } Frame; static void keyboard(CnetEvent ev, CnetTimerID timer, CnetData data) { char line[80]; int length; length = sizeof(line); CHECK(CNET_read_keyboard((void *)line, (unsigned int *)&length)); // Reads input from keyboard if(length > 1) { /* not just a blank line */ printf("\tsending %d bytes - \"%s\"\n", length, line); application_downto_transport(1, line, &length); } } void application_downto_transport(int link, char *msg, int *length) { transport_downto_network(link, msg, length); } void transport_downto_network(int link, char *msg, int *length) { Packet *p; DATA *d; p = (Packet *)malloc(sizeof(Packet)); d = (DATA *)malloc(sizeof(DATA)); d->data = msg; p->data = *d; network_downto_datalink(link, (void *)p, length); } void network_downto_datalink(int link, Packet *p, int *length) { Frame *f; // Encapsulate datagram and checksum into a Frame. f = (Frame *)malloc(sizeof(Frame)); f->checksum = CNET_crc32((unsigned char *)(p->data).data, *length); // Generate 32-bit CRC for the data. f->datagram = *p; f->length = sizeof(f); //Pass Frame to the CNET physical layer to send Frame to the require link. CHECK(CNET_write_physical(link, (void *)f, (size_t *)f->length)); free(p->data); free(p); free(f); } I managed to find that the line: CHECK(CNET_write_physical(link, (void *)f, (size_t *)f-length)); is causing the segfault but I can't work out why. Any help is greatly appreciated.

Read the article
Finding the right version of the right JAR in a maven repository

- by Matt McHenry

I'm converting a build that has 71 .jar files in its global lib/ directory to use Maven. Of course, these have been pulled from the web by lots of developers over the past ten years of this project's history, and weren't always added to VCS with all the necessary version info, etc. Is there an easy, automated way to go from that set of .jar files to the corresponding <dependency/> elements for use in my pom.xml files? I'm hoping for a web page where I can submit the checksum of a jar file and get back an XML snippet. The google hits for 'maven repository search' are basically just finding name-based searches. And http://repo1.maven.org/ has no search whatsoever, as far as I can see. Update: GrepCode looks like it can find projects given an MD5 checksum. But it doesn't provide the particular details (groupId, artifactId) that Maven needs.

Read the article

< Previous Page | 2 3 4 5 6 7 8 9 10 11 12 13 | Next Page >