System is CentOS5 x86_64, completely up to date.
I've got a folder that can't be listed (ls just hangs, eating memory until it is killed). The directory size is nearly 500k:
root@server [/home/user/public_html/domain.com/wp-content/uploads/2010/03]# stat .
File: `.'
Size: 458752 Blocks: 904 IO Block: 4096 directory
Device: 812h/2066d Inode: 44499071 Links: 2
Access: (0755/drwxr-xr-x) Uid: ( 3292/ user) Gid: ( 3287/ user)
Access: 2012-06-29 17:31:47.000000000 -0400
Modify: 2012-10-23 14:41:58.000000000 -0400
Change: 2012-10-23 14:41:58.000000000 -0400
I can see the file names if I use ls -1f, but it just repeats the same 48 files ad infinitum, all of which have non-ascii characters somewhere in the file name:
La-critic\363-al-servicio-la-privacidad-300x160.jpg
When I try to access the files (say to copy them or remove them) I get messages like the following:
lstat("/home/user/public_html/domain.com/wp-content/uploads/2010/03/Sebast\355an-Pi\361era-el-balc\363n-150x120.jpg", 0x7fff364c52c0) = -1 ENOENT (No such file or directory)
I tried altering the code found on this man page and modified the code to call unlink for each file. I get the same ENOENT error from the unlink call:
unlink("/home/user/public_html/domain.com/wp-content/uploads/2010/03/Marca-naci\363n-Madrid-150x120.jpg") = -1 ENOENT (No such file or directory)
I also straced a "touch", grabbed the syscalls it makes and replicated them, then tried to unlink the resulting file by name. This works fine, but the folder still contains an entry by the same name after the operation completes and the program runs for an arbitrarily long time (strace output ended up at 20GB after 5 minutes and I stopped the process).
I'm stumped on this one, I'd really prefer not to have to take this production machine (hundreds of customers) offline to fsck the filesystem, but I'm leaning toward that being the only option at this point. If anyone's had success using other methods for removing files (by inode number, I can get those with the getdents code) I'd love to hear them.
(Yes, I've tried find . -inum <inode> -exec rm -fv {} \; and it still has the problem with unlink returning ENOENT)
For those interested, here's the diff between that man page's code and mine. I didn't bother with error checking on mallocs, etc because I'm lazy and this is a one-off:
root@server [~]# diff -u listdir-orig.c listdir.c
--- listdir-orig.c 2012-10-23 15:10:02.000000000 -0400
+++ listdir.c 2012-10-23 14:59:47.000000000 -0400
@@ -6,6 +6,7 @@
#include <stdlib.h>
#include <sys/stat.h>
#include <sys/syscall.h>
+#include <string.h>
#define handle_error(msg) \
do { perror(msg); exit(EXIT_FAILURE); } while (0)
@@ -17,7 +18,7 @@
char d_name[];
};
-#define BUF_SIZE 1024
+#define BUF_SIZE 1024*1024*5
int main(int argc, char *argv[])
{
@@ -26,11 +27,16 @@
struct linux_dirent *d;
int bpos;
char d_type;
+ int deleted;
+ int file_descriptor;
fd = open(argc > 1 ? argv[1] : ".", O_RDONLY | O_DIRECTORY);
if (fd == -1)
handle_error("open");
+ char* full_path;
+ char* fd_path;
+
for ( ; ; ) {
nread = syscall(SYS_getdents, fd, buf, BUF_SIZE);
if (nread == -1)
@@ -55,7 +61,24 @@
printf("%4d %10lld %s\n", d->d_reclen,
(long long) d->d_off, (char *) d->d_name);
bpos += d->d_reclen;
+ if ( d_type == DT_REG )
+ {
+ full_path = malloc(strlen((char *) d->d_name) + strlen(argv[1]) + 2); //One for the /, one for the \0
+ strcpy(full_path, argv[1]);
+ strcat(full_path, (char *) d->d_name);
+
+ //We're going to try to "touch" the file.
+ //file_descriptor = open(full_path, O_WRONLY|O_CREAT|O_NOCTTY|O_NONBLOCK, 0666);
+ //fd_path = malloc(32); //Lazy, only really needs 16
+ //sprintf(fd_path, "/proc/self/fd/%d", file_descriptor);
+ //utimes(fd_path, NULL);
+ //close(file_descriptor);
+ deleted = unlink(full_path);
+ if ( deleted == -1 ) printf("Error unlinking file\n");
+ break; //Break on first try
+ }
}
+ break; //Break on first try
}
exit(EXIT_SUCCESS);