I'm trying to debug a program I wrote. I ran it inside gdb and I managed to catch a SIGABRT from inside calloc(). I'm completely confused about how this can arise. Can it be a bug in gcc or even libc??
More details: My program uses OpenMP. I ran it through valgrind in single-threaded mode with no errors. I also use mmap() to load a 40GB file, but I doubt that is relevant. Inside gdb, I'm running with 30 threads. Several identical runs (same input&CL) finished correctly, until the problematic one that I caught. On the surface this suggests there might be a race condition of some type. However, the SIGABRT comes from calloc() which is out of my control. Here is some relevant gdb output:
(gdb) info threads
[...]
* 11 Thread 0x7ffff0056700 (LWP 73449) 0x00007ffff6a948a5 in raise () from /lib64/libc.so.6
[...]
(gdb) thread 11
[Switching to thread 11 (Thread 0x7ffff0056700 (LWP 73449))]#0 0x00007ffff6a948a5 in raise () from /lib64/libc.so.6
(gdb) bt
#0 0x00007ffff6a948a5 in raise () from /lib64/libc.so.6
#1 0x00007ffff6a96085 in abort () from /lib64/libc.so.6
#2 0x00007ffff6ad1fe7 in __libc_message () from /lib64/libc.so.6
#3 0x00007ffff6ad7916 in malloc_printerr () from /lib64/libc.so.6
#4 0x00007ffff6adb79f in _int_malloc () from /lib64/libc.so.6
#5 0x00007ffff6adbdd6 in calloc () from /lib64/libc.so.6
#6 0x000000000040e87f in my_calloc (re=0x7fff2867ef10, st=0, options=0x632020) at gmapper/../gmapper/../common/my-alloc.h:286
#7 read_get_hit_list_per_strand (re=0x7fff2867ef10, st=0, options=0x632020) at gmapper/mapping.c:1046
#8 0x000000000041308a in read_get_hit_list (re=<value optimized out>, options=0x632010, n_options=1) at gmapper/mapping.c:1239
#9 handle_read (re=<value optimized out>, options=0x632010, n_options=1) at gmapper/mapping.c:1806
#10 0x0000000000404f35 in launch_scan_threads (.omp_data_i=<value optimized out>) at gmapper/gmapper.c:557
#11 0x00007ffff7230502 in ?? () from /usr/lib64/libgomp.so.1
#12 0x00007ffff6dfc851 in start_thread () from /lib64/libpthread.so.0
#13 0x00007ffff6b4a11d in clone () from /lib64/libc.so.6
(gdb) f 6
#6 0x000000000040e87f in my_calloc (re=0x7fff2867ef10, st=0, options=0x632020) at gmapper/../gmapper/../common/my-alloc.h:286
286 res = calloc(size, 1);
(gdb) p size
$2 = 814080
(gdb)
The function my_calloc() is just a wrapper, but the problem is not in there, as the real calloc() call looks legit. These are the limits set in the shell:
$ ulimit -a
core file size (blocks, -c) 0
data seg size (kbytes, -d) unlimited
scheduling priority (-e) 0
file size (blocks, -f) unlimited
pending signals (-i) 2067285
max locked memory (kbytes, -l) 64
max memory size (kbytes, -m) unlimited
open files (-n) 1024
pipe size (512 bytes, -p) 8
POSIX message queues (bytes, -q) 819200
real-time priority (-r) 0
stack size (kbytes, -s) 10240
cpu time (seconds, -t) unlimited
max user processes (-u) 1024
virtual memory (kbytes, -v) unlimited
file locks (-x) unlimited
The program is not out of memory, it's using 41GB on a machine with 256GB available:
$ top -b -n 1 | grep gmapper
73437 user 20 0 41.5g 16g 15g T 0.0 6.6 55:17.24 gmapper-ls
$ free -m
total used free shared buffers cached
Mem: 258437 195567 62869 0 82 189677
-/+ buffers/cache: 5807 252629
Swap: 0 0 0
I compiled using gcc (GCC) 4.4.6 20120305 (Red Hat 4.4.6-4), with flags -g -O2 -DNDEBUG -mmmx -msse -msse2 -fopenmp -Wall -Wno-deprecated -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS.