segfault during __cxa_allocate_exception in SWIG wrapped library
- by lefticus
While developing a SWIG wrapped C++ library for Ruby, we came across an unexplained crash during exception handling inside the C++ code.
I'm not sure of the specific circumstances to recreate the issue, but it happened first during a call to std::uncaught_exception, then after a some code changes, moved to __cxa_allocate_exception during exception construction. Neither GDB nor valgrind provided any insight into the cause of the crash.
I've found several references to similar problems, including:
http://wiki.fifengine.de/Segfault_in_cxa_allocate_exception
http://forums.fifengine.de/index.php?topic=30.0
http://code.google.com/p/osgswig/issues/detail?id=17
https://bugs.launchpad.net/ubuntu/+source/libavg/+bug/241808
The overriding theme seems to be a combination of circumstances:
A C application is linked to more than one C++ library
More than one version of libstdc++ was used during compilation
Generally the second version of C++ used comes from a binary-only implementation of libGL
The problem does not occur when linking your library with a C++ application, only with a C application
The "solution" is to explicitly link your library with libstdc++ and possibly also with libGL, forcing the order of linking.
After trying many combinations with my code, the only solution that I found that works is the LD_PRELOAD="libGL.so libstdc++.so.6" ruby scriptname option. That is, none of the compile-time linking solutions made any difference.
My understanding of the issue is that the C++ runtime is not being properly initialized. By forcing the order of linking you bootstrap the initialization process and it works. The problem occurs only with C applications calling C++ libraries because the C application is not itself linking to libstdc++ and is not initializing the C++ runtime. Because using SWIG (or boost::python) is a common way of calling a C++ library from a C application, that is why SWIG often comes up when researching the problem.
Is anyone out there able to give more insight into this problem? Is there an actual solution or do only workarounds exist?
Thanks.