Today I experience a deadlock using localtime_r() after forking a multi-thread program. I was quite surprised, because – well – using localtime_r() I thought to be in some way safe. But I was not ;)
Shortly I use a logging function which calls localtime_r(): my_log(“message”);
And the following happens even if you use syslog(), because syslog() on most system uses localtime() [yes, the non reentrant one too].
After the child hung, I found that it was in futex_wait state using:
# ps -elf ;
Looking for a mutex I discovered that – as you (don’t) know – localtime_r() calls tzset(). It’s a libc function using a mutex (call it _tz_mutex) while mangling the current timezone.
The “bad” code was basically doing the following:
main() {
... spawn many threads using my_log() ...
if (fork()==0) {
my_log("I am the child");
execv("/bin/bash", ...);
}
}
This is bad, because could happen the following:
- one of the parent threads runs my_log(), locking the _tz_mutex (which is global and not thread local) ;
- before the _tz_mutex is released, the main thread forks;
- fork() preserves the locked mutex, because it’s a global one, but closes the thread that locked it: causing the deadlock;
- child runs my_log(), trying to lock _tz_mutex and hanging.
This behavior is described in the rationale of pthread_atfork():
When fork() is called, only the calling thread is duplicated in the child process. Synchronization variables remain in the same state in the child as they were in the parent at the time fork() was called. Thus, for example, mutex locks may be held by threads that no longer exist in the child process, and any associated states may be inconsistent.
Moral:
1. don’t trust functions just because they end with “_r”;
2. run execv() ASAP after fork(), as stated in man pthread_atfork():
It is suggested that programs that use fork() call an exec function very soon afterwards in the child process, thus resetting all states. In the meantime, only a short list of async-signal-safe library routines are promised to be available.
3. between fork() and execv() use only simple functions: printf(), dup(),… you can find a list of async-signal-safe functions in
#man 7 signal