Monday, June 23, 2008

On libcurl, OpenSSL, and thread-safety

The cURL project with its libcurl is a frequent choice of developers requiring a feature-rich HTTP client. Indeed, libcurl is a good choice, it supports HTTP/1.1, several authentication mechanisms (including Kerberos / SPNEGO authentication), and HTTPS, to name a few important aspects. It can also be used in multi-threaded applications - well - it can but you have to be aware of some fundamental facts in order to avoid random segmentation faults or aborts - Linux is assumed here...

First, to avoid pitfall #1, be sure to disable the use of signals of the library by adding the following line of code to the initialization phase of your application:
curl_easy_setopt(handle, CURLOPT_NOSIGNAL, TRUE);

This option deactivates code that works around the fact that DNS lookups initiated via gethostbyname cannot be interrupted, there's no timeout facility. The libcurl developers of course knew about the one generic way to interrupt the current thread of execution and execute code in that context - signals.

In order to interrupt gethostbyname, libcurl saves the current state of execution via setjmp, initializes a timeout of n seconds by calling alarm, which results in signal delivery at that point-in-time. If gethostbyaddr does not return in time, the signal handler associated with SIGALARM, the delivered signal, is being called. The handler restores the original state using longjmp to rewind execution and indicates a timeout via the return value.
signalhandler()
{
longjmp(state, 1);
}

init()
{
signal(SIGALARM, signalhandler);
}

lookup()
{
if (setjmp(state) == 0) {
alarm(30);
gethostbyname();
/* snip */
}
else {
/* timeout */
}
}

In a single-threaded application, that works fine because alarm only affects the process and the main thread is identical to the process itself. In a multi-threaded application, however, the fact that the process gets to handle the signal and not a particular thread, the workaround can fail because an unrelated thread could handle SIGALARM. While this strategy may work in the case of LinuxThreads, it does not with NPTL, the state-of-the-art Linux thread library that implements POSIX semantics. The relevant part of the specification is as follows:

"There were two possible choices for alarm generation in multi-threaded applications: generation for the calling thread or generation for the process. The first option would not have been particularly useful since the alarm state is maintained on a per-process basis and the alarm that is established by the last invocation of alarm() is the only one that would be active.

Furthermore, allowing generation of an asynchronous signal for a thread would have introduced an exception to the overall signal model. This requires a compelling reason in order to be justified."


So, by setting the cURL option, you disable DNS timeouts but thereby avoid related segmentation faults in multi-threaded applications. If you don't plan to invest more time into fixing this issue, that's an acceptable solution.

Pitfall #2 is merely related to libcurl, but it does cause random "crashes" if you intend to use libcurl along with HTTPS requests in multi-threaded applications. The module of interest is OpenSSL, the primary backend for libcurl and HTTPS. By default, using libcurl and HTTPS in multiple threads can lead to crashes, even if you do not share libcurl-specific handles or, more generally, memory. The reason for that is OpenSSL, which is not thread-safe by default.

Rather than providing a thread-safe library out-of-the-box, the OpenSSL team decided to leave this as an exercise for the (documentation) reader and / or user. OpenSSL provides callbacks that define functions for serializing access to resources. Two sets of callbacks exist, one provides access to a static set of locks that can be locked and unlocked. The other allows for allocation, deallocation and lock / unlock of a lock object.
By default, these are not implemented. In other words, code using OpenSSL with threads can fail unless the developer read the relevant parts of the documentation and implemented the required callbacks for all supported platforms correctly.
Hmmm. Don't get me wrong, I'm the first to vote for reading the documentation / specification before writing code and I'm a big fan of generic and flexible interfaces but, IMHO, the OpenSSL team took the easy path here. What I'd have done is implement locking for supported platforms right in the library to cover all direct and indirect (e.g. libcurl) users.

So, should libcurl define these callbacks (it doesn't)? That's a tough one. The main issue is that these callback are process-global and thus must be implemented by every single library or module that makes use of OpenSSL. Depending on the linking scenario of these and OpenSSL (dynamic or static), there's no truly correct implementation strategy for these callbacks - modules could overwrite each others callbacks and cause memory leaks, hangs, and segmentation faults as a result of an implementation mismatch at an arbitrary point-in-time.

Consider the following scenario: a process loads library A which depends on libcurl. Consequently, library A implemented the callbacks and assigns these at load time.
A library used by library A, THIRDPARTY, uses libpq, the official PostgreSQL driver. The driver also uses OpenSSL and initializes the callbacks in the function PQinitSSL.
Now, in case of the following sequence of events, this scenario can cause a hang because the unlock implementation does not match the lock implementation:

1) Initialization of library A
2) Library A initializes OpenSSL locking callbacks
3) Library A receives libcurl request from multiple threads, that result in repeated lock and unlock callback invocations
4) Thread 1 invokes the lock callback in OpenSSL
5) A new request handled by thread 2 results in the initialization of THIRDPARTY, which loads libpq and calls PQinitSSL. The initializes overrides the locking callbacks
6) Thread 1 invokes the unlock callback which has no effect on the previously locked lock object because the implementation changed between lock and unlock. Application hang or undefined behavior

Of course there are other scenarios that can lead to problems, all of them caused by the global locking callbacks that must be implemented for thread-safe operation. My point is that whenever global resources and multiple libraries are involved, chances are that these cannot coexist.

I recommend one of the following two approaches to following the OpenSSL contract while also minimizing collisions with other libraries or modules:

Solution A) Beware of other libraries in your locking callback implementation

Implement and install your callback knowing that other libraries might have installed callbacks already. In particular, do not install callbacks if callbacks are already in place (and consistent) to avoid causing hangs or undefined behavior. Uninstall callbacks on unload to avoid crashes on subsequent callback invocations. Uninstall the callbacks only if these represent callbacks installed by your library - previously installed callbacks must not be affected. Additionally, new callbacks installed by other modules after the initialization of your library should not be affected as well (it's better to leave them in place rather than having no callbacks installed).

The following pseudo-code implements these recommendations:
init()
{
if (!all_callbacks_are_installed) {
install_callbacks();
}
else {
/* Do not interfere with existing callbacks. */
}
}

destroy()
{
/* Uninstall callbacks to avoid segmentation faults after unload. */
/* Only uninstall callbacks owned by this library. */
if (installed_callbacks == library_callbacks) {
uninstall callbacks();
}
}

Solution B) Use a private OpenSSL library (less desirable)

If dependencies allow, link OpenSSL statically so that callbacks are not shared and conflicts can be avoided. The major drawback here is that OpenSSL cannot be updated independent from your implementation which is critical in the case security updates must be applied. Used libraries depending on OpenSSL must be linked statically as well, which might not be an option in the case of proprietary libraries or libraries not available as an archive. If you go down that road, make sure not to export OpenSSL symbols (GCC: compile with -f visibility=hidden) to prevent other libraries from accessing your private (and possibly incompatible) copy. This does have its issues, but sometimes, there's no other way.

Whatever the reasoning against implementing the locking code directly in the library was, it unnecessarily complicates the task of writing stable multi-threaded code for developers. Fortunately, if some thought goes into the callback implementation, the facility provided by OpenSSL is good enough for completing the task of enabling the thread-safe operation of OpenSSL and libcurl as well as other OpenSSL dependencies.


With these two issues addressed, libcurl should integrate just fine in your muti-threaded code.