## ## OSSP ex - Exception Handling ## Copyright (c) 2002-2003 Ralf S. Engelschall ## Copyright (c) 2002-2003 The OSSP Project ## Copyright (c) 2002-2003 Cable & Wireless Germany ## ## This file is part of OSSP ex, an exception library ## which can be found at http://www.ossp.org/pkg/lib/ex/. ## ## Permission to use, copy, modify, and distribute this software for ## any purpose with or without fee is hereby granted, provided that ## the above copyright notice and this permission notice appear in all ## copies. ## ## THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED ## WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF ## MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. ## IN NO EVENT SHALL THE AUTHORS AND COPYRIGHT HOLDERS AND THEIR ## CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, ## SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT ## LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF ## USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ## ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, ## OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT ## OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF ## SUCH DAMAGE. ## ## ex.pod: exception library manual page ## =pod =head1 NAME B - Exception Handling =head1 VERSION B =head1 SYNOPSIS B I; B I [B I] B (I) I B(I, I, I); B; B I B I if (B) ... if (B) ... if (B) ... =head1 DESCRIPTION B is a small B style exception handling library for use in the B language. It allows you to use the paradigm of throwing and catching exceptions in order to reduce the amount of error handling code without making your program less robust. This is achieved by directly transferring exceptional return codes (and the program control flow) from the location where the exception is raised (throw point) to the location where it is handled (catch point) -- usually from a deeply nested sub-routine to a parent routine. All intermediate routines no longer have to make sure that the exceptional return codes from sub-routines are correctly passed back to the parent. =head2 EXCEPTIONS An B exception is a triple EI,I,IE where I identifies the class of the exception thrower, I identifies the particular class instance of the exception thrower, and I is the exceptional return code value the thrower wants to communicate. All three parts are of type "C" internally, but every value is useable which can be lossless "casted" to this type. Exceptions are created on-the-fly by the B command. =head2 APPLICATION PROGRAMMER INTERFACE (API) The B API consists of the following elements: =over 4 =item B I; This is the declaration of an exception variable. It is usually never initialized manually. Instead it is initialized by an B clause and just used read-only inside its block. Such a variable of type B consists of six attributes: =over 2 =item CI This is the I argument of the B call which created the exception. This can globally and uniquely identify the class to which I belongs to . Usually this is a pointer to a static object (variable, structure or function) which identifies the class of the thrower and allows the catcher to correctly handle I. It is usually just an additional (optional) information to I. =item CI This is the I argument of the B call which created the exception. This can globally and uniquely identify the class instance I belongs to (in case multiple instances exists at all). Usually this a pointer to a dynamic object (structure) which identifiers the particular instance of the thrower. It is usually just an additional (optional) information to I. =item CI This is the I argument of the B call which created the exception. This is the exceptional return code value which has to uniquely identify the type of exception. Usually this is the value which is Ced if no exceptions would be thrown. In the simple case this is just a numerical return code. In the complex case this can be a pointer to an arbitrary complex data structure describing the exception. =item CI This is the file name of the B source where the B call was performed. It is automatically provided as an additional information about the throw point and is intended mainly for tracing and debugging purposes. =item C I This is the line number inside the B source file name where the B call was performed. It is automatically provided as an additional information about the throw point and is intended mainly for tracing and debugging purposes. =item CI This is the function name (if determinable, else "C<#NA#>") inside the B source file name where the B call was performed. It is automatically provided as an additional information about the throw point and is intended mainly for tracing and debugging purposes. =back =item B I [B I] B (I) I This is the primary syntactical construct provided by B. It is modeled after the B C-C clause which in turn is very similar to an B C-C clause. It consists of an B block I which forms the dynamic scope for exception handling (i.e. exceptions directly thrown there or thrown from its sub-routines are caught), an optional B block I for performing cleanup operations and an B block I where the caught exceptions are handled. The control flow in case no exception is thrown is simply I, optionally followed by I; I is skipped. The control flow in case an exception is thrown is: I (up to the statement where the exception is thrown only), optionally followed by I, followed by I. The B, B and B cannot be used separately, they work only in combination because they form a language clause as a whole. In contrast to B there is only one B block and not multiple ones (all B exceptions are of the same B type B). If an exception is caught, it is stored in I for inspection inside the B block. Although having to be declared outside, the I value is only valid within the B block. But the variable can be re-used in subsequent B clauses, of course. The B block is a regular B language statement block, but it is not allowed to jump into it via "C" or C(3) or out of it via "C", "C", "C" or C(3) because there is some hidden setup and cleanup that needs to be done by B regardless of whether an exception is caught. Jumping into an B clause would avoid doing the setup, and jumping out of it would avoid doing the cleanup. In both cases the result is a broken exception handling facility. Nevertheless you are allowed to nest B clauses. The B and B blocks are regular B language statement blocks without any restrictions. You are even allowed to throw (and in the B block to re-throw) an exception. There is just one subtle portability detail you have to remember about B blocks: all accessible B objects have the (expected) values as of the time B was called, except that the values of objects of automatic storage invocation duration that do not have the "C" storage class I have been changed between the B invocation and B are indeterminate. This is because both you usually do not know which commands in the B were already successful before the exception was thrown (logically speaking) and because the underlying B setjmp(3) facility applies those restrictions (technically speaking). =item B(I, I, I); This builds an exception from the supplied arguments and throws it. If an B/B clause formed the dynamic scope of the B call, this exception is copied into the I of its B clause and the program control flow is continued in the (optional B and then in the) B block. If no B/B clause exists in the dynamic scope of the B call, the program calls C(3). The B can be performed everywhere, including inside B, B and B blocks. =item B; This is only valid within an B block and re-throws the current exception (in I). It is similar to the call B(I.ex_class, I.ex_object, I.ex_value) except for the difference that the C, C and C elements of the caught exception are passed through as it would have been never caught. =item B I This directive executes I while deferring the throwing of exceptions, i.e., inside the dynamic scope of B all B operations are remembered but deferred and on leaving the I the I occurred exception is thrown. The second and subsequent exceptions are ignored. The B block I is a regular B language statement block, but it is not allowed to jump into it via "C" or C(3) or out of it via "C", "C", "C" or C(3) because this would cause the deferral level to become out of sync. Jumping into an B clause would avoid increasing the exception deferral level, and jumping out of it would avoid decreasing it. In both cases the result is an incorrect exception deferral level. Nevertheless you are allowed to nest B clauses. =item B I This directive executes I while shielding it against the throwing of exceptions, i.e., inside the dynamic scope of B all B operations are just silently ignored. The B block is a regular B language statement block, but it is not allowed to jump into it via "C" or C(3) or out of it via "C", "C", "C" or C(3) because this would cause the shielding level to become out of sync. Jumping into an B clause would avoid increasing the exception shielding level, and jumping out of it would avoid decreasing it. In both cases the result is an incorrect exception shielding level. Nevertheless you are allowed to nest B clauses. =item B This is a boolean flag which can be checked inside the dynamic scope of an B clause to test whether the current scope is exception catching (see B/B clause). =item B This is a boolean flag which can be checked inside the dynamic scope of an B clause to test whether the current scope is exception deferring (see B clause). =item B This is a boolean flag which can be checked inside the dynamic scope of an B clause to test whether the current scope is exception shielding (see B clause). =back =head1 IMPLEMENTATION CONTROL B uses a very light-weight but still flexible exception facility implementation. The following adjustments can be made before including the F header: =head2 Machine Context In order to move the program control flow from the exception throw point (B) to the catch point (B), B uses four macros: =over 4 =item B<__ex_mctx_struct> This holds the contents of the machine context structure. A pointer to such a machine context is passed to the following macros as I. =item B<__ex_mctx_save>(__ex_mctx_struct *I) This is called by the prolog of B to save the current machine context in I. This function has to return true (not C<0>) after saving. If the machine context is restored (by B<__ex_mctx_restore>) it has to return false (C<0>). In other words, this function has to return twice and indicate the particular situation with the provided return code. =item B<__ex_mctx_restored>(__ex_mctx_struct *I) This is called by the epilog of B to perform additional operations at the new (restored) machine context after an exception was caught. Usually this is a no-operation macro. =item B<__ex_mctx_restore>(__ex_mctx_struct *I) This is called by B at the old machine context in order to restore the machine context of the B/B clause which will catch the exception. =back The default implementation (define C<__EX_MCTX_SJLJ__> or as long as C<__EX_MCTX_CUSTOM__> is not defined) uses the B jmp_buf(3) facility: #define __ex_mctx_struct jmp_buf jb; #define __ex_mctx_save(mctx) (setjmp((mctx)->jb) == 0) #define __ex_mctx_restored(mctx) /* noop */ #define __ex_mctx_restore(mctx) (void)longjmp((mctx)->jb, 1) Alternatively, you can define C<__EX_MCTX_SSJLJ__> to use B sigjmp_buf(3) or C<__EX_MCTX_MCSC__> to use B ucontext(3). For using a custom implementation define C<__EX_MCTX_CUSTOM__> and provide own definitions for the four B<__ex_mctx_xxxx> macros. =head2 Exception Context In order to maintain the exception catching stack and for passing the exception between the throw and the catch point, B uses a global exception context, returned on-the-fly by the callback "CB<__ex_ctx>C<)(void)>". By default, B<__ex_ctx> (which is B<__ex_ctx_default> as provided by F) returns a pointer to a static C context. For use in multi-threading environments, this should be overwritten with a callback function returning a per-thread context structure (see section B below). To initialize an exception context structure there are two macros defined: "B" for static initialization and "CBC<(ex_ctx_t *)>" for dynamic initialization. =head2 Termination Handler In case there is an exception thrown which is not caught by any B/B clauses, B calls the callback "CB<__ex_terminate>C<)(ex_t *)>". It receives a pointer to the exception object which was thrown. By default, B<__ex_terminate> (which is B<__ex_terminate_default> as provided by F) prints a message of the form "C<**EX: UNCAUGHT EXCEPTION: class=0xXXXXXXXX object=0xXXXXXXXX value=0xXXXXXXX [xxxx:NNN:xxxx]>" to F and then calls abort(3) in order to terminate the application. For use in multi-threading environments, this should be overwritten with a callback function which terminates only the current thread. Even better, a real application always should have a top-level B/B clause in its "C" in order to more gracefully terminate the application. =head2 Namespace Mapping The B implementation consistently uses the "C", "C<__ex_>" and "C<__EX_>" prefixes for namespace protection. But at least the "C" prefix for the API macros B, B, B, B, B and B sometimes have an unpleasant optical appearance. Especially because B is modeled after the exception facility of B where there is no such prefix on the language directives, of course. For this, B optionally provides the ability to provide additional namespace mappings for those API elements. By default (define C<__EX_NS_CXX__> or as long as C<__EX_NS_CUSTOM__> and C<__cplusplus> is not defined) you can additionally use the B style names B, B, B, B and B. As an alternative you can define C<__EX_NS_UCCXX__> to get the same but with a more namespace safe upper case first letter. =head1 PROGRAMMING PITFALLS Exception handling is a very elegant and efficient way of dealing with exceptional situation. Nevertheless it requires additional discipline in programming and there are a few pitfalls one must be aware of. Look the following code which shows some pitfalls and contains many errors (assuming a mallocex() function which throws an exception if malloc(3) fails): /* BAD EXAMPLE */ ex_try { char *cp1, *cp2, cp3; cp1 = mallocex(SMALLAMOUNT); globalcontext->first = cp1; cp2 = mallocex(TOOBIG); cp3 = mallocex(SMALLAMOUNT); strcpy(cp1, "foo"); strcpy(cp2, "bar"); } ex_cleanup { if (cp3 != NULL) free(cp3); if (cp2 != NULL) free(cp2); if (cp1 != NULL) free(cp1); } ex_catch(ex) { printf("cp3=%s", cp3); ex_rethrow; } This example raises a few issues: =over 4 =item B<01: variable scope> Variables which are used in the B or B clauses must be declared before the B clause, otherwise they only exist inside the B block. In the example above, C, C and C are automatic variables and only exist in the block of the B clause, the code in the B and B clauses does not know anything about them. =item B<02: variable initialization> Variables which are used in the B or B clauses must be initialized before the point of the first possible B is reached. In the example above, B would have trouble using C if mallocex() throws a exception when allocating a C buffer. =item B<03: volatile variables> Variables which are used in the B or B clauses must be declared with the storage class "C", otherwise they might contain outdated information if B throws an exception. If using a "free if unset" approach like the example does in the B clause, the variables must be initialized (see B<02>) I remain valid upon use. =item B<04: clean before catch> The B clause is not only written down before the B clause, it is also evaluated before the B clause. So, resources being cleaned up must no longer be used in the B block. The example above would have trouble referencing the character strings in the printf(3) statement because these have been freed before. =item B<05: variable uninitialization> If resources are passed away and out of the scope of the B/B/B construct and the variables were initialized for using a "free if unset" approach then they must be uninitialized after being passed away. The example above would free(3) C in the B clause if mallocex() throws an exception if allocating a C buffer. The C pointer hence becomes invalid. =back The following is fixed version of the code (annotated with the pitfall items for reference): /* GOOD EXAMPLE */ { /*01*/ char * volatile /*03*/ cp1 = NULL /*02*/; char * volatile /*03*/ cp2 = NULL /*02*/; char * volatile /*03*/ cp3 = NULL /*02*/; try { cp1 = mallocex(SMALLAMOUNT); globalcontext->first = cp1; cp1 = NULL /*05 give away*/; cp2 = mallocex(TOOBIG); cp3 = mallocex(SMALLAMOUNT); strcpy(cp1, "foo"); strcpy(cp2, "bar"); } clean { /*04*/ printf("cp3=%s", cp3 == NULL /*02*/ ? "" : cp3); if (cp3 != NULL) free(cp3); if (cp2 != NULL) free(cp2); /*05 cp1 was given away */ } catch(ex) { /*05 global context untouched */ rethrow; } } Alternatively, this could also be used: /* ALTERNATIVE GOOD EXAMPLE */ { /*01*/ char * volatile /*03*/ cp1 = NULL /*02*/; char * volatile /*03*/ cp2 = NULL /*02*/; char * volatile /*03*/ cp3 = NULL /*02*/; try { cp1 = mallocex(SMALLAMOUNT); globalcontext->first = cp1; /*05 keep responsibility*/ cp2 = mallocex(TOOBIG); cp3 = mallocex(SMALLAMOUNT); strcpy(cp1, "foo"); strcpy(cp2, "bar"); } clean { /*04*/ printf("cp3=%s", cp3 == NULL /*02*/ ? "" : cp3); if (cp3 != NULL) free(cp3); if (cp2 != NULL) free(cp2); if (cp1 != NULL) free(cp1); } catch(ex) { globalcontext->first = NULL; rethrow; } } =head1 MULTITHREADING ENVIRONMENTS B is designed to work both in single-threading and multi-threading environments. The default is to support single-threading only. But it is easy to configure B to work correctly in a multi-threading environment like B or B. There are only two issues: which machine context to use and where to store the exception context to make sure exception throwing happens only within a thread and does not conflict with the regular thread dispatching mechanism. =head2 GNU pth Using B together with B is straight-forward, because B 2.0 (and higher) already has support for B built-in. All which is needed is that B is configured with the B option C<--with-ex>. Then each B user-space thread has its own B exception context automatically. The default of using B jmp_buf(3) does not conflict with the thread dispatching mechanisms used by B. =head2 POSIX pthreads Using B inside an arbitrary B standard compliant environment is also straight-forward, although it requires extra coding. What you basically have to do is to make sure that the B<__ex_ctx> becomes a per-thread context and that B<__ex_terminate> terminates only the current thread. To get an impression, a small utility library for this follows: =over 2 =item F #ifndef __PTHREAD_EX_H__ #define __PTHREAD_EX_H__ #include int pthread_init_ex (void); int pthread_create_ex (pthread_t *, const pthread_attr_t *, void *(*)(void *), void *); #ifndef PTHREAD_EX_INTERNAL #define pthread_init pthread_init_ex #define pthread_create pthread_create_ex #endif #endif /* __PTHREAD_EX_H__ */ =item F #include #include #define PTHREAD_EX_INTERNAL #include "pthread_ex.h" #include "ex.h" /* context storage key */ static pthread_key_t pthread_ex_ctx_key; /* context destructor */ static void pthread_ex_ctx_destroy(void *data) { if (data != NULL) free(data); return; } /* callback: context fetching */ static ex_ctx_t *pthread_ex_ctx(void) { return (ex_ctx_t *) pthread_getspecific(pthread_ex_ctx_key); } /* callback: termination */ static void pthread_ex_terminate(ex_t *e) { pthread_exit(e->ex_value); } /* pthread init */ int pthread_init_ex(void) { int rc; /* additionally create thread data key and override OSSP ex callbacks */ pthread_key_create(&pthread_ex_ctx_key, pthread_ex_ctx_destroy); __ex_ctx = pthread_ex_ctx; __ex_terminate = pthread_ex_terminate; return rc; } /* internal thread entry wrapper information */ typedef struct { void *(*entry)(void *); void *arg; } pthread_create_ex_t; /* internal thread entry wrapper */ static void *pthread_create_wrapper(void *arg) { pthread_create_ex_t *wrapper; ex_ctx_t *ex_ctx; /* create per-thread exception context */ wrapper = (pthread_create_ex_t *)arg; ex_ctx = (ex_ctx_t *)malloc(sizeof(ex_ctx_t)); EX_CTX_INITIALIZE(ex_ctx); pthread_setspecific(pthread_ex_ctx_key, ex_ctx); /* perform original operation */ return wrapper->entry(wrapper->arg); } /* pthread_create() wrapper */ int pthread_create_ex(pthread_t *thread, const pthread_attr_t *attr, void *(*entry)(void *), void *arg) { pthread_create_ex_t wrapper; /* spawn thread but execute start function through wrapper */ wrapper.entry = entry; wrapper.arg = arg; return pthread_create(thread, attr, pthread_create_wrapper, &wrapper); } =back Now all which is required is that you include F after the standard F header and to call B once at startup of your program. =head1 EXAMPLES As a real-life example we will look how you can add optional B based exception handling support to a library B. The original library looks like this: =over 2 =item F typedef enum { FOO_OK, FOO_ERR_ARG, FOO_ERR_XXX, FOO_ERR_SYS, FOO_ERR_IMP, ... } foo_rc_t; struct foo_st; typedef struct foo_st foo_t; foo_rc_t foo_create (foo_t **foo); foo_rc_t foo_perform (foo_t *foo); foo_rc_t foo_destroy (foo_t *foo); =item F #include "foo.h" struct foo_st { ... } foo_rc_t foo_create(foo_t **foo) { if ((*foo = (foo_t)malloc(sizeof(foo))) == NULL) return FOO_ERR_SYS; (*foo)->... = ... return FOO_OK; } foo_rc_t foo_perform(foo_t *foo) { if (foo == NULL) return FOO_ERR_ARG; if (...) return FOO_ERR_XXX; return FOO_OK; } foo_rc_t foo_destroy(foo_t *foo) { if (foo == NULL) return FOO_ERR_ARG; free(foo); return FOO_OK; } =back Then the typical usage of this library is: #include "foo.h" ... foo_t foo; foo_rc_t rc; ... if ((rc = foo_create(&foo)) != FOO_OK) die(rc); if ((rc = foo_perform(foo)) != FOO_OK) die(rc); if ((rc = foo_destroy(foo)) != FOO_OK) die(rc); But what you really want, is to use exception handling to get rid of the intermixed error handling code: #include "foo.h" #include "ex.h" ... foo_t foo; ex_t ex; ... ex_try { foo_create(&foo); foo_perform(foo); foo_destroy(foo); } ex_catch (ex) { die((foo_rc_t)ex->ex_value); } You can achieve this very easily by changing the library as following: =over 2 =item F ... extern const char foo_id[]; ... =item F #include "foo.h" const char foo_id[] = "foo 1.0"; #ifdef WITH_EX #include "ex.h" #define FOO_RC(rv) \ ( (rv) != FOO_OK && (ex_catching && !ex_shielding) \ ? (ex_throw(foo_id, NULL, (rv)), (rv)) : (rv) ) #else #define FOO_RC(rv) (rv) #endif struct foo_st { ... } foo_rc_t foo_create(foo_t **foo) { if ((*foo = (foo_t)malloc(sizeof(foo))) == NULL) return FOO_RC(FOO_ERR_SYS); (*foo)->... = ... return FOO_OK; } foo_rc_t foo_perform(foo_t *foo) { if (foo == NULL) return FOO_RC(FOO_ERR_ARG); if (...) return FOO_RC(FOO_ERR_XXX); return FOO_OK; } foo_rc_t foo_destroy(foo_t *foo) { if (foo == NULL) return FOO_RC(FOO_ERR_ARG); free(foo); return FOO_OK; } =back This way the library by default is still exactly the same. If you now compile it with C<-DWITH_EX> you activate exception handling support. This means that all API functions throw exceptions where C is the C instead of returning this value. =head1 SEE ALSO B C, C, C. B C, C, C, C. B jmp_buf(3), setjmp(3), longjmp(3). B sigjmp_buf(3), sigsetjmp(3), siglongjump(3). B ucontext(3), setcontext(3), getcontext(3). =head1 HISTORY B was invented in January 2002 by Ralf S. Engelschall Erse@engelschall.comE for use inside the B project. Its creation was prompted by the requirement to reduce the error handling inside B. The core B/B clause was inspired by B and the implementation was partly derived from B 2.0.0, a similar library written 2000 by Adam M. Costello Eamc@cs.berkeley.eduE and Cosmin Truta Ecosmin@cs.toronto.eduE. The B clause was inspired by the B C clause. The B feature was inspired by an "C" shielding facility used in the B implementation. The B feature was invented to simplify an application's cleanup handling if multiple independent resources are allocated and have to be freed on error. =head1 AUTHORS Ralf S. Engelschall rse@engelschall.com www.engelschall.com =cut