## ## XDS - OSSP Extensible Data Serialization Library ## Copyright (c) 2001 The OSSP Project (http://www.ossp.org/) ## Copyright (c) 2001 Cable & Wireless Deutschland (http://www.cw.com/de/) ## ## This file is part of OSSP XDS, an extensible data serialization ## library which can be found at http://www.ossp.org/pkg/xds/. ## ## Permission to use, copy, modify, and distribute this software for ## any purpose with or without fee is hereby granted, provided that ## the above copyright notice and this permission notice appear in all ## copies. ## ## THIS SOFTWARE IS PROVIDED `AS IS' AND ANY EXPRESSED OR IMPLIED ## WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF ## MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. ## IN NO EVENT SHALL THE AUTHORS AND COPYRIGHT HOLDERS AND THEIR ## CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, ## SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT ## LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF ## USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ## ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, ## OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT ## OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF ## SUCH DAMAGE. ## ## xds.pod: Unix manual page source ## =pod =head1 NAME xds - OSSP Extensible Data Serialization =head1 SYNOPSIS =head1 DESCRIPTION The purpose of XDS is to encode data in a way that allows this data to be exchanged between different computer systems. Assume you'd want to transfer the value $1234 from host A to host B. Then you would encode it using XDS, transfer the encoded data over the network, and decode the value again at the other end. Every program that follows this process will read the correct value no matter what native representation is used internally. XDS consists of three components: The generic encoding and decoding framework, a set of engines to encode and decode values in a certain format, and a run-time context, which is used to manage buffers, registered engines, etc. In order to use the library, the first thing the developer has to do is to create a valid XDS context by calling xds_init(). The routine requires one parameter that determines whether to operate in encoding- or decoding mode. A context can be used for encoding or decoding only; it is not possible to use the same context for both operations. Once a valid XDS context has been obtained, the routine xds_register() can be used to register an arbitrary number of encoding or decoding engines within the context. Two sets of engines are included in the library. These routines will handle any elementary datatype defined by the ISO-C language, such as 32-bit integers, 64-bit integers, unsigned integers (of both 32- and 64-bit), floating point numbers, strings and octet streams. Once all required encoding/decoding engines are registered, the routines xds_encode() or xds_decode() may be used to actually perform the encoding or decoding process. Any data type for which an engine has been registered can be handled by the library. This means, that it is possible for the developer to write custom engines for any data type he desires to use and to register them in the context -- as long as these engines adhere to the xds_engine_t interface defined in xds.h. In particular, it is possible to register meta engines. That is an engine designed to encode or decode data types, which consist of several elementary data types. Such an engine will simply re-use the existing engines to encode or decode the elements of the structure. The following example program will encode an unsigend integer into the XDR format, decode it back into the native host format, and compare the result to make sure it is the original value again: #include #include #include int main() { xds_t* xds; xds_uint32_t uint32 = 0x12345678; xds_uint32_t new_uint32; char* buffer; size_t buffer_size; if ((xds = xds_init(XDS_ENCODE)) == NULL || xds_register(xds, "uint32", &xdr_encode_uint32, NULL) != XDS_OK || xds_encode(xds, "uint32", uint32) != XDS_OK || xds_getbuffer(xds, XDS_GIFT, (void**)&buffer, &buffer_size) != XDS_OK) { printf("Encoding failed.\n"); exit(1); } xds_destroy(xds); if ((xds = xds_init(XDS_DECODE)) == NULL || xds_register(xds, "uint32", &xdr_decode_uint32, NULL) != XDS_OK || xds_setbuffer(xds, XDS_LOAN, buffer, buffer_size) != XDS_OK || xds_decode(xds, "uint32", &new_uint32) != XDS_OK) { printf("Decoding failed.\n"); exit(1); } xds_destroy(xds); if (uint32 == new_uint32) printf("OK\n"); else printf("Failure\n"); return 0; } =head1 THE XDS FRAMEWORK =over 4 =item xds_t* xds_init(xds_mode_t I); This routine creates and initializes a context for use with the XDS library. The `mode' parameter may be either XDS_ENCODE or XDS_DECODE, depending on whether you want to encode or to decode data. If successful, xds_init() returns a pointer to the XDS context structure. In case of failure, xds_init() returns NULL and sets errno to ENOMEM (failed to allocate internal memory buffers) or EINVAL (`mode' parameter was invalid). A context obtained from xds_init() must be destroyed by xds_destroy() when it is not needed any more. =item void xds_destroy(xds_t* I); xds_destroy() will destroy an XDS context created by xds_init(). Doing so will return all resources associated with this context -- most notably the memory used to buffer the results of encoding or decoding any values. A context may not be used after it has been destroyed. =item int xds_register(xds_t* I, const char* I, xds_engine_t I, void* I); This routine will register an engine in the provided XDS context. An `engine' is potentially any function that fullfils the following interface: int engine(xds_t* xds, void* engine_context, void* buffer, size_t buffer_size, size_t* used_buffer_size, va_list* args); By calling xds_register(), the engine `engine' will be registered under the name `name' in the XDS context `xds'. The last parameter `engine_context' may be used as the user sees fit: It will be passed when the engine is actually called and may be used to implement an engine-specific context. Most engines will not need a context of their own, in which case NULL should be specified here. Please note that until the user calls xds_register() for an XDS context he obtained from xds_init(), no engines are registered for that context. Even the engines included in the library distribution are not registered automatically. For engine names, any combination of the characters `a-z', `A-Z', `0-9', `-', and `_' may be used; anything else is not a legal engine name component. xds_register() may return the following return codes: XDS_OK (everything went fine; the engine is registered now), XDS_ERR_INVALID_ARG (either `xds', `name', or `engine' are NULL or `name' contains illegal characters for an engine name), or XDS_ERR_NO_MEM (failed to allocate internally required buffers). =item int xds_unregister(xds_t* I, const char* I); xds_unregister() will remove the engine `name' from XDS context `xds'. The function will return XDS_OK in case everything went fine, XDS_ERR_UNKNOWN_ENGINE in case the engine `name' is not registered in `xds', or XDS_ERR_INVALID_ARG if either `xds' or `name' are NULL or `name' contains illegal characters for an engine name. =item int xds_setbuffer(xds_t* I, xds_scope_t I, void* I, size_t I); This routine allows the user to control XDS' buffer handling: Calling it will replace the buffer currently used in `xds'. The address and size of that buffer are passed to xds_setbuffer() via the `buffer' and `buffer_len' parameters. The `xds' parameter determines for which XDS context the new buffer will be set. Furthermore, you can set `flag' to either XDS_GIFT or XDS_LOAN. XDS_GIFT will tell XDS that the provided buffer is now owned by the library and that it may be resized by calling realloc(3). Furthermore, the buffer is free(3)ed when `xds' is destroyed. If `flag' is XDS_GIFT and `buffer' is NULL, xds_setbuffer() will simply allocate a buffer of its own to be set in `xds'. Please note that a buffer given to XDS as gift B have been allocated using malloc(3) -- it may not live on the stack because XDS will try to free or to resize the buffer as it sees fit. Passing XDS_LOAN via `flag' tells xds_setbuffer() that the buffer is owned by the application and that XDS should not free nor resize the buffer in any case. In this mode, passing a buffer of NULL will result in an invalid-argument error. =item int xds_getbuffer(xds_t* I, xds_scope_t I, void** I, size_t* I); This routine is the counterpart to xds_setbuffer(): It will get the buffer currently used in the XDS context `xds'. The address of that buffer is stored in the location `buffer' points to; the length of the buffer's content will be stored in the location `buffer_len' points to. The `flag' argument may be set to either XDS_GIFT or XDS_LOAN. The first setting means that the buffer is now owned by the application and that XDS must not use it after this xds_getbuffer() call anymore; the library will allocate a new internal buffer instead. Of course, this also means that the buffer will not be freed by xds_destroy(); the application has to free(3) the buffer itself when it is not needed anymore. Setting `flag' to XDS_LOAN tells XDS that the application just wishes to peek into the buffer and will not modify it. The buffer is still owned (and used) by XDS. Please note that the loaned address returned by xds_getbuffer() may change after any other xds_xxx() function call! The routine will return XDS_OK (everything went fine) or XDS_ERR_INVALID_ARG (`xds', `buffer' or `buffer_len' are NULL or `flag' is invalid) signifying success or failure respectively. Please note: It is perfectly legal for xds_getbuffer() to return a buffer of NULL and a buffer length of 0! This happens when xds_getbuffer() is called for an XDS context before a buffer has been allocated. =item int xds_vencode(xds_t* I, const char* I, va_list I); This routine will encode one or several values using the appropriate encoding engines registered in XDS context `xds'. The parameter `fmt' contains a sprintf(3)-alike descriptions of the values to be encoded; the actual values are provided in the varadic parameter `args'. The format for `fmt' is simple: Just provide the names of the engines to be used for encoding the appropriate value(s) in `args'. Any non-legal engine-name character may be used as a delimiter. In order to encode two 32-bit integers followed by a 64-bit integer, the format string int32 int32 int64 could be used. In case you don't like the blank, use the colon instead: int32:int32:int64 Of course the names to be used here have to match to the names used to register the engines in `xds' earlier. Every time xds_vencode() is called, it will append the encoded data at the end of the internal buffer stored in `xds'. Thus, you can call xds_vencode() several times in order to encode several values, but you'll still get all encoded values stored in one buffer. Calling xds_setbuffer() or xds_getbuffer() with gift semantics at any point during encoding will re-set the buffer to the beginning. All values that have been encoded into that buffer already will eventually be overwritten when xds_encode() is called again. Hence: Don't call xds_setbuffer() or xds_getbuffer() unless you actually want to access the data stored in the buffer. Also, it should be noted that the data you have to provide for `args' depends entirely on what the deployed engines expect to find on the stack -- there is no `standard' on what should be put on the stack here. The XML and XDR engines included in the distribution will simply expect the value to be encoded to be found on the stack, but other engines may act differently. xds_vencode() will return any of the following return codes: XDS_OK (everything worked fine), XDS_ERR_NO_MEM (failed to allocate or to resize the internal buffer), XDS_ERR_OVERFLOW (the internal buffer is too small but is not owned by us), XDS_ERR_INVALID_ARG (`xds' or `fmt' are NULL), XDS_ERR_UNKNOWN_ENGINE (an engine name specified in `fmt' is not registered in `xds'), XDS_ERR_INVALID_MODE (`xds' is initialized in decode mode), or XDS_ERR_UNKNOWN (the engine returned an unspecified error). =item int xds_encode(xds_t* I, const char* I, ...); This routine is basically identical to xds_vencode(), only that it uses a different prototype syntax. =item int xds_vdecode(xds_t* I, const char* I, va_list I); This routine is almost identical to xds_vencode(): It expects an XDS context, a format string and a set of parameters for the engines, but xds_vdecode() does not encode any data, it decodes the data back into the native format. The format string determines which engines are to be called by the framework in order to decode the values contained in the buffer. The values will then be stored at the locations found in the corresponding `args' entry. But please note that the exact behavior of the decoding engines is not specified! The XML and XDR engines included in this distribution expect a pointer to a location where to store the decoded value, but other engines may vary. xds_vdecode() may return any of the following return codes: XDS_OK (everything went fine), XDS_ERR_INVALID_ARG (`xds' or `fmt' are NULL), XDS_ERR_TYPE_MISMATCH (the format string says the next value is of type $A$, but that's not what we found in the buffer), XDS_ERR_UNKNOWN_ENGINE (an engine name specified in `fmt' is not registered in `xds'), XDS_ERR_INVALID_MODE (`xds' has been initialized in encode mode), XDS_ERR_UNDERFLOW (an engine tried to read $n$ bytes from the buffer, but we don't have that much data left), or XDS_ERR_UNKNOWN (an engine returned an unspecified error). =item int xds_decode(xds_t* I, const char* I, ...); This routine is basically identical to xds_vdecode(), only that it uses a different prototype syntax. =back =head1 THE XDR ENGINES Function Name Expected `args' Input Output ----------------------------------------------------------------- xdr_encode_uint32() xds_uint32_t 4 bytes 4 bytes xdr_decode_uint32() xds_uint32_t* 4 bytes 4 bytes xdr_encode_int32() xds_int32_t 4 bytes 4 bytes xdr_decode_int32() xds_int32_t* 4 bytes 4 bytes xdr_encode_uint64() xds_uint64_t 4 bytes 4 bytes xdr_decode_uint64() xds_uint64_t* 4 bytes 4 bytes xdr_encode_int64() xds_int64_t 4 bytes 4 bytes xdr_decode_int64() xds_int64_t* 4 bytes 4 bytes xdr_encode_double() xds_double_t ? bytes ? bytes xdr_decode_double() xds_double_t* ? bytes ? bytes xdr_encode_octetstream() void*, size_t variable variable xdr_decode_octetstream() void**, size_t* variable variable xdr_encode_string() char* variable variable xdr_decode_string() char** variable variable Please note that the routines xdr_decode_octetstream() and xdr_decode_string() return a pointer to a buffer holding the decoded data. This buffer has been allocated with malloc(3) and must be free(3)ed by the application when it is not required anymore. All other callbacks write the decoded value into the location found on the stack, but these behave differently because the length of the decoded data is not known in advance and the application cannot provide a buffer that's guaranteed to suffice. =head1 THE XML ENGINES Function Name Expected `args' Input Output ------------------------------------------------------------------------ xml_encode_uint32() xds_uint32_t 4 bytes 8-27 bytes xml_decode_uint32() xds_uint32_t* 18-27 bytes 4 bytes xml_encode_int32() xds_int32_t 4 bytes 16-26 bytes xml_decode_int32() xds_int32_t* 16-26 bytes 4 bytes xml_encode_uint64() xds_uint64_t 8 bytes 18-37 bytes xml_decode_uint64() xds_uint64_t* 18-37 bytes 8 bytes xml_encode_int64() xds_int64_t 8 bytes 16-36 bytes xml_decode_int64() xds_int64_t* 16-36 bytes 8 bytes xml_encode_double() xds_double_t ? bytes ? bytes xml_decode_double() xds_double_t* ? bytes ? bytes xml_encode_octetstream() void*, size_t variable variable xml_decode_octetstream() void**, size_t* variable variable xml_encode_string() char* variable variable xml_decode_string() char** variable variable Please note that the routines xml_decode_octetstream() and xml_decode_string() return a pointer to a buffer holding the decoded data. This buffer has been allocated with malloc(3) and must be free(3)ed by the application when it is not required anymore. All other callbacks write the decoded value into the location found on the stack, but these behave differently because the length of the decoded data is not known in advance and the application cannot provide a buffer that's guaranteed to suffice. =head1 EXTENDING THE XDS LIBRARY This section demonstrates, how to write a `meta engine' and for the XDS framework. The example engine will encode a complex data structure, consisting of three elementary data types. The structure is defined as follows: struct mystruct { xds_int32_t small; xds_int64_t big; xds_uint32_t positive; char text[16]; }; Some readers might wonder why the structure is defined using these weird data types rather than the familiar ones like int, long, etc. The reason is that these data types have an undefined size. An int variable will have, say, 32 bits when compiled on the average Unix machine, but when the same program is compiled on a 64-bit machine like TRUE64 Unix, it will have a size of 64 bit. That is a problem when those structures have to be exchanged between entirely different systems, because the structures are binary incompatible -- something even XDS cannot remedy. In order to encode an instance of this structure, we write an encoding engine: static int encode_mystruct(xds_t* xds, void* engine_context, void* buffer, size_t buffer_size, size_t* used_buffer_size, va_list* args) { struct mystruct* ms; ms = va_arg(*args, struct mystruct*); return xds_encode(xds, "int32 int64 uint32 octetstream", ms->small, ms->big, ms->positive, ms->text, sizeof(ms->text)); } This engine takes the address of the `mystruct' structure from the stack and then uses xds_encode() to handle all elements of `mystruct' separately -- which is fine, because these data types are supperted by XDS already. It is worth noting, though, that we refer to the other engines by name, meaning that these engines must be registered in `xds' by that name! What is very nice, though, is the fact that this encoding engine does not even need to know which engines are used to encode the actual values! If the user registeres the XDR engines under the appropriate names, `mystruct' will be encoded in XDR. If the user registeres the XML engines under the appropriate names, `mystruct' will be encoded in XML. Because of that property, we call such an engine a `meta engine'. Of coures you need not necessarily implement an engine as B engine: Rather than going through xds_encode(), it would be possible to execute the appropriate encoding engines directly. This had the advantage of not depending on those engines being registered at all, but it would make the custom engine depend on the elementary engines -- what is an unnecessary limitation. One more word about the engine syntax and semantics: As has been mentioned earlier, any function that adheres to the interface shown above is potentially an engine. These parameters have the following meaning: =over 4 =item xds -- This is the XDS context that was originally provided to the xds_encode() call, which in turn executed the engine. It may be used, for example, for executing xds_encode() again like we did in our example engines. =item engine_context -- The engine context can be used by the engine to store any type of internal information. The value the engine will receive must have been provided when the engine was registered by xds_register(). Engines obviously may neglect this parameter if they don't need a context of their own -- all engines included in the distribution do so. =item buffer -- This parameter points to the buffer the encoded data should be written to. In decoding mode, `buffer' points to the encoded data, which should be decoded; the location where the results should be stored at can be found on the stack then. =item buffer_size -- The number of bytes available in `buffer'. In encoding mode, this means `free space', in decoding mode, `buffer_size' determines how many bytes of encoded data are available in `buffer' for consumption. =item used_buffer_size -- This parameter points to a variable, which the callback must set before returning in order to let the framework know how many bytes it consumed from `buffer'. A callback encoding, say, an int32 number into a 8 bytes text representation would set the used_buffer_size to 8: *used_buffer_size = 8; In encoding mode, this variable determines how many bytes the engine has written into `buffer'; in decoding mode the variable determines how many bytes the engines has read from `buffer'. =item args -- This pointer points to an initialized varadic argument. Use the standard C macro va_arg(3) to fetch the actual data. =back A callback may return any of the following return codes, as defined in xds.h: =over 4 =item XDS_OK -- No error. =item XDS_ERR_NO_MEM -- Failed to allocate required memory. =item XDS_ERR_OVERFLOW -- The buffer is too small to hold all encoded data. The callback may set `*used_buffer_size' to the number of bytes it needs in `buffer', thereby giving the framework a hint by how many bytes it should enlarge the buffer before trying the engine again, but just leaving `*used_buffer_size' alone will work fine too, it may just be a bit less efficient in some cases. Obviously this return code does not make much sense in decoding mode. =item XDS_ERR_INVALID_ARG -- Unexpected or incorrect parameters. =item XDS_ERR_TYPE_MISMATCH -- This return code will be returned in decoding mode in case the decoding engine realizes that the data it is decoding does not fit what it is expecting. Not all encoding formats will allow to detect this at all. XDR, for example, does not. =item XDS_ERR_UNDERFLOW -- In decode mode, this error is be returned when an engine needs, say, 4 bytes of data in order to decode a value but `buffer'/'buffer_size' provides less. =item XDS_ERR_UNKNOWN -- Any other reason to fail than those listed before. Catch all ... =back Let's take a look at the corresponding decoding engine now: static int decode_mystruct(xds_t* xds, void* engine_context, void* buffer, size_t buffer_size, size_t* used_buffer_size, va_list* args) { struct mystruct* ms; size_t i; char* tmp; int rc; ms = va_arg(*args, struct mystruct*); rc = xds_decode(xds, "int32 int64 uint32 octetstream", &(ms->small), &(ms->big), &(ms->positive), &tmp, &i); if (rc == XDS_OK) { if (i == sizeof(ms->text)) memmove(ms->text, tmp, i); else rc = XDS_ERR_TYPE_MISMATCH; free(tmp); } return rc; } The engine simply calls xds_decode() to handle the separate data types. The only complication is that the octet stream decoding engines return a pointer to malloc(3)ed buffer -- what is not what we need. Thus we have to manually copy the contents of that buffer into the right place in the structure and free the (now unused) buffer again. A complete example program encoding and decoding `mystruct' can be found at docs/extended.c in the distribution. =head1 SO ALSO =over 4 =item RFC 1832: `XDR: External Data Representation Standard', R. Srinivasan, August 1995 =item XML-RPC Home Page: http://www.xmlrpc.org/ =back =cut