ossp-pkg/xds/xds.pod
##
## OSSP xds - Extensible Data Serialization
## Copyright (c) 2001-2005 Ralf S. Engelschall <rse@engelschall.com>
## Copyright (c) 2001-2005 The OSSP Project <http://www.ossp.org/>
## Copyright (c) 2001-2005 Cable & Wireless <http://www.cw.com/>
##
## This file is part of OSSP xds, an extensible data serialization
## library which can be found at http://www.ossp.org/pkg/lib/xds/.
##
## Permission to use, copy, modify, and distribute this software for
## any purpose with or without fee is hereby granted, provided that
## the above copyright notice and this permission notice appear in all
## copies.
##
## THIS SOFTWARE IS PROVIDED `AS IS' AND ANY EXPRESSED OR IMPLIED
## WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
## MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
## IN NO EVENT SHALL THE AUTHORS AND COPYRIGHT HOLDERS AND THEIR
## CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
## SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
## LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
## USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
## ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
## OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
## OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
## SUCH DAMAGE.
##
## xds.pod: Unix manual page source
##
=pod
=head1 NAME
B<OSSP xds> - eXtensible Data Serialization
=head1 SYNOPSIS
xds_init,
xds_destroy,
xds_register,
xds_unregister,
xds_setbuffer,
xds_getbuffer,
xds_encode,
xds_decode,
xds_vencode,
xds_vdecode.
=head1 DESCRIPTION
The B<OSSP xds> library is generic and extensible encoding and decoding
framework for the serialization of arbitrary ISO C data types. B<OSSP
xds> consists of three components: the generic encoding and decoding
framework, a set of shipped engines to encode and decode values in
certain existing formats (Sun RPC/XDR and XDS/XML are currently
provided), and a run-time context, which is used to manage buffers,
registered engines, etc. The library is designed to allow fully
recursive and efficient encoding/decoding of arbitrary nested data.
=head2 INTRODUCTION
In order to use B<OSSP xds>, the first thing the developer has to
do is to create a valid context by calling xds_init(). The function
requires one parameter that determines whether to operate in encoding-
or decoding mode. A context can be used for encoding or decoding only;
it is not possible to use the same context for both operations. Once a
valid context has been obtained, the function xds_register() can be used
to register an arbitrary number of encoding (or decoding) engines within
the context.
Two sets of engines are included in the library, additional ones can be
easily programmed. These functions will handle any elementary datatype
defined by the ISO-C language, such as 32-bit integers, 64-bit integers,
unsigned integers (of both 32 and 64-bit), floating point numbers,
strings and octet streams.
Once all required encoding/decoding engines are registered, the
functions xds_encode() or xds_decode() may be used to actually perform
the encoding or decoding process. Any data type for which an engine has
been registered before can be handled by the library.
This means, it is possible for the developer to write custom engines for
any data type he desires to use and to register them in the context --
as long as these engines adhere to the C<xds_engine_t> interface defined
in F<xds.h>.
In particular, it is possible to register meta engines. This is an
engine designed to encode or decode data types which consist of several
elementary data types. Such an engine will simply re-use the existing
engines to encode or decode the elements of the structure.
The following example program (without error checking for simplicity)
will encode the unsigned integer 0x1234 into the B<XDR> format (known from
Sun RPC), decode it back into the native host format, and compare the
result to make sure it is the original value again:
#include <stdio.h>
#include <errno.h>
#include "xds.h"
int main(int argc, char *argv[])
{
xds_t *xds;
xds_uint32_t uint32 = 0x1234;
xds_uint32_t uint32_new;
char *buffer;
size_t buffer_size;
/* encoding */
xds = xds_init(XDS_ENCODE);
xds_register(xds, "uint32", &xdr_encode_uint32, NULL);
xds_encode(xds, "uint32", uint32);
xds_getbuffer(xds, XDS_GIFT, (void**)&buffer, &buffer_size);
xds_destroy(xds);
/* ...usually buffer is now transferred to a remote system... */
/* decoding */
xds = xds_init(XDS_DECODE);
xds_register(xds, "uint32", &xdr_decode_uint32, NULL);
xds_setbuffer(xds, XDS_LOAN, buffer, buffer_size);
xds_decode(xds, "uint32", &uint32_new);
xds_destroy(xds);
/* comparison */
if (uint32 == uint32_new)
printf("OK\n");
else
printf("Failure\n");
return 0;
}
=head1 THE XDS FRAMEWORK
B<OSSP xds> provides a generic framework for encoding and decoding.
The corresponding API is described here.
=over 4
=item xds_t *B<xds_init>(xds_mode_t I<mode>);
This function creates and initializes a context. The I<mode> parameter
may be either C<XDS_ENCODE> or C<XDS_DECODE>, depending on whether you
want to encode or decode data. If successful, xds_init() returns a
pointer to the context. In case of failure, xds_init() returns C<NULL>
and sets errno to C<ENOMEM> (failed to allocate internal memory buffers)
or C<EINVAL> (I<mode> parameter was invalid).
A context obtained from xds_init() must be destroyed by xds_destroy() if
it is no longer needed.
=item void B<xds_destroy>(xds_t *I<xds>);
xds_destroy() will destroy the context I<xds>, created by xds_init().
Doing so will return all resources associated with this context -- most
notably the memory used to buffer the results of encoding or decoding
any values. A context may not be used after it has been destroyed.
=item int B<xds_register>(xds_t *I<xds>, const char *I<name>, xds_engine_t I<engine>, void *I<engine_context>);
This function will register an engine in the provided context. An
I<engine> is potentially any function that fullfils the following
interface:
int B<engine>(xds_t *I<xds>, void *I<engine_context>, void *I<buffer>, size_t I<buffer_size>, size_t *I<used_buffer_size>, va_list *I<args>);
By calling xds_register(), the I<engine> will be registered under the
name I<name> in the context I<xds>. The last parameter I<engine_context>
may be used as the user sees fit: It will be passed when the engine is
actually called and may be used to implement an engine-specific context.
Most engines will not need a context of their own, in which case C<NULL>
should be specified here.
Please note that until the user calls xds_register() for a context he
obtained from xds_init(), no engines are registered for that context.
Even the engines included in the B<OSSP xds> source distribution are not
registered automatically.
For engine names, any combination of the characters a-z, A-Z, 0-9, "-",
and "_" may be used; anything else is not a legal engine name component.
xds_register() may return the following return codes: C<XDS_OK>
(everything went fine; the engine is registered now),
C<XDS_ERR_INVALID_ARG> (either I<xds>, I<name>, or I<engine> are
C<NULL> or I<name> contains illegal characters for an engine name), or
C<XDS_ERR_NO_MEM> (failed to allocate internally required buffers).
=item int B<xds_unregister>(xds_t *I<xds>, const char *I<name>);
xds_unregister() will remove the engine I<name> from the context I<xds>.
The function will return C<XDS_OK> in case everything went fine,
C<XDS_ERR_UNKNOWN_ENGINE> in case the engine I<name> is not registered
in I<xds>, or C<XDS_ERR_INVALID_ARG> if either I<xds> or I<name> are
C<NULL> or I<name> contains illegal characters for an engine name.
=item int B<xds_setbuffer>(xds_t *I<xds>, xds_scope_t I<flag>, void *I<buffer>, size_t I<buffer_len>);
This function allows the user to control the buffer handling: Calling
it will replace the buffer currently used in I<xds>. The address and
size of that buffer are passed to xds_setbuffer() via the I<buffer> and
I<buffer_len> parameters. The I<xds> parameter determines for which
context the new buffer will be set. Furthermore, you can set I<flag> to
either C<XDS_GIFT> or C<XDS_LOAN>.
C<XDS_GIFT> will tell B<OSSP xds> that the provided buffer is now
owned by the library and that it may be resized by calling realloc(3).
Furthermore, the buffer is free(3)'ed when I<xds> is destroyed. If
I<flag> is C<XDS_GIFT> and I<buffer> is C<NULL>, xds_setbuffer() will
simply allocate a buffer of its own to be set in I<xds>. Please note
that a buffer given to the library as a gift B<must> have been allocated
using malloc(3) -- it may not live on the stack because B<OSSP xds> will
try to free or to resize the buffer as it sees fit.
Passing C<XDS_LOAN> via I<flag> tells xds_setbuffer() that the buffer is
owned by the application and that B<OSSP xds> should not free nor resize
the buffer in any case. In this mode, passing a buffer of C<NULL> will
result in an invalid-argument error.
=item int B<xds_getbuffer>(xds_t *I<xds>, xds_scope_t I<flag>, void **I<buffer>, size_t *I<buffer_len>);
This function is the counterpart to xds_setbuffer(): It will get
the buffer currently used in the context I<xds>. The address of that
buffer is stored in the location I<buffer> points to; the length of the
buffer's content will be stored in the location I<buffer_len> points to.
The I<flag> argument may be set to either C<XDS_GIFT> or C<XDS_LOAN>.
The first setting means that the buffer is now owned by the application
and that B<OSSP xds> must not use it after this xds_getbuffer() call
anymore; it will allocate a new internal buffer instead. Of course,
this also means that the buffer will not be freed by xds_destroy();
the application has to free(3) the buffer itself when it is not needed
anymore.
Setting I<flag> to C<XDS_LOAN> tells B<OSSP xds> that the application
just wishes to peek into the buffer and will not modify it. The buffer
is still owned (and used) by B<OSSP xds>. Please note that the loaned
address returned by xds_getbuffer() may change after any other xds_xxx()
function call!
The function will return C<XDS_OK> (everything went fine) or
C<XDS_ERR_INVALID_ARG> (I<xds>, I<buffer> or I<buffer_len> are C<NULL>
or I<flag> is invalid) signifying success or failure respectively.
Please note: It is perfectly legal for xds_getbuffer() to return a
buffer of C<NULL> and a buffer length of C<0>! This happens when
xds_getbuffer() is called for a fresh context before a buffer has been
allocated at all.
=item int B<xds_vencode>(xds_t *I<xds>, const char *I<fmt>, va_list I<args>);
This function will encode one or several values using the appropriate
encoding engines registered in the context I<xds>. The parameter I<fmt>
contains a sprintf(3)-alike descriptions of the values to be
encoded; the actual values are provided in the varadic parameter I<args>.
The format for I<fmt> is simple: Just provide the names of the engines to
be used for encoding the appropriate value(s) in I<args>. Any non-legal
engine-name character may be used as a delimiter. In order to encode two
32-bit integers followed by a 64-bit integer, the format string
int32 int32 int64
could be used. In case you don't like the blank, use the colon instead:
int32:int32:int64
Of course the names to be used here have to match to the names used to
register the engines in I<xds> earlier.
Every time xds_vencode() is called, it will append the encoded data
at the end of the internal buffer stored in I<xds>. Thus, you can
call xds_vencode() several times in order to encode several values,
but you'll still get all encoded values stored in one buffer. Calling
xds_setbuffer() or xds_getbuffer() with gift semantics at any point
during encoding will re-set the buffer to the beginning. All values
that have been encoded into that buffer already will eventually be
overwritten when xds_encode() is called again. Hence: Don't call
xds_setbuffer() or xds_getbuffer() unless you actually want to access
the data stored in the buffer.
Also, it should be noted that the data you have to provide for I<args>
depends entirely on what the deployed engines expect to find on the
stack -- there is no "standard" on what should be put on the stack here.
The B<XML> and B<XDR> engines included in the distribution will simply expect
the value to be encoded to be found on the stack, but other engines may
act differently.
xds_vencode() will return any of the following return codes: C<XDS_OK>
(everything worked fine), C<XDS_ERR_NO_MEM> (failed to allocate or to
resize the internal buffer), C<XDS_ERR_OVERFLOW> (the internal buffer
is too small but is not owned by us), C<XDS_ERR_INVALID_ARG> (I<xds> or
I<fmt> are C<NULL>), C<XDS_ERR_UNKNOWN_ENGINE> (an engine name specified
in I<fmt> is not registered in I<xds>), C<XDS_ERR_INVALID_MODE> (I<xds>
is initialized in decode mode), or C<XDS_ERR_UNKNOWN> (the engine
returned an unspecified error).
=item int B<xds_encode>(xds_t *I<xds>, const char *I<fmt>, ...);
This function is basically identical to xds_vencode(), only that
it uses a different prototype syntax.
=item int B<xds_vdecode>(xds_t *I<xds>, const char *I<fmt>, va_list I<args>);
This function is almost identical to xds_vencode(): It expects a
context, a format string and a set of parameters for the engines, but
xds_vdecode() does not encode any data, it decodes the data back into
the native format. The format string determines which engines are to be
called by the framework in order to decode the values contained in the
buffer. The values will then be stored at the locations found in the
corresponding I<args> entry. But please note that the exact behavior of
the decoding engines is not specified! The B<XML> and B<XDR> engines included
in this distribution expect a pointer to a location where to store the
decoded value, but other engines may vary.
xds_vdecode() may return any of the following return codes: C<XDS_OK>
(everything went fine), C<XDS_ERR_INVALID_ARG> (I<xds> or I<fmt> are
C<NULL>), C<XDS_ERR_TYPE_MISMATCH> (the format string says the next
value is of a particular type, but that's not what we found in the
buffer), C<XDS_ERR_UNKNOWN_ENGINE> (an engine name specified in I<fmt>
is not registered in I<xds>), C<XDS_ERR_INVALID_MODE> (I<xds> has
been initialized in encode mode), C<XDS_ERR_UNDERFLOW> (an engine
tried to read more bytes from the buffer than what is data left), or
C<XDS_ERR_UNKNOWN> (an engine returned an unspecified error).
=item int B<xds_decode>(xds_t *I<xds>, const char *I<fmt>, ...);
This function is basically identical to xds_vdecode(), only that
it uses a different prototype syntax.
=back
=head1 THE XDR ENGINES
The B<OSSP xds> distribution ships with a set of engine functions which
implement the encoding and decoding for the B<XDR> encoding known from
Sun RPC.
Function Name Expected `args' Input Output
----------------------------------------------------------
xdr_encode_uint32() xds_uint32_t 4 bytes 4 bytes
xdr_decode_uint32() xds_uint32_t* 4 bytes 4 bytes
xdr_encode_int32() xds_int32_t 4 bytes 4 bytes
xdr_decode_int32() xds_int32_t* 4 bytes 4 bytes
xdr_encode_uint64() xds_uint64_t 4 bytes 4 bytes
xdr_decode_uint64() xds_uint64_t* 4 bytes 4 bytes
xdr_encode_int64() xds_int64_t 4 bytes 4 bytes
xdr_decode_int64() xds_int64_t* 4 bytes 4 bytes
xdr_encode_float() xds_float_t 4 bytes 4 bytes
xdr_decode_float() xds_float_t* 4 bytes 4 bytes
xdr_encode_double() xds_double_t 8 bytes 8 bytes
xdr_decode_double() xds_double_t* 8 bytes 8 bytes
xdr_encode_octetstream() void*, size_t variable variable
xdr_decode_octetstream() void**, size_t* variable variable
xdr_encode_string() char* variable variable
xdr_decode_string() char** variable variable
Please note that the functions xdr_decode_octetstream() and
xdr_decode_string() return a pointer to a buffer holding the decoded
data. This buffer has been allocated with malloc(3) and must be
free(3)'ed by the application when it is not required anymore. All other
callbacks write the decoded value into the location found on the stack,
but these behave differently because the length of the decoded data is
not known in advance and the application cannot provide a buffer that's
guaranteed to suffice.
=head1 THE XML ENGINES
The B<OSSP xds> distribution ships with a set of engine functions which
implement the encoding and decoding for an B<XML> based format specified by
the included B<XML> DTD.
Function Name Expected `args' Input Output
----------------------------------------------------------------
xml_encode_uint32() xds_uint32_t 4 bytes 8-27 bytes
xml_decode_uint32() xds_uint32_t* 18-27 bytes 4 bytes
xml_encode_int32() xds_int32_t 4 bytes 16-26 bytes
xml_decode_int32() xds_int32_t* 16-26 bytes 4 bytes
xml_encode_uint64() xds_uint64_t 8 bytes 18-37 bytes
xml_decode_uint64() xds_uint64_t* 18-37 bytes 8 bytes
xml_encode_int64() xds_int64_t 8 bytes 16-36 bytes
xml_decode_int64() xds_int64_t* 16-36 bytes 8 bytes
xml_encode_float() xds_float_t 4 bytes variable
xml_decode_float() xds_float_t* variable 4 bytes
xml_encode_double() xds_double_t 8 bytes variable
xml_decode_double() xds_double_t* variable 8 bytes
xml_encode_octetstream() void*, size_t variable variable
xml_decode_octetstream() void**, size_t* variable variable
xml_encode_string() char* variable variable
xml_decode_string() char** variable variable
Please note that the functions xml_decode_octetstream() and
xml_decode_string() return a pointer to a buffer holding the decoded
data. This buffer has been allocated with malloc(3) and must be
free(3)ed by the application when it is not required anymore. All other
callbacks write the decoded value into the location found on the stack,
but these behave differently because the length of the decoded data is
not known in advance and the application cannot provide a buffer that's
guaranteed to suffice.
=head1 EXTENDING THE LIBRARY
This section demonstrates how to write a "meta engine" for the B<OSSP
xds> framework. The example engine will encode a complex data structure,
consisting of three elementary data types. The structure is defined as
follows:
struct mystruct {
xds_int32_t small;
xds_int64_t big;
xds_uint32_t positive;
char text[16];
};
Some readers might wonder why the structure is defined using these
weird data types rather than the familiar ones like C<int>, C<long>,
etc. The reason is that these data types have an undefined size. An
C<int> variable will have, say, 32 bits when compiled on the average
Unix machine, but when the same program is compiled on a 64-bit machine
like Tru64 Unix, it will have a size of 64 bit. This is a problem
when those structures have to be exchanged between entirely different
systems, because the structures are binary incompatible -- something
even B<OSSP xds> cannot remedy, because it is impossible to construct a
bidirectional and lossless mapping in this case.
In order to encode an instance of this structure, we write an encoding
engine:
static int
encode_mystruct(
xds_t *xds, void *engine_context,
void *buffer, size_t buffer_size,
size_t *used_buffer_size,
va_list *args)
{
struct mystruct *ms;
ms = va_arg(*args, struct mystruct*);
return xds_encode(xds, "int32 int64 uint32 octetstream",
ms->small, ms->big, ms->positive,
ms->text, sizeof(ms->text));
}
This engine takes the address of the I<mystruct> structure from the
stack and then uses xds_encode() to handle all elements of I<mystruct>
separately -- which is fine, because these data types are supported
by B<OSSP xds> already (both by the shipped B<XDR> and B<XML> engines). It
is worth noting, though, that we refer to the other engines by name,
meaning that these engines must be registered in I<xds> by that name
before!
What is very nice, though, is the fact that this encoding engine does
not even need to know which particular engines are used to encode the
actual values! If the user registere the B<XDR> engines under the
appropriate names, I<mystruct> will be encoded in B<XDR>. If the user
registers the B<XML> engines under the appropriate names, I<mystruct>
will be encoded in B<XML>. Because of that property, we call such an
engine a "meta engine".
Of coures you need not necessarily implement an engine as a "meta
engine": Rather than going through xds_encode(), it would be possible
to execute the appropriate encoding engines directly. This had the
advantage of not depending on those engines being registered at all, but
it would make the custom engine depend on the elementary engines -- what
is an unnecessary limitation.
One more word about the engine syntax and semantics: As has been
mentioned earlier, any function that adheres to the interface shown
above is potentially an engine. These parameters have the following
meaning:
=over 4
=item I<xds>
This is the B<OSSP xds> context that was originally provided to the
xds_encode() call, which in turn executed the engine. It may be used,
for example, for executing xds_encode() again like we did in our example
engines.
=item I<engine_context>
The engine context can be used by the engine to store any type of
internal information. The value the engine will receive must have been
provided when the engine was registered by xds_register(). Engines
obviously may neglect this parameter if they don't need a context of
their own -- all engines included in the distribution do so.
=item I<buffer>
This parameter points to the buffer the encoded data should be written
to. In decoding mode, I<buffer> points to the encoded data, which should
be decoded; the location where the results should be stored at can be
found on the stack then.
=item I<buffer_size>
The number of bytes available in I<buffer>. In encoding mode, this means
"free space", in decoding mode, I<buffer_size> determines how many bytes
of encoded data are available in I<buffer> for consumption.
=item I<used_buffer_size>
This parameter points to a variable, which the callback must set before
returning in order to let the framework know how many bytes it consumed
from I<buffer>. A callback encoding, say, an int32 number into a 8 bytes
text representation would set the used_buffer_size to 8:
*used_buffer_size = 8;
In encoding mode, this variable determines how many bytes the engine has
written into I<buffer>; in decoding mode the variable determines how
many bytes the engines has read from I<buffer>.
=item I<args>
This pointer points to an initialized varadic argument.
Use the standard C macro va_arg(3) to fetch the actual data.
=back
A callback may return any of the following return codes, as defined in
F<xds.h>:
=over 4
=item C<XDS_OK>
No error.
=item C<XDS_ERR_NO_MEM>
Failed to allocate required memory.
=item C<XDS_ERR_OVERFLOW>
The buffer is too small to hold all encoded data. The callback may set
*I<used_buffer_size> to the number of bytes it needs in I<buffer>,
thereby giving the framework a hint by how many bytes it should
enlarge the buffer before trying the engine again, but just leaving
*I<used_buffer_size> alone will work fine too, it may just be a bit less
efficient in some cases. Obviously this return code does not make much
sense in decoding mode.
=item C<XDS_ERR_INVALID_ARG>
Unexpected or incorrect parameters.
=item C<XDS_ERR_TYPE_MISMATCH>
This return code will be returned in decoding mode in case the decoding
engine realizes that the data it is decoding does not fit what it is
expecting. Not all encoding formats will allow to detect this at all.
B<XDR>, for example, does not.
=item C<XDS_ERR_UNDERFLOW>
In decode mode, this error is be returned when an engine needs, say, 4
bytes of data in order to decode a value but I<buffer>/I<buffer_size>
provides less.
=item C<XDS_ERR_UNKNOWN>
Any other reason to fail than those listed before. Catch all...
=back
Let's take a look at the corresponding decoding "meta engine" now:
static int
decode_mystruct(
xds_t *xds, void *engine_context,
void *buffer, size_t buffer_size,
size_t *used_buffer_size,
va_list *args)
{
struct mystruct *ms;
size_t i;
char *tmp;
int rc;
ms = (struct mystruct *)va_arg(*args, void *);
rc = xds_decode(xds, "int32 int64 uint32 octetstream",
&(ms->small), &(ms->big), &(ms->positive),
&tmp, &i);
if (rc == XDS_OK) {
if (i == sizeof(ms->text))
memmove(ms->text, tmp, i);
else
rc = XDS_ERR_TYPE_MISMATCH;
free(tmp);
}
return rc;
}
The engine simply calls xds_decode() to handle the separate data types.
The only complication is that the octet stream decoding engines return a
pointer to malloc(3)ed buffer -- what is not what we need. Thus we have
to manually copy the contents of that buffer into the right place in the
structure and free the (now unused) buffer again.
A complete example program encoding and decoding C<mystruct> can be
found as F<docs/extended.c> in the B<OSSP xds> source distribution.
=head1 SEE ALSO
RFC 1832: `XDR: External Data Representation Standard',
R. Srinivasan, August 1995
XML-RPC Home Page: http://www.xmlrpc.org/
=head1 HISTORY
B<OSSP var> was initially written by Peter Simons
E<lt>simons@crypt.toE<gt> in August 2001 under contract with the B<OSSP>
sponsor B<Cable & Wireless>.
=cut