OSSP CVS Repository

ossp - Check-in [4278]
Not logged in
[Honeypot]  [Browse]  [Home]  [Login]  [Reports
[Search]  [Ticket]  [Timeline
  [Patchset]  [Tagging/Branching

Check-in Number: 4278
Date: 2001-Aug-02 15:58:07 (local)
2001-Aug-02 13:58:07 (UTC)
User:simons
Branch:
Comment: Currently, the manual contains an introduction, a description of the architecture and two example programs demonstrating how to encode and decode. Much remains to be written.
Tickets:
Inspections:
Files:
ossp-pkg/xds/docs/libxds.tex      added-> 1.1

ossp-pkg/xds/docs/libxds.tex -> 1.1

*** /dev/null    Sat Nov 23 06:10:44 2024
--- -    Sat Nov 23 06:10:48 2024
***************
*** 0 ****
--- 1,571 ----
+ % -*- mode: LaTeX; fill-column: 75; -*-
+ %
+ % $Id: libxds.tex,v 1.1 2001/08/02 13:58:07 simons Exp $
+ %
+ \documentclass[a4paper,10pt,pointlessnumbers,bibtotoc]{scrartcl}
+ \usepackage[dvips,xdvi]{graphicx}
+ \usepackage{fancyvrb}
+ \typearea[2cm]{12}
+ \fussy
+ 
+ \begin{document}
+ 
+ \subject{Cable \& Wireless Application Development}
+ \title{XDS -- eXtensible Data Serialization}
+ \author{Peter Simons $<$simons@computer.org$>$}
+ \date{2001-08-01}
+ \maketitle
+ 
+ \section{Introduction}
+ 
+ In today's networked world, computer systems of all brands and flavours
+ communicate with each other. Unfortunately, these systems are far from
+ being identical: Many systems use different internal representations for
+ the same thing. Look at the (hexadecimal) number \$1234 for instance: On a
+ big endian machine, this number will be stored in memory the way you'd
+ intuitively expect: \$12~\$34 --- the more significant byte preceeds the
+ less significant one. On a little endian machine, though, the number \$1234
+ will be stored like this: \$34~\$12 --- exactly the other way round.
+ 
+ As a result, you cannot just write the number \$1234 to a socket and expect
+ the other end to understand it correctly, because if the endians differ,
+ the reader will read a different number than the writer sent. Things will
+ get even more complicated when you start exchanging floating point numbers,
+ for which about a dozen different encodings exist!
+ 
+ Solving these problems is the domain of libxds; its purpose is to encode
+ data in a way that allows this data to be exchanged between computer
+ systems of different types. Assume you'd want to reliably transfer the
+ value \$1234 from host A to host B. Then you would encode the value using
+ libxds, transfer the encoded data via the network, and decode the value
+ again at the other end. Every application that follows this process will
+ read the correct value no matter what native representation its hosting
+ platform uses internally.
+ 
+ \begin{figure}[tbh]
+     \begin{center}
+         \includegraphics[width=\textwidth]{data-exchange.eps}
+         \caption{Data exchange using libxds}
+         \label{data exchange}
+     \end{center}
+ \end{figure}
+ 
+ There is a rich variety of applications for such a functionality: libxds
+ may be used to encode data before it is written to disk or read from the
+ disk, it may be used to encode data to be exchanged between processes over
+ the network, etc. Because of this variety, special attention has been paid
+ to the library design.
+ 
+ \paragraph{The library has been designed to be extensible.}
+ The functionality is split into a generic encoding and decoding framework
+ and a set of formatting engines. These engines can be plugged into the
+ framework at run-time to actually encode and decode data. Because of this
+ architecture, libxds can be customized to deploy any data format the
+ developer sees fit. Included in the distribution are formatting engines for
+ the XDR format specified in \cite{xdr} and for the XML format specified in
+ \cite{xml}.
+ 
+ \paragraph{The library is convenient to use.}
+ An arbitrary number of variables can be encoded or decoded with one single
+ function call. All memory management is done by libxds, the developer
+ doesn't need bother to allocate or to manage buffers for the encoded or
+ decoded data. Automatic buffer management can be switched off at run-time,
+ though, for maximum performance.
+ 
+ \paragraph{Performance.}
+ Since all transferred data has to wander through libxds, the library has
+ been written to encode and decode with maximim performance. The generic
+ encoding framework adds almost no run-time overhead to the encoding
+ process. If non-automatic buffer management has been selected, hardly
+ anything but the actual formatting engines is executed.
+ 
+ \paragraph{Robustness.}
+ In order to verify that the library is working correctly, a set of
+ regression tests is included in the distribution. The test suits will ---
+ among other things --- encode known values and compare the result with the
+ expected (correct) values. This ensures that libxds works correctly on any
+ platform.
+ 
+ \paragraph{Use standard formats.}
+ The supported formats XDR and XML and widely known and accepted formats,
+ which are most likely interoperable with other marshaling implementations.
+ For XDR for example, it would be possible to encode data with libxds and to
+ decode it with an entirely different XDR implementation or vice versa.
+ 
+ \paragraph{Portability.}
+ libxds has been written with portability in mind. Development took place on
+ FreeBSD, Linux and Solaris; other platforms has been used to test the
+ results. It is expected that libxds will compile and function on virtually
+ any POSIX.1-compilant system with a moderately modern ISO-C compiler. Gnu's
+ CC (gcc) is known to compile the library just fine. For maximum
+ portability, GNU autoconf has been used to determine the target system's
+ properties.
+ 
+ \section{Architecture of libxds}
+ 
+ \begin{figure}[htb]
+     \begin{center}
+         \includegraphics[width=\textwidth]{architecture.eps}
+         \caption{Components of libxds}
+         \label{libxds components}
+     \end{center}
+ \end{figure}
+ 
+ The architecture of libxds is illustrated in figure~\ref{libxds
+ components}. libxds consists of three components: The generic encoding and
+ decoding framework, a set of formatting engines to encode and decode values
+ in a certain forman, and a run-time context, which is used to manage
+ buffers, registered engines, etc.
+ 
+ In order to use the library, the first thing the developer has to do is to
+ create a valid XDS context by calling {\sf xds\_init()}. The routine
+ requires one parameter that determines whether to operate in encoding- or
+ decoding mode. A context can be used for encoding or decoding only; it is
+ not possible to use the same context for both operations. Once a valid XDS
+ context has been obtained, the routine {\sf xds\_register()} can be used to
+ register an arbitrary number of formatting engines within the context.
+ 
+ A set of formatting engines has been included in the library. These
+ routines will handle any elementary datatype included in the ISO-C language
+ such as 32-bit integers, 64-bit integers, unsigned integers (of both 32-
+ and 64-bit), floating point numbers, strings and octet streams.
+ 
+ Once all required formatting engines are registered, the routines {\sf
+ xds\_encode()} or {\sf xds\_\-decode()} may be used to actually perform the
+ encoding or decoding process. Any data type for which a formatting engine
+ has been registered can be handled by the library.
+ 
+ This means, that it is possible for the developer to write custom
+ formatting engines for any data type he desires to use and to register them
+ in the context as long as these engines adhere to the {\sf xds\_engine\_t}
+ interface defined in {\sf xds.h}.
+ 
+ In particular it is possible to register meta formatting engines. That is a
+ formatting engine designed to encode or decode structures --- data types
+ which consist of several elementary data types. The formatting engine for
+ such a structure will simply re-use the existing engines in order to encode
+ or decode the whole structure. The clou here is that the meta engine
+ doesn't even need to know \emph{which} low-level formatting engines are
+ registered in order to use them. Hence, a meta engine may format the whole
+ structure in XDR, XML, or any other format without needing to know anything
+ about the details.
+ 
+ This topic is addressed in great detail in section~\ref{meta engines} of
+ this document, but before we come to that rather advanced topic, let us
+ start by studying two simple examples of how data is encoded and decoded
+ using libxds.
+ 
+ \section{Using the XDS library}
+ 
+ \subsection{Encoding}
+ 
+ The following example program will encode three variables using the XDR
+ formatting engines. The result of the process will then be written to the
+ standard output stream, which can be redirected to a file or piped into the
+ decoding program described in the next section. Just take a look at the
+ source code for a moment, we will then go on to discuss all relevant
+ sections line by line.
+ 
+ \begin{Verbatim}[numbers=left,fontsize=\small,frame=lines]
+ #include <stdio.h>
+ #include <unistd.h>
+ #include <string.h>
+ #include <errno.h>
+ #include <xds.h>
+ 
+ static void error_exit(int rc, const char* msg, ...)
+     {
+     va_list args;
+     va_start(args, msg);
+     vfprintf(stderr, msg, args);
+     va_end(args);
+     exit(rc);
+     }
+ 
+ int main()
+     {
+     xds_t* xds;
+     char*  buffer;
+     size_t buffer_size;
+ 
+     xds_int32_t  int32  = -42;
+     xds_uint32_t uint32 = 0x12345678;
+     const char*  string = "This is a test.";
+ 
+     xds = xds_init(XDS_ENCODE);
+     if (xds == NULL)
+         error_exit(1, "Failed to initialize XDS context: %s\n", strerror(errno));
+ 
+     if (xds_register(xds, "int32",  &xdr_encode_int32, NULL) != XDS_OK ||
+         xds_register(xds, "uint32", &xdr_encode_uint32, NULL) != XDS_OK ||
+         xds_register(xds, "string", &xdr_encode_string, NULL) != XDS_OK)
+         error_exit(1, "Failed to register my encoding engines!\n");
+ 
+     if (xds_encode(xds, "int32 uint32 string", int32, uint32, string) != XDS_OK)
+         error_exit(1, "xds_encode() failed!\n");
+ 
+     if (xds_getbuffer(xds, XDS_GIFT, (void**)&buffer, &buffer_size) != XDS_OK)
+         error_exit(1, "getbuffer() failed.\n");
+ 
+     xds_destroy(xds);
+ 
+     write(STDOUT_FILENO, buffer, buffer_size);
+ 
+     free(buffer);
+ 
+     fprintf(stderr, "Encoded data:\n");
+     fprintf(stderr, "\tint32   = %d\n", int32);
+     fprintf(stderr, "\tuint32 = 0x%x\n", uint32);
+     fprintf(stderr, "\tstring = \"%s\"\n", string);
+ 
+     return 0;
+     }
+ \end{Verbatim}
+ 
+ \paragraph{Lines 1--5.}
+ The program starts by including several system headers, which define the
+ prototypes for some routines we use. The most interesting header in our
+ case is of course {\sf xds.h} --- the header of libxds. Please note that
+ all declarations required to use libxds are included in that file.
+ 
+ \paragraph{Lines 7--13.}
+ The {\sf error\_exit()} routine is not relevant for the example; we just
+ define it to make the rest of the source code shorter and easier to read.
+ 
+ \paragraph{Lines 16--53.}
+ This is the interesting part: The {\sf main()} routine. This function will
+ create the variables to be encoded on the stack, assign values to them,
+ initialize the XDS library, use it to encode the values, and write the
+ result of the encoding process to the standard output stream. Read on for
+ further details.
+ 
+ \paragraph{Lines 26--28.}
+ First of all we have to obtain a XDS context for all further operation.
+ This is done by calling {\sf xds\_init()}. Since we intend to \emph{encode}
+ data, we initialize the context in encoding mode. The only other mode of
+ operation would be decoding mode, but this is demonstrated in the next
+ section.
+ 
+ All routines in libxds return a code from a small list of return codes
+ defined in {\sf xds.h}, but {\sf xds\_init()} is different: It will return
+ a pointer to an {\sf xds\_t} in case of success and {\sf NULL} in case of
+ failure. One reason why {\sf xds\_init()} would fail is because it can't
+ allocate the memory required to initialize the context. In this case, the
+ system variable {\sf errno} is set to {\sf ENOMEM}. Another reason why {\sf
+ xds\_init()} would fail is because the mode parameter is invalid, in which
+ case {\sf errno} woulde be set to {\sf EINVAL}. If libxds has been compiled
+ with assertions enabled, such an error would result in an assertion error,
+ terminating the program with a diagnostic message immediately.
+ 
+ \paragraph{Lines 30--33.}
+ Once we have obtained a valid XDS context, we register the formatting
+ engines we need. In this example, we'll encode a signed 32-bit integer, an
+ unsigned 32-bit integer, and a string. We'll be using XDR encoding in this
+ case, so the engines to register are {\sf xdr\_encode\_int32()}, {\sf
+ xdr\_encode\_uint32()}, and {\sf xdr\_encode\_string()}. (A complete list
+ of available formatting engines can be found in {\sf xds.h} or in
+ section~\ref{xdr}~and~\ref{xml}. Please note that we could switch the
+ deployed encoding format simply be using the corresponding {\sf
+ xml\_encode\_XXX()} engines here. We could even mix XDR and XML encoding as
+ we see fit but it's hard to think of a case where this would make sense.
+ 
+ As you can see in the code, the developer is free to choose a name he'd
+ like to register the engine under. These names may only contain
+ alphanumerical characters plus the hyphon (``\verb#-#'') and the underscore
+ (``\verb#_#''). You can choose any name you want, but it is recommended to
+ follow the naming scheme of the corresponding formatting engine. Why this
+ is recommended will be seen in section~\ref{meta engines}.
+ 
+ \paragraph{Lines 35--36.}
+ This is the place where the actual encoding takes place. As parameters,
+ {\sf xds\_encode()} requires a valid encoding context plus a format string
+ that describes how the following parameters are to be interpreted. While
+ the concept is obviously identical to {\sf sprintf()}, the syntax is
+ different. The format string may contain an arbitrary number of names,
+ which are delimited by an arbitrary number of any character that is not a
+ legal character for engine names. Thus you can delimit the names by colons,
+ blanks, or whatever you like.
+ 
+ For each valid engine name in the format string, a corresponding parameter
+ must follow. What these parameters mean depends on the engine you're using.
+ The engines provided with the libxds library will expect the value to
+ encode, but theoretically developers are free to write formatting engines
+ that expect virtually any kind of information here. More about this will
+ explained in section~\ref{meta engines}.
+ 
+ \paragraph{Lines 38--39.}
+ We have encoded all values we wanted to encode, now we can get the result
+ from the library. This happens by calling {\sf xds\_getbuffer()}. The
+ routine will store the buffer's address and length at the locations we
+ provided as parameters. Please note that we can choose whether we want the
+ buffer as a ``gift'' ({\sf XDS\_GIFT}) or as a ``loan'' ({\sf XDS\_LOAN}).
+ 
+ The buffer being a ``loan'' means that the buffer is still owned by the
+ library -- we're only allowed to peak at it. But any call to an libxds
+ routine may potentially modify the buffer or even change the buffers
+ location. Hence the result of a {\sf xds\_getbuffer()} call with loaning
+ semantics is only valid until the next libxds routine is called. After
+ that, it is invalid.
+ 
+ If we choose the gift semantics, the buffer we receive will be owned by us;
+ the library will not touch the buffer again. This means of course, that
+ we're responsible for {\sf free()}ing the buffer when we don't need it
+ anymore.
+ 
+ \paragraph{Line 41.}
+ Destroy the XDS context and all data associated with it. This is possible
+ because we requested the buffer as ``gift''; the buffer is not associated
+ with libxds anymore.
+ 
+ \paragraph{Line 43.}
+ Write the buffer with the encoded data to the standard output stream.
+ 
+ \paragraph{Line 45.}
+ Now that we don't need the buffer anymore, we have to return the memory it
+ uses to the system. libxds won't do that for us.
+ 
+ \paragraph{Lines 47--50.}
+ Write a short report of what we have done to the standard error channel.
+ 
+ \bigskip
+ Finally, let us compile and execute the example program shown above. For
+ convenience, it is included in the distribution under the name {\sf
+ docs/encode.c}. You can compile and execute the program as follows:
+ 
+ \begin{quote}
+ \begin{verbatim}
+ simons@dev13:~/libxds$ cd docs
+ simons@dev13:~/libxds/docs$ gcc -I.. encode.c -o encode -L.. -lxds
+ simons@dev13:~/libxds/docs$ ./encode >output
+ Encoded data:
+         int32   = -42
+         uint32 = 0x12345678
+         string = "This is a test."
+ simons@dev13:~/libxds/docs$ ls -l output
+ -rw-r--r--  1 simons  simons  28 Aug  2 15:21 output
+ \end{verbatim}
+ \end{quote}
+ 
+ The result of executing the programm --- the file {\sf output} --- can be
+ displayed with {\sf hexdump(1)} or {\sf od(1)} and should look like this:
+ 
+ \begin{quote}
+ \begin{Verbatim}[fontsize=\small]
+ simons@dev13:~/libxds/docs$ hexdump -C output
+ 00000000  ff ff ff d6 12 34 56 78  00 00 00 0f 54 68 69 73  |.....4Vx....This|
+ 00000010  20 69 73 20 61 20 74 65  73 74 2e 00              | is a test..|
+ 0000001c
+ \end{Verbatim}
+ \end{quote}
+ 
+ \noindent
+ We will also re-use this file in the next section, where we'll read it and
+ decode those values again.
+ 
+ 
+ \subsection{Decoding}
+ 
+ The following example program will read the result of the encoding example
+ shown in the previous section and decode the values back into the native
+ representation. Then it will print those values to the standard error
+ stream so that the user can see the values are correct. Please take a look
+ at the source now, we'll discuss all relevant details in the following
+ paragraphs.
+ 
+ \begin{Verbatim}[numbers=left,fontsize=\small,frame=lines]
+ #include <stdio.h>
+ #include <unistd.h>
+ #include <string.h>
+ #include <errno.h>
+ #include <xds.h>
+ 
+ static void error_exit(int rc, const char* msg, ...)
+     {
+     va_list args;
+     va_start(args, msg);
+     vfprintf(stderr, msg, args);
+     va_end(args);
+     exit(rc);
+     }
+ 
+ int main()
+     {
+     xds_t* xds;
+     char   buffer[1024];
+     size_t buffer_len;
+     int rc;
+ 
+     xds_int32_t  int32;
+     xds_uint32_t uint32;
+     char*        string;
+ 
+     buffer_len = 0;
+     do
+         {
+         rc = read(STDIN_FILENO, buffer + buffer_len, sizeof(buffer) - buffer_len);
+         if (rc < 0)
+             error_exit(1, "read() failed: %s\n", strerror(errno));
+         else if (rc > 0)
+             buffer_len += rc;
+         }
+     while (rc > 0 && buffer_len < sizeof(buffer));
+ 
+     if (buffer_len >= sizeof(buffer))
+         error_exit(1, "Too much input data for our buffer.\n");
+ 
+     xds = xds_init(XDS_DECODE);
+     if (xds == NULL)
+         error_exit(1, "Failed to initialize XDS context: %s\n", strerror(errno));
+ 
+     if (xds_register(xds, "int32",  &xdr_decode_int32, NULL) != XDS_OK ||
+         xds_register(xds, "uint32", &xdr_decode_uint32, NULL) != XDS_OK ||
+         xds_register(xds, "string", &xdr_decode_string, NULL) != XDS_OK)
+         error_exit(1, "Failed to register my decoding engines!\n");
+ 
+     if (xds_setbuffer(xds, XDS_LOAN, buffer, buffer_len) != XDS_OK)
+         error_exit(1, "setbuffer() failed.\n");
+ 
+     if (xds_decode(xds, "int32 uint32 string", &int32, &uint32, &string) != XDS_OK)
+         error_exit(1, "xds_decode() failed!\n");
+ 
+     xds_destroy(xds);
+ 
+     fprintf(stderr, "Decoded data:\n");
+     fprintf(stderr, "\tint32   = %d\n", int32);
+     fprintf(stderr, "\tuint32 = 0x%x\n", uint32);
+     fprintf(stderr, "\tstring = \"%s\"\n", string);
+ 
+     free(string);
+ 
+     return 0;
+     }
+ \end{Verbatim}
+ 
+ \paragraph{Lines 1--25.}
+ Include the required header files, define the {\sf error\_exit()} helper
+ function, and create the required variables on the stack.
+ 
+ \paragraph{Lines 27--39.}
+ These instructions will read an unspecified number of bytes from the
+ standard input stream --- as long as the input does not exceed the size of
+ the {\sf buffer} variable. In order to provide the program with the
+ apropriate input, redirect the standard input stream to the file {\sf
+ output} created in the previous section or connect the encoding and
+ decoding programs directly by a pipe.
+ 
+ \paragraph{Lines 41-43.}
+ Create a context for decoding the values. The semantics are identical to
+ those described in the previous section.
+ 
+ \paragraph{Lines 45--48.}
+ Register the decoding engines in the context. Please note that obviously
+ the decoding engines must correspond to the encoding engines used to create
+ the data we're about to process. Using, say, an XML engine to decode XDR
+ data will at best return with an error --- in the worst case, it will
+ return incorrect results!
+ 
+ \paragraph{Lines 50-51.}
+ Here we do not get a buffer from the library, we \emph{set} the buffer
+ we've read earlier in the context for decoding. Please note that we use
+ loan semantics in this case, not gift semantics. This is necessary because
+ {\sf buffer} has not been allocated by {\sf malloc()} --- the variable
+ lives on the stack. This means that we cannot give it to libxds because
+ libxds expects to be able to {\sf free()} the buffer when the context is
+ destroyed.
+ 
+ Loan semantics are fine, though, all we have to do is to take care that we
+ don't erase or modify the contents of {\sf buffer} while libxds operates on
+ it. The library itself will never touch the buffer in decode mode, no
+ matter whether loan or gift semantics have been chosen.
+ 
+ \paragraph{Lines 53--54.}
+ Here come the actual decoding of the buffer's contents using {\sf
+ xds\_decode()}. The syntax is identical to {\sf xds\_encode()}'s, the only
+ difference is that the decoding engines do not expect the values --- like
+ the encoding engines did --- but the location where to store the value.
+ Thus we pass the addresses of the apropriate variables here. If the routine
+ returns with {\sf XDS\_OK}, the decoded values will have been stored in
+ those locations.
+ 
+ It should be noted that the decoded string cannot trivially be returned
+ this way. Instead, {\sf xds\_decode()} will use {\sf malloc()} to allocate
+ a buffer barely large enough to hold the string. The address of that buffer
+ is then stored in the pointer {\sf string}. Of course this means that the
+ application has to {\sf free()} the string once it's not required anymore.
+ 
+ \paragraph{Line 56.}
+ We don't need the context anymore, so we destroy it and free all used
+ resources. This does not affect {\sf buffer} in any way because we used
+ loan semantics.
+ 
+ \paragraph{Lines 58-61.}
+ Print the decoded values to the standard error stream for the user to take
+ a look at them.
+ 
+ \paragraph{Line 63.}
+ Now that we don't need the contents of {\sf string} anymore, we must return
+ the buffer allocated in {\sf xds\_decode()} to the system.
+ 
+ \bigskip
+ Like the encoding program described earlier, the source code to this
+ program is included in the library distribution as {\sf docs/decode.c}. You
+ can compile and execute the program like this:
+ 
+ \begin{quote}
+ \begin{verbatim}
+ simons@dev13:~/libxds$ cd docs
+ simons@dev13:~/libxds/docs$ gcc -I.. decode.c -o decode -L.. -lxds
+ simons@dev13:~/libxds/docs$ ./decode <output
+ Decoded data:
+         int32   = -42
+         uint32 = 0x12345678
+         string = "This is a test."
+ \end{verbatim}
+ \end{quote}
+ 
+ Of course we assume that the {\sf output} file has been created as
+ described in the previous section, otherwise you cannot trivially use the
+ example program. Alternatively, you could execute both programs like this:
+ 
+ \begin{quote}
+ \begin{Verbatim}[fontsize=\small]
+ simons@dev13:~/libxds/docs$ ./encode | ./decode
+ Encoded data:
+         int32   = -42
+         uint32 = 0x12345678
+         string = "This is a test."
+ Decoded data:
+         int32   = -42
+         uint32 = 0x12345678
+         string = "This is a test."
+ \end{Verbatim}
+ \end{quote}
+ 
+ \noindent
+ This will encode and decode the values without the need for a temporary
+ file.
+ 
+ \section{Extending the XDS library}
+ \label{meta engines}
+ 
+ 
+ \section{The XDS Framework}
+ \label{xds}
+ 
+ \section{The XDR Engines}
+ \label{xdr}
+ 
+ \section{The XML Engines}
+ \label{xml}
+ 
+ \begin{thebibliography}{xxx}
+ 
+ \bibitem{xdr} RFC 1832: ``XDR: External Data Representation Standard'',
+ R.~Srinivasan, August~1995
+ 
+ \bibitem{xml} {\sf http://www.ossp.org/pkg/xds/xds-xml.dtd}
+ 
+ 
+ \end{thebibliography}
+ 
+ \end{document}

CVSTrac 2.0.1