OSSP: CVS Repository: Check-in [4278]

Check-in Number:

4278

Date:

2001-Aug-02 15:58:07 (local)
2001-Aug-02 13:58:07 (UTC)

User:

simons

Branch:

Comment:

Currently, the manual contains an introduction, a description of the architecture and two example programs demonstrating how to encode and decode. Much remains to be written.

Tickets:

Inspections:

Files:

ossp-pkg/xds/docs/libxds.tex

added-> 1.1

ossp-pkg/xds/docs/libxds.tex -> 1.1

*** /dev/null Tue May 20 16:23:40 2025 --- - Tue May 20 16:23:47 2025 *************** *** 0 **** --- 1,571 ---- + % -*- mode: LaTeX; fill-column: 75; -*- + % + % $Id: libxds.tex,v 1.1 2001/08/02 13:58:07 simons Exp $ + % + \documentclass[a4paper,10pt,pointlessnumbers,bibtotoc]{scrartcl} + \usepackage[dvips,xdvi]{graphicx} + \usepackage{fancyvrb} + \typearea[2cm]{12} + \fussy + + \begin{document} + + \subject{Cable \& Wireless Application Development} + \title{XDS -- eXtensible Data Serialization} + \author{Peter Simons $<$simons@computer.org$>$} + \date{2001-08-01} + \maketitle + + \section{Introduction} + + In today's networked world, computer systems of all brands and flavours + communicate with each other. Unfortunately, these systems are far from + being identical: Many systems use different internal representations for + the same thing. Look at the (hexadecimal) number \$1234 for instance: On a + big endian machine, this number will be stored in memory the way you'd + intuitively expect: \$12~\$34 --- the more significant byte preceeds the + less significant one. On a little endian machine, though, the number \$1234 + will be stored like this: \$34~\$12 --- exactly the other way round. + + As a result, you cannot just write the number \$1234 to a socket and expect + the other end to understand it correctly, because if the endians differ, + the reader will read a different number than the writer sent. Things will + get even more complicated when you start exchanging floating point numbers, + for which about a dozen different encodings exist! + + Solving these problems is the domain of libxds; its purpose is to encode + data in a way that allows this data to be exchanged between computer + systems of different types. Assume you'd want to reliably transfer the + value \$1234 from host A to host B. Then you would encode the value using + libxds, transfer the encoded data via the network, and decode the value + again at the other end. Every application that follows this process will + read the correct value no matter what native representation its hosting + platform uses internally. + + \begin{figure}[tbh] + \begin{center} + \includegraphics[width=\textwidth]{data-exchange.eps} + \caption{Data exchange using libxds} + \label{data exchange} + \end{center} + \end{figure} + + There is a rich variety of applications for such a functionality: libxds + may be used to encode data before it is written to disk or read from the + disk, it may be used to encode data to be exchanged between processes over + the network, etc. Because of this variety, special attention has been paid + to the library design. + + \paragraph{The library has been designed to be extensible.} + The functionality is split into a generic encoding and decoding framework + and a set of formatting engines. These engines can be plugged into the + framework at run-time to actually encode and decode data. Because of this + architecture, libxds can be customized to deploy any data format the + developer sees fit. Included in the distribution are formatting engines for + the XDR format specified in \cite{xdr} and for the XML format specified in + \cite{xml}. + + \paragraph{The library is convenient to use.} + An arbitrary number of variables can be encoded or decoded with one single + function call. All memory management is done by libxds, the developer + doesn't need bother to allocate or to manage buffers for the encoded or + decoded data. Automatic buffer management can be switched off at run-time, + though, for maximum performance. + + \paragraph{Performance.} + Since all transferred data has to wander through libxds, the library has + been written to encode and decode with maximim performance. The generic + encoding framework adds almost no run-time overhead to the encoding + process. If non-automatic buffer management has been selected, hardly + anything but the actual formatting engines is executed. + + \paragraph{Robustness.} + In order to verify that the library is working correctly, a set of + regression tests is included in the distribution. The test suits will --- + among other things --- encode known values and compare the result with the + expected (correct) values. This ensures that libxds works correctly on any + platform. + + \paragraph{Use standard formats.} + The supported formats XDR and XML and widely known and accepted formats, + which are most likely interoperable with other marshaling implementations. + For XDR for example, it would be possible to encode data with libxds and to + decode it with an entirely different XDR implementation or vice versa. + + \paragraph{Portability.} + libxds has been written with portability in mind. Development took place on + FreeBSD, Linux and Solaris; other platforms has been used to test the + results. It is expected that libxds will compile and function on virtually + any POSIX.1-compilant system with a moderately modern ISO-C compiler. Gnu's + CC (gcc) is known to compile the library just fine. For maximum + portability, GNU autoconf has been used to determine the target system's + properties. + + \section{Architecture of libxds} + + \begin{figure}[htb] + \begin{center} + \includegraphics[width=\textwidth]{architecture.eps} + \caption{Components of libxds} + \label{libxds components} + \end{center} + \end{figure} + + The architecture of libxds is illustrated in figure~\ref{libxds + components}. libxds consists of three components: The generic encoding and + decoding framework, a set of formatting engines to encode and decode values + in a certain forman, and a run-time context, which is used to manage + buffers, registered engines, etc. + + In order to use the library, the first thing the developer has to do is to + create a valid XDS context by calling {\sf xds\_init()}. The routine + requires one parameter that determines whether to operate in encoding- or + decoding mode. A context can be used for encoding or decoding only; it is + not possible to use the same context for both operations. Once a valid XDS + context has been obtained, the routine {\sf xds\_register()} can be used to + register an arbitrary number of formatting engines within the context. + + A set of formatting engines has been included in the library. These + routines will handle any elementary datatype included in the ISO-C language + such as 32-bit integers, 64-bit integers, unsigned integers (of both 32- + and 64-bit), floating point numbers, strings and octet streams. + + Once all required formatting engines are registered, the routines {\sf + xds\_encode()} or {\sf xds\_\-decode()} may be used to actually perform the + encoding or decoding process. Any data type for which a formatting engine + has been registered can be handled by the library. + + This means, that it is possible for the developer to write custom + formatting engines for any data type he desires to use and to register them + in the context as long as these engines adhere to the {\sf xds\_engine\_t} + interface defined in {\sf xds.h}. + + In particular it is possible to register meta formatting engines. That is a + formatting engine designed to encode or decode structures --- data types + which consist of several elementary data types. The formatting engine for + such a structure will simply re-use the existing engines in order to encode + or decode the whole structure. The clou here is that the meta engine + doesn't even need to know \emph{which} low-level formatting engines are + registered in order to use them. Hence, a meta engine may format the whole + structure in XDR, XML, or any other format without needing to know anything + about the details. + + This topic is addressed in great detail in section~\ref{meta engines} of + this document, but before we come to that rather advanced topic, let us + start by studying two simple examples of how data is encoded and decoded + using libxds. + + \section{Using the XDS library} + + \subsection{Encoding} + + The following example program will encode three variables using the XDR + formatting engines. The result of the process will then be written to the + standard output stream, which can be redirected to a file or piped into the + decoding program described in the next section. Just take a look at the + source code for a moment, we will then go on to discuss all relevant + sections line by line. + + \begin{Verbatim}[numbers=left,fontsize=\small,frame=lines] + #include <stdio.h> + #include <unistd.h> + #include <string.h> + #include <errno.h> + #include <xds.h> + + static void error_exit(int rc, const char* msg, ...) + { + va_list args; + va_start(args, msg); + vfprintf(stderr, msg, args); + va_end(args); + exit(rc); + } + + int main() + { + xds_t* xds; + char* buffer; + size_t buffer_size; + + xds_int32_t int32 = -42; + xds_uint32_t uint32 = 0x12345678; + const char* string = "This is a test."; + + xds = xds_init(XDS_ENCODE); + if (xds == NULL) + error_exit(1, "Failed to initialize XDS context: %s\n", strerror(errno)); + + if (xds_register(xds, "int32", &xdr_encode_int32, NULL) != XDS_OK || + xds_register(xds, "uint32", &xdr_encode_uint32, NULL) != XDS_OK || + xds_register(xds, "string", &xdr_encode_string, NULL) != XDS_OK) + error_exit(1, "Failed to register my encoding engines!\n"); + + if (xds_encode(xds, "int32 uint32 string", int32, uint32, string) != XDS_OK) + error_exit(1, "xds_encode() failed!\n"); + + if (xds_getbuffer(xds, XDS_GIFT, (void**)&buffer, &buffer_size) != XDS_OK) + error_exit(1, "getbuffer() failed.\n"); + + xds_destroy(xds); + + write(STDOUT_FILENO, buffer, buffer_size); + + free(buffer); + + fprintf(stderr, "Encoded data:\n"); + fprintf(stderr, "\tint32 = %d\n", int32); + fprintf(stderr, "\tuint32 = 0x%x\n", uint32); + fprintf(stderr, "\tstring = \"%s\"\n", string); + + return 0; + } + \end{Verbatim} + + \paragraph{Lines 1--5.} + The program starts by including several system headers, which define the + prototypes for some routines we use. The most interesting header in our + case is of course {\sf xds.h} --- the header of libxds. Please note that + all declarations required to use libxds are included in that file. + + \paragraph{Lines 7--13.} + The {\sf error\_exit()} routine is not relevant for the example; we just + define it to make the rest of the source code shorter and easier to read. + + \paragraph{Lines 16--53.} + This is the interesting part: The {\sf main()} routine. This function will + create the variables to be encoded on the stack, assign values to them, + initialize the XDS library, use it to encode the values, and write the + result of the encoding process to the standard output stream. Read on for + further details. + + \paragraph{Lines 26--28.} + First of all we have to obtain a XDS context for all further operation. + This is done by calling {\sf xds\_init()}. Since we intend to \emph{encode} + data, we initialize the context in encoding mode. The only other mode of + operation would be decoding mode, but this is demonstrated in the next + section. + + All routines in libxds return a code from a small list of return codes + defined in {\sf xds.h}, but {\sf xds\_init()} is different: It will return + a pointer to an {\sf xds\_t} in case of success and {\sf NULL} in case of + failure. One reason why {\sf xds\_init()} would fail is because it can't + allocate the memory required to initialize the context. In this case, the + system variable {\sf errno} is set to {\sf ENOMEM}. Another reason why {\sf + xds\_init()} would fail is because the mode parameter is invalid, in which + case {\sf errno} woulde be set to {\sf EINVAL}. If libxds has been compiled + with assertions enabled, such an error would result in an assertion error, + terminating the program with a diagnostic message immediately. + + \paragraph{Lines 30--33.} + Once we have obtained a valid XDS context, we register the formatting + engines we need. In this example, we'll encode a signed 32-bit integer, an + unsigned 32-bit integer, and a string. We'll be using XDR encoding in this + case, so the engines to register are {\sf xdr\_encode\_int32()}, {\sf + xdr\_encode\_uint32()}, and {\sf xdr\_encode\_string()}. (A complete list + of available formatting engines can be found in {\sf xds.h} or in + section~\ref{xdr}~and~\ref{xml}. Please note that we could switch the + deployed encoding format simply be using the corresponding {\sf + xml\_encode\_XXX()} engines here. We could even mix XDR and XML encoding as + we see fit but it's hard to think of a case where this would make sense. + + As you can see in the code, the developer is free to choose a name he'd + like to register the engine under. These names may only contain + alphanumerical characters plus the hyphon (``\verb#-#'') and the underscore + (``\verb#_#''). You can choose any name you want, but it is recommended to + follow the naming scheme of the corresponding formatting engine. Why this + is recommended will be seen in section~\ref{meta engines}. + + \paragraph{Lines 35--36.} + This is the place where the actual encoding takes place. As parameters, + {\sf xds\_encode()} requires a valid encoding context plus a format string + that describes how the following parameters are to be interpreted. While + the concept is obviously identical to {\sf sprintf()}, the syntax is + different. The format string may contain an arbitrary number of names, + which are delimited by an arbitrary number of any character that is not a + legal character for engine names. Thus you can delimit the names by colons, + blanks, or whatever you like. + + For each valid engine name in the format string, a corresponding parameter + must follow. What these parameters mean depends on the engine you're using. + The engines provided with the libxds library will expect the value to + encode, but theoretically developers are free to write formatting engines + that expect virtually any kind of information here. More about this will + explained in section~\ref{meta engines}. + + \paragraph{Lines 38--39.} + We have encoded all values we wanted to encode, now we can get the result + from the library. This happens by calling {\sf xds\_getbuffer()}. The + routine will store the buffer's address and length at the locations we + provided as parameters. Please note that we can choose whether we want the + buffer as a ``gift'' ({\sf XDS\_GIFT}) or as a ``loan'' ({\sf XDS\_LOAN}). + + The buffer being a ``loan'' means that the buffer is still owned by the + library -- we're only allowed to peak at it. But any call to an libxds + routine may potentially modify the buffer or even change the buffers + location. Hence the result of a {\sf xds\_getbuffer()} call with loaning + semantics is only valid until the next libxds routine is called. After + that, it is invalid. + + If we choose the gift semantics, the buffer we receive will be owned by us; + the library will not touch the buffer again. This means of course, that + we're responsible for {\sf free()}ing the buffer when we don't need it + anymore. + + \paragraph{Line 41.} + Destroy the XDS context and all data associated with it. This is possible + because we requested the buffer as ``gift''; the buffer is not associated + with libxds anymore. + + \paragraph{Line 43.} + Write the buffer with the encoded data to the standard output stream. + + \paragraph{Line 45.} + Now that we don't need the buffer anymore, we have to return the memory it + uses to the system. libxds won't do that for us. + + \paragraph{Lines 47--50.} + Write a short report of what we have done to the standard error channel. + + \bigskip + Finally, let us compile and execute the example program shown above. For + convenience, it is included in the distribution under the name {\sf + docs/encode.c}. You can compile and execute the program as follows: + + \begin{quote} + \begin{verbatim} + simons@dev13:~/libxds$ cd docs + simons@dev13:~/libxds/docs$ gcc -I.. encode.c -o encode -L.. -lxds + simons@dev13:~/libxds/docs$ ./encode >output + Encoded data: + int32 = -42 + uint32 = 0x12345678 + string = "This is a test." + simons@dev13:~/libxds/docs$ ls -l output + -rw-r--r-- 1 simons simons 28 Aug 2 15:21 output + \end{verbatim} + \end{quote} + + The result of executing the programm --- the file {\sf output} --- can be + displayed with {\sf hexdump(1)} or {\sf od(1)} and should look like this: + + \begin{quote} + \begin{Verbatim}[fontsize=\small] + simons@dev13:~/libxds/docs$ hexdump -C output + 00000000 ff ff ff d6 12 34 56 78 00 00 00 0f 54 68 69 73 |.....4Vx....This| + 00000010 20 69 73 20 61 20 74 65 73 74 2e 00 | is a test..| + 0000001c + \end{Verbatim} + \end{quote} + + \noindent + We will also re-use this file in the next section, where we'll read it and + decode those values again. + + + \subsection{Decoding} + + The following example program will read the result of the encoding example + shown in the previous section and decode the values back into the native + representation. Then it will print those values to the standard error + stream so that the user can see the values are correct. Please take a look + at the source now, we'll discuss all relevant details in the following + paragraphs. + + \begin{Verbatim}[numbers=left,fontsize=\small,frame=lines] + #include <stdio.h> + #include <unistd.h> + #include <string.h> + #include <errno.h> + #include <xds.h> + + static void error_exit(int rc, const char* msg, ...) + { + va_list args; + va_start(args, msg); + vfprintf(stderr, msg, args); + va_end(args); + exit(rc); + } + + int main() + { + xds_t* xds; + char buffer[1024]; + size_t buffer_len; + int rc; + + xds_int32_t int32; + xds_uint32_t uint32; + char* string; + + buffer_len = 0; + do + { + rc = read(STDIN_FILENO, buffer + buffer_len, sizeof(buffer) - buffer_len); + if (rc < 0) + error_exit(1, "read() failed: %s\n", strerror(errno)); + else if (rc > 0) + buffer_len += rc; + } + while (rc > 0 && buffer_len < sizeof(buffer)); + + if (buffer_len >= sizeof(buffer)) + error_exit(1, "Too much input data for our buffer.\n"); + + xds = xds_init(XDS_DECODE); + if (xds == NULL) + error_exit(1, "Failed to initialize XDS context: %s\n", strerror(errno)); + + if (xds_register(xds, "int32", &xdr_decode_int32, NULL) != XDS_OK || + xds_register(xds, "uint32", &xdr_decode_uint32, NULL) != XDS_OK || + xds_register(xds, "string", &xdr_decode_string, NULL) != XDS_OK) + error_exit(1, "Failed to register my decoding engines!\n"); + + if (xds_setbuffer(xds, XDS_LOAN, buffer, buffer_len) != XDS_OK) + error_exit(1, "setbuffer() failed.\n"); + + if (xds_decode(xds, "int32 uint32 string", &int32, &uint32, &string) != XDS_OK) + error_exit(1, "xds_decode() failed!\n"); + + xds_destroy(xds); + + fprintf(stderr, "Decoded data:\n"); + fprintf(stderr, "\tint32 = %d\n", int32); + fprintf(stderr, "\tuint32 = 0x%x\n", uint32); + fprintf(stderr, "\tstring = \"%s\"\n", string); + + free(string); + + return 0; + } + \end{Verbatim} + + \paragraph{Lines 1--25.} + Include the required header files, define the {\sf error\_exit()} helper + function, and create the required variables on the stack. + + \paragraph{Lines 27--39.} + These instructions will read an unspecified number of bytes from the + standard input stream --- as long as the input does not exceed the size of + the {\sf buffer} variable. In order to provide the program with the + apropriate input, redirect the standard input stream to the file {\sf + output} created in the previous section or connect the encoding and + decoding programs directly by a pipe. + + \paragraph{Lines 41-43.} + Create a context for decoding the values. The semantics are identical to + those described in the previous section. + + \paragraph{Lines 45--48.} + Register the decoding engines in the context. Please note that obviously + the decoding engines must correspond to the encoding engines used to create + the data we're about to process. Using, say, an XML engine to decode XDR + data will at best return with an error --- in the worst case, it will + return incorrect results! + + \paragraph{Lines 50-51.} + Here we do not get a buffer from the library, we \emph{set} the buffer + we've read earlier in the context for decoding. Please note that we use + loan semantics in this case, not gift semantics. This is necessary because + {\sf buffer} has not been allocated by {\sf malloc()} --- the variable + lives on the stack. This means that we cannot give it to libxds because + libxds expects to be able to {\sf free()} the buffer when the context is + destroyed. + + Loan semantics are fine, though, all we have to do is to take care that we + don't erase or modify the contents of {\sf buffer} while libxds operates on + it. The library itself will never touch the buffer in decode mode, no + matter whether loan or gift semantics have been chosen. + + \paragraph{Lines 53--54.} + Here come the actual decoding of the buffer's contents using {\sf + xds\_decode()}. The syntax is identical to {\sf xds\_encode()}'s, the only + difference is that the decoding engines do not expect the values --- like + the encoding engines did --- but the location where to store the value. + Thus we pass the addresses of the apropriate variables here. If the routine + returns with {\sf XDS\_OK}, the decoded values will have been stored in + those locations. + + It should be noted that the decoded string cannot trivially be returned + this way. Instead, {\sf xds\_decode()} will use {\sf malloc()} to allocate + a buffer barely large enough to hold the string. The address of that buffer + is then stored in the pointer {\sf string}. Of course this means that the + application has to {\sf free()} the string once it's not required anymore. + + \paragraph{Line 56.} + We don't need the context anymore, so we destroy it and free all used + resources. This does not affect {\sf buffer} in any way because we used + loan semantics. + + \paragraph{Lines 58-61.} + Print the decoded values to the standard error stream for the user to take + a look at them. + + \paragraph{Line 63.} + Now that we don't need the contents of {\sf string} anymore, we must return + the buffer allocated in {\sf xds\_decode()} to the system. + + \bigskip + Like the encoding program described earlier, the source code to this + program is included in the library distribution as {\sf docs/decode.c}. You + can compile and execute the program like this: + + \begin{quote} + \begin{verbatim} + simons@dev13:~/libxds$ cd docs + simons@dev13:~/libxds/docs$ gcc -I.. decode.c -o decode -L.. -lxds + simons@dev13:~/libxds/docs$ ./decode <output + Decoded data: + int32 = -42 + uint32 = 0x12345678 + string = "This is a test." + \end{verbatim} + \end{quote} + + Of course we assume that the {\sf output} file has been created as + described in the previous section, otherwise you cannot trivially use the + example program. Alternatively, you could execute both programs like this: + + \begin{quote} + \begin{Verbatim}[fontsize=\small] + simons@dev13:~/libxds/docs$ ./encode | ./decode + Encoded data: + int32 = -42 + uint32 = 0x12345678 + string = "This is a test." + Decoded data: + int32 = -42 + uint32 = 0x12345678 + string = "This is a test." + \end{Verbatim} + \end{quote} + + \noindent + This will encode and decode the values without the need for a temporary + file. + + \section{Extending the XDS library} + \label{meta engines} + + + \section{The XDS Framework} + \label{xds} + + \section{The XDR Engines} + \label{xdr} + + \section{The XML Engines} + \label{xml} + + \begin{thebibliography}{xxx} + + \bibitem{xdr} RFC 1832: ``XDR: External Data Representation Standard'', + R.~Srinivasan, August~1995 + + \bibitem{xml} {\sf http://www.ossp.org/pkg/xds/xds-xml.dtd} + + + \end{thebibliography} + + \end{document}

OSSP CVS Repository