Index: ossp-pkg/xds/docs/libxds.tex RCS File: /v/ossp/cvs/ossp-pkg/xds/docs/libxds.tex,v rcsdiff -q -kk '-r1.4' '-r1.5' -u '/v/ossp/cvs/ossp-pkg/xds/docs/libxds.tex,v' 2>/dev/null --- libxds.tex 2001/08/08 13:12:28 1.4 +++ libxds.tex 2001/08/09 12:58:08 1.5 @@ -10,8 +10,8 @@ \begin{document} -\subject{Cable \& Wireless Application Development} -\title{XDS -- eXtensible Data Serialization} +\titlehead{Cable \& Wireless Deutschland GmbH\\Application Services\\Development Team} +\title{OSSP XDS ---\\eXtensible Data Serialization} \author{Peter Simons $<$simons@computer.org$>$} \date{2001-08-01} \maketitle @@ -33,24 +33,24 @@ get even more complicated when you start exchanging floating point numbers, for which about a dozen different encodings exist! -Solving these problems is the domain of libxds; its purpose is to encode -data in a way that allows this data to be exchanged between computer -systems of different types. Assume you'd want to reliably transfer the -value \$1234 from host A to host B. Then you would encode the value using -libxds, transfer the encoded data via the network, and decode the value -again at the other end. Every application that follows this process will -read the correct value no matter what native representation its hosting -platform uses internally. - \begin{figure}[tbh] \begin{center} \includegraphics[width=\textwidth]{data-exchange.eps} - \caption{Data exchange using libxds} + \caption{Data exchange using XDS} \label{data exchange} \end{center} \end{figure} -There is a rich variety of applications for such a functionality: libxds +Solving these problems is the domain of XDS; its purpose is to encode data +in a way that allows this data to be exchanged between computer systems of +different types. Assume you'd want to reliably transfer the value \$1234 +from host A to host B. Then you would encode the value using XDS, transfer +the encoded data over the network, and decode the value again at the other +end. Every application that follows this process will read the correct +value no matter what native representation its hosting platform uses +internally. + +There is a rich variety of applications for such a functionality: XDS may be used to encode data before it is written to disk or read from the disk, it may be used to encode data to be exchanged between processes over the network, etc. Because of this variety, special attention has been paid @@ -58,113 +58,112 @@ \paragraph{The library has been designed to be extensible.} The functionality is split into a generic encoding and decoding framework -and a set of formatting engines. These engines can be plugged into the -framework at run-time to actually encode and decode data. Because of this -architecture, libxds can be customized to deploy any data format the -developer sees fit. Included in the distribution are formatting engines for -the XDR format specified in \cite{xdr} and for the XML format specified in -\cite{xml}. +and a set of encoding and decoding engines. These engines can be plugged +into the framework at run-time to actually do the encoding and decoding of +data. Because of this architecture, XDS can be customized to deploy any +data format the developer sees fit. Included in the distribution are +engines for the XDR format specified in \cite{xdr} and for the XML format +specified in \cite{xml}. \paragraph{The library is convenient to use.} An arbitrary number of variables can be encoded or decoded with one single -function call. All memory management is done by libxds, the developer +function call. All memory management is done by XDS, the developer doesn't need bother to allocate or to manage buffers for the encoded or decoded data. Automatic buffer management can be switched off at run-time, though, for maximum performance. \paragraph{Performance.} -Since all transferred data has to wander through libxds, the library has -been written to encode and decode with maximim performance. The generic -encoding framework adds almost no run-time overhead to the encoding -process. If non-automatic buffer management has been selected, hardly -anything but the actual formatting engines is executed. +Since all transferred data has to wander through XDS, the library has been +written to encode and decode with maximum performance. The generic encoding +framework adds almost no run-time overhead to the encoding process. If +non-automatic buffer management has been selected, hardly anything but the +actual encoding/decoding engines is executed. \paragraph{Robustness.} In order to verify that the library is working correctly, a set of -regression tests is included in the distribution. The test suits will --- +regression tests is included in the distribution. The test suites will --- among other things --- encode known values and compare the result with the -expected (correct) values. This ensures that libxds works correctly on any +expected (correct) values. This ensures that XDS works correctly on any platform. \paragraph{Use standard formats.} -The supported formats XDR and XML and widely known and accepted formats, -which are most likely interoperable with other marshaling implementations. -For XDR for example, it would be possible to encode data with libxds and to -decode it with an entirely different XDR implementation or vice versa. +The supported XDR and XML formats are widely known and accepted, meaning +that they are interoperable with other marshaling implementations. For XDR +for instance, it would be possible to encode data with XDS and to decode it +with an entirely different XDR implementation or vice versa. \paragraph{Portability.} -libxds has been written with portability in mind. Development took place on -FreeBSD, Linux and Solaris; other platforms has been used to test the -results. It is expected that libxds will compile and function on virtually -any POSIX.1-compilant system with a moderately modern ISO-C compiler. Gnu's -CC (gcc) is known to compile the library just fine. For maximum -portability, GNU autoconf has been used to determine the target system's +XDS has been written with portability in mind. Development took place on +FreeBSD, Linux and Solaris; other platforms were used to test the results. +It is expected that XDS will compile and function on virtually any +POSIX.1-compliant system with a moderately modern ISO-C compiler. GNU's +C~Compiler~(gcc) is known to compile the library just fine. For maximum +portability, GNU Autoconf has been used to determine the target system's properties. -\section{Architecture of libxds} +\section{Architecture of XDS} \begin{figure}[htb] \begin{center} \includegraphics[width=\textwidth]{architecture.eps} - \caption{Components of libxds} - \label{libxds components} + \caption{Components of XDS} + \label{XDS components} \end{center} \end{figure} -The architecture of libxds is illustrated in figure~\ref{libxds -components}. libxds consists of three components: The generic encoding and -decoding framework, a set of formatting engines to encode and decode values -in a certain forman, and a run-time context, which is used to manage -buffers, registered engines, etc. +The architecture of XDS is illustrated in figure~\ref{XDS components}. XDS +consists of three components: The generic encoding and decoding framework, +a set of engines to encode and decode values in a certain format, and a +run-time context, which is used to manage buffers, registered engines, etc. In order to use the library, the first thing the developer has to do is to -create a valid XDS context by calling {\sf xds\_init()}. The routine +create a valid XDS context by calling \textsf{xds\_init()}. The routine requires one parameter that determines whether to operate in encoding- or decoding mode. A context can be used for encoding or decoding only; it is not possible to use the same context for both operations. Once a valid XDS -context has been obtained, the routine {\sf xds\_register()} can be used to -register an arbitrary number of formatting engines within the context. - -A set of formatting engines has been included in the library. These -routines will handle any elementary datatype included in the ISO-C language -such as 32-bit integers, 64-bit integers, unsigned integers (of both 32- -and 64-bit), floating point numbers, strings and octet streams. - -Once all required formatting engines are registered, the routines {\sf -xds\_encode()} or {\sf xds\_\-decode()} may be used to actually perform the -encoding or decoding process. Any data type for which a formatting engine +context has been obtained, the routine \textsf{xds\_register()} can be used to +register an arbitrary number of encoding or decoding engines within the +context. + +A set of engines has been included in the library. These routines will +handle any elementary datatype included in the ISO-C language such as +32-bit integers, 64-bit integers, unsigned integers (of both 32- and +64-bit), floating point numbers, strings and octet streams. + +Once all required encoding/decoding engines are registered, the routines +\textsf{xds\_encode()} or \textsf{xds\_\-decode()} may be used to actually +perform the encoding or decoding process. Any data type for which an engine has been registered can be handled by the library. -This means, that it is possible for the developer to write custom -formatting engines for any data type he desires to use and to register them -in the context as long as these engines adhere to the {\sf xds\_engine\_t} -interface defined in {\sf xds.h}. - -In particular it is possible to register meta formatting engines. That is a -formatting engine designed to encode or decode structures --- data types -which consist of several elementary data types. The formatting engine for -such a structure will simply re-use the existing engines in order to encode -or decode the whole structure. The clou here is that the meta engine -doesn't even need to know \emph{which} low-level formatting engines are -registered in order to use them. Hence, a meta engine may format the whole -structure in XDR, XML, or any other format without needing to know anything -about the details. +This means, that it is possible for the developer to write custom engines +for any data type he desires to use and to register them in the context as +long as these engines adhere to the \textsf{xds\_engine\_t} interface defined +in \textsf{xds.h}. + +In particular it is possible to register meta engines. That is an engine +designed to encode or decode structures --- data types which consist of +several elementary data types. The engine for such a structure will simply +re-use the existing engines in order to encode or decode the whole +structure. The clou here is that the meta engine doesn't even need to know +\emph{which} low-level engines are registered in order to use them. Hence, +a meta engine may format the whole structure in XDR, XML, or any other +format without needing to know anything about the details. This topic is addressed in great detail in section~\ref{meta engines} of this document, but before we come to that rather advanced topic, let us start by studying two simple examples of how data is encoded and decoded -using libxds. +using XDS. \section{Using the XDS library} \subsection{Encoding} The following example program will encode three variables using the XDR -formatting engines. The result of the process will then be written to the -standard output stream, which can be redirected to a file or piped into the -decoding program described in the next section. Just take a look at the -source code for a moment, we will then go on to discuss all relevant -sections line by line. +engines. The result of the process will then be written to the standard +output stream, which can be redirected to a file or piped into the decoding +program described in the next section. Just take a look at the source code +for a moment, we will then go on to discuss all relevant sections line by +line. \begin{Verbatim}[numbers=left,fontsize=\small,frame=lines] #include @@ -225,62 +224,64 @@ \paragraph{Lines 1--5.} The program starts by including several system headers, which define the prototypes for some routines we use. The most interesting header in our -case is of course {\sf xds.h} --- the header of libxds. Please note that -all declarations required to use libxds are included in that file. +case is of course \textsf{xds.h} --- the header of XDS. Please note that +all declarations required to use XDS are included in that file. \paragraph{Lines 7--13.} -The {\sf error\_exit()} routine is not relevant for the example; we just +The \textsf{error\_exit()} routine is not relevant for the example; we just define it to make the rest of the source code shorter and easier to read. \paragraph{Lines 16--53.} -This is the interesting part: The {\sf main()} routine. This function will +This is the interesting part: The \textsf{main()} routine. This function will create the variables to be encoded on the stack, assign values to them, initialize the XDS library, use it to encode the values, and write the result of the encoding process to the standard output stream. Read on for further details. \paragraph{Lines 26--28.} -First of all we have to obtain a XDS context for all further operation. -This is done by calling {\sf xds\_init()}. Since we intend to \emph{encode} +First of all we have to obtain an XDS context for all further operation. +This is done by calling \textsf{xds\_init()}. Since we intend to \emph{encode} data, we initialize the context in encoding mode. The only other mode of operation would be decoding mode, but this is demonstrated in the next section. -All routines in libxds return a code from a small list of return codes -defined in {\sf xds.h}, but {\sf xds\_init()} is different: It will return -a pointer to an {\sf xds\_t} in case of success and {\sf NULL} in case of -failure. One reason why {\sf xds\_init()} would fail is because it can't -allocate the memory required to initialize the context. In this case, the -system variable {\sf errno} is set to {\sf ENOMEM}. Another reason why {\sf -xds\_init()} would fail is because the mode parameter is invalid, in which -case {\sf errno} woulde be set to {\sf EINVAL}. If libxds has been compiled -with assertions enabled, such an error would result in an assertion error, -terminating the program with a diagnostic message immediately. +All routines in XDS return a code from a small list of return codes defined +in \textsf{xds.h}, but \textsf{xds\_init()} is different: It will return a +pointer to an \textsf{xds\_t} in case of success and \textsf{NULL} in case +of failure. One reason why \textsf{xds\_init()} would fail is because it +can't allocate the memory required to initialize the context. In this case, +the system variable \textsf{errno} is set to \textsf{ENOMEM}. Another +reason why \textsf{xds\_init()} would fail is because the mode parameter is +invalid, in which case \textsf{errno} woulde be set to \textsf{EINVAL}. If +XDS has been compiled with assertions enabled, such an error would result +in an assertion error, terminating the program with a diagnostic message +immediately. \paragraph{Lines 30--33.} -Once we have obtained a valid XDS context, we register the formatting -engines we need. In this example, we'll encode a signed 32-bit integer, an -unsigned 32-bit integer, and a string. We'll be using XDR encoding in this -case, so the engines to register are {\sf xdr\_encode\_int32()}, {\sf -xdr\_encode\_uint32()}, and {\sf xdr\_encode\_string()}. (A complete list -of available formatting engines can be found in {\sf xds.h} or in -section~\ref{xdr}~and~\ref{xml}. Please note that we could switch the -deployed encoding format simply be using the corresponding {\sf -xml\_encode\_XXX()} engines here. We could even mix XDR and XML encoding as -we see fit but it's hard to think of a case where this would make sense. +Once we have obtained a valid XDS context, we register the engines we need. +In this example, we'll encode a signed 32-bit integer, an unsigned 32-bit +integer, and a string. We'll be using XDR encoding in this case, so the +engines to register are \textsf{xdr\_encode\_int32()}, +\textsf{xdr\_encode\_uint32()}, and \textsf{xdr\_encode\_string()}. (A +complete list of available engines can be found in \textsf{xds.h}, in +section~\ref{xdr}~and~\ref{xml}, or in the manual pages for the library. +Please note that we could switch the deployed encoding format simply be +using the corresponding \textsf{xml\_encode\_XXX()} engines here. We could +even mix XDR and XML encoding as we see fit but it's hard to think of a +case where this would make sense. As you can see in the code, the developer is free to choose a name he'd like to register the engine under. These names may only contain -alphanumerical characters plus the hyphon (``\verb#-#'') and the underscore +alphanumerical characters plus the hyphen (``\verb#-#'') and the underscore (``\verb#_#''). You can choose any name you want, but it is recommended to -follow the naming scheme of the corresponding formatting engine. Why this -is recommended will be seen in section~\ref{meta engines}. +follow the naming scheme of the corresponding engine. Why this is +recommended will be seen in section~\ref{meta engines}. \paragraph{Lines 35--36.} This is the place where the actual encoding takes place. As parameters, -{\sf xds\_encode()} requires a valid encoding context plus a format string +\textsf{xds\_encode()} requires a valid encoding context plus a format string that describes how the following parameters are to be interpreted. While -the concept is obviously identical to {\sf sprintf()}, the syntax is +the concept is obviously identical to \textsf{sprintf()}, the syntax is different. The format string may contain an arbitrary number of names, which are delimited by an arbitrary number of any character that is not a legal character for engine names. Thus you can delimit the names by colons, @@ -288,70 +289,69 @@ For each valid engine name in the format string, a corresponding parameter must follow. What these parameters mean depends on the engine you're using. -The engines provided with the libxds library will expect the value to -encode, but theoretically developers are free to write formatting engines -that expect virtually any kind of information here. More about this will -explained in section~\ref{meta engines}. +The engines provided with the XDS library will expect the value to encode, +but theoretically developers are free to write encoding and decoding +engines that expect virtually any kind of information here. More about this +will explained in section~\ref{meta engines}. \paragraph{Lines 38--39.} We have encoded all values we wanted to encode, now we can get the result -from the library. This happens by calling {\sf xds\_getbuffer()}. The +from the library. This happens by calling \textsf{xds\_getbuffer()}. The routine will store the buffer's address and length at the locations we provided as parameters. Please note that we can choose whether we want the -buffer as a ``gift'' ({\sf XDS\_GIFT}) or as a ``loan'' ({\sf XDS\_LOAN}). +buffer as a ``gift'' (\textsf{XDS\_GIFT}) or as a ``loan'' (\textsf{XDS\_LOAN}). The buffer being a ``loan'' means that the buffer is still owned by the -library -- we're only allowed to peak at it. But any call to an libxds +library --- we're only allowed to peak at it. But any call to an XDS routine may potentially modify the buffer or even change the buffers -location. Hence the result of a {\sf xds\_getbuffer()} call with loaning -semantics is only valid until the next libxds routine is called. After +location. Hence the result of a \textsf{xds\_getbuffer()} call with loaning +semantics is only valid until the next XDS routine is called. After that, it is invalid. If we choose the gift semantics, the buffer we receive will be owned by us; the library will not touch the buffer again. This means of course, that -we're responsible for {\sf free()}ing the buffer when we don't need it +we're responsible for \textsf{free()}ing the buffer when we don't need it anymore. \paragraph{Line 41.} Destroy the XDS context and all data associated with it. This is possible because we requested the buffer as ``gift''; the buffer is not associated -with libxds anymore. +with XDS anymore. \paragraph{Line 43.} Write the buffer with the encoded data to the standard output stream. \paragraph{Line 45.} Now that we don't need the buffer anymore, we have to return the memory it -uses to the system. libxds won't do that for us. +uses to the system. XDS won't do that for us. \paragraph{Lines 47--50.} Write a short report of what we have done to the standard error channel. \bigskip Finally, let us compile and execute the example program shown above. For -convenience, it is included in the distribution under the name {\sf -docs/encode.c}. You can compile and execute the program as follows: +convenience, it is included in the distribution under the name +\textsf{docs/encode.c}. You can compile and execute the program as follows: \begin{quote} \begin{verbatim} -simons@dev13:~/libxds$ cd docs -simons@dev13:~/libxds/docs$ gcc -I.. encode.c -o encode -L.. -lxds -simons@dev13:~/libxds/docs$ ./encode >output +$ gcc -I.. encode.c -o encode -L.. -lxds +$ ./encode >output Encoded data: int32 = -42 uint32 = 0x12345678 string = "This is a test." -simons@dev13:~/libxds/docs$ ls -l output +$ ls -l output -rw-r--r-- 1 simons simons 28 Aug 2 15:21 output \end{verbatim} \end{quote} -The result of executing the programm --- the file {\sf output} --- can be -displayed with {\sf hexdump(1)} or {\sf od(1)} and should look like this: +The result of executing the programm --- the file \textsf{output} --- can be +displayed with \textsf{hexdump(1)} or \textsf{od(1)} and should look like this: \begin{quote} \begin{Verbatim}[fontsize=\small] -simons@dev13:~/libxds/docs$ hexdump -C output +$ hexdump -C output 00000000 ff ff ff d6 12 34 56 78 00 00 00 0f 54 68 69 73 |.....4Vx....This| 00000010 20 69 73 20 61 20 74 65 73 74 2e 00 | is a test..| 0000001c @@ -442,15 +442,15 @@ \end{Verbatim} \paragraph{Lines 1--25.} -Include the required header files, define the {\sf error\_exit()} helper +Include the required header files, define the \textsf{error\_exit()} helper function, and create the required variables on the stack. \paragraph{Lines 27--39.} These instructions will read an unspecified number of bytes from the standard input stream --- as long as the input does not exceed the size of -the {\sf buffer} variable. In order to provide the program with the -apropriate input, redirect the standard input stream to the file {\sf -output} created in the previous section or connect the encoding and +the \textsf{buffer} variable. In order to provide the program with the +appropriate input, redirect the standard input stream to the file +\textsf{output} created in the previous section or connect the encoding and decoding programs directly by a pipe. \paragraph{Lines 41-43.} @@ -468,34 +468,34 @@ Here we do not get a buffer from the library, we \emph{set} the buffer we've read earlier in the context for decoding. Please note that we use loan semantics in this case, not gift semantics. This is necessary because -{\sf buffer} has not been allocated by {\sf malloc()} --- the variable -lives on the stack. This means that we cannot give it to libxds because -libxds expects to be able to {\sf free()} the buffer when the context is +\textsf{buffer} has not been allocated by \textsf{malloc()} --- the variable +lives on the stack. This means that we cannot give it to XDS because +XDS expects to be able to \textsf{free()} the buffer when the context is destroyed. Loan semantics are fine, though, all we have to do is to take care that we -don't erase or modify the contents of {\sf buffer} while libxds operates on +don't erase or modify the contents of \textsf{buffer} while XDS operates on it. The library itself will never touch the buffer in decode mode, no matter whether loan or gift semantics have been chosen. \paragraph{Lines 53--54.} Here come the actual decoding of the buffer's contents using {\sf -xds\_decode()}. The syntax is identical to {\sf xds\_encode()}'s, the only +xds\_decode()}. The syntax is identical to \textsf{xds\_encode()}'s, the only difference is that the decoding engines do not expect the values --- like the encoding engines did --- but the location where to store the value. -Thus we pass the addresses of the apropriate variables here. If the routine -returns with {\sf XDS\_OK}, the decoded values will have been stored in +Thus we pass the addresses of the appropriate variables here. If the routine +returns with \textsf{XDS\_OK}, the decoded values will have been stored in those locations. It should be noted that the decoded string cannot trivially be returned -this way. Instead, {\sf xds\_decode()} will use {\sf malloc()} to allocate +this way. Instead, \textsf{xds\_decode()} will use \textsf{malloc()} to allocate a buffer barely large enough to hold the string. The address of that buffer -is then stored in the pointer {\sf string}. Of course this means that the -application has to {\sf free()} the string once it's not required anymore. +is then stored in the pointer \textsf{string}. Of course this means that the +application has to \textsf{free()} the string once it's not required anymore. \paragraph{Line 56.} We don't need the context anymore, so we destroy it and free all used -resources. This does not affect {\sf buffer} in any way because we used +resources. This does not affect \textsf{buffer} in any way because we used loan semantics. \paragraph{Lines 58-61.} @@ -503,19 +503,18 @@ a look at them. \paragraph{Line 63.} -Now that we don't need the contents of {\sf string} anymore, we must return -the buffer allocated in {\sf xds\_decode()} to the system. +Now that we don't need the contents of \textsf{string} anymore, we must return +the buffer allocated in \textsf{xds\_decode()} to the system. \bigskip Like the encoding program described earlier, the source code to this -program is included in the library distribution as {\sf docs/decode.c}. You -can compile and execute the program like this: +program is included in the library distribution as \textsf{docs/decode.c}. +You can compile and execute the program like this: \begin{quote} \begin{verbatim} -simons@dev13:~/libxds$ cd docs -simons@dev13:~/libxds/docs$ gcc -I.. decode.c -o decode -L.. -lxds -simons@dev13:~/libxds/docs$ ./decode