## ## OSSP sa - Socket Abstraction ## Copyright (c) 2001-2004 Ralf S. Engelschall ## Copyright (c) 2001-2004 The OSSP Project ## Copyright (c) 2001-2004 Cable & Wireless ## ## This file is part of OSSP sa, a socket abstraction library which ## can be found at http://www.ossp.org/pkg/lib/sa/. ## ## Permission to use, copy, modify, and distribute this software for ## any purpose with or without fee is hereby granted, provided that ## the above copyright notice and this permission notice appear in all ## copies. ## ## THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED ## WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF ## MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. ## IN NO EVENT SHALL THE AUTHORS AND COPYRIGHT HOLDERS AND THEIR ## CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, ## SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT ## LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF ## USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ## ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, ## OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT ## OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF ## SUCH DAMAGE. ## ## sa.pod: socket abstraction library manual page ## =pod =head1 NAME B - Socket Abstraction =head1 VERSION B =head1 SYNOPSIS =over 4 =item B: sa_rc_t, sa_addr_t, sa_t. =item B

: sa_addr_create, sa_addr_destroy. =item B

: sa_addr_u2a, sa_addr_s2a, sa_addr_a2u, sa_addr_a2s, sa_addr_match. =item B: sa_create, sa_destroy. =item B: sa_type, sa_timeout, sa_buffer, sa_option, sa_syscall. =item B: sa_bind, sa_connect, sa_listen, sa_accept, sa_getremote, sa_getlocal, sa_shutdown. =item B: sa_getfd, sa_read, sa_readln, sa_write, sa_writef, sa_flush. =item B: sa_recv, sa_send, sa_sendf. =item B: sa_error. =back =head1 DESCRIPTION B is an abstraction library for the Unix I networking application programming interface (API), featuring stream and datagram oriented communication over I and I (TCP and UDP) sockets. It provides the following key features: =over 4 =item B Although there are various Open Source libraries available which provide a similar abstraction approach, they all either lack important features or unfortunately depend on other companion libraries. B fills this gap by providing all important features (see following points) as a stand-alone and fully self-contained library. This way B can be trivially embedded as a sub-library into other libraries. It especially provides additional support for namespace-safe embedding of its API in order to avoid symbol conflicts (see C in F). =item B

Most of the ugliness in the Unix I API is the necessity to have to deal with the various address structures (C) which exist because of both the different communication types and addressing schemes. B fully hides this by providing an abstract and opaque address type (C) together with utility functions which allow one to convert from the traditional C or URI specification to the C and vice versa without having to deal with special cases related to the underlying particular C. B support I and both IPv4 and IPv6 I addressing. =item B Some other subtle details in the Unix I API make the life hard in practice: C and C. These two types originally were (and on some platforms still are) plain integers or unsigned integers while POSIX later introduced own types for them (and even revised these types after some time again). This is nasty, because for 100% type-correct API usage (especially important on 64-bit machines where pointers to different integer types make trouble), every application has to check whether the newer types exists, and if not provide own definitions which map to the still actually used integer type on the underlying platform. B hides most of this in its API and for C provides a backward-compatibility definition. Instead of C it can use C because B does not use traditional Unix return code semantics. =item B Each I/O function in B is aware of timeouts (set by sa_timeout(3)), i.e., all I/O operations return C if the timeout expired before the I/O operation was able to succeed. This allows one to easily program less-blocking network services. B internally implements these timeouts either through the C{C,C}C feature on more modern I implementations or through traditional select(2). This way high performance is achieved on modern platforms while the full functionality still is available on older platforms. =item B If B is used for stream communication, internally all I/O operations can be performed through input and/or output buffers (set by sa_buffer(3)) for achieving higher I/O performance by doing I/O operations on larger aggregated messages and with less required system calls. Additionally if B is used for stream communication, for convenience reasons line-oriented reading (sa_readln(3)) and formatted writing (see sa_writef(3)) is provided, modelled after STDIO's fgets(3) and fprintf(3). Both features fully leverage from the I/O buffering. =back =head1 DATA TYPES B uses three data types in its API: =over 4 =item B (Return Code Type) This is an exported enumerated integer type with the following possible values: SA_OK Everything Ok SA_ERR_ARG Invalid Argument SA_ERR_USE Invalid Use Or Context SA_ERR_MEM Not Enough Memory SA_ERR_MTC Matching Failed SA_ERR_EOF End Of Communication SA_ERR_TMT Communication Timeout SA_ERR_SYS Operating System Error (see errno) SA_ERR_IMP Implementation Not Available SA_ERR_INT Internal Error =item B (Socket Address Abstraction Type) This is an opaque data type representing a socket address. Only pointers to this abstract data type are used in the API. =item B (Socket Abstraction Type) This is an opaque data type representing a socket. Only pointers to this abstract data type are used in the API. =back =head1 FUNCTIONS B provides a bunch of API functions, all modelled after the same prototype: C BIC<(sa_>[C]C<_t *,> ...C<)> This means, every function returns C to indicate its success (C) or failure (CI) by returning a return code (the corresponding describing text can be determined by passing this return code to sa_error(3)). Each function name starts with the common prefix C and receives a C (or C) object handle on which it operates as its first argument. =head2 Address Object Operations This API part provides operations for the creation and destruction of address abstraction C. =over 4 =item CBC<(sa_addr_t **>IC<);> Create a socket address abstraction object. The object is stored in I on success. Example: C =item CBC<(sa_addr_t *>IC<);> Destroy a socket address abstraction object. The object I is invalid after this call succeeded. Example: C =back =head2 Address Operations This API part provides operations for working with the address abstraction C. =over 4 =item CBC<(sa_addr_t *>IC<, const char *>IC<, ...);> Import an address into by converting from an URI specification to the corresponding address abstraction. The supported syntax for I is: "CI" for I addresses and "CIC<:>I[C<#>I]" for I addresses. In the URI, I can be an absolute or relative filesystem path to an existing or not-existing file. I can be an IPv4 address in dotted decimal notation ("C<127.0.0.1>"), an IPv6 address in colon-separated (optionally abbreviated) hexadecimal notation ("C<::1>") or a to-be-resolved hostname ("C"). I has to be either a decimal port in the range C<1>...C<65535> or a port name ("C"). If I is specified as a name, it is resolved as a TCP port by default. To force resolving a I name via a particular protocol, I can be specified as either "C" or "C". The result is stored in I on success. Example: C =item CBC<(sa_addr_t *>IC<, const struct sockaddr *>IC<, socklen_t >IC<);> Import an address by converting from a traditional C object to the corresponding address abstraction. The accepted addresses for I are: C (C), C (C) and C (C). The I is the corresponding C of the particular underyling structure. The result is stored in I on success. Example: C =item CBC<(sa_addr_t *>IC<, char **>IC<);> Export an address by converting from the address abstraction to the corresponding URI specification. The result is a string of the form "CI" for I addresses and "CIC<:>I" for I addresses. Notice that I and I are returned in numerical (unresolved) way. Additionally, because usually one cannot map bidirectionally between TCP or UDP port names and the numerical value, there is no distinction between TCP and UDP here. The result is stored in I on success. The caller has to free(3) the I buffer later. Example: C =item CBC<(sa_addr_t *>IC<, struct sockaddr **>IC<, socklen_t *>IC<);> Export an address by converting from the address abstraction to the corresponding traditional C object. The result is one of the following particular underlying address structures: C (C), C (C) and C (C). The result is stored in I and I on success. The caller has to free(3) the I buffer later. Example: C =item CBC<(sa_addr_t *>IC<, sa_addr_t *>IC<, size_t >IC<);> Match two address abstractions up to a specified prefix. This compares the addresses I and I by only taking the prefix part of length I into account. I is number of filesystem path characters for I addresses and number of bits for I addresses. In case of I addresses, the addresses are matched in network byte order and the port (counting as an additional bit/item of length 1) is virtually appended to the address for matching. Specifying I as C<-1> means matching the whole address (but without the virtually appended port) without having to know how long the underlying address representation (length of path for Unix Domain addresses, 32+1 [IPv4] or 128+1 [IPv6] for Internet Domain addresses) is. Specifying I as C<-2> is equal to C<-1> but additionally the port is matched, too. This especially can be used to implement Access Control Lists (ACL) without having to fiddle around with the underlying representation. For this, make I the to be checked address and I plus I the ACL pattern as shown in the following example. Example: sa_addr_t *srv_sa; sa_addr_t *clt_saa; sa_t *clt_sa; sa_addr_t *acl_saa; char *acl_addr = "192.168.0.0"; int acl_len = 24; ... sa_addr_u2a(&acl_saa, "inet://%s:0", acl_addr); ... while (sa_accept(srv_sa, &clt_saa, &clt_sa) == SA_OK) { if (sa_addr_match(clt_saa, acl_saa, acl_len) != SA_OK) { /* connection refused */ ... sa_addr_destroy(clt_saa); sa_destroy(clt_sa); continue; } ... } ... =back =head2 Socket Object Operations This API part provides operations for the creation and destruction of socket abstraction C. =over 4 =item CBC<(sa_t **>IC<);> Create a socket abstraction object. The object is stored in I on success. Example: C =item CBC<(sa_t *>IC<);> Destroy a socket abstraction object. The object I is invalid after this call succeeded. Example: C =back =head2 Socket Parameter Operations This API part provides operations for parameterizing the socket abstraction C. =over 4 =item CBC<(sa_t *>IC<, sa_type_t >IC<);> Assign a particular communication protocol type to the socket abstraction object. A socket can only be assigned a single protocol type at any time. Nevertheless one can switch the type of a socket abstraction object at any time in order to reuse it for a different communication. Just keep in mind that switching the type will stop a still ongoing communication by closing the underlying socket. Possible values for I are C (stream communication) and C (datagram communication). The default communication protocol type is C. Example: C =item CBC<(sa_t *>IC<, sa_timeout_t >IC<, long >IC<, long >IC<);> Assign one or more communication timeouts to the socket abstraction object. Possible values for I are: C (affecting sa_accept(3)), C (affecting sa_connect(3)), C (affecting sa_read(3), sa_readln(3) and sa_recv(3)) and C (affecting sa_write(3), sa_writef(3), sa_send(3), and sa_sendf(3)). Additionally you can set all four timeouts at once by using C. The default is that no communication timeouts are used which is equal to I=C<0>/I=C<0>. Example: C =item CBC<(sa_t *>IC<, sa_buffer_t >IC<, size_t >IC<);> Assign I/O communication buffers to the socket abstraction object. Possible values for I are: C (affecting sa_read(3) and sa_readln(3)) and C (affecting sa_write(3) and sa_writef(3)). The default is that no communication buffers are used which is equal to I=C<0>. Example: C =item CBC<(sa_t *>IC<, sa_option_t >IC<, ...);> Adjust various options of the socket abstraction object. The adjusted option is controlled by I. The number and type of the expected following argument(s) are dependent on the particular option. Currently the following options are implemented (option arguments in parenthesis): C (C I) for enabling (I=C<1>) or disabling (I == C<0>) Nagle's Algorithm (see RFC898). C (C I) for enabling (I == I E C<0>) or disabling (I == C<0>) lingering on close (see C of setsockopt(2)). C (C I) for enabling (I == C<1>) or disabling (I == C<0>) the reusability of the address on binding via sa_bind(3) (see C of setsockopt(2)). C (C I) for enabling (I == C<1>) or disabling (I == C<0>) the reusability of the port on binding via sa_bind(3) (see C of setsockopt(2)). C (C I) for enabling (I == C<1>) or disabling (I == C<0>) non-blocking I/O mode (see C of fcntl(2)). Example: C =item CBC<(sa_t *>IC<, sa_syscall_t >IC<, void (*>IC<)(), void *>IC<);> Divert I/O communication related system calls to user supplied callback functions. This allows you to override mostly all I/O related system calls B internally performs while communicating. This can be used to adapt B to different run-time environments and requirements without having to change the source code. Usually this is used to divert the system calls to the variants of a user-land multithreading facility like B. The function supplied as I is required to fulfill the API of the replaced system call, i.e., it has to have the same prototype (if I is C). If I is not C, this prototype has to be extended to accept an additional first argument of type C which receives the value of I. It is up to the callback function whether to pass the call through to the replaced actual system call or not. Possible values for I are (expected prototypes behind I are given in parenthesis): C: "C", see connect(2). C: "C", see accept(2). C: "C", see select(2). C: "C", see read(2). C: "C", see write(2). C: "C", see recvfrom(2). C: "C", see sendto(2). Example: FILE *trace_fp = ...; ssize_t trace_read(void *ctx, int fd, void *buf, size_t len) { FILE *fp = (FILE *)ctx; ssize_t rv; int errno_saved; rv = read(fd, buf, len); errno_saved = errno; fprintf(fp, "read(%d, %lx, %d) = %d\n", fd, (long)buf, len, rv); errno = errno_saved; return rv; } sa_syscall(sa, SA_SC_READ, trace_read, trace_fp); =back =head2 Socket Connection Operations This API part provides connection operations for stream-oriented data communication through the socket abstraction C. =over 4 =item CBC<(sa_t *>IC<, sa_addr_t *>IC<);> Bind socket abstraction object to a local protocol address. This assigns the local protocol address I. When a socket is created, it exists in an address family space but has no protocol address assigned. This call requests that I be used as the local address. For servers this is the address they later listen on (see sa_listen(3)) for incoming connections, for clients this is the address used for outgoing connections (see sa_connect(3)). Internally this directly maps to bind(2). Example: C =item CBC<(sa_t *>IC<, sa_addr_t *>IC<);> Initiate an outgoing connection on a socket abstraction object. This performs a connect to the remote address I. If the socket is of type C, this call specifies the peer with which the socket is to be associated; this address is that to which datagrams are to be sent, and the only address from which datagrams are to be received. If the socket is of type C, this call attempts to make a connection to the remote socket. Internally this directly maps to connect(2). Example: C =item CBC<(sa_t *>IC<, int >IC<);> Listen for incoming connections on a socket abstraction object. A willingness to accept incoming connections and a queue limit for incoming connections are specified by this call. The I argument defines the maximum length the queue of pending connections may grow to. Internally this directly maps to listen(2). Example: C =item CBC<(sa_t *>IC<, sa_addr_t **>IC<, sa_t **>IC<);> Accept incoming connection on a socket abstraction object. This accepts an incoming connection by extracting the first connection request on the queue of pending connections. It creates a new socket abstraction object (returned in I) and a new socket address abstraction object (returned in I) describing the connection. The caller has to destroy these objects later. If no pending connections are present on the queue, it blocks the caller until a connection is present. Example: sa_addr_t *clt_saa; sa_t *clt_sa; ... while (sa_accept(srv_sa, &clt_saa, &clt_sa) == SA_OK) { ... } =item CBC<(sa_t *>IC<, sa_addr_t **>IC<);> Get address abstraction of remote side of communication. This determines the address of the communication peer and creates a new socket address abstraction object (returned in I) describing the peer address. The application has to destroy I later with sa_addr_destroy(3). Internally this maps to getpeername(2). Example: C =item CBC<(sa_t *>IC<, sa_addr_t **>IC<);> Get address abstraction of local side of communication. This determines the address of the local communication side and creates a new socket address abstraction object (returned in I) describing the local address. The application has to destroy I later with sa_addr_destroy(3). Internally this maps to getsockname(2). Example: C =item CBC<(sa_t *>IC<, char *>IC<);> Shut down part of the full-duplex connection. This performs a shut down of the connection described in I. The flags string can be either "C" (indicating the read channel of the communication is shut down only), "C" (indicating the write channel of the communication is shut down only), or "C" (indicating both the read and write channels of the communication are shut down). Internally this directly maps to shutdown(2). Example: C =back =head2 Socket Input/Output Operations (Stream Communication) This API part provides I/O operations for stream-oriented data communication through the socket abstraction C. =over 4 =item CBC<(sa_t *>IC<, int *>IC<);> Get underlying socket filedescriptor. This peeks into the underlying socket filedescriptor B allocated internally for the communication. This can be used for adjusting the socket communication (via fcntl(2), setsockopt(2), etc) directly. Think twice before using this, then think once more. After all that, think again. With enough thought, the need for directly manipulating the underlying socket can often be eliminated. At least remember that all your direct socket operations fully by-pass B and this way can leads to nasty side-effects. Example: C =item CBC<(sa_t *>IC<, char *>IC<, size_t >IC<, size_t *>IC<);> Read a chunk of data from socket into own buffer. This reads from the socket (optionally through the internal read buffer) up to a maximum of I bytes into buffer I. The actual number of read bytes is stored in I. This internally maps to read(2). Example: C =item CBC<(sa_t *>IC<, char *>IC<, size_t >IC<, size_t *>IC<);> Read a line of data from socket into own buffer. This reads from the socket (optionally through the internal read buffer) up to a maximum of I bytes into buffer I, but only as long as no line terminating newline character (0x0a) was found. The line terminating newline character is stored in the buffer plus a (not counted) terminating C character ('C<\0>'), too. The actual number of read bytes is stored in I. This internally maps to sa_read(3). Keep in mind that for efficiency reasons, line-oriented I/O usually always should be performed with read buffer (see sa_option(3) and C). Without such a read buffer, the performance is cruel, because single character read(2) operations would be performed on the underlying socket. Example: C =item CBC<(sa_t *>IC<, const char *>IC<, size_t >IC<, size_t *>IC<);> Write a chunk of data to socket from own buffer. This writes to the socket (optionally through the internal write buffer) I bytes from buffer I. In case of a partial write, the actual number of written bytes is stored in I. This internally maps to write(2). Example: C =item CBC<(sa_t *>IC<, const char *>IC<, ...);> Write formatted data data to socket. This formats a string according to the printf(3)-style format specification I and sends the result to the socket (optionally through the internal write buffer). In case of a partial socket write, the not written data of the formatted string is internally discarded. Hence using a write buffer is strongly recommended here (see sa_option(3) and C). This internally maps to sa_write(3). The underlying string formatting engine is just a minimal one and for security and independence reasons intentionally not directly based on s[n]printf(3). It understands only the following format specifications: "C<%%>", "C<%c>" (C), "C<%s>" (C) and "C<%d>" (C) without any precision and padding possibilities. It is intended for minimal formatting only. If you need more sophisticated formatting, you have to format first into an own buffer via s[n]printf(3) and then write this to the socket via sa_write(3) instead. Example: C =item CBC<(sa_t *>IC<);> Flush still pending outgoing data to socket. This writes all still pending outgoing data for the internal write buffer (see sa_option(3) and C) to the socket. This internally maps to write(2). Example: C =back =head2 Socket Input/Output Operations (Datagram Communication) This API part provides I/O operations for datagram-oriented data communication through the socket abstraction C. =over 4 =item CBC<(sa_t *>IC<, sa_addr_t **>IC<, char *>IC<, size_t >IC<, size_t *>IC<);> Receive a chunk of data from remote address via socket into own buffer. This receives from the remote address specified in I via the socket up to a maximum of I bytes into buffer I. The actual number of received bytes is stored in I. This internally maps to recvfrom(2). Example: C =item CBC<(sa_t *>IC<, sa_addr_t *>IC<, const char *>IC<, size_t >IC<, size_t *>IC<);> Send a chunk of data to remote address via socket from own buffer. This sends to the remote address specified in I via the socket I bytes from buffer I. The actual number of sent bytes is stored in I. This internally maps to sendto(2). Example: C =item CBC<(sa_t *>IC<, sa_addr_t *>IC<, const char *>IC<, ...);> Send formatted data data to remote address via socket. This formats a string according to the printf(3)-style format specification I and sends the result to the socket as a single piece of data chunk. In case of a partial socket write, the not written data of the formatted string is internally discarded. The underlying string formatting engine is just a minimal one and for security and independence reasons intentionally not directly based on s[n]printf(3). It understands only the following format specifications: "C<%%>", "C<%c>" (C), "C<%s>" (C) and "C<%d>" (C) without any precision and padding possibilities. It is intended for minimal formatting only. If you need more sophisticated formatting, you have to format first into an own buffer via s[n]printf(3) and then send this to the remote address via sa_send(3) instead. Example: C =back =head2 Socket Error Handling This API part provides error handling operations only. =over 4 =item CBC<(sa_rc_t >IC<);> Return the string representation corresponding to the return code value I. The returned string has to be treated read-only by the application and is not required to be deallocated. =back =head1 SEE ALSO =head2 Standards R. Gilligan, S. Thomson, J. Bound, W. Stevens: I<"Basic Socket Interface Extensions for IPv6">, B, March 1999. W. Stevens: I<"Advanced Sockets API for IPv6">, B, February 1998. R. Fielding, L. Masinter, T. Berners-Lee: I<"Uniform Resource Identifiers: Generic Syntax">, B, August 1998. R. Hinden, S. Deering: I<"IP Version 6 Addressing Architecture">, B, July 1998. R. Hinden, B. Carpenter, L. Masinter: I<"Format for Literal IPv6 Addresses in URL's">, B, December 1999. =head2 Papers Stuart Sechrest: I<"An Introductory 4.4BSD Interprocess Communication Tutorial">, FreeBSD 4.4 (/usr/share/doc/psd/20.ipctut/). Samuel J. Leffler, Robert S. Fabry, William N. Joy, Phil Lapsley: I<"An Advanced 4.4BSD Interprocess Communication Tutorial">, FreeBSD 4.4 (/usr/share/doc/psd/21.ipc/). Craig Metz: I<"Protocol Independence Using the Sockets API">, http://www.usenix.org/publications/library/proceedings/usenix2000/freenix/metzprotocol.html, USENIX Annual Technical Conference, June 2000. =head2 Manual Pages socket(2), accept(2), bind(2), connect(2), getpeername(2), getsockname(2), getsockopt(2), ioctl(2), listen(2), read(2), recv(2), select(2), send(2), shutdown(2), socketpair(2), write(2), getprotoent(3), protocols(4). =head1 HISTORY B was invented in August 2001 by Ralf S. Engelschall Erse@engelschall.comE under contract with Cable & Wireless Germany Ehttp://www.cw.com/deE for use inside the OSSP project. Its creation was prompted by the requirement to implement an SMTP logging channel for B (logging library). Its initial code was derived from a predecessor sub-library originally written for socket address abstraction inside B. =head1 AUTHOR Ralf S. Engelschall rse@engelschall.com www.engelschall.com =cut