Names: o SIO = simple/streams/stacked I/O o FIO = fast I/O o IOL = IO Lite (achtung) o LIO = Lite/Layered IO o AIO = Abstracted I/O Connectors/Plugs: o Memory SIO_PLUG *sio_plug_mem(uchar *buf, ulong len, uchar *(cb)(ulong)); - connect to preallocated buffer + callback function for more buffer allocs - connect internally to auto-growing buffer(s) o File constructors: - from fd - from FILE - from pathname o Socket - UDP - TCP o FILE*-2-SIO o fd-2-SIO o file-2-SIO o TCP/socket-2-SIO o UDP/socket-2-SIO o Pipe/FIFO-2-SIO o Null Discard Plug SIO *sio_plug_null(void) Filters: o Transparent; sio_pipe_trans o Regex-Matching-Filter o Regex-Subst-Filter o MD5/SHA1/DES/IDEA o Zlib/LZO o SSL IO Models (according to Stevens: Unix Network programming I, p. 144) o blocking I/O o nonblocking I/O (NO_HANG, NONBLK) o I/O multiplexing (select and poll) o signal driven I/O (SIGIO) o asynchronous I/O (POSIX aio_xxx) Data Structures o Rings internal o malloc(3) and mm_malloc(3) aware Buffering-Modes: o flush after timeout, close and exit() + explicit via flush() o no buffering at all o Chunking for HTTP Support! o only read is buffered o only write is buffered o read & write is buffered Buffering: o filter koennen window/buffer size einstellen o filter geben an ob sie shrinken/expandieren/gleichlassen oder random sind o source connector kann eventuell Buffer vom User direkt nutzen o umgekehrt kann auch user buffer wird von target connector genutzt werden o application koennte auch bei write angeben, dasz buffer destroyed werden darf und dasz er eventuell groesser ist als die daten, die aus ihm geschrieben werden -> nuetzlich fuer expanding filters bei der Ausgabe. Optimierungen: o mmap muss moeglich sein o if target connector = socket & source connector = file => try to use sendfile(2) o if target connector = file & source connector = file => try to use mmap(2) o wenn I/O ohne filters dann no buffering oder nur wenn chunks zu klein o wann es geht writev() und readv() nutzen, um Buffers zu koppeln beim I/O o sockets haben fuer read/write() einen low water mark (siehe setsockopt) o FreeBSD 4.0's accept_filter(9) mechanism to check for HTTP request NOTES: o Timeouts muessen generell unterstuetzt werden o Error handling optional mit callbackfunction+void o Callback functions for exceptional things o Socket-Connector muss z.B. shutdown() erlauben o man sollte meta-data (mtime, inode, etc.) von SIO connector rausfinden koennen und setzen koennen o Chunking muss moeglich sein (HTTP?) o Read sollte lesen bis: - number of bytes read - found a char of class [a-z] - found a string "foo" - found a regex "..." o man sollte auf jedem SIO seek()en koennen o man sollte SIOs in FILEs umwandeln koennen??? o bei Socket Plugs muss shutdown() also eine Seite moeglich sein Architecture: o three types of objects (siehe SIO.fig): - plugs (Stecker/Anschluesse) for socket, file, mem, etc. - pipes (Rohre/Verbindungen) - pipelines (the Kombination von plugs und pipes zu einer Einheit) o buffered I/O: es gibt eine buffer-pipe, das ist alles unbuffered I/O ist also eine Pipleline wo kein buffer-pipe dabei ist Conversions: - ASCII2EBCDIC and vice versa conversions o SIO Disciplines for Virtual Filesystem Stuff URLs as pathname open various network things, tarballs, etc. file, http, ftp, extfs ?? vfs, libfetch, libcurl ?? Performance Gains: - use sendfile() - use TCP_CORK - use ... Filter classes (from Apache discussions): 1) content-generator (files, CGI, database, etc) 2) content-filter/munger/processor (SSI, PHP, etc) 3) content-encoding (gzip?) 4) digest/message processor? (mod_auth_digest) 5) transport-encoding (HTTP chunking) 6) socket-encoding (SSL) Fuer Zero-Copy: The data the user program writes must be page sized and start on a page boundary in order for it to be run through the zero copy send code. Ideas: http://sourceforge.net/projects/paip ----------------- API Functions: sio_rc_t sio_attach (sio_t **sio, int fd); sio_rc_t sio_deattach (sio_t *sio, int *fd); sio_rc_t sio_setbuffer (sio_t *sio, size_t newsize, size_t *oldsize); sio_rc_t sio_readvec (sio_t *sio, void **vec, size_t *veclen); sio_rc_t sio_read (sio_t *sio, void *buf, size_t *buflen); sio_rc_t sio_readline (sio_t *sio, void *buf, size_t *buflen); sio_rc_t sio_readchar (sio_t *sio, char *c); sio_rc_t sio_putback (sio_t *sio, void *buf, size_t buflen); sio_rc_t sio_undo (sio_t *sio); sio_rc_t sio_writevec (sio_t *sio, void **vec, size_t veclen); sio_rc_t sio_write (sio_t *sio, void *buf, size_t buflen); sio_rc_t sio_writestr (sio_t *sio, char *str); sio_rc_t sio_writeline (sio_t *sio, void *buf, size_t buflen); sio_rc_t sio_writechar (sio_t *sio, char c); sio_rc_t sio_print (sio_t *sio, char *fmt, ...); sio_rc_t sio_printv (sio_t *sio, char *fmt, va_list ap); sio_rc_t sio_flush (sio_t *sio); sio_rc_t sio_error (sio_t *sio, char **error); sio_rc_t sio_read (sio_t *sio, sio_ioflags_t flags, ...); sio_rc_t sio_write(sio_t *sio, sio_ioflags_t flags, ...); SIO_CHR | SIO_STR | SIO_BUF | SIO_FMT | SIO_STRVEC | SIO_BUFVEC type of objects: character, nul-terminated string or buffer+size or format string based SIO_MULT multiple objects are passed in call SIO_NULLEND whether size of vector is indicated by NULL or given SIO_COPY objects are copied to library (internal its SIO_GIFT after copy!) SIO_GIFT objects are gifted to library SIO_LOAN objects are just loaned to library sio_read(sio, SIO_BUF, buf, buflen, &readlen); sio_read(sio, SIO_LINE, buf, buflen, &readlen); sio_write(sio, SIO_STR, "foo"); sio_write(sio, SIO_STR|SIO_MULT, "foo", "bar", NULL); sio_write(sio, SIO_VEC|SIO_STR, vec, veclen); sio_write(sio, SIO_VEC|SIO_STR|SIO_NULLEND, vec); sio_write(sio, SIO_BUF, buf, buflen); sio_write(sio, SIO_BUF|SIO_MULT, buf, buflen, buf2, buflen2, NULL); sio_write(sio, SIO_VEC|SIO_BUF, vec, veclen); sio_write(sio, SIO_VEC|SIO_BUF|SIO_NULLEND, vec); sio_write(sio, SIO_STR|SIO_MULT, line, "\r\n", NULL); sio_write(sio, SIO_CHR, c); sio_write(sio, SIO_BUF, &c, 1); sio_write(sio, SIO_FMT, "%c%s%S%b%B", c, cp, cpvec, cpvecsize, buf, bufsize, bufvec, bufvecsize); API Comfort: sio_writestr("..") -> sio_write(SIO_STR, ".."); API Standard: sio_write(SIO_STR, char *x) -> sio_output(x, strlen(x)); sio_write(SIO_VEC, char *x) -> for... sio_output(x.ptr, x.len); done API Basic: sio_output 1. Was ist mit seekable fds (files!)? seek, tell, 2. Top-level filtering and chaining? tie, untie ----------------- IDEA: - SIO (Socket IO) - BIO (Buffered/Filtered I/O) - BA/BB (Buffer Aggregates, Bucket Brigades -- ACT) ----------------- brigate ::= bucket * bucket ::= ----------------- /* SFIO: sfreserve? * sfpool ? */ /* * Data structures */ /* the general SIO API */ typedef struct sio_st sio_t; /* I/O vector entry (similar to POSIX struct iovec) for sio_{read,write}v() */ typdef struct sio_iovec_st char *iov_base; /* Base address. */ size_t iov_len; /* Length. */ } sio_iovec_t; typedef long sio_off_t; typedef sio_uint8_t; typedef sio_uint16_t; typedef sio_uint32_t; #define SIO_SEEK_SET #define SIO_SEEK_CUR #define SIO_SEEK_END /* * Values */ #define SIO_EOF (-1) /* * Stream Disciplines */ sio_disc_t *sio_disc_null (void); sio_disc_t *sio_disc_anon (void); sio_disc_t *sio_disc_fd (int fd, ...); sio_disc_t *sio_disc_socket (int fd, int type /*tcp,udp*/, ...); sio_disc_t *sio_disc_pipe (int fd, ...); sio_disc_t *sio_disc_file (FILE *fp, ...); sio_disc_t *sio_disc_url (const char *url, ...); /* * Stream Handling */ sio_t *sio_new (sio_disc_t *disc); sio_t *sio_dup (sio_t *sio); int sio_free (sio_t *sio); /* * I/O Operations */ sio_size_t sio_read (sio_t *sio, void *buf, size_t bytes); sio_size_t sio_write (sio_t *sio, void *buf, size_t bytes); sio_size_t sio_writev (sio_t *sio, const sio_iovec_t *iov, int iovcnt); sio_size_t sio_writev (sio_t *sio, const sio_iovec_t *iov, int iovcnt); sio_off_t sio_seek (sio_t *sio, sio_off_t offset, int type); sio_size_t sio_move (sio_t *siow, sio_t *sior, int n, int rsc); int sio_getc (sio_t *sio); int sio_putc (sio_t *sio, int c); int sio_nputc (sio_t *sio, int c, sio_size_t n); int sio_ungetc (sio_t *sio, int c); char *sio_getr (sio_t *sio, int rsc, int type); sio_size_t *sio_putr (sio_t *sio, const char *str, int rsc); /* * Data Formatting */ sio_site_t sio_printf (sio_t *sio, const char *fmt, ...); sio_site_t sio_vprintf (sio_t *sio, const char *fmt, va_list ap); sio_site_t sio_scanf (sio_t *sio, const char *fmt, ...); sio_site_t sio_vscanf (sio_t *sio, const char *fmt, va_list ap); /* * Buffering & Synchronization */ int sio_sync (sio_t *sio); int sio_purge (sio_t *sio); int sio_poll (sio_poll_t *pl, sio_size_t pn, sio_time_t timeout); int sio_eof (sio_t *sio); /* * Stream Control */ long sio_ctrl (sio_t *sio, int cmd, ...); int sio_top (sio_t *sio); int sio_push (sio_t *sio, sio_t *top); int sio_pop (sio_t *sio); int sio_swap (sio_t *sio1, sio_t *sio2); ########################################################################### mlelstv sez: sio pipe: - two chains of handlers for reading and writing - "bucket brigade"-buffers, degenerates to static buffer(ptr,len) - handler degenerates to (handle,read(2),write(2)) ? dispatching: - sio_strategy(chain) h = chain; while (h) { if h->call(sio) // reader or writer h = h->next; else h = h->prev; } ? eof/error propagation -> global flags, (byte offset ?) - sio_write(buf) push sio->writebuffer, buf; sio_strategy(sio->writers); buf = pop sio->writebuffer; - sio_read(bound) sio->readbuffer->append(buf); sio_strategy(sio->readers); pop sio->readbuffer; buffer: - bytestream(policy for chunks ?) - drop bytestream - append bytes to bytestream (ptr, len) == copy - prepend bytes to bytestream (ptr, len) == copy - append static buffer to bytestream - splice bytestream into new bytestream (offset, length, dest) - truncate bytestream (offset, length) == splice but drop data - traverse chunks of bytestream forward (offset, length) - traverse chunks of bytestream reverse (offset, length) - splice last chunk from bytestream (convenience function) - splice first chunk from bytestream (convenience function) - flatten chunks of bytestream into target buffer (convenience function) - garbage collection ? pullup ? seek cache ? - read consistency while traversing ? bytestream, chunks, buffer chunk memory allocator needed buffer memory allocator ? --- stdio "emulation" fixed buffer <-> bytestream EOF handling ########################################################################### mlelstv sez: ideas for error/eof/urgent-data handling -> multiple assembly lines per stream - too big ? - synchronization between "bands" required -> "urgent" pointer similar to TCP ? + trivial implementation - only single condition for multiple causes -> in-band multiplexing of labeled streams + synchonization is implicit + any number of conditions - needs support in AL (coalescing/merging)