_ ___ ____ ____ ____ _ ____ |_|_ _ / _ \/ ___/ ___|| _ \ | |___ \ _|_||_| | | | \___ \___ \| |_) | | | __) | |_||_|_| | |_| |___) |__) | __/ | |/ __/ |_|_|_| \___/|____/____/|_| |_|_____| OSSP l2 -- Flexible Logging TODO ==== 20030611 thl Fix Problem where l2 is not thread safe when multiple threads use the same env as is true with with fsl. http://cvs.ossp.org/tktview?tn=24 20030324 thl The sanity checking in the l2_ch_syslog.c:hook_configure() implementation causes an artifical sensibility to the order of parameters when a l2spec is used. A possible solution could be to relocate the sanity checks into hook_open(). The following two lines show simple examples, the first one works, the second doesn't. $ echo "hello" | l2tool 'syslog(facility=mail, ident=info, remotehost="127.0.0.1", remoteport="514", target="remote")' #works $ echo "hello" | l2tool 'syslog(facility=mail, ident=info, target="remote", remotehost="127.0.0.1", remoteport="514")' #error 20030127 thl Fixed a bug in l2_channel.c:l2_channel_destroy() where chD was passing a pointer to l2_channel_downstream() to find a sibling of a previously free(3)d node. This caused a bus error on FreeBSD5 where free(3) filled the free area with nonzero data. All other implementations seem to keep the data and l2_channel_downstream() traversed through still valid pointers in unmalloc(3)ed areas. This caused openssh using fsl v1.0.6 to fail on FreeBSD5. The fix was implemented in l2_channel_destroy() through look-ahead. This is once again a basic iteration problem which can catch us in other places as well. This needs to be verified. Structure of channels and documentation [thl]. It should be possible to drag the documentation out of a channel's source code. Everything else is error prone and a documentation nightmare. Currently, most (noop and null doesn't) channels have a code segment that looks like /* feed and call generic parameter parsing engine */ L2_PARAM_SET(pa[0], size, INT, &cfg->bufsize); L2_PARAM_SET(pa[1], interval, INT, &cfg->bufinterval); L2_PARAM_SET(pa[2], levelflush, INT, &cfg->levelflush); L2_PARAM_END(pa[3]); l2_channel_env(ch, &env); I recommend the L2_PARAM_SET macro should be modified to - set the default value of each paramater, if any exists - force the internal structure name to follow the revealed name - declare a channel to be a filter or output - declare parameters optional or mandatory Also i recommend an organizational enforcement of putting the channel description before such a code segment and a short description (i.e. unit, 0=deactivate) of each parameter/value just behind the macro. This could change the example above to look something like: /* * The buffer channel buffers messages poured in from upper channels. * It flushes the messages down to the lower channels when the buffered * space exceeds the buffer size, when a given time interval is reached * (0 disables this feature) or when a newly arrived message has a * level that matches the levelflush mask. * /* feed and call generic parameter parsing engine */ L2_PARAM_SET(pa[0], size, INT, 4096); /* o: bytes */ L2_PARAM_SET(pa[1], interval, INT, 0 ); /* o: sec, 0=disable */ L2_PARAM_SET(pa[2], levelflush, INT, 0 ); /* o: levelmask */ L2_PARAM_END(pa[3]); l2_channel_env(ch, &env); /**/ Neh idea as thl discussed with rse a few minutes ago - l2 should calculate the global logging mask automatically based on the OR'ing the masks of all output channels. This should increase performance as useless requests are captured at an early stage. - l2 should provide a function to set the logging masks including the global logging mask manually. This would require a mechanism to re-calculate the maks at any time when manually set to 'auto'. - l2 should provide a function to get the logging masks including the global logging mask. This could be used by the caller to check if it is worth calling the actual log and allow him to bypass wholly code blocks which exist for logging purposes only. Next steps: - libl2syslog: OpenPKG fakesyslog is nasty, because it doesn't provide logging to multiple files or filter out some messages. Additionally the application has to be restarted in order to reopen the logfile which is nasty for MTAs like Postfix in case of very high loads (because they start again processing the queue from scratch). What we need is a new L2-based libl2syslog which maps from syslog(3) API to l2(3) API. We later can add the reopen feature in L2 or sends the messages via Unix Domain socket to an L2 daemon which in turn logs to targets via l2tool, etc. - signal and process handling (l2_env*) - asynchronous channel (l2_ch_async.c) - manual page (l2.pod) New Channels ---------------------------------------------- - l2_ch_rotfile for a cronolog-style file writing, i.e., it changes the file according to a strftime-based filename expansion. - l2_ch_bofh does something cool but is undocumented and has its source mangled/scrambled ;) - l2_ch_asynch spawns a new process and communicates with the stemming channels via shared memory. Might be the way to implement autoflush as well, or rather with its own shared memory and sub-process. - l2_ch_smart is a smart debug buffer channel which flushes only if errors occur. - l2_ch_repeat prints 'last message repeated 100 times.' - l2_ch_action is similar to pipe channel, but executes a given command conditionally of the incoming priority or a regex. -> Might be implemented by combining a pcre, filter, and pipe (ms) - l2_ch_proxy sniffs a tcp connection and dumps its traffic to standard out or to the next channel, allowing for parsing through an unknown protocol. - l2_ch_nntp for logging messages to NNTP (news) service. support post and feed modes - l2_ch_snmp logs a stream to an SNMP listener. - ls_ch_trigger pulls the trigger to flush downstream if a certain critera is met - l2_ch_ring keeps messages in a ring buffer until someone pulls the trigger (backtrace) Existing Channels ----------------------------------------- - l2_ch_socket bind and udp parameters need work. (RSE: why and what?) - l2_ch_socket consider taking a fd at configuration. - l2_ch_pcre is too large and might benefit from a smaller set of pattern matching logic. - l2_ch_syslog should inherit socket logic to directly implement syslog communication, allowing for remote syslog operations. - l2_ch_syslog should be trimmed to necessary functionality, not necessarily everything that the system syslog(3) implements. - l2_ch_irc has a problem pinging its host to stay alive. - l2_ch_prefix produces implicitly two log messages if a following buffer channel would not accumulate them. - l2_ch_prefix should have printf style without strftime. - l2_ch_prefix should apply __FILE__, __LINE__, (__FUNCTION__) as options in case we are building in custom striptease mode. - l2_ch_file should optionally support file locking via fcntl(3). (RSE: why? Unix guarranties that up to 512 bytes one can write atomically) - l2_ch_pipe needs an overhaul once l2_ch_async is implemented. Also needs a review for dangling descriptors. During configure only checks existance of command in non-shell mode. Remove hard coded 256 in exec arguments. - l2_ch_pipe should use a standardized url like prg:/path/to/program. - l2_ch_pipe might consider taking a fd at configuration. - l2_ch_buffer buffer is duplicated when the process forks. This means the buffer has to be flushed in advance or the content is dumped twice. If the buffer would remember the pid of the last writer, it could discard the contents of the buffer when the pid changes. This is because the parent retains the pid and the buffer content while the child changes the pid and discards the content. Should be an optionial feature. -> Can be immediately implemented (rse) - l2_ch_buffer needs a L2_OK and L2_OK_PASS consistency check - l2_ch_buffer should not use different param types in alarm and setitimer(2). - l2_ch_buffer should incorporate multiplexer logic to properly flush when multiple buffers are in use. Can be implemented via a environment level timer and a lowest common denominator alarm function. For example, tmr1 9secs; tmr2 6secs; envtmr expires every lcd(9,6) = 3secs - l2_ch_buffer should not implement an autoflush, rather L2 should have a multiplexed timer object to serve timer requests from incoming callbacks. Library-wide changes -------------------------------------- - Add nonreentrant log method like mm. - Add striptease target to Makefile. - l2tool needs an [addr:]port option - API access through the LogManager logm; - Jump to a callback in case of error, throw/exit. - Offer the option to reopen the logfile on each write. - Optimize log operations from log(..) {} to log(....); - Implicit level decision should default to >=, then follow with <= = ; * - '\r\n' Uebersetzung Bei Input \r und \n wegstrippen, und von jedem Output Channel entweder \r\n (smtp) oder nur \n wieder anhaengen lassen. Oder als Kanaloption, wobei \r, \r\n, \n, gleich String, oder gar nix. - Spec-facility To debug an application, sometimes it's overkill to log everything at DEBUG level. I see an improvement when an additional facility can be specified. Example: DEBUG/LMTP but don't care about NNTP in the lmtp2nntp program. Possibly could be implemented as a second mask. -> Needs more consideration before implementation should start (rse) - Prelogging L2 should log even before it is initialized. Maybe the log function should buffer everything as long as a NULL l2-context is passed and if ever a non-NULL context is passed every remembered message should be logged afterwards or if destroy/flush is executed with a NULL- context it should print the buffered stuff to stderr. -> Can be immediately implemented, but one has to be carefully here. I want to think about this a little bit more in-depth. (rse) - Config file Might be used with the previous prelogging principle in mind. Namely, a file that describes a set of channel specs for one or many configurations. Path is searched in the following order: 1. /etc/liblog.conf 2. (in ., .., ../..) 3. $HOME/.liblog.conf - hook_write methods should receive a null-termined string instead of buf+size, because some channels like syslog otherwise have to rebuffer the message and append the null terminator character. Spec-parsing ---------------------------------------------- - implement location tracking - target=remote has to be currently after host=xxx because of single-configuration procedure - bugfix l2tool and after it works 100% remove l2_test in favor of l2tool and add instead a test shell script which calls l2tool with various input specifications. Channel-Only Revamping ------------------------------------ - l2_objects update - syscall override ala OSSP SA in l2_env_t - perhaps rename l2_env to l2_ctx and l2_channel_ to just l2_ - API cleanup for open semantics Documentation --------------------------------------------- - l2_ch_buffer How the buffer object behaves in relation to up/downstream channels. When does it pass its data to the next channel, when does it erase, what happens to its data when it is over written or flushed. - l2_ch_syslog Describe the options, and mention that more info is found in the man page for syslog(3). Identify features that change from one system to the next. - l2 errors Describe how an l2 channel reports errors when it fails during an operation, and how a user should interpret the error message? Describe whether a channel continues passing data on downstream if it somehow fails, and if the data will be corrupt. - l2_util_fmt_string This can be used like %s, but instead of fetching only a "char *" from the var-args stack, it fetches a "char *" plus a "size_t" and this way allows one to log only a sub-string of a larger string without the need for any temporary buffers, etc. - l2_util_fmt_dump: This can be used as "%{type}X" for dumping arbitrary octets. The parameter "type" can be either "text" (the default if only "%X" is used) for dumping the octets as text but with non-printable characters replaced by "\xXX" constructs; "hex" for dumping the octets in hexadecimal as "XX:XX:XX:XX" or "base64" for dumping the octets Base64 encoded. All three are intended for making it easier to produce reasonable L2_LEVEL_DEBUG messages without having to fiddle around with temporary buffers and having to care with non-printable characters. - formatter example l2_stream_formatter(st, 'D', l2_util_fmt_dump, NULL); : l2_stream_vlog(st, L2_LEVEL_DEBUG, "%{text}D %{hex}D %{base64}D\n", "foo", 12345, "foo\1bar", 7, "foo\1bar", 7, "foo\1bar", 7); : ...produces "foo\x01bar 66:6f:6f:01:62:61:72 Zm9vAWJhcg==" High-level configuration interface ================================== Config-File (OSSP): logging { channel 0 null ALL; channel 1 prefix CUSTOM1 { prefix="%Y-%m-%d/%H:%M:%S" }; channel 2 buffer { size=8k }; channel 3 file { path=/path/to/ossp.stat.log; mode=0644 }; channel 4 prefix wmask=DEBUG { prefix="%Y-%m-%d/%H:%M:%S %L" }; channel 5 buffer { size=4k }; channel 6 file { path=/path/to/ossp.error.log; mode=0644 }; channel 7 syslog wmask=ERROR { peer=remote; host=loghost }; channel 8 filter { regex="(memory|foo)" }; channel 9 smtp { host=loghost; rcpt="rse@engelschall.com" }; stream { 0->(4,7); 4->(5,7); 5->6; 8->9; 1->2->3 }; log access 0; log stat 1; }; Alternative: log access { error: syslog; prefix -> { buffer(size=4k) -> file(path=/path/to/log, mode=0644); panic: smtp(host=en1, rcpt=rse@engelschall.com); } } Command-Line (lmtp2nntp): $ lmtp2nntp -l 'INFO:' { prefix(prefix="%Y-%m-%d/%H:%M:%S %L") ->{ buffer(size=4k) ->file(path=/path/to/log,mode=0644); smtp(host=en1,rcpt=rse@engelschall.com) }; error:syslog } Brainstorming --------------------------------------------- - Debugging can be implemented as a special case of logging - Tracing can be implemented as a special case of debugging *** *** *** *** *** *** Out of date *** *** *** *** *** *** *** *** *** *** *** *** Out of date *** *** *** *** *** *** *** *** *** *** *** *** Out of date *** *** *** *** *** *** Backend channels ------------------------------------------ - Level -> N Channels - file (append) - program (stdin) - syslog - stderr/stdout - null (discard, nicht nur /dev/null) - filedescriptor (escape/ext) - callback function Aufbau ---------------------------------------------------- Layer C++ API log.hpp, log.cpp Layer C API log.h log.c Layer C Backend backend.h backend.c - Ein Wort noch zu variablen Argumentlisten in cpp-Makros: gcc unterstützt dies in in der GNU- und der C99-Ausführung. Das heißt, der "..." Parameter kann im Makro respektive über "args" und über "__VA_ARGS__" angesprochen werden. Wichtig ist dabei, daß "..." nicht leer sein -- also kein Argument enthalten -- darf, da sonst der Präprocessor an einem eventuell vorhandenen Komma scheitert. Dies kann beim gcc durch Voranstellen von "##" vor dem "__VA_ARGS__" umgangen werden. Ouch. Beide Erweiterungen sind derzeit nicht aktiv, wenn mit -ansi compiliert wird. Explizit anschalten läßt sich die standard-konforme Erweiterung über "-std=c9x", bzw. "-std=c99" bei neueren gccs. Log Messages: - raw - optional prefixes (inclusive of order) string facility level timestamp pid (tid) - errno (like syslog %m) - eigene %{foo}x mit callback function mit context - automatisch number -> string mapping (fuer error strings) - !debug -> !code - add support for prefixes of all non-static symbols to better support embedding