ossp-pkg/l2/TODO
_ ___ ____ ____ ____ _ ____
|_|_ _ / _ \/ ___/ ___|| _ \ | |___ \
_|_||_| | | | \___ \___ \| |_) | | | __) |
|_||_|_| | |_| |___) |__) | __/ | |/ __/
|_|_|_| \___/|____/____/|_| |_|_____|
OSSP l2 -- Flexible Logging
TODO
====
20030611 thl
Fix Problem where l2 is not thread safe when multiple
threads use the same env as is true with with fsl.
http://cvs.ossp.org/tktview?tn=24
20030324 thl
The sanity checking in the l2_ch_syslog.c:hook_configure()
implementation causes an artifical sensibility to the order of
parameters when a l2spec is used. A possible solution could be to
relocate the sanity checks into hook_open(). The following two
lines show simple examples, the first one works, the second doesn't.
$ echo "hello" | l2tool 'syslog(facility=mail, ident=info, remotehost="127.0.0.1", remoteport="514", target="remote")' #works
$ echo "hello" | l2tool 'syslog(facility=mail, ident=info, target="remote", remotehost="127.0.0.1", remoteport="514")' #error
20030127 thl
Fixed a bug in l2_channel.c:l2_channel_destroy() where chD was
passing a pointer to l2_channel_downstream() to find a sibling of a
previously free(3)d node. This caused a bus error on FreeBSD5 where
free(3) filled the free area with nonzero data. All other
implementations seem to keep the data and l2_channel_downstream()
traversed through still valid pointers in unmalloc(3)ed areas. This
caused openssh using fsl v1.0.6 to fail on FreeBSD5. The fix was
implemented in l2_channel_destroy() through look-ahead. This is once
again a basic iteration problem which can catch us in other places
as well. This needs to be verified.
Structure of channels and documentation [thl]. It should be possible to
drag the documentation out of a channel's source code. Everything else
is error prone and a documentation nightmare. Currently, most (noop and
null doesn't) channels have a code segment that looks like
/* feed and call generic parameter parsing engine */
L2_PARAM_SET(pa[0], size, INT, &cfg->bufsize);
L2_PARAM_SET(pa[1], interval, INT, &cfg->bufinterval);
L2_PARAM_SET(pa[2], levelflush, INT, &cfg->levelflush);
L2_PARAM_END(pa[3]);
l2_channel_env(ch, &env);
I recommend the L2_PARAM_SET macro should be modified to
- set the default value of each paramater, if any exists
- force the internal structure name to follow the revealed name
- declare a channel to be a filter or output
- declare parameters optional or mandatory
Also i recommend an organizational enforcement of putting the
channel description before such a code segment and a short description
(i.e. unit, 0=deactivate) of each parameter/value just behind the macro.
This could change the example above to look something like:
/*<man description=channel, type=filter>
* The buffer channel buffers messages poured in from upper channels.
* It flushes the messages down to the lower channels when the buffered
* space exceeds the buffer size, when a given time interval is reached
* (0 disables this feature) or when a newly arrived message has a
* level that matches the levelflush mask.
*
/* feed and call generic parameter parsing engine */
L2_PARAM_SET(pa[0], size, INT, 4096); /* o: bytes */
L2_PARAM_SET(pa[1], interval, INT, 0 ); /* o: sec, 0=disable */
L2_PARAM_SET(pa[2], levelflush, INT, 0 ); /* o: levelmask */
L2_PARAM_END(pa[3]);
l2_channel_env(ch, &env);
/*</man>*/
Neh idea as thl discussed with rse a few minutes ago
- l2 should calculate the global logging mask automatically based on
the OR'ing the masks of all output channels. This should increase
performance as useless requests are captured at an early stage.
- l2 should provide a function to set the logging masks including
the global logging mask manually. This would require a mechanism
to re-calculate the maks at any time when manually set to 'auto'.
- l2 should provide a function to get the logging masks including
the global logging mask. This could be used by the caller to
check if it is worth calling the actual log and allow him to
bypass wholly code blocks which exist for logging purposes only.
Next steps:
- libl2syslog:
OpenPKG fakesyslog is nasty, because it doesn't provide logging
to multiple files or filter out some messages. Additionally the
application has to be restarted in order to reopen the logfile which
is nasty for MTAs like Postfix in case of very high loads (because
they start again processing the queue from scratch). What we need is a
new L2-based libl2syslog which maps from syslog(3) API to l2(3) API.
We later can add the reopen feature in L2 or sends the messages via
Unix Domain socket to an L2 daemon which in turn logs to targets via
l2tool, etc.
- signal and process handling (l2_env*)
- asynchronous channel (l2_ch_async.c)
- manual page (l2.pod)
New Channels ----------------------------------------------
- l2_ch_rotfile for a cronolog-style file writing, i.e., it changes the
file according to a strftime-based filename expansion.
- l2_ch_bofh does something cool but is undocumented and has its
source mangled/scrambled ;)
- l2_ch_asynch spawns a new process and communicates with
the stemming channels via shared memory. Might be the
way to implement autoflush as well, or rather with its
own shared memory and sub-process.
- l2_ch_smart is a smart debug buffer channel which flushes
only if errors occur.
- l2_ch_repeat prints 'last message repeated 100 times.'
- l2_ch_action is similar to pipe channel, but executes a given
command conditionally of the incoming priority or a regex.
-> Might be implemented by combining a pcre, filter, and pipe (ms)
- l2_ch_proxy sniffs a tcp connection and dumps its traffic to
standard out or to the next channel, allowing for parsing
through an unknown protocol.
- l2_ch_nntp for logging messages to NNTP (news) service.
support post and feed modes
- l2_ch_snmp logs a stream to an SNMP listener.
- ls_ch_trigger pulls the trigger to flush downstream if a certain critera is met
- l2_ch_ring keeps messages in a ring buffer until someone pulls the trigger (backtrace)
Existing Channels -----------------------------------------
- l2_ch_socket bind and udp parameters need work. (RSE: why and what?)
- l2_ch_socket consider taking a fd at configuration.
- l2_ch_pcre is too large and might benefit from a
smaller set of pattern matching logic.
- l2_ch_syslog should inherit socket logic to directly implement
syslog communication, allowing for remote syslog operations.
- l2_ch_syslog should be trimmed to necessary functionality, not
necessarily everything that the system syslog(3) implements.
- l2_ch_irc has a problem pinging its host to stay alive.
- l2_ch_prefix produces implicitly two log messages if a
following buffer channel would not accumulate them.
- l2_ch_prefix should have printf style without strftime.
- l2_ch_prefix should apply __FILE__, __LINE__, (__FUNCTION__)
as options in case we are building in custom striptease mode.
- l2_ch_file should optionally support file locking via fcntl(3).
(RSE: why? Unix guarranties that up to 512 bytes one can write atomically)
- l2_ch_pipe needs an overhaul once l2_ch_async is implemented.
Also needs a review for dangling descriptors. During configure
only checks existance of command in non-shell mode. Remove
hard coded 256 in exec arguments.
- l2_ch_pipe should use a standardized url like prg:/path/to/program.
- l2_ch_pipe might consider taking a fd at configuration.
- l2_ch_buffer buffer is duplicated when the process forks. This
means the buffer has to be flushed in advance or the content
is dumped twice. If the buffer would remember the pid of the
last writer, it could discard the contents of the buffer when
the pid changes. This is because the parent retains the pid
and the buffer content while the child changes the pid and
discards the content. Should be an optionial feature.
-> Can be immediately implemented (rse)
- l2_ch_buffer needs a L2_OK and L2_OK_PASS consistency check
- l2_ch_buffer should not use different param types in alarm and setitimer(2).
- l2_ch_buffer should incorporate multiplexer logic to properly flush
when multiple buffers are in use. Can be implemented via a environment
level timer and a lowest common denominator alarm function. For example,
tmr1 9secs; tmr2 6secs; envtmr expires every lcd(9,6) = 3secs
- l2_ch_buffer should not implement an autoflush, rather L2 should
have a multiplexed timer object to serve timer requests from
incoming callbacks.
Library-wide changes --------------------------------------
- Add nonreentrant log method like mm.
- Add striptease target to Makefile.
- l2tool needs an [addr:]port option
- API access through the LogManager logm;
- Jump to a callback in case of error, throw/exit.
- Offer the option to reopen the logfile on each write.
- Optimize log operations from log(..) {}
to log(....);
- Implicit level decision should default to >=,
then follow with <=
=
;
*
- '\r\n' Uebersetzung
Bei Input \r und \n wegstrippen, und von jedem Output Channel
entweder \r\n (smtp) oder nur \n wieder anhaengen lassen. Oder
als Kanaloption, wobei \r, \r\n, \n, gleich String, oder gar nix.
- Spec-facility
To debug an application, sometimes it's overkill to log everything
at DEBUG level. I see an improvement when an additional facility
can be specified. Example: DEBUG/LMTP but don't care about NNTP in
the lmtp2nntp program. Possibly could be implemented as a second mask.
-> Needs more consideration before implementation should start (rse)
- Prelogging
L2 should log even before it is initialized. Maybe the log function
should buffer everything as long as a NULL l2-context is passed and
if ever a non-NULL context is passed every remembered message should
be logged afterwards or if destroy/flush is executed with a NULL-
context it should print the buffered stuff to stderr.
-> Can be immediately implemented, but one has to be carefully here.
I want to think about this a little bit more in-depth. (rse)
- Config file
Might be used with the previous prelogging principle in mind.
Namely, a file that describes a set of channel specs for one
or many configurations. Path is searched in the following order:
1. /etc/liblog.conf
2. (in ., .., ../..)
3. $HOME/.liblog.conf
- hook_write methods should receive a null-termined string
instead of buf+size, because some channels like syslog
otherwise have to rebuffer the message and append the
null terminator character.
Spec-parsing ----------------------------------------------
- implement location tracking
- target=remote has to be currently after host=xxx because of
single-configuration procedure
- bugfix l2tool and after it works 100% remove l2_test in
favor of l2tool and add instead a test shell script
which calls l2tool with various input specifications.
Channel-Only Revamping ------------------------------------
- l2_objects update
- syscall override ala OSSP SA in l2_env_t
- perhaps rename l2_env to l2_ctx and l2_channel_ to just l2_
- API cleanup for open semantics
Documentation ---------------------------------------------
- l2_ch_buffer
How the buffer object behaves in relation to up/downstream
channels. When does it pass its data to the next channel,
when does it erase, what happens to its data when it is
over written or flushed.
- l2_ch_syslog
Describe the options, and mention that more info is found
in the man page for syslog(3). Identify features that
change from one system to the next.
- l2 errors
Describe how an l2 channel reports errors when it fails during
an operation, and how a user should interpret the error message?
Describe whether a channel continues passing data on downstream
if it somehow fails, and if the data will be corrupt.
- l2_util_fmt_string
This can be used like %s, but instead of fetching only a "char *"
from the var-args stack, it fetches a "char *" plus a "size_t" and
this way allows one to log only a sub-string of a larger string
without the need for any temporary buffers, etc.
- l2_util_fmt_dump:
This can be used as "%{type}X" for dumping arbitrary octets. The
parameter "type" can be either "text" (the default if only "%X"
is used) for dumping the octets as text but with non-printable
characters replaced by "\xXX" constructs; "hex" for dumping the
octets in hexadecimal as "XX:XX:XX:XX" or "base64" for dumping the
octets Base64 encoded. All three are intended for making it easier to
produce reasonable L2_LEVEL_DEBUG messages without having to fiddle
around with temporary buffers and having to care with non-printable
characters.
- formatter example
l2_stream_formatter(st, 'D', l2_util_fmt_dump, NULL);
:
l2_stream_vlog(st, L2_LEVEL_DEBUG, "%{text}D %{hex}D %{base64}D\n",
"foo", 12345, "foo\1bar", 7, "foo\1bar", 7, "foo\1bar", 7);
:
...produces "foo\x01bar 66:6f:6f:01:62:61:72 Zm9vAWJhcg=="
High-level configuration interface
==================================
Config-File (OSSP):
logging {
channel 0 null ALL;
channel 1 prefix CUSTOM1 { prefix="%Y-%m-%d/%H:%M:%S" };
channel 2 buffer { size=8k };
channel 3 file { path=/path/to/ossp.stat.log; mode=0644 };
channel 4 prefix wmask=DEBUG { prefix="%Y-%m-%d/%H:%M:%S %L" };
channel 5 buffer { size=4k };
channel 6 file { path=/path/to/ossp.error.log; mode=0644 };
channel 7 syslog wmask=ERROR { peer=remote; host=loghost };
channel 8 filter { regex="(memory|foo)" };
channel 9 smtp { host=loghost; rcpt="rse@engelschall.com" };
stream { 0->(4,7); 4->(5,7); 5->6; 8->9; 1->2->3 };
log access 0;
log stat 1;
};
Alternative:
log access {
error: syslog;
prefix -> {
buffer(size=4k) ->
file(path=/path/to/log, mode=0644);
panic: smtp(host=en1, rcpt=rse@engelschall.com);
}
}
Command-Line (lmtp2nntp):
$ lmtp2nntp -l 'INFO:'
{ prefix(prefix="%Y-%m-%d/%H:%M:%S %L")
->{ buffer(size=4k)
->file(path=/path/to/log,mode=0644);
smtp(host=en1,rcpt=rse@engelschall.com)
};
error:syslog
}
Brainstorming ---------------------------------------------
- Debugging can be implemented as a special case of logging
- Tracing can be implemented as a special case of debugging
*** *** *** *** *** *** Out of date *** *** *** *** *** ***
*** *** *** *** *** *** Out of date *** *** *** *** *** ***
*** *** *** *** *** *** Out of date *** *** *** *** *** ***
Backend channels ------------------------------------------
- Level -> N Channels
- file (append)
- program (stdin)
- syslog
- stderr/stdout
- null (discard, nicht nur /dev/null)
- filedescriptor (escape/ext)
- callback function
Aufbau ----------------------------------------------------
Layer C++ API log.hpp, log.cpp
Layer C API log.h log.c
Layer C Backend backend.h backend.c
- Ein Wort noch zu variablen Argumentlisten in cpp-Makros: gcc
unterstützt dies in in der GNU- und der C99-Ausführung. Das heißt,
der "..." Parameter kann im Makro respektive über "args" und über
"__VA_ARGS__" angesprochen werden. Wichtig ist dabei, daß "..."
nicht leer sein -- also kein Argument enthalten -- darf, da sonst
der Präprocessor an einem eventuell vorhandenen Komma scheitert.
Dies kann beim gcc durch Voranstellen von "##" vor dem "__VA_ARGS__"
umgangen werden. Ouch. Beide Erweiterungen sind derzeit nicht aktiv,
wenn mit -ansi compiliert wird. Explizit anschalten läßt sich die
standard-konforme Erweiterung über "-std=c9x", bzw. "-std=c99" bei
neueren gccs.
Log Messages:
- raw
- optional prefixes (inclusive of order)
string
facility
level
timestamp
pid
(tid)
- errno (like syslog %m)
- eigene %{foo}x mit callback function mit context
- automatisch number -> string mapping (fuer error strings)
- !debug -> !code
- add support for prefixes of all non-static symbols to better support embedding