OSSP CVS Repository

ossp - ossp-pkg/l2/TODO 1.58
Not logged in
[Honeypot]  [Browse]  [Directory]  [Home]  [Login
[Reports]  [Search]  [Ticket]  [Timeline
  [Raw

ossp-pkg/l2/TODO 1.58
   _        ___  ____ ____  ____    _ ____
  |_|_ _   / _ \/ ___/ ___||  _ \  | |___ \
  _|_||_| | | | \___ \___ \| |_) | | | __) |
 |_||_|_| | |_| |___) |__) |  __/  | |/ __/
  |_|_|_|  \___/|____/____/|_|     |_|_____|

  OSSP l2 -- Flexible Logging

  TODO
  ====

Structure of channels and documentation [thl]. It should be possible to
drag the documentation out of a channel's source code. Everything else
is error prone and a documentation nightmare. Currently, most (noop and
null doesn't) channels have a code segment that looks like

    /* feed and call generic parameter parsing engine */
    L2_PARAM_SET(pa[0], size,       INT, &cfg->bufsize);
    L2_PARAM_SET(pa[1], interval,   INT, &cfg->bufinterval);
    L2_PARAM_SET(pa[2], levelflush, INT, &cfg->levelflush);
    L2_PARAM_END(pa[3]);
    l2_channel_env(ch, &env);

I recommend the L2_PARAM_SET macro should be modified to

- set the default value of each paramater, if any exists
- force the internal structure name to follow the revealed name
- declare a channel to be a filter or output
- declare parameters optional or mandatory

Also i recommend an organizational enforcement of putting the
channel description before such a code segment and a short description
(i.e. unit, 0=deactivate) of each parameter/value just behind the macro.
This could change the example above to look something like:

    /*<man description=channel, type=filter>
     * The buffer channel buffers messages poured in from upper channels.
     * It flushes the messages down to the lower channels when the buffered
     * space exceeds the buffer size, when a given time interval is reached
     * (0 disables this feature) or when a newly arrived message has a
     * level that matches the levelflush mask.
     *
    /* feed and call generic parameter parsing engine */
    L2_PARAM_SET(pa[0], size,       INT, 4096); /* o: bytes */
    L2_PARAM_SET(pa[1], interval,   INT, 0   ); /* o: sec, 0=disable */
    L2_PARAM_SET(pa[2], levelflush, INT, 0   ); /* o: levelmask */
    L2_PARAM_END(pa[3]);
    l2_channel_env(ch, &env);
    /*</man>*/

Neh idea as thl discussed with rse a few minutes ago

- l2 should calculate the global logging mask automatically based on
  the OR'ing the masks of all output channels. This should increase
  performance as useless requests are captured at an early stage.

- l2 should provide a function to set the logging masks including
  the global logging mask manually. This would require a mechanism
  to re-calculate the maks at any time when manually set to 'auto'.

- l2 should provide a function to get the logging masks including
  the global logging mask. This could be used by the caller to
  check if it is worth calling the actual log and allow him to
  bypass wholly code blocks which exist for logging purposes only.

Next steps:

- libl2syslog:
  OpenPKG fakesyslog is nasty, because it doesn't provide logging
  to multiple files or filter out some messages. Additionally the
  application has to be restarted in order to reopen the logfile which
  is nasty for MTAs like Postfix in case of very high loads (because
  they start again processing the queue from scratch). What we need is a
  new L2-based libl2syslog which maps from syslog(3) API to l2(3) API.
  We later can add the reopen feature in L2 or sends the messages via
  Unix Domain socket to an L2 daemon which in turn logs to targets via
  l2tool, etc.
- signal and process handling (l2_env*)
- asynchronous channel (l2_ch_async.c)
- manual page (l2.pod)

New Channels ----------------------------------------------
- l2_ch_rotfile for a cronolog-style file writing, i.e., it changes the
  file according to a strftime-based filename expansion.

- l2_ch_bofh does something cool but is undocumented and has its
    source mangled/scrambled ;)

- l2_ch_asynch spawns a new process and communicates with
    the stemming channels via shared memory. Might be the
    way to implement autoflush as well, or rather with its
    own shared memory and sub-process.

- l2_ch_smart is a smart debug buffer channel which flushes
    only if errors occur.

- l2_ch_repeat prints 'last message repeated 100 times.'

- l2_ch_action is similar to pipe channel, but executes a given
    command conditionally of the incoming priority or a regex.
    -> Might be implemented by combining a pcre, filter, and pipe (ms)

- l2_ch_proxy sniffs a tcp connection and dumps its traffic to
    standard out or to the next channel, allowing for parsing
    through an unknown protocol.

- l2_ch_nntp for administration through news.

- l2_ch_snmp logs a stream to an SNMP listener.

Existing Channels -----------------------------------------
- l2_ch_socket bind and udp parameters need work. (RSE: why and what?)
- l2_ch_socket consider taking a fd at configuration.

- l2_ch_pcre is too large and might benefit from a
    smaller set of pattern matching logic.

- l2_ch_syslog should inherit socket logic to directly implement
    syslog communication, allowing for remote syslog operations.
- l2_ch_syslog should be trimmed to necessary functionality, not
    necessarily everything that the system syslog(3) implements.

- l2_ch_irc has a problem pinging its host to stay alive.

- l2_ch_prefix produces implicitly two log messages if a
    following buffer channel would not accumulate them.
- l2_ch_prefix should have printf style without strftime.
- l2_ch_prefix should apply __FILE__, __LINE__, (__FUNCTION__)
    as options in case we are building in custom striptease mode.

- l2_ch_file should optionally support file locking via fcntl(3).
  (RSE: why? Unix guarranties that up to 512 bytes one can write atomically)

- l2_ch_pipe needs an overhaul once l2_ch_async is implemented.
    Also needs a review for dangling descriptors. During configure
    only checks existance of command in non-shell mode. Remove
    hard coded 256 in exec arguments.
- l2_ch_pipe should use a standardized url like prg:/path/to/program.
- l2_ch_pipe might consider taking a fd at configuration.

- l2_ch_buffer buffer is duplicated when the process forks. This
    means the buffer has to be flushed in advance or the content
    is dumped twice. If the buffer would remember the pid of the
    last writer, it could discard the contents of the buffer when
    the pid changes. This is because the parent retains the pid
    and the buffer content while the child changes the pid and
    discards the content. Should be an optionial feature.
    -> Can be immediately implemented (rse)
- l2_ch_buffer needs a L2_OK and L2_OK_PASS consistency check
- l2_ch_buffer should not use different param types in alarm and setitimer(2).
- l2_ch_buffer should incorporate multiplexer logic to properly flush
    when multiple buffers are in use. Can be implemented via a environment
    level timer and a lowest common denominator alarm function. For example,
        tmr1 9secs; tmr2 6secs; envtmr expires every lcd(9,6) = 3secs
- l2_ch_buffer should not implement an autoflush, rather L2 should
    have a multiplexed timer object to serve timer requests from
    incoming callbacks.

Library-wide changes --------------------------------------
- Add nonreentrant log method like mm.
- Add striptease target to Makefile.
- l2tool needs an [addr:]port option
- API access through the LogManager logm;
- Jump to a callback in case of error, throw/exit.
- Offer the option to reopen the logfile on each write.
- Optimize log operations from log(..) {}
    to log(....);

- Implicit level decision should default to >=,
    then follow with <=
                      =
                      ;
                      *

- '\r\n' Uebersetzung
    Bei Input \r und \n wegstrippen, und von jedem Output Channel
    entweder \r\n (smtp) oder nur \n wieder anhaengen lassen. Oder
    als Kanaloption, wobei \r, \r\n, \n, gleich String, oder gar nix.

- Spec-facility
    To debug an application, sometimes it's overkill to log everything
    at DEBUG level. I see an improvement when an additional facility
    can be specified. Example: DEBUG/LMTP but don't care about NNTP in
    the lmtp2nntp program. Possibly could be implemented as a second mask.
    -> Needs more consideration before implementation should start (rse)

- Prelogging
    L2 should log even before it is initialized. Maybe the log function
    should buffer everything as long as a NULL l2-context is passed and
    if ever a non-NULL context is passed every remembered message should
    be logged afterwards or if destroy/flush is executed with a NULL-
    context it should print the buffered stuff to stderr.
    -> Can be immediately implemented, but one has to be carefully here.
       I want to think about this a little bit more in-depth. (rse)

- Config file
    Might be used with the previous prelogging principle in mind.
    Namely, a file that describes a set of channel specs for one
    or many configurations. Path is searched in the following order:
      1. /etc/liblog.conf
      2. (in ., .., ../..)
      3. $HOME/.liblog.conf

- hook_write methods should receive a null-termined string
    instead of buf+size, because some channels like syslog
    otherwise have to rebuffer the message and append the
    null terminator character.

Spec-parsing ----------------------------------------------
- implement location tracking

- target=remote has to be currently after host=xxx because of
    single-configuration procedure

- bugfix l2tool and after it works 100% remove l2_test in
    favor of l2tool and add instead a test shell script
    which calls l2tool with various input specifications.

Channel-Only Revamping ------------------------------------
- l2_objects update
- syscall override ala OSSP SA in l2_env_t
- perhaps rename l2_env to l2_ctx and l2_channel_ to just l2_
- API cleanup for open semantics

Documentation ---------------------------------------------
- l2_ch_buffer
    How the buffer object behaves in relation to up/downstream
    channels. When does it pass its data to the next channel,
    when does it erase, what happens to its data when it is
    over written or flushed.

- l2_ch_syslog
    Describe the options, and mention that more info is found
    in the man page for syslog(3). Identify features that
    change from one system to the next.

- l2 errors
    Describe how an l2 channel reports errors when it fails during
    an operation, and how a user should interpret the error message?
    Describe whether a channel continues passing data on downstream
    if it somehow fails, and if the data will be corrupt.

- l2_util_fmt_string
    This can be used like %s, but instead of fetching only a "char *"
    from the var-args stack, it fetches a "char *" plus a "size_t" and
    this way allows one to log only a sub-string of a larger string
    without the need for any temporary buffers, etc.

- l2_util_fmt_dump:
    This can be used as "%{type}X" for dumping arbitrary octets. The
    parameter "type" can be either "text" (the default if only "%X"
    is used) for dumping the octets as text but with non-printable
    characters replaced by "\xXX" constructs; "hex" for dumping the
    octets in hexadecimal as "XX:XX:XX:XX" or "base64" for dumping the
    octets Base64 encoded. All three are intended for making it easier to
    produce reasonable L2_LEVEL_DEBUG messages without having to fiddle
    around with temporary buffers and having to care with non-printable
    characters.

- formatter example
    l2_stream_formatter(st, 'D', l2_util_fmt_dump, NULL);
        :
    l2_stream_vlog(st, L2_LEVEL_DEBUG, "%{text}D %{hex}D %{base64}D\n",
        "foo", 12345, "foo\1bar", 7, "foo\1bar", 7, "foo\1bar", 7);
        :
    ...produces "foo\x01bar 66:6f:6f:01:62:61:72 Zm9vAWJhcg=="

High-level configuration interface
==================================
Config-File (OSSP):
logging {
    channel 0 null ALL;

    channel 1 prefix CUSTOM1 { prefix="%Y-%m-%d/%H:%M:%S" };
    channel 2 buffer { size=8k };
    channel 3 file { path=/path/to/ossp.stat.log; mode=0644 };

    channel 4 prefix wmask=DEBUG { prefix="%Y-%m-%d/%H:%M:%S %L" };

    channel 5 buffer { size=4k };
    channel 6 file { path=/path/to/ossp.error.log; mode=0644 };

    channel 7 syslog wmask=ERROR { peer=remote; host=loghost };

    channel 8 filter { regex="(memory|foo)" };
    channel 9 smtp { host=loghost; rcpt="rse@engelschall.com" };

    stream { 0->(4,7); 4->(5,7); 5->6; 8->9; 1->2->3 };

    log access 0;
    log stat 1;
};

Alternative:
log access {
    error: syslog;
    prefix -> {
        buffer(size=4k) ->
            file(path=/path/to/log, mode=0644);
        panic: smtp(host=en1, rcpt=rse@engelschall.com);
    }
}

Command-Line (lmtp2nntp):
$ lmtp2nntp -l 'INFO:'
  { prefix(prefix="%Y-%m-%d/%H:%M:%S %L")
    ->{ buffer(size=4k)
        ->file(path=/path/to/log,mode=0644);
        smtp(host=en1,rcpt=rse@engelschall.com)
      };
    error:syslog
  }

Brainstorming ---------------------------------------------
- Debugging can be implemented as a special case of logging
- Tracing can be implemented as a special case of debugging


*** *** *** *** *** *** Out of date *** *** *** *** *** ***
*** *** *** *** *** *** Out of date *** *** *** *** *** ***
*** *** *** *** *** *** Out of date *** *** *** *** *** ***


Backend channels ------------------------------------------
- Level -> N Channels
- file (append)
- program (stdin)
- syslog
- stderr/stdout
- null (discard, nicht nur /dev/null)
- filedescriptor (escape/ext)
- callback function

Aufbau ----------------------------------------------------
Layer C++ API      log.hpp,  log.cpp
Layer C   API      log.h     log.c
Layer C   Backend  backend.h backend.c

- Ein Wort noch zu variablen Argumentlisten in cpp-Makros: gcc
  unterstützt dies in in der GNU- und der C99-Ausführung. Das heißt,
  der "..." Parameter kann im Makro respektive über "args" und über
  "__VA_ARGS__" angesprochen werden. Wichtig ist dabei, daß "..."
  nicht leer sein -- also kein Argument enthalten -- darf, da sonst
  der Präprocessor an einem eventuell vorhandenen Komma scheitert.
  Dies kann beim gcc durch Voranstellen von "##" vor dem "__VA_ARGS__"
  umgangen werden. Ouch. Beide Erweiterungen sind derzeit nicht aktiv,
  wenn mit -ansi compiliert wird. Explizit anschalten läßt sich die
  standard-konforme Erweiterung über "-std=c9x", bzw. "-std=c99" bei
  neueren gccs.

Log Messages:
- raw
- optional prefixes (inclusive of order)
    string
    facility
    level
    timestamp
    pid
    (tid)
- errno (like syslog %m)
- eigene %{foo}x mit callback function mit context
- automatisch number -> string mapping (fuer error strings)
- !debug -> !code


- add support for prefixes of all non-static symbols to better support embedding

CVSTrac 2.0.1