OSSP CVS Repository

ossp - ossp-pkg/l2/TODO 1.39
Not logged in
[Honeypot]  [Browse]  [Directory]  [Home]  [Login
[Reports]  [Search]  [Ticket]  [Timeline
  [Raw

ossp-pkg/l2/TODO 1.39

OSSP L2
=======

Spec-Parsing:
- location tracking
- target=remote has to be currently after host=xxx because of
  single-configuration procedure

Channel-Only Revamping:
- syscall override ala OSSP SA in l2_env_t
- l2_objects.fig update
- perhaps rename l2_env to l2_ctx and l2_channel_ to just l2_

- Perhaps we should later also write an l2_ch_bofh.c which
  something very cool but is undocumented and has its source
  mangled/scrambled ;)

- Problem: OpenPKG fakesyslog ist nasty, because it doesn't provide
  logging to multiple files or filter out some messages. Additionally
  the app has to be restarted in order to reopen the logfile which
  is nasty for MTAs like Postfix in case of very high loads (because
  they start again processign the queue from scratch). What we need
  is a new L2-based libsyslog.a which sends the stuff via Unix Domain
  socket to an L2 daemon which in turn logs to targets via L2.

- "smart debug buffer channel": 
   extra debug buffer, flush only if error occurs

- "repeat channel": "last message repeated 100 times"

- "\r\n" Problematik: Vorschlag: Bei Input \r und \n wegstrippen,
  und von jedem Output Channel entweder \r\n (smtp) oder nur \n
  wieder anhaengen lassen.

- thl: log facility
  To debug an application, sometimes it's overkill to log everything at
  DEBUG level. I see an improvement when an additional facility can be
  specified. Example: DEBUG/LMTP but don't care about NNTP in the
  lmtp2nntp program. Possibly could be implemented as a second mask.
  -> needs more consideration before implementation should start (rse)

- thl: buffer fork() awareness
  When the process forks, the buffer is duplicated. Currently this means
  the buffer has to be flushed in advance or the content is dumped twice.
  If the buffer would remember the pid of the last writer, it could
  discard the contents of the buffer when the pid changes. This is because
  the parent retains the pid and the buffer content while the child
  changes the pid and discards the content. Should be an optionial feature.
  -> can be immediately implemented (rse)

- thl: prelog
  I want to log everything even things that happen before L2 is
  initialized. Complicated, i know. Maybe the log function should buffer
  everything as long as a NULL l2-context is passed and if ever a non-NULL
  context is passed every remembered message should be logged afterwards
  or if destroy/ flush is executed with a NULL-context it should print the
  buffered stuff to stderr.
  -> can be immediately implemented, but one has to be carefully here.
     Seems like I want to think about this a little bit more in-depth. (rse)

o Another great rse idea, make a proxy server channel that sniffs a tcp
  connection and dumps its traffic to standard out. Optionally, it could dump
  this to the next channel, allowing for parsing through an unknown protocol.

RSE:
- channel API cleanup: open semantics
- bind and udp parameters in socket channel
- prefix channels:
  - printf style without strftime;
- l2tool [addr:]port
- autoflush via shared memory and sub-process?
- NNTP channel
- SNMP trap channel
- perhaps replace too large PCRE stuff with smaller pattern
  matching stuff
- file channel should optionally support file locking 
  via fcntl(3).
- idea of an asynch channel, that spawns a new process and
  communicates with the stemming channels via shared memory

Lunchtime:
- Correct DNS resolve blocking problem by using a funky asynchronous DNS lib.
  This leads to easy reimplementation of the prefix channel (asynchronous)

MS:
- pipe channel may need a big overhaul if we redesign
  it around the asynch channel principle
- review pipe handler for dangling descriptors
- configure only checks existance of command in non-shell mode
- find alternative to exec arguments which is hard coded to 256
- signal handler chaining, save old signal handler and call it after our own
- consider adding options such as PCRE_CASELESS to filter channel
- implement "action" channel, can be based on pipe channel
- correct problem with multiple buffer channels using the timer
- solve problem with buffer (timer), irc (ping), and buffer (autoflush), by
  creating sleep/ping threads. The disadvantage is that we must depend on a
  pth installation, and force the parent app to be multithreaded.
  Alternatively, we can spawn a management process in l2_stream_create(), who
  owns management resources globally available to all channels. Or write the
  l2 mini-protocol :-(

ISSUES
------

o hook_write's should perhaps receive a nul-termined string
  instead of buf+size, because syslog else has to re-buffer it
  in order to append the nul terminator character.

o Stream Members: channels static array
  - consider dynamic

o buffer
  user needs to know how a buffer object behaves in
  relation to up/downstream channels. When does it
  pass its data to the next channel, when does it
  erase, what happens to its data when it is over
  written or flushed...

o buffer timer
  How on earth to use a C-style exception handler to flush
  our buffer? We can't directly call ANY function from an
  exception state, so this might not work at all.

  Depends on SIGALARM, only one handler of which may exist.
  A more robust implementation would not use such a precious
  resource, and guard against signal collision with other channels.

o syslog
  many options need docu, and we should mention to
  the user that more info is found in the man page
  for syslog(), because after all that is what is
  doing all the work in our implementation. Also,
  can we really properly document these features
  if they change from one system's syslog to the next?

o errors
  when a channel fails during an operation, how
  does it report this? How should a user interpret
  the error message or other data? Do we need more
  accurate or detailed error messages in the channel
  code? When a channel fails, does it continue
  passing data on to downstream channels? Is it
  corrupt data?

o Syslog Kanal
  - Trim down to what will be used, right now the
    channel supports ALL functionality through syslog(3)

BRAINSTORMING
-------------

Braindump:
- debugging is special case of logging
- tracing is special case of debugging

Channel Handler Configuration:
o l2_handler_null
  - no configuration at all
o l2_handler_fd
  - mode="unix|stdio"
  - fd=int|FILE*
o l2_handler_file
  - mode="unix|stdio"
  - path=char*
  - append="yes|no"
o l2_handler_pipe
  - url="prg:/path/to/program"
  - fd=int
o l2_handler_socket
  - url="tcp://hostname:port"
  - fd=int
o l2_handler_syslog
  - ident=char*
  - should have its own logic
    and not use unix lib syslog()
    thus able to write to a remote
    syslog daemon
o l2_handler_filter
  - pattern=char*
o l2_handler_prefix
  - prefix=char*
o l2_handler_buffer
  - size=size_t
o all output channels
  - should they have downstream, or be true endpoints?
o l2_ch_socket
  - write should handle partial send()
    thus check the return of send
o l2_ch_buffer
  - write() must implicitly flush() when incoming
    data is larger than remaining buffer capacity
o l2_ch_action
  - new action channnel could accept a regexp or L2_LEVEL as configuration,
    and then run a callback function or exec a binary depending on incoming
    l2 stream messages. For example, lmtp2nntp could conditionally run code
    via the new action channel to monitor spam or alert an admin about a
    flawed OR CURIOUs LMTP header. However, the action channel is also
    appealling to other applications that don't run as daemons or at the
    system level.
  - l2_channel_config(l2Act, "regexp,run", "(nobody|cz)", "callback");
  - l2_channel_config(l2Act, "funcptr", pfnAddtcpwrap);

License:
- ISC/MIT/BSD

Sprache:
- C++
- C

Aufbau:
1. Layer C++ API      log.hpp,  log.cpp
2. Layer C   API      log.h     log.c
3. Layer C   Backend  backend.h backend.c
- "make striptease"

- optimierung:
log(..)
{
}
:
log(....);
:

API Levels:
- PANIC    (-> LOG_EMERG)
- CRITICAL (-> LOG_CRIT)
- ERROR    (-> LOG_ERR)
- WARNING
- NOTICE
- INFO
- TRACE    (-> LOG_DEBUG)
- DEBUG    (-> LOG_DEBUG)
- ALERT

Level Entscheidungen:
>= (default)
<=
=
;
*

Backend Channels:
1 Level -> N Channels
- file (append)
- program (stdin)
- syslog
- stderr/stdout
- null (discard, nicht nur /dev/null)
- filedescriptor (escape/ext)
- callback function

Log Messages:
- raw
- optional prefixes (inclusive order):
  string
  facility
  level
  timestamp
  pid
  (tid)
- errno (like syslog %m)
- eigene %{foo}x mit callback function mit context
- automatisch: number -> string mapping (fuer error strings)
- __FILE__, __LINE__, (__FUNCTION__)

Configuration:
- ueber C/C++ API
- zusaetzlich Config-File
  1. /etc/liblog.conf
  2. (in ., .., ../..)
  3. $HOME/.liblog.conf

- !debug -> !code

API C (ala MM):
- reentrant: log_xxx
- non-reentrant: Log_xxx

Message Filtering/Masking:
- facility und/oder levels und/oder wildcard pattern

API Using:
- C++:
  LogManager logm;
  logm.debug1("test");
  logm.configure("

- C:
  log lh;
  lh = log_init(LOG_CFGFILE|LOG_CFGPARENT|LOG_XXX|..., "foo" (=facility));
  log_configure(lh, "foo", LOG_WARN|LOG_LESSER, null);
  log_cb(lh, "x", func, ctx);
  int func(void *ctx, char *str, ...);
  log_msg(lh, LOG_WARN, "..%{foo}x %s...%E..", cp);
  log_dbg(lh, "..%{foo}x %s...%E..", cp);
  log_kill(lh);

- Buffered I/O:
  fuer manche channels non-buffered (debug, errors)
  fuer manche andere aber buffered (access log, performance)
  loesung: I/O ueber callbacks (3x: open, write, close) z.B. RRDTool

- Varargs:
  log ist nur wrapper fuer vlog

- Error Handling:
  o log kein Return Code
  o aber error callback function (dadrin in C++: throw, in C: exit)

- Newline Handling:
  option fuer channel: \r, \r\n, \n oder gleich string
  und moeglichkeit gar nix (string="")

- Perhaps:
  optionally reopen logfile on each write

- An optional syslog(3) compatible API for converting syslog-only based
  applications (like sendmail) to (restricted) liblog-based applications.

- Ein Wort noch zu variablen Argumentlisten in cpp-Makros: gcc
  unterstützt dies in in der GNU- und der C99-Ausführung. Das heißt,
  der "..." Parameter kann im Makro respektive über "args" und über
  "__VA_ARGS__" angesprochen werden. Wichtig ist dabei, daß "..."
  nicht leer sein -- also kein Argument enthalten -- darf, da sonst
  der Präprocessor an einem eventuell vorhandenen Komma scheitert.
  Dies kann beim gcc durch Voranstellen von "##" vor dem "__VA_ARGS__"
  umgangen werden. Ouch.
  Beide Erweiterungen sind derzeit nicht aktiv, wenn mit -ansi
  compiliert wird. Explizit anschalten läßt sich die standard-konforme
  Erweiterung über "-std=c9x", bzw. "-std=c99" bei neueren gccs.

High-Level Configuration Interface
==================================

Config-File (OSSP):

logging {
    channel 0 null ALL; 

    channel 1 prefix CUSTOM1 { prefix="%Y-%m-%d/%H:%M:%S" };
    channel 2 buffer { size=8k };
    channel 3 file { path=/path/to/ossp.stat.log; mode=0644 };

    channel 4 prefix wmask=DEBUG { prefix="%Y-%m-%d/%H:%M:%S %L" };

    channel 5 buffer { size=4k };
    channel 6 file { path=/path/to/ossp.error.log; mode=0644 };

    channel 7 syslog wmask=ERROR { peer=remote; host=loghost };

    channel 8 filter { regex="(memory|foo)" };
    channel 9 smtp { host=loghost; rcpt="rse@engelschall.com" };

    stream { 0->(4,7); 4->(5,7); 5->6; 8->9; 1->2->3 };

    log access 0;
    log stat 1;

};

Alternative: 

log access {
    error: syslog;
    prefix -> { 
        buffer(size=4k) -> 
            file(path=/path/to/log, mode=0644);
        panic: smtp(host=en1, rcpt=rse@engelschall.com);
    }
}

Command-Line (lmtp2nntp):

$ lmtp2nntp -l 'INFO:'
  { prefix(prefix="%Y-%m-%d/%H:%M:%S %L")
    ->{ buffer(size=4k)
        ->file(path=/path/to/log,mode=0644);
        smtp(host=en1,rcpt=rse@engelschall.com)
      };
    error:syslog
  }


CVSTrac 2.0.1