OSSP CVS Repository

ossp - ossp-pkg/cfg/cfg.pod
Not logged in
[Honeypot]  [Browse]  [Directory]  [Home]  [Login
[Reports]  [Search]  [Ticket]  [Timeline
  [Raw

ossp-pkg/cfg/cfg.pod
##
##  OSSP cfg - Configuration Parsing
##  Copyright (c) 2002-2006 Ralf S. Engelschall <rse@engelschall.com>
##  Copyright (c) 2002-2006 The OSSP Project (http://www.ossp.org/)
##
##  This file is part of OSSP cfg, a configuration parsing
##  library which can be found at http://www.ossp.org/pkg/lib/cfg/.
##
##  Permission to use, copy, modify, and distribute this software for
##  any purpose with or without fee is hereby granted, provided that
##  the above copyright notice and this permission notice appear in all
##  copies.
##
##  THIS SOFTWARE IS PROVIDED ``AS IS'' AND ANY EXPRESSED OR IMPLIED
##  WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF
##  MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED.
##  IN NO EVENT SHALL THE AUTHORS AND COPYRIGHT HOLDERS AND THEIR
##  CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,
##  SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT
##  LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF
##  USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND
##  ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
##  OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT
##  OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
##  SUCH DAMAGE.
##
##  cfg.pod: manual page
##

=pod

=head1 NAME

B<OSSP cfg> - Configuration Parsing

=head1 VERSION

OSSP cfg CFG_VERSION_STR

=head1 SYNOPSIS

=over 4

=item B<API Header:>

cfg.h

=item B<API Types:>

cfg_t,
cfg_rc_t,
cfg_node_type_t,
cfg_node_t,
cfg_node_attr_t,
cfg_fmt_t,
cfg_data_t,
cfg_data_ctrl_t,
cfg_data_cb_t,
cfg_data_attr_t

=item B<API Functions:>

cfg_create,
cfg_destroy,
cfg_error,
cfg_version,
cfg_import,
cfg_export,
cfg_node_create,
cfg_node_destroy,
cfg_node_clone,
cfg_node_set,
cfg_node_get,
cfg_node_root,
cfg_node_select,
cfg_node_find,
cfg_node_apply,
cfg_node_cmp,
cfg_node_link,
cfg_node_unlink,
cfg_data_set,
cfg_data_get,
cfg_data_ctrl

=back

=head1 DESCRIPTION

B<OSSP cfg> is a ISO-C library for parsing arbitrary C/C++-style
configuration files. A configuration is sequence of directives. Each
directive consists of zero or more tokens. Each token can be either a
string or again a complete sequence. This means the configuration syntax
has a recursive structure and this way allows to create configurations
with arbitrarily nested sections.

Additionally the configuration syntax provides complex
single/double/balanced quoting of tokens, hexadecimal/octal/decimal
character encodings, character escaping, C/C++ and Shell-style comments,
etc. The library API allows importing a configuration text into an
Abstract Syntax Tree (AST), traversing the AST and optionally exporting
the AST again as a configuration text.

=head2 CONFIGURATION SYNTAX

The configuration syntax is described by the following context-free
(Chomsky-2) grammar:

B<sequence>   ::= I<empty>
             | B<directive>
             | B<directive> B<SEP> B<sequence>

B<directive>  ::= B<token>
             | B<token> B<directive>

B<token>      ::= B<OPEN> B<sequence> B<CLOSE>
             | B<string>

B<string>     ::= B<DQ_STRING>   # double quoted string
             | B<SQ_STRING>   # single quoted string
             | B<FQ_STRING>   # flexible quoted string
             | B<PT_STRING>   # plain text string

The other contained terminal symbols are defined itself by the following
set of grammars production (regular sub-grammars for character
sequences given as Perl-style regular expressions "/I<regex>/"):

B<SEP>        ::= /;/

B<OPEN>       ::= /{/

B<CLOSE>      ::= /}/

B<DQ_STRING>  ::= /"/ B<DQ_CHARS> /"/

B<DQ_CHARS>   ::= I<empty>
             | B<DQ_CHAR> B<DQ_CHARS>

B<DQ_CHAR>    ::= /\\"/               # escaped quote
             | /\\x\{[0-9a-fA-F]+\}/ # hex-char group
             | /\\x[0-9a-fA-F]{2}/ # hex-char
             | /\\[0-7]{1,3}/      # octal character
             | /\\[nrtbfae]/       # special character
             | /\\\n[ \t]*/        # line continuation
             | /\\\\/              # escaped escape
             | /./                 # any other char

B<SQ_STRING>  ::= /'/ B<SQ_CHARS> /'/

B<SQ_CHARS>   ::= I<empty>
             | B<SQ_CHAR> B<SQ_CHARS>

B<SQ_CHAR>    ::= /\\'/               # escaped quote
             | /\\\n[ \t]*/        # line contination
             | /\\\\/              # escaped escape
             | /./                 # any other char

B<FQ_STRING>  ::= /q/ B<FQ_OPEN> B<FQ_CHARS> B<FQ_CLOSE>

B<FQ_CHARS>   ::= I<empty>
             | B<FQ_CHAR> B<FQ_CHARS>

B<FQ_CHAR>    ::= /\\/ B<FQ_OPEN>        # escaped open
             | /\\/ B<FQ_CLOSE>       # escaped close
             | /\\\n[ \t]*/        # line contination
             | /./                 # any other char

B<FQ_OPEN>    ::= /[!"#$%&'()*+,-./:;<=>?@\[\\\]^_`{|}~]/

B<FQ_CLOSE>   ::= E<lt>E<lt> B<FQ_OPEN> or corresponding closing char
                  ('}])>') if B<FQ_OPEN> is a char of '{[(<' E<gt>E<gt>

B<PT_STRING>  ::= B<PT_CHAR> B<PT_CHARS>

B<PT_CHARS>   ::= I<empty>
             | B<PT_CHAR> B<PT_STRING>

B<PT_CHAR>    ::= /[^ \t\n;{}"']/     # none of specials

Additionally, white-space B<WS> and comment B<CO> tokens are allowed at
any position in the above productions of the previous grammar part.

B<WS>         ::= /[ \t\n]+/

B<CO>         ::= B<CO_C>                # style of C
             | B<CO_CXX>              # style of C++
             | B<CO_SH>               # style of /bin/sh

B<CO_C>       ::= /\/\*([^*]|\*(?!\/))*\*\//

B<CO_CXX>     ::= /\/\/[^\n]*/

B<CO_SH>      ::= /#[^\n]*/

Finally, any configuration line can have a trailing backslash character
(C<\>) just before the newline character for simple line continuation.
The backslash, the newline and (optionally) the leading whitespaces on
the following line are silently obsorbed and as a side-effect continue
the first line with the contents of the second lines.

=head2 CONFIGURATION EXAMPLE

A more intuitive description of the configuration syntax is perhaps given by
the following example which shows all features at once:

 /* single word */
 foo;

 /* multi word */
 foo bar quux;

 /* nested structure */
 foo { bar; baz } quux;

 /* quoted strings */
 'foo bar'
 "foo\x0a\t\n\
  bar"

=head1 APPLICATION PROGRAMMING INTERFACE (API)

...

=head1 NODE SELECTION SPECIFICATION

The B<cfg_node_select> function takes a I<node selection specification>
string B<select> for locating the intended nodes. This specification is
defined as:

B<select>           ::= I<empty>
                   | B<select-step> B<select>

B<select-step>      ::= B<select-direction>
                     B<select-pattern>
                     B<select-filter>

B<select-direction> ::= "./"        # current node
                   | "../"       # parent node
                   | "..../"     # anchestor nodes
                   | "-/"        # previous sibling node
                   | "--/"       # preceeding sibling nodes
                   | "+/"        # next sibling node
                   | "++/"       # following sibling nodes
                   | "/"         # child nodes
                   | "//"        # descendant nodes

B<select-pattern>   ::= /</ B<regex> />/
                   | B<token>

B<select-filter>    ::= I<empty>
                   | /\[/ B<filter-range> /\]/

B<filter-range>     ::= B<num>           # short for: num,num
                   | B<num> /,/          # short for: num,-1
                   | /,/ B<num>          # short for: 1,num
                   | B<num> /,/ B<num>

B<num>              ::= /^[+-]?[0-9]+/

B<regex>            ::= E<lt>E<lt> Regular Expression (PCRE-based) E<gt>E<gt>

B<token>            ::= E<lt>E<lt> Plain-Text Token String E<gt>E<gt>

=head1 IMPLEMENTATION ISSUES

Goal: non-hardcoded syntax tokens, only hard-coded syntax structure
Goal: time-efficient parsing
Goal: space-efficient storage
Goal: representation of configuration as AST
Goal: manipulation (annotation, etc) of AST via API
Goal: dynamic syntax verification

=head1 HISTORY

B<OSSP cfg> was implemented in lots of small steps over a very
long time. The first ideas date back to the year 1995 when
Ralf S. Engelschall attended his first compiler construction lessons at
university. But it was first time finished in summer 2002 by him for use
in the B<OSSP> project.

=head1 AUTHOR

 Ralf S. Engelschall
 rse@engelschall.com
 www.engelschall.com

=cut


CVSTrac 2.0.1