manpagez: man pages & more
man ffe(1)
Home | html | info | man
ffe(1)                                                                  ffe(1)




NAME

       ffe - flat file extractor


SYNOPSIS

       ffe [options]...



DESCRIPTION

       ffe  is a program for extracting fields from flat file records and dis-
       playing them in different formats. ffe relies on the configuration file
       to control input file structure and the output format.


OPTIONS

       ffe accepts the following options:

       -c, --configuration=file
              Read the configuration from file, default is ~/.fferc.

       -s, --structure=STRUCTURE
              Input file is processed using the structure STRUCTURE.

       -p, --print=FORMAT
              Use  output format FORMAT for printing. All printing can be sup-
              pressed using format no. Original data is printed  using  format
              raw.

       -o, --output=NAME
              Write output to NAME instead of standard output.

       -f, --field-list=LIST
              Print  only  fields  and  constants specified in comma separated
              list LIST.

       -e, --expression=EXPRESSION
              Print only those records for which the EXPRESSION  evaluates  to
              true.

       -a, --and
              Expressions  are  combined  with logical and, default is logical
              or.

       -X, --casecmp
              Expressions are evaluated case insensitive.

       -v, --invert-match
              Print only those records which don't match the expression.

       -l, --loose
              An invalid input line does not cause program to abort.

       -r, --replace=FIELD=VALUE
              Replace FIELDs contents with VALUE in output. VALUE can  contain
              same directives as output option data.

       -d, --debug
              All invalid input lines are written to file ffe_error_<pid>.log.

       -I, --info
              Show the structure information in configuration file and exit.

       -?, --help
              List all available options and their meanings and exit.

       -V, --version
              Show version of program and exit.


       All remaining arguments are names of input files; if no input files are
       specified, then the standard input is read.


   Expressions (option -e, --expression)
       Expression  can be used to select specific records comparing field val-
       ues.

       If the value starts with string "file:" then the rest of the  value  is
       considered  as  a file name. Every line in the file is used as value in
       comparison. Record will be selected if one  or  more  values  evaluates
       true.

       Expression notation:


       field=value
              A  record  will  be  selected if the field field is equal to the
              value value.

       field^value
              A record will be selected if the field  field  starts  with  the
              value value.

       field~value
              A  record will be selected if the field field contains the value
              value.

       field!value
              A record will be selected if the field field is not equal to the
              value value.

       field?value
              A record will be selected if the field field matches the regular
              expression in value.





FFE CONFIGURATION

       ffe uses the configuration file for extracting fields  from  the  input
       file  and  for  formatting  the fields for output. Every line or binary
       block of the input file is considered as a record.  Default  configura-
       tion file is ~/.fferc but another file can be given with '-c' option.

       Configuration  file  for ffe is a text file. The file may contain empty
       lines. Commands are case-sensitive. Comments  begin  with  the  #-char-
       acter  and  end at the end of the line. The string and char definitions
       can be enclosed in double quotation '"' characters. char  is  a  single
       character.   string  and  char  can  contain  following  escape  codes:
       '\a','\b','\t','\n','\v','\f', '\r', '\"' and '\#'. Character  '\'  can
       be escaped as '\\'.

       Command Substitution allows the output of a command to replace parts of
       the configuration file. Syntax for command substitution is:
       `command`
       The command is executed and the `command` is substituted with the stan-
       dard output of the command, with any trailing newlines deleted. Command
       substitutions may not be nested.

       Before executing the command ffe sets few environment variables:

       FFE_STRUCTURE
              The name of the structure given using -s,--structure.

       FFE_OUIPUT
              The name of the output file given using -o,--output.

       FFE_FORMAT
              The name of the output format given using -p,--print.

       FFE_FIRST_FILE
              The name of the first input file.

       FFE_FILES
              A list of all input files.

       If variable is already set it will not be replaced.


   Input file structure
       Input file structures are specified with keyword structure:

       structure name {options...}

       Options must be ended with newline, options are:


       type fixed|binary|separated [char] [*]
              Fields in the input are fixed length text fields,  fixed  length
              binary  fields  or text fields separated by char. If * is given,
              multiple sequential separators are considered  as  one.  Default
              separator is comma.

       quoted [char]
              Fields may be quoted with char, default quotation mark is double
              quotation mark '"'.  A quotation mark is assumed to  be  escaped
              as  \char or doubling the mark as charchar in input. Non escaped
              quotation marks are not preserved in output.

       header first|all|no
              Controls the occurrence of the header line. Default  is  no.  If
              set  as  first or all, the first line of the first input file is
              considered as header line containing the names  of  the  fields.
              First  means  that  only  the first file has a header, all means
              that all files have a header, although the names are still taken
              from  the  header  of  the  first  file.  Header line is handled
              according the record definition, meaning  that  the  name  posi-
              tions, separators etc. are the same as for the fields.

       output name
              All  records belonging this structure are printed according out-
              put format name. Default is to use output named as 'default'.

       record name {options...}
              Defines one record for a structure. A structure can contain sev-
              eral record types.

   Record options:
       id position string

       rid position regexp
              Identifies a record in the input file. Records are identified by
              the string or by the  regular  expression  in  regexp  in  input
              record  position position. For fixed length and binary input the
              position is the byte position of the input record and for  sepa-
              rated  input  the  position  means  the position'th field of the
              input record. Positions start from one.

              Id's are required  only  if  input  structure  contains  several
              record  types  with equal lengths or field counts. Non printable
              characters can be escaped as \xnn where nn  is  the  hexadecimal
              value of the character.

              A record definition can contain several id's, then all id'd must
              match the input line (id's are combined with logical and).

              In a multi-record binary structure every  record  must  have  at
              least one id.

       field name|FILLER|* [length]|* [lookup]|* [output]
              Specifies  one field in a text input structure. length is manda-
              tory for fixed length input structure except for the last field.
              If  the  last field of a fixed length input structure has a * in
              place of length then the last field can have arbitrary length.

              Length is also used for printing fields in fixed  length  format
              using  the %D or %D directive. The order of fields in configura-
              tion file is essential,  it  specifies  the  field  order  in  a
              record.

              If  '*'  is  given instead of the name,  then the 'name' will be
              the ordinal number of the field, or if the 'header'  option  has
              value  'first'  or  'all', then the name of the field will taken
              from the header line (first line of the input).

              If lookup is given then the fields contents is used to   make  a
              lookup  in  lookup  table lookup. If length is not needed (sepa-
              rated format) but lookup is needed, use asterisk (*) in place of
              length definition.

              If  output  is  given  field is printed using output output. Use
              asterisk in place of lookup if lookup is not needed.

              Naming the field as FILLER causes field not  to  be  printed  in
              output.

       field name|FILLER|* [length]|type [lookup]|* [output]
              Specifies  one field in a binary input structure. All other fea-
              tures are same as for the text structure except the type parame-
              ter.  type specifies field data type and length and can have the
              following values:


              char Printable character.

              short Short integer having current system length and byte order.

              int Integer having current system length and byte order.

              long Long integer having current system length and byte order.

              llong  Long  long  integer having current system length and byte
              order.

              ushort Unsigned short integer having current system  length  and
              byte order.

              uint  Unsigned  integer  having  current  system length and byte
              order.

              ulong Unsigned long integer having  current  system  length  and
              byte order.

              ullong  Unsigned  long long integer having current system length
              and byte order.

              int8 8 bit integer.

              int16_be Big endian 16 bit integer.

              int32_be Big endian 32 bit integer.

              int64_be Big endian 64 bit integer.

              int16_le Little endian 16 bit integer.

              int32_le Little endian 32 bit integer.

              int64_le Little endian 64 bit integer.

              uint8 Unsigned 8 bit integer.

              uint16_be Unsigned big endian 16 bit integer.

              uint32_be Unsigned big endian 32 bit integer.

              uint64_be Unsigned big endian 64 bit integer.

              uint16_le Unsigned little endian 16 bit integer.

              uint32_le Unsigned little endian 32 bit integer.

              uint64_le Unsigned little endian 64 bit integer.

              float Float having current system length and byte order.

              float_be Float having current system length and big endian  byte
              order.

              float_le  Float  having  current system length and little endian
              byte order.

              double Double having current system length and byte order.

              double_be Double having current system  length  and  big  endian
              byte order.

              double_le  Double having current system length and little endian
              byte order.

              bcd_be_len Bcd number having  length  len  and  nybbles  in  big
              endian order.

              bcd_le_len  Bcd  number  having length len and nybbles in little
              endian order.

              hex_be_len Hexadecimal data in big endian  order  having  length
              len.

              hex_le_len Hexadecimal data in little endian order having length
              len.

              If length is given instead  of  the  type,  then  the  field  is
              assumed to be a printable string having length length. String is
              printed until length characters are printed or NULL character is
              found.

              Bcd  number  (bcd_be_len  and  bcd_le_len)  is printed until len
              bytes are read or a nybble having hexadecimal value f is  found.
              Bcd  number  having  big  endian order is printed in order: most
              significant nybble first and least significant nybble second and
              bcd number having little endian order is printed in order: least
              significant nybble first and  most  significant  nybble  second.
              Bytes are always read in big endian order.

              Hexadecimal data (hex_be_len and hex_le_len) is printed as hexa-
              decimal values. Big endian data is  printed  starting  from  the
              lower  address  and  little  endian data starting from the upper
              address.


       field-count number
              Same effect as having field * number times.  Because  length  is
              not specified, this works only with separated structure.

       fields-from record
              Fields for this record are the same as for record record.

       output name
              This  record is printed according output format name. Default is
              to use output format specified in the structure.

       level number [element_name|*] [group_name]
              Level can be used if the contents of a file should be printed as
              hierarchical  multi-level nested form document. Use * instead of
              the element name if it is not needed. number is the level of the
              record,  starting  from number one (highest level), element_name
              is the name for the record, group_name is used to group  records
              in  the  same and lower levels. Only number is mandatory parame-
              ter.

       record-length strict|minimum

              strict Input record length and field count must match the record
              definition  in order to get it processed. This is default value.

              minimum Input record length and field count can be the  same  or
              longer  as defined for the record. The rest of the input line is
              ignored.


   Output definitions
       There can be several output definitions in the configuration file. For-
       mat  can  be  selected  with  '-p'  option.  Default format is named as
       'default'.

       output name|default {options...}
              Defines one output format. Output named  as  'default'  will  be
              used  if none is given for structure or record, or none is given
              with option '-p'.

              There is two predefined output formats no and raw. no suppresses
              all printing and raw prints the original input data.

   Output options
       Pictures in output definition can contain printf-style %-directives:


       %f     Name of the input file.

       %s     Name of the current structure.

       %r     Name of the current record.

       %o     Input record number in current file.

       %O     Input record number starting from the first file.

       %i     Byte  offset  of  the current record in the current file. Starts
              from zero.

       %I     Byte offset of the current record starting from the first  file.
              Starts from zero.

       %n     Field name.

       %t     Field contents, without leading and trailing whitespaces.

       %d     Field  contents.  Binary  integer is printed as a decimal value.
              Floating point number is printed in the style [-]ddd.ddd,  where
              the number of digits after the decimal-point character is 6. Bcd
              number is printed as a decimal number and  hexadecimal  data  as
              consecutive hexadecimal values.

       %D     Field  contents,  right  padded  to  the  field length (requires
              length definition for the field).

       %C     Field contents, right  padded  to  the  field  length  (requires
              length  definition  for the field). Output field is cut if input
              field is longer that field length.

       %x     Unsigned hexadecimal value of a binary integer. Other fields are
              printed using directive %d.

       %l     Value from lookup.

       %L     Value  from  lookup,  right padded to the field length (requires
              length definition for the field).

       %e     Does not print anything, causes still the "field empty" check to
              be  performed.  Can  be  used  when  only the names of non-empty
              fields should be printed.

       %p     Fields start position in a record. For fixed structure  this  is
              field's byte position in the input line and for separated struc-
              ture this is the ordinal number of the field. Starts from one.

       %h     Hexadecimal dump of a field. Byte values are printed as consecu-
              tive  xnn  values,  where  the  nn is the hexadecimal value of a
              byte. Data is printed before any endian conversion.

       %g     Group name given by the keyword group_name in record definition.

       %m     Element name given by the keyword element_name in record defini-
              tion.

       %%     Percent sign.


       file_header picture
              Picture is printed once before file contents.

       file_trailer picture
              Picture is printed once after file contents.

       header picture
              If specified, then the header line describing the field names is
              printed  before  records. Every field  name is printed according
              the picture using  the  same  separator  and  fields  length  as
              defined for the fields. Picture can contain only %n directive.

       data picture
              Field contents is printed according picture.

       lookup picture
              If  field  is  mapped to lookup table, this picture will be used
              instead of picture from data option. If not given, then  picture
              from data will be used.

       separator string
              All  fields  are  terminated by string, except the last field of
              the record. Default is not to print separator.

       record_header picture
              picture is printed before the record content. Default is not  to
              print header.

       record_trailer picture
              picture is printed after the record content. Default is newline.

       justify left|right|char
              Fields are  left  or  right  justified.  char  justifies  output
              according  the  first  occurrence  of  char in the data picture.
              Default is left.

       indent string
              Record  contents  is  intended  by  string.  Field  contents  is
              intended by two times the string. Default is not to indent.

       field-list name1,name2,...
              Only  fields  or constants named as name1,name2,... are printed,
              same effect as has '-f' option. Default  is  to  print  all  the
              fields.  Fields  are  also printed in the same order as they are
              listed.

       no-data-print yes|no
              When set as no and field-list is given, suppresses  printing  of
              record_header  and  record_trailer  in case where current record
              contains none of the fields specified in field-list.

       field-empty-print yes|no
              When set as no, nothing is  printed  for  fields  which  consist
              entirely  of  characters from empty-chars. If none of the fields
              of a record are printed then the printing of  record_trailer  is
              also suppressed. Default is yes.

       empty-chars string
              string  specifies  a  set  of characters which define an "empty"
              field. Default is " \f\n\r\t\v" (space, form-feed, newline, car-
              riage return, horizontal tab and vertical tab).

       output-file file
              Output is written to file instead of the default output. If - is
              given the standard output is used.

       group_header string
              If a record has a  level  and  group  name  defined,  string  is
              printed  before  the  first  record  in the same group or if the
              group name has changed in the same level

       group_trailer string
              If a record has a  level  and  group  name  defined,  string  is
              printed  after  the records in lower levels or if the group name
              has changed in the same level or if a  higher  level  record  is
              found.

       element_header string
              If record has a level and header name defined, string is printed
              before the records contents.

       element_header string
              If record has a level and header name defined, string is printed
              after the records contents.

       hex-caps yes|no
              Print hexadecimal numbers in capital letters. Default is no.


   Lookup definitions
       lookup name {options...}
              Defines one lookup table.


   Lookup options:
       search exact|longest
              The search type for lookup table.

       default-value value
               value is printed if the lookup is not successful.

       pair key value
              One key/value pair for the lookup table.

       file name [separator]
              Key/value  pairs  are read from file name. Every line is consid-
              ered as a key/value pair separated by separator. Default separa-
              tor is semicolon.


   Constants
       Additional to input fields constants values can be printed using option
       -f,--field-list or output option field-list. Constant will  be  printed
       using data output option.

       Constants are specified as

       const name value
              when the name appears in a field list, value will be printed for
              every record as the name were one of the input fields.


   Input Preprocessor
       It is possible to define an input preprosessor for ffe. An  input  pre-
       processor  is simply an executable program which writes the contents of
       the input file to standard output which will be read  by  ffe.  If  the
       input  preprosessor  does not write any characters on its standard out-
       put, then ffe uses the original file.

       To set up an input preprocessor, set the FFEOPEN  environment  variable
       to  a command line which will invoke your input preprocessor. This com-
       mand line should include one occurrence of the string %s, which will be
       replaced  by  the input filename when the input preprocessor command is
       invoked.

       The input preprocessor is not used if ffe is reading standard input.



EXAMPLES

       Example of fixed length flat file containing fields  'FirstName','Last-
       Name' and 'Age':

       John     Ripper       23
       Scott    Tiger        45
       Mary     Moore        41


       This file can be printed in XML with the following configuration:

       structure personnel {
           type fixed
           output XML
           record person {
               field FirstName 9
               field LastName  13
               field Age 2
           }
       }

       output XML {
           file_header "<?xml version=\"1.0\" encoding=\"ISO-8859-1\"?>\n"
           data "<%n>%d</%n>\n"
           record_header "<%r>\n"
           record_trailer "</%r>\n"
           indent " "
       }







SEE ALSO

       More  examples  in  Texinfo  manual.  If  the info and ffe are properly
       installed, the command



              info ffe



       should give more information.


AUTHOR

       Timo Savinen <tjsa@iki.fi >



Timo Savinen                      2011-04-06                            ffe(1)

ffe 3.7.0 - Generated Sun Jan 29 06:32:20 CST 2017
© manpagez.com 2000-2026
Individual documents may contain additional copyright information.