4 How `ffe` works

Following examples use two different input files:

Fixed length example

Fixed length personnel file with header and trailer, line (record) is identified by the first byte (H = Header, E = Employee, B = Boss, T = trailer).

$cat personnel.fix
H2006-02-25
EJohn     Ripper       23
BScott    Tiger        45
EMary     Moore        41
ERidge    Forrester    31
T0004
$

Structure for reading file above. Note that record ‘boss’ reuses fields from ‘employee’.

structure personel_fix {
    type fixed
    record header {
        id 1 H
        field type 1
        field date 10
    }
    record employee {
        id 1 E
        field EmpType 1
        field FirstName 9
        field LastName  13
        field Age 2
    }
    record boss {
        id 1 B
        fields-from employee
    }
    record trailer {
        id 1 T
        field type 1
        field count 4
    }
}

Separated example

Same file as above, but now separated by comma.

$cat personnel.sep
H,2006-02-25
E,john,Ripper,23
B,Scott,Tiger,45
E,Mary,Moore,41
E,Ridge,Forrester,31
T,0004
$

Structure for reading file above. Note that the field lengths are not needed in separated format. Length is need if the separated data is to be printed in fixed length format.

structure personel_sep {
    type separated ,
    record header {
        id 1 H
        field type 
        field date 
    }
    record employee {
        id 1 E
        field type 
        field FirstName 
        field LastName 
        field Age 
    }
    record boss {
        id 1 B
        fields-from employee
    }
    record trailer {
        id 1 T
        field type 
        field count
    }
}

Printing in XML format

Data in examples above can be printed in XML using output definition like:

output xml {
    file_header "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
    data "<%n>%t</%n>\n"
    record_header "<%r>\n"
    record_trailer "</%r>\n"
    indent " "
}

Example output using command (assuming definitions above are saved in ~/.fferc)

ffe -p xml personnel.sep

<?xml version="1.0" encoding="UTF-8"?>
 <header>
  <type>H</type>
  <date>2006-02-25</date>
 </header>
 <employee>
  <type>E</type>
  <FirstName>john</FirstName>
  <LastName>Ripper</LastName>
  <Age>23</Age>
 </employee>
 <boss>
  <type>B</type>
  <FirstName>Scott</FirstName>
  <LastName>Tiger</LastName>
  <Age>45</Age>
 </boss>
 <employee>
  <type>E</type>
  <FirstName>Mary</FirstName>
  <LastName>Moore</LastName>
  <Age>41</Age>
 </employee>
 <employee>
  <type>E</type>
  <FirstName>Ridge</FirstName>
  <LastName>Forrester</LastName>
  <Age>31</Age>
 </employee>
 <trailer>
  <type>T</type>
  <count>0004</count>
 </trailer>

Printing sql commands

Data in examples above can be loaded to database by generated sql commands. Note that the header and trailer are not loaded, because only fields ‘FirstName’,‘LastName’ and ‘Age’ are printed and ‘no-data-print’ is set as no. This prevents the ‘record_header’ and ‘record_trailer’ to be printed for file header and trailer.

output sql {
    file_header "delete table boss;\ndelete table employee;\n"
    record_header "insert into %r values("
    data "'%t'"
    separator ","
    record_trailer ");\n"
    file_trailer "commit\nquit\n"
    no-data-print no
    field-list FirstName,LastName,Age
}

Output from command

ffe -p sql personnel.sep

delete table boss;
delete table employee;
insert into employee values('john','Ripper','23');
insert into boss values('Scott','Tiger','45');
insert into employee values('Mary','Moore','41');
insert into employee values('Ridge','Forrester','31');
commit
quit

Human readable output

This output format shows the fields in format suitable for displaying in screen or printing.

output nice {
    record_header "%s - %r - %f - %o\n"
    data "%n=%t\n"
    justify =
    indent " "
}

Output from command

ffe -p nice personnel.fix

 personel - header - personnel.fix - 1
  type=H
  date=2006-02-25
 
 personel - employee - personnel.fix - 2
    EmpType=E
  FirstName=John
   LastName=Ripper
        Age=23
 
 personel - boss - personnel.fix - 3
    EmpType=B
  FirstName=Scott
   LastName=Tiger
        Age=45
 
 personel - employee - personnel.fix - 4
    EmpType=E
  FirstName=Mary
   LastName=Moore
        Age=41
 
 personel - employee - personnel.fix - 5
    EmpType=E
  FirstName=Ridge
   LastName=Forrester
        Age=31
 
 personel - trailer - personnel.fix - 6
   type=T
  count=0004

HTML table

Personnel data can be displayed as HTML table using output like:

output html {
    file_header "<html>\n<head>\n</head>\n<body>\n<table border=\"1\">\n<tr>\n"
    header "<th>%n</th>\n"
    record_header "<tr>\n"
    data "<td>%t</td>\n"
    file_trailer "</table>\n</body>\n</html>\n"
    no-data-print no
}

Output from command

ffe -p html -f FirstName,LastName,Age personnel.fix

<html>
<head>
</head>
<body>
<table border="1">
<tr>
<th>FirstName</th>
<th>LastName</th>
<th>Age</th>

<tr>
<td>John</td>
<td>Ripper</td>
<td>23</td>

<tr>
<td>Scott</td>
<td>Tiger</td>
<td>45</td>

<tr>
<td>Mary</td>
<td>Moore</td>
<td>41</td>

<tr>
<td>Ridge</td>
<td>Forrester</td>
<td>31</td>

</table>
</body>
</html>

Using expression

Printing only Scott’s record using expression with previous example:

ffe -p html -f FirstName,LastName,Age -e FirstName^Scott personnel.fix

<html>
<head>
</head>
<body>
<table border="1">
<tr>
<th>FirstName</th>
<th>LastName</th>
<th>Age</th>

<tr>
<td>Scott</td>
<td>Tiger</td>
<td>45</td>

</table>
</body>
</html>

Using replace

Make all bosses and write a new personnel file printing the fields in fixed length format using directive %D:

Output definition:

output fixed 
{
    data "%D"
}

Write a new file:

$ffe -p fixed -r EmpType=B -o personnel.fix.new personnel.fix
$cat personnel.fix.new
H2006-02-25
BJohn     Ripper       23
BScott    Tiger        45
BMary     Moore        41
BRidge    Forrester    31
T0004
$

Using constant

The length of the fields FirstName and LastName in fixed length format will be made two bytes longer. This will be done by printing a constant after those two fields. We use dots instead of spaces in order to make change more visible.

Because we do not want to change header and trailer we need specially crafted configuration file. Employee and boss records will be printed using new output fixed2 and other records will be printed using output default.

New definition file ‘new_fixed.rc’:

const 2dots ".."

structure personel_fix {
    type fixed
    record header {
        id 1 H 
        field type 1 
        field date 10
    }
    record employee {
        id 1 E
        field EmpType 1 
        field FirstName 9
        field LastName  13
        field Age 2
        output fixed2
    }
    record boss {
        id 1 B
        fields-from employee
        output fixed2
    }
    record trailer {
        id 1 T
        field type 1 
        field count 4
    }
}

output default
{
    data "%D"
}

output fixed2
{
    data "%D"
    field-list Emptype,FirstName,2dots,LastName,2dots,Age
}

Print new flat file:

$ ffe -c new_fixed.rc personel_fix
H2006-02-25
EJohn     ..Ripper       ..23
BScott    ..Tiger        ..45
EMary     ..Moore        ..41
ERidge    ..Forrester    ..31
T0004
$

Using lookup table

Lookup table is used to explain the EmpTypes contents in output format nice:

Lookup definition:

lookup Type
{
    search exact
    pair H Header
    pair B "He is a Boss!"
    pair E "Not a Boss!"
    pair T Trailer
    default-value "Unknown record type!"
}

Mapping the EmpType field to lookup:

structure personel_fix {
    type fixed
    record header {
        id 1 H
        field type 1
        field date 10
    }
    record employee {
        id 1 E
        field EmpType 1 Type
        field FirstName 9
        field LastName  13
        field Age 2
    }
    record boss {
        id 1 B
        fields-from employee
    }
    record trailer {
        id 1 T
        field type 1
        field count 4
    }
}

Adding the lookup option to output definition nice.

output nice {
    record_header "%s - %r - %f - %o\n"
    data "%n=%t\n"
    lookup "%n=%t (%l)\n"
    justify =
    indent " "
}

Running ffe:

 $ffe -p nice personnel.fix
 personel_fix - header - personel_fix - 1
  type=H
  date=2006-02-25
 
 personel_fix - employee - personel_fix - 2
    EmpType=E (Not a Boss!)
  FirstName=John
   LastName=Ripper
        Age=23
 
 personel_fix - boss - personel_fix - 3
    EmpType=B (He is a Boss!)
  FirstName=Scott
   LastName=Tiger
        Age=45
 
 personel_fix - employee - personel_fix - 4
    EmpType=E (Not a Boss!)
  FirstName=Mary
   LastName=Moore
        Age=41
 
 personel_fix - employee - personel_fix - 5
    EmpType=E (Not a Boss!)
  FirstName=Ridge
   LastName=Forrester
        Age=31
 
 personel_fix - trailer - personel_fix - 6
   type=T
  count=0004

External lookup file

In previous example the lookup data could be read from external file like:

$cat lookupdata
H;Header
B;He is a Boss!
E;Not a Boss!
T;Trailer
$

Lookup definition using file above:

lookup Type
{
    search exact
    file lookupdata
    default-value "Unknown record type!"
}

Making universal csv reader using command substitution

Command substitution can be used to make a configuration for reading any csv file. The number of fields will be read from the first file using awk. Input file names and date are printed in the file header:

structure csv {
    type separated ,
    header first
    record csv {
        field-count `awk "-F," 'FNR == 1 {print NF;exit;}' $FFE_FIRST_FILE`
    }
}

output default {
    file_header "Files: `echo $FFE_FILES`\n`date`\n"
    data "%n=%d\n"
    justify =
}

Reading binary data

A binary block having a 3 byte text (ABC) in 5 bytes long space, one byte integer (35), a 32 bit integer (12345678), a double (345.385), a 3 byte bcd number (45112) and a 4 byte hexadecimal data (f15a9188) can be read using following configuration:

structure bin_data
{
    type binary
    record b
    {
        field text 5
        field byte_int int8
        field integer int
        field number double
        field bcd_number bcd_be_3
        field hex hex_be_4
    }
}

output default
{
    data "%n = %d (%h)\n"
}

The %h directive gives a hex dump of the input data.

Hexadecimal dump of the data:

$ od -t x1 example_bin
0000000 41 42 43 00 08 23 4e 61 bc 00 5c 8f c2 f5 28 96
0000020 75 40 45 11 2f f1 5a 91 88
0000031

Using ffe:

$ffe -c example_bin.fferc -s bin_data example_bin
text = ABC (x41x42x43x00x08)
byte_int = 35 (x23)
integer = 12345678 (x4ex61xbcx00)
number = 345.385000 (x5cx8fxc2xf5x28x96x75x40)
bcd_number = 45112 (x45x11x2f)
hex = f15a9188 (xf1x5ax91x88)

Note that the text has only 3 characters before NULL byte. Because this example was made in little endian machine, same result can be achieved with different configuration:

structure bin_data
{
    type binary
    record b
    {
        field text 5
        field byte_int int8
        field integer int32_le
        field number double_le
        field bcd_number bcd_be_3
        field hex hex_be_4
    }
}

This configuration is more portable in case the same data is to be read in a different architecture because endianess of integer and double are explicit given.

If the bcd number is read with bcd_le_3 it would look as

bcd_number = 5411 (x45x11x2f)

Note that nybbles are swapped and last byte is handled as f2 (f stops the printing) causing only first two bytes to be printed.

and if hexadecimal data is read with hex_le_4 it would look as

hex = 88915af1 (xf1x5ax91x88)

Bytes are printed starting from the end of the data.

Printing nested XML

The keyword level in record definition can be used to print data in multi-level nested form. In this example a parent row is in level one and a child row is in level two. Children after a parent row belongs to the parent before child rows, so they are enclosed in a parent element.

Example data:

P,John Smith,3 
C,Kathren,6,Blue 
C,Jimmy,4,Red
C,Peter,2,Green
P,Margaret Eelers,2
C,Aden,16,White
C,Amanda,20,Black

A parent row consistd of ID (P), parent name, and the count of the children. A child row consists of id (C), child name, age and favorite color.

This can be printed in nested XML using rc file:

structure family
{
    type separated ,
    record parent
    {
        id 1 P
        field FILLER
        field Name
        field Child_count
        level 1 parent
    }

    record child
    {
        id 1 C
        field FILLER
        field Name
        field Age
        field FavoriteColor
        level 2 child children
    }
}

output nested_xml
{
    file_header "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
    data "<%n>%t</%n>\n"
    indent " "
    record_trailer ""
    group_header "<%g>\n"
    group_trailer "</%g>\n"
    element_header "<%m>\n"
    element_trailer "</%m>\n"
}

Output:

<?xml version="1.0" encoding="UTF-8"?>
 <parent>
  <Name>John Smith</Name>
  <Child_count>3</Child_count>
  <children>
   <child>
    <Name>Kathren</Name>
    <Age>6</Age>
    <FavoriteColor>Blue</FavoriteColor>
   </child>
   <child>
    <Name>Jimmy</Name>
    <Age>4</Age>
    <FavoriteColor>Red</FavoriteColor>
   </child>
   <child>
    <Name>Peter</Name>
    <Age>2</Age>
    <FavoriteColor>Green</FavoriteColor>
   </child>
  </children>
 </parent>
 <parent>
  <Name>Margaret Eelers</Name>
  <Child_count>2</Child_count>
  <children>
   <child>
    <Name>Aden</Name>
    <Age>16</Age>
    <FavoriteColor>White</FavoriteColor>
   </child>
   <child>
    <Name>Amanda</Name>
    <Age>20</Age>
    <FavoriteColor>Black</FavoriteColor>
   </child>
  </children>
 </parent>

Some examples put in a single file

structure personel_fix {
    type fixed
    record header {
        id 1 H
        field type 1
        field date 10
    }
    record employee {
        id 1 E
        field EmpType 1 Type
        field FirstName 9
        field LastName  13
        field Age 2
    }
    record boss {
        id 1 B
        fields-from employee
    }
    record trailer {
        id 1 T
        field type 1
        field count 4
    }
}

structure personel_sep {
    type separated ,
    record header {
        id 1 H
        field type 
        field date 
    }
    record employee {
        id 1 E
        field type 
        field FirstName 
        field LastName  
        field Age 
    }
    record boss {
        id 1 B
        fields-from employee
    }
        record trailer {
        id 1 T
        field type 
        field count
    }
}

structure bin_data
{
    type binary
    record b
    {
        field text 5
        field byte_int int8
        field integer int32_le
        field number double_le
        field bcd_number bcd_be_3
        field hex hex_be_4
    }
}

output xml {
    file_header "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n"
    data "<%n>%t</%n>\n"
    record_header "<%r>\n"
    record_trailer "</%r>\n"
    indent " "
}

output sql {
    file_header "delete table boss;\ndelete table employee;\n"
    record_header "insert into %r values("
    data "'%t'"
    separator ","
    record_trailer ");\n"
    file_trailer "commit\nquit\n"
    no-data-print no
    field-list FirstName,LastName,Age
}

output nice {
    record_header "%s - %r - %f - %o\n"
    data "%n=%t\n"
    lookup "%n=%t (%l)\n"
    justify =
    indent " "
}

output html {
    file_header "<html>\n<head>\n</head>\n<body>\n<table border=\"1\">\n<tr>\n"
    header "<th>%n</th>\n"
    record_header "<tr>\n"
    data "<td>%t</td>\n"
    file_trailer "</table>\n</body>\n</html>\n"
    no-data-print no
}

output fixed 
{
    data "%D"
}

lookup Type
{
    search exact
    pair H Header
    pair B "He is a Boss!"
    pair E "Not a Boss!"
    pair T Trailer
    default-value "Unknown record type!"
}

Using `ffe` to test file integrity

ffe can be used to check flat file integrity, because ffe checks for all lines the line length and id’s for fixed length structure and field count and id’s for separated structure.

Integrity can be checked using command

ffe -p no -l inputfiles…

Because option ‘-p’ has value no nothing is printed to output except the error messages. Option ‘-l’ causes all erroneous lines to be reported, not just the first one.

Example output:

ffe: Invalid input line in file 'inputfileB', line 14550
ffe: Invalid input line in file 'inputfileD', line 12

This document was generated on March 30, 2014 using texi2html 5.0.

4 How ffe works

Fixed length example

Separated example

Printing in XML format

Printing sql commands

Human readable output

HTML table

Using expression

Using replace

Using constant

Using lookup table

External lookup file

Making universal csv reader using command substitution

Reading binary data

Printing nested XML

Some examples put in a single file

Using ffe to test file integrity

4 How `ffe` works

Using `ffe` to test file integrity