[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
4 How ffe
works
Following examples use two different input files:
Fixed length example
Fixed length personnel file with header and trailer, line (record) is identified by the first byte (H = Header, E = Employee, B = Boss, T = trailer).
$cat personnel.fix H2006-02-25 EJohn Ripper 23 BScott Tiger 45 EMary Moore 41 ERidge Forrester 31 T0004 $
Structure for reading file above. Note that record ‘boss’ reuses fields from ‘employee’.
structure personel_fix { type fixed record header { id 1 H field type 1 field date 10 } record employee { id 1 E field EmpType 1 field FirstName 9 field LastName 13 field Age 2 } record boss { id 1 B fields-from employee } record trailer { id 1 T field type 1 field count 4 } }
Separated example
Same file as above, but now separated by comma.
$cat personnel.sep H,2006-02-25 E,john,Ripper,23 B,Scott,Tiger,45 E,Mary,Moore,41 E,Ridge,Forrester,31 T,0004 $
Structure for reading file above. Note that the field lengths are not needed in separated format. Length is need if the separated data is to be printed in fixed length format.
structure personel_sep { type separated , record header { id 1 H field type field date } record employee { id 1 E field type field FirstName field LastName field Age } record boss { id 1 B fields-from employee } record trailer { id 1 T field type field count } }
Printing in XML format
Data in examples above can be printed in XML using output definition like:
output xml { file_header "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" data "<%n>%t</%n>\n" record_header "<%r>\n" record_trailer "</%r>\n" indent " " }
Example output using command (assuming definitions above are saved in ~/.fferc)
ffe -p xml personnel.sep
<?xml version="1.0" encoding="UTF-8"?> <header> <type>H</type> <date>2006-02-25</date> </header> <employee> <type>E</type> <FirstName>john</FirstName> <LastName>Ripper</LastName> <Age>23</Age> </employee> <boss> <type>B</type> <FirstName>Scott</FirstName> <LastName>Tiger</LastName> <Age>45</Age> </boss> <employee> <type>E</type> <FirstName>Mary</FirstName> <LastName>Moore</LastName> <Age>41</Age> </employee> <employee> <type>E</type> <FirstName>Ridge</FirstName> <LastName>Forrester</LastName> <Age>31</Age> </employee> <trailer> <type>T</type> <count>0004</count> </trailer>
Printing sql commands
Data in examples above can be loaded to database by generated sql commands. Note that the header and trailer are not loaded, because only fields ‘FirstName’,‘LastName’ and ‘Age’ are printed and ‘no-data-print’ is set as no. This prevents the ‘record_header’ and ‘record_trailer’ to be printed for file header and trailer.
output sql { file_header "delete table boss;\ndelete table employee;\n" record_header "insert into %r values(" data "'%t'" separator "," record_trailer ");\n" file_trailer "commit\nquit\n" no-data-print no field-list FirstName,LastName,Age }
Output from command
ffe -p sql personnel.sep
delete table boss; delete table employee; insert into employee values('john','Ripper','23'); insert into boss values('Scott','Tiger','45'); insert into employee values('Mary','Moore','41'); insert into employee values('Ridge','Forrester','31'); commit quit
Human readable output
This output format shows the fields in format suitable for displaying in screen or printing.
output nice { record_header "%s - %r - %f - %o\n" data "%n=%t\n" justify = indent " " }
Output from command
ffe -p nice personnel.fix
personel - header - personnel.fix - 1 type=H date=2006-02-25 personel - employee - personnel.fix - 2 EmpType=E FirstName=John LastName=Ripper Age=23 personel - boss - personnel.fix - 3 EmpType=B FirstName=Scott LastName=Tiger Age=45 personel - employee - personnel.fix - 4 EmpType=E FirstName=Mary LastName=Moore Age=41 personel - employee - personnel.fix - 5 EmpType=E FirstName=Ridge LastName=Forrester Age=31 personel - trailer - personnel.fix - 6 type=T count=0004
HTML table
Personnel data can be displayed as HTML table using output like:
output html { file_header "<html>\n<head>\n</head>\n<body>\n<table border=\"1\">\n<tr>\n" header "<th>%n</th>\n" record_header "<tr>\n" data "<td>%t</td>\n" file_trailer "</table>\n</body>\n</html>\n" no-data-print no }
Output from command
ffe -p html -f FirstName,LastName,Age personnel.fix
<html> <head> </head> <body> <table border="1"> <tr> <th>FirstName</th> <th>LastName</th> <th>Age</th> <tr> <td>John</td> <td>Ripper</td> <td>23</td> <tr> <td>Scott</td> <td>Tiger</td> <td>45</td> <tr> <td>Mary</td> <td>Moore</td> <td>41</td> <tr> <td>Ridge</td> <td>Forrester</td> <td>31</td> </table> </body> </html>
Using expression
Printing only Scott’s record using expression with previous example:
ffe -p html -f FirstName,LastName,Age -e FirstName^Scott personnel.fix
<html> <head> </head> <body> <table border="1"> <tr> <th>FirstName</th> <th>LastName</th> <th>Age</th> <tr> <td>Scott</td> <td>Tiger</td> <td>45</td> </table> </body> </html>
Using replace
Make all bosses and write a new personnel file printing the fields in fixed length format
using directive %D
:
Output definition:
output fixed { data "%D" }
Write a new file:
$ffe -p fixed -r EmpType=B -o personnel.fix.new personnel.fix $cat personnel.fix.new H2006-02-25 BJohn Ripper 23 BScott Tiger 45 BMary Moore 41 BRidge Forrester 31 T0004 $
Using constant
The length of the fields FirstName and LastName in fixed length format will be made two bytes longer. This will be done by printing a constant after those two fields. We use dots instead of spaces in order to make change more visible.
Because we do not want to change header and trailer we need specially crafted configuration file. Employee and boss records will be printed using new output fixed2 and other records will be printed using output default.
New definition file ‘new_fixed.rc’:
const 2dots ".." structure personel_fix { type fixed record header { id 1 H field type 1 field date 10 } record employee { id 1 E field EmpType 1 field FirstName 9 field LastName 13 field Age 2 output fixed2 } record boss { id 1 B fields-from employee output fixed2 } record trailer { id 1 T field type 1 field count 4 } } output default { data "%D" } output fixed2 { data "%D" field-list Emptype,FirstName,2dots,LastName,2dots,Age }
Print new flat file:
$ ffe -c new_fixed.rc personel_fix H2006-02-25 EJohn ..Ripper ..23 BScott ..Tiger ..45 EMary ..Moore ..41 ERidge ..Forrester ..31 T0004 $
Using lookup table
Lookup table is used to explain the EmpTypes contents in output format nice
:
Lookup definition:
lookup Type { search exact pair H Header pair B "He is a Boss!" pair E "Not a Boss!" pair T Trailer default-value "Unknown record type!" }
Mapping the EmpType field to lookup:
structure personel_fix { type fixed record header { id 1 H field type 1 field date 10 } record employee { id 1 E field EmpType 1 Type field FirstName 9 field LastName 13 field Age 2 } record boss { id 1 B fields-from employee } record trailer { id 1 T field type 1 field count 4 } }
Adding the lookup option to output definition nice
.
output nice { record_header "%s - %r - %f - %o\n" data "%n=%t\n" lookup "%n=%t (%l)\n" justify = indent " " }
Running ffe:
$ffe -p nice personnel.fix personel_fix - header - personel_fix - 1 type=H date=2006-02-25 personel_fix - employee - personel_fix - 2 EmpType=E (Not a Boss!) FirstName=John LastName=Ripper Age=23 personel_fix - boss - personel_fix - 3 EmpType=B (He is a Boss!) FirstName=Scott LastName=Tiger Age=45 personel_fix - employee - personel_fix - 4 EmpType=E (Not a Boss!) FirstName=Mary LastName=Moore Age=41 personel_fix - employee - personel_fix - 5 EmpType=E (Not a Boss!) FirstName=Ridge LastName=Forrester Age=31 personel_fix - trailer - personel_fix - 6 type=T count=0004
External lookup file
In previous example the lookup data could be read from external file like:
$cat lookupdata H;Header B;He is a Boss! E;Not a Boss! T;Trailer $
Lookup definition using file above:
lookup Type { search exact file lookupdata default-value "Unknown record type!" }
Making universal csv reader using command substitution
Command substitution can be used to make a configuration for reading any csv file. The number of fields will be read from the first file using awk. Input file names and date are printed in the file header:
structure csv { type separated , header first record csv { field-count `awk "-F," 'FNR == 1 {print NF;exit;}' $FFE_FIRST_FILE` } } output default { file_header "Files: `echo $FFE_FILES`\n`date`\n" data "%n=%d\n" justify = }
Reading binary data
A binary block having a 3 byte text (ABC) in 5 bytes long space, one byte integer (35), a 32 bit integer (12345678), a double (345.385), a 3 byte bcd number (45112) and a 4 byte hexadecimal data (f15a9188) can be read using following configuration:
structure bin_data { type binary record b { field text 5 field byte_int int8 field integer int field number double field bcd_number bcd_be_3 field hex hex_be_4 } } output default { data "%n = %d (%h)\n" }
The %h
directive gives a hex dump of the input data.
Hexadecimal dump of the data:
$ od -t x1 example_bin 0000000 41 42 43 00 08 23 4e 61 bc 00 5c 8f c2 f5 28 96 0000020 75 40 45 11 2f f1 5a 91 88 0000031
Using ffe:
$ffe -c example_bin.fferc -s bin_data example_bin text = ABC (x41x42x43x00x08) byte_int = 35 (x23) integer = 12345678 (x4ex61xbcx00) number = 345.385000 (x5cx8fxc2xf5x28x96x75x40) bcd_number = 45112 (x45x11x2f) hex = f15a9188 (xf1x5ax91x88)
Note that the text has only 3 characters before NULL byte. Because this example was made in little endian machine, same result can be achieved with different configuration:
structure bin_data { type binary record b { field text 5 field byte_int int8 field integer int32_le field number double_le field bcd_number bcd_be_3 field hex hex_be_4 } }
This configuration is more portable in case the same data is to be read in a different architecture because endianess of integer and double are explicit given.
If the bcd number is read with bcd_le_3
it would look as
bcd_number = 5411 (x45x11x2f)
Note that nybbles are swapped and last byte is handled as f2
(f
stops the printing) causing only first two bytes to be printed.
and if hexadecimal data is read with hex_le_4
it would look as
hex = 88915af1 (xf1x5ax91x88)
Bytes are printed starting from the end of the data.
Printing nested XML
The keyword level
in record definition can be used to print data in multi-level nested form. In this
example a parent row is in level one and a child row is in level two. Children after a parent row belongs
to the parent before child rows, so they are enclosed in a parent element.
Example data:
P,John Smith,3 C,Kathren,6,Blue C,Jimmy,4,Red C,Peter,2,Green P,Margaret Eelers,2 C,Aden,16,White C,Amanda,20,Black
A parent row consistd of ID (P), parent name, and the count of the children. A child row consists of id (C), child name, age and favorite color.
This can be printed in nested XML using rc file:
structure family { type separated , record parent { id 1 P field FILLER field Name field Child_count level 1 parent } record child { id 1 C field FILLER field Name field Age field FavoriteColor level 2 child children } } output nested_xml { file_header "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" data "<%n>%t</%n>\n" indent " " record_trailer "" group_header "<%g>\n" group_trailer "</%g>\n" element_header "<%m>\n" element_trailer "</%m>\n" }
Output:
<?xml version="1.0" encoding="UTF-8"?> <parent> <Name>John Smith</Name> <Child_count>3</Child_count> <children> <child> <Name>Kathren</Name> <Age>6</Age> <FavoriteColor>Blue</FavoriteColor> </child> <child> <Name>Jimmy</Name> <Age>4</Age> <FavoriteColor>Red</FavoriteColor> </child> <child> <Name>Peter</Name> <Age>2</Age> <FavoriteColor>Green</FavoriteColor> </child> </children> </parent> <parent> <Name>Margaret Eelers</Name> <Child_count>2</Child_count> <children> <child> <Name>Aden</Name> <Age>16</Age> <FavoriteColor>White</FavoriteColor> </child> <child> <Name>Amanda</Name> <Age>20</Age> <FavoriteColor>Black</FavoriteColor> </child> </children> </parent>
Some examples put in a single file
structure personel_fix { type fixed record header { id 1 H field type 1 field date 10 } record employee { id 1 E field EmpType 1 Type field FirstName 9 field LastName 13 field Age 2 } record boss { id 1 B fields-from employee } record trailer { id 1 T field type 1 field count 4 } } structure personel_sep { type separated , record header { id 1 H field type field date } record employee { id 1 E field type field FirstName field LastName field Age } record boss { id 1 B fields-from employee } record trailer { id 1 T field type field count } } structure bin_data { type binary record b { field text 5 field byte_int int8 field integer int32_le field number double_le field bcd_number bcd_be_3 field hex hex_be_4 } } output xml { file_header "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n" data "<%n>%t</%n>\n" record_header "<%r>\n" record_trailer "</%r>\n" indent " " } output sql { file_header "delete table boss;\ndelete table employee;\n" record_header "insert into %r values(" data "'%t'" separator "," record_trailer ");\n" file_trailer "commit\nquit\n" no-data-print no field-list FirstName,LastName,Age } output nice { record_header "%s - %r - %f - %o\n" data "%n=%t\n" lookup "%n=%t (%l)\n" justify = indent " " } output html { file_header "<html>\n<head>\n</head>\n<body>\n<table border=\"1\">\n<tr>\n" header "<th>%n</th>\n" record_header "<tr>\n" data "<td>%t</td>\n" file_trailer "</table>\n</body>\n</html>\n" no-data-print no } output fixed { data "%D" } lookup Type { search exact pair H Header pair B "He is a Boss!" pair E "Not a Boss!" pair T Trailer default-value "Unknown record type!" }
Using ffe
to test file integrity
ffe
can be used to check flat file integrity, because ffe
checks for all lines the line length and id’s for fixed length structure
and field count and id’s for separated structure.
Integrity can be checked using command
ffe -p no -l inputfiles…
Because option ‘-p’ has value no
nothing is printed to output except the error messages.
Option ‘-l’ causes all erroneous lines to be reported, not just the first one.
Example output:
ffe: Invalid input line in file 'inputfileB', line 14550 ffe: Invalid input line in file 'inputfileD', line 12
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated on March 30, 2014 using texi2html 5.0.