File: gawk.info, Node: Input Summary, Next: Input Exercises, Prev: Command-line directories, Up: Reading Files 4.14 Summary ============ * Input is split into records based on the value of 'RS'. The possibilities are as follows: Value of 'RS' Records are split on 'awk' / 'gawk' ... --------------------------------------------------------------------------- Any single That character 'awk' character The empty string Runs of two or more 'awk' ('""') newlines A regexp Text that matches the 'gawk' regexp * 'FNR' indicates how many records have been read from the current input file; 'NR' indicates how many records have been read in total. * 'gawk' sets 'RT' to the text matched by 'RS'. * After splitting the input into records, 'awk' further splits the records into individual fields, named '$1', '$2', and so on. '$0' is the whole record, and 'NF' indicates how many fields there are. The default way to split fields is between whitespace characters. * Fields may be referenced using a variable, as in '$NF'. Fields may also be assigned values, which causes the value of '$0' to be recomputed when it is later referenced. Assigning to a field with a number greater than 'NF' creates the field and rebuilds the record, using 'OFS' to separate the fields. Incrementing 'NF' does the same thing. Decrementing 'NF' throws away fields and rebuilds the record. * Field splitting is more complicated than record splitting: Field separator value Fields are split ... 'awk' / 'gawk' --------------------------------------------------------------------------- 'FS == " "' On runs of whitespace 'awk' 'FS == ANY SINGLE On that character 'awk' CHARACTER' 'FS == REGEXP' On text matching the regexp 'awk' 'FS == ""' Such that each individual 'gawk' character is a separate field 'FIELDWIDTHS == LIST OF Based on character position 'gawk' COLUMNS' 'FPAT == REGEXP' On the text surrounding 'gawk' text matching the regexp * Using 'FS = "\n"' causes the entire record to be a single field (assuming that newlines separate records). * 'FS' may be set from the command line using the '-F' option. This can also be done using command-line variable assignment. * Use 'PROCINFO["FS"]' to see how fields are being split. * Use 'getline' in its various forms to read additional records from the default input stream, from a file, or from a pipe or coprocess. * Use 'PROCINFO[FILE, "READ_TIMEOUT"]' to cause reads to time out for FILE. * Directories on the command line are fatal for standard 'awk'; 'gawk' ignores them if not in POSIX mode.