manpagez: man pages & more
info gawk
Home | html | info | man

gawk: Ranges

 7.1.3 Specifying Record Ranges with Patterns
 A "range pattern" is made of two patterns separated by a comma, in the
 form 'BEGPAT, ENDPAT'.  It is used to match ranges of consecutive input
 records.  The first pattern, BEGPAT, controls where the range begins,
 while ENDPAT controls where the pattern ends.  For example, the
      awk '$1 == "on", $1 == "off"' myfile
 prints every record in 'myfile' between 'on'/'off' pairs, inclusive.
    A range pattern starts out by matching BEGPAT against every input
 record.  When a record matches BEGPAT, the range pattern is "turned on",
 and the range pattern matches this record as well.  As long as the range
 pattern stays turned on, it automatically matches every input record
 read.  The range pattern also matches ENDPAT against every input record;
 when this succeeds, the range pattern is "turned off" again for the
 following record.  Then the range pattern goes back to checking BEGPAT
 against each record.
    The record that turns on the range pattern and the one that turns it
 off both match the range pattern.  If you don't want to operate on these
 records, you can write 'if' statements in the rule's action to
 distinguish them from the records you are interested in.
    It is possible for a pattern to be turned on and off by the same
 record.  If the record satisfies both conditions, then the action is
 executed for just that record.  For example, suppose there is text
 between two identical markers (e.g., the '%' symbol), each on its own
 line, that should be ignored.  A first attempt would be to combine a
 range pattern that describes the delimited text with the 'next'
 statement (not discussed yet, ⇒Next Statement).  This causes
 'awk' to skip any further processing of the current record and start
 over again with the next input record.  Such a program looks like this:
      /^%$/,/^%$/    { next }
                     { print }
 This program fails because the range pattern is both turned on and
 turned off by the first line, which just has a '%' on it.  To accomplish
 this task, write the program in the following manner, using a flag:
      /^%$/     { skip = ! skip; next }
      skip == 1 { next } # skip lines with `skip' set
    In a range pattern, the comma (',') has the lowest precedence of all
 the operators (i.e., it is evaluated last).  Thus, the following program
 attempts to combine a range pattern with another, simpler test:
      echo Yes | awk '/1/,/2/ || /Yes/'
    The intent of this program is '(/1/,/2/) || /Yes/'.  However, 'awk'
 interprets this as '/1/, (/2/ || /Yes/)'.  This cannot be changed or
 worked around; range patterns do not combine with other patterns:
      $ echo Yes | gawk '(/1/,/2/) || /Yes/'
      error-> gawk: cmd. line:1: (/1/,/2/) || /Yes/
      error-> gawk: cmd. line:1:           ^ syntax error
    As a minor point of interest, although it is poor style, POSIX allows
 you to put a newline after the comma in a range pattern.  (d.c.)
© 2000-2019
Individual documents may contain additional copyright information.