[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
4.4 Changing the Contents of a Field
The contents of a field, as seen by awk
, can be changed within an
awk
program; this changes what awk
perceives as the
current input record. (The actual input is untouched; awk
never
modifies the input file.)
Consider the following example and its output:
$ awk '{ nboxes = $3 ; $3 = $3 - 10 > print nboxes, $3 }' inventory-shipped -| 25 15 -| 32 22 -| 24 14 … |
The program first saves the original value of field three in the variable
nboxes
.
The ‘-’ sign represents subtraction, so this program reassigns
field three, $3
, as the original value of field three minus ten:
‘$3 - 10’. (See section Arithmetic Operators.)
Then it prints the original and new values for field three.
(Someone in the warehouse made a consistent mistake while inventorying
the red boxes.)
For this to work, the text in field $3
must make sense
as a number; the string of characters must be converted to a number
for the computer to do arithmetic on it. The number resulting
from the subtraction is converted back to a string of characters that
then becomes field three.
See section Conversion of Strings and Numbers.
When the value of a field is changed (as perceived by awk
), the
text of the input record is recalculated to contain the new field where
the old one was. In other words, $0
changes to reflect the altered
field. Thus, this program
prints a copy of the input file, with 10 subtracted from the second
field of each line:
$ awk '{ $2 = $2 - 10; print $0 }' inventory-shipped -| Jan 3 25 15 115 -| Feb 5 32 24 226 -| Mar 5 24 34 228 … |
It is also possible to also assign contents to fields that are out of range. For example:
$ awk '{ $6 = ($5 + $4 + $3 + $2) > print $6 }' inventory-shipped -| 168 -| 297 -| 301 … |
We’ve just created $6
, whose value is the sum of fields
$2
, $3
, $4
, and $5
. The ‘+’ sign
represents addition. For the file ‘inventory-shipped’, $6
represents the total number of parcels shipped for a particular month.
Creating a new field changes awk
’s internal copy of the current
input record, which is the value of $0
. Thus, if you do ‘print $0’
after adding a field, the record printed includes the new field, with
the appropriate number of field separators between it and the previously
existing fields.
This recomputation affects and is affected by
NF
(the number of fields; see section Examining Fields).
For example, the value of NF
is set to the number of the highest
field you create.
The exact format of $0
is also affected by a feature that has not been discussed yet:
the output field separator, OFS
,
used to separate the fields (see section Output Separators).
Note, however, that merely referencing an out-of-range field
does not change the value of either $0
or NF
.
Referencing an out-of-range field only produces an empty string. For
example:
if ($(NF+1) != "") print "can't happen" else print "everything is normal" |
should print ‘everything is normal’, because NF+1
is certain
to be out of range. (See section The if
-else
Statement,
for more information about awk
’s if-else
statements.
See section Variable Typing and Comparison Expressions,
for more information about the ‘!=’ operator.)
It is important to note that making an assignment to an existing field
changes the
value of $0
but does not change the value of NF
,
even when you assign the empty string to a field. For example:
$ echo a b c d | awk '{ OFS = ":"; $2 = "" > print $0; print NF }' -| a::c:d -| 4 |
The field is still there; it just has an empty value, denoted by the two colons between ‘a’ and ‘c’. This example shows what happens if you create a new field:
$ echo a b c d | awk '{ OFS = ":"; $2 = ""; $6 = "new" > print $0; print NF }' -| a::c:d::new -| 6 |
The intervening field, $5
, is created with an empty value
(indicated by the second pair of adjacent colons),
and NF
is updated with the value six.
Decrementing NF
throws away the values of the fields
after the new value of NF
and recomputes $0
.
(d.c.)
Here is an example:
$ echo a b c d e f | awk '{ print "NF =", NF; > NF = 3; print $0 }' -| NF = 6 -| a b c |
CAUTION: Some versions of
awk
don’t rebuild$0
whenNF
is decremented. Caveat emptor.
Finally, there are times when it is convenient to force
awk
to rebuild the entire record, using the current
value of the fields and OFS
. To do this, use the
seemingly innocuous assignment:
$1 = $1 # force record to be reconstituted print $0 # or whatever else with $0 |
This forces awk
rebuild the record. It does help
to add a comment, as we’ve shown here.
There is a flip side to the relationship between $0
and
the fields. Any assignment to $0
causes the record to be
reparsed into fields using the current value of FS
.
This also applies to any built-in function that updates $0
,
such as sub()
and gsub()
(see section String-Manipulation Functions).
Advanced Notes: Understanding $0
It is important to remember that $0
is the full
record, exactly as it was read from the input. This includes
any leading or trailing whitespace, and the exact whitespace (or other
characters) that separate the fields.
It is a not-uncommon error to try to change the field separators
in a record simply by setting FS
and OFS
, and then
expecting a plain ‘print’ or ‘print $0’ to print the
modified record.
But this does not work, since nothing was done to change the record itself. Instead, you must force the record to be rebuilt, typically with a statement such as ‘$1 = $1’, as described earlier.
[ < ] | [ > ] | [ << ] | [ Up ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |