manpagez: man pages & more
info gawk
Home | html | info | man

File: gawk.info,  Node: Feature History,  Next: Common Extensions,  Prev: POSIX/GNU,  Up: Language History

A.6 History of 'gawk' Features
==============================

This minor node describes the features in 'gawk' over and above those in
POSIX 'awk', in the order they were added to 'gawk'.

   Version 2.10 of 'gawk' introduced the following features:

   * The 'AWKPATH' environment variable for specifying a path search for
     the '-f' command-line option (*note Options::).

   * The 'IGNORECASE' variable and its effects (*note
     Case-sensitivity::).

   * The '/dev/stdin', '/dev/stdout', '/dev/stderr' and '/dev/fd/N'
     special file names (*note Special Files::).

   Version 2.13 of 'gawk' introduced the following features:

   * The 'FIELDWIDTHS' variable and its effects (*note Constant Size::).

   * The 'systime()' and 'strftime()' built-in functions for obtaining
     and printing timestamps (*note Time Functions::).

   * Additional command-line options (*note Options::):

        - The '-W lint' option to provide error and portability checking
          for both the source code and at runtime.

        - The '-W compat' option to turn off the GNU extensions.

        - The '-W posix' option for full POSIX compliance.

   Version 2.14 of 'gawk' introduced the following feature:

   * The 'next file' statement for skipping to the next data file (*note
     Nextfile Statement::).

   Version 2.15 of 'gawk' introduced the following features:

   * New variables (*note Built-in Variables::):

        - 'ARGIND', which tracks the movement of 'FILENAME' through
          'ARGV'.

        - 'ERRNO', which contains the system error message when
          'getline' returns -1 or 'close()' fails.

   * The '/dev/pid', '/dev/ppid', '/dev/pgrpid', and '/dev/user' special
     file names.  These have since been removed.

   * The ability to delete all of an array at once with 'delete ARRAY'
     (*note Delete::).

   * Command-line option changes (*note Options::):

        - The ability to use GNU-style long-named options that start
          with '--'.

        - The '--source' option for mixing command-line and library-file
          source code.

   Version 3.0 of 'gawk' introduced the following features:

   * New or changed variables:

        - 'IGNORECASE' changed, now applying to string comparison as
          well as regexp operations (*note Case-sensitivity::).

        - 'RT', which contains the input text that matched 'RS' (*note
          Records::).

   * Full support for both POSIX and GNU regexps (*note Regexp::).

   * The 'gensub()' function for more powerful text manipulation (*note
     String Functions::).

   * The 'strftime()' function acquired a default time format, allowing
     it to be called with no arguments (*note Time Functions::).

   * The ability for 'FS' and for the third argument to 'split()' to be
     null strings (*note Single Character Fields::).

   * The ability for 'RS' to be a regexp (*note Records::).

   * The 'next file' statement became 'nextfile' (*note Nextfile
     Statement::).

   * The 'fflush()' function from BWK 'awk' (then at Bell Laboratories;
     *note I/O Functions::).

   * New command-line options:

        - The '--lint-old' option to warn about constructs that are not
          available in the original Version 7 Unix version of 'awk'
          (*note V7/SVR3.1::).

        - The '-m' option from BWK 'awk'.  (Brian was still at Bell
          Laboratories at the time.)  This was later removed from both
          his 'awk' and from 'gawk'.

        - The '--re-interval' option to provide interval expressions in
          regexps (*note Regexp Operators::).

        - The '--traditional' option was added as a better name for
          '--compat' (*note Options::).

   * The use of GNU Autoconf to control the configuration process (*note
     Quick Installation::).

   * Amiga support.  This has since been removed.

   Version 3.1 of 'gawk' introduced the following features:

   * New variables (*note Built-in Variables::):

        - 'BINMODE', for non-POSIX systems, which allows binary I/O for
          input and/or output files (*note PC Using::).

        - 'LINT', which dynamically controls lint warnings.

        - 'PROCINFO', an array for providing process-related
          information.

        - 'TEXTDOMAIN', for setting an application's
          internationalization text domain (*note
          Internationalization::).

   * The ability to use octal and hexadecimal constants in 'awk' program
     source code (*note Nondecimal-numbers::).

   * The '|&' operator for two-way I/O to a coprocess (*note Two-way
     I/O::).

   * The '/inet' special files for TCP/IP networking using '|&' (*note
     TCP/IP Networking::).

   * The optional second argument to 'close()' that allows closing one
     end of a two-way pipe to a coprocess (*note Two-way I/O::).

   * The optional third argument to the 'match()' function for capturing
     text-matching subexpressions within a regexp (*note String
     Functions::).

   * Positional specifiers in 'printf' formats for making translations
     easier (*note Printf Ordering::).

   * A number of new built-in functions:

        - The 'asort()' and 'asorti()' functions for sorting arrays
          (*note Array Sorting::).

        - The 'bindtextdomain()', 'dcgettext()' and 'dcngettext()'
          functions for internationalization (*note Programmer i18n::).

        - The 'extension()' function and the ability to add new built-in
          functions dynamically.  This has seen removed.  It was
          replaced by the new extension mechanism.  *Note Dynamic
          Extensions::.

        - The 'mktime()' function for creating timestamps (*note Time
          Functions::).

        - The 'and()', 'or()', 'xor()', 'compl()', 'lshift()',
          'rshift()', and 'strtonum()' functions (*note Bitwise
          Functions::).

   * The support for 'next file' as two words was removed completely
     (*note Nextfile Statement::).

   * Additional command-line options (*note Options::):

        - The '--dump-variables' option to print a list of all global
          variables.

        - The '--exec' option, for use in CGI scripts.

        - The '--gen-po' command-line option and the use of a leading
          underscore to mark strings that should be translated (*note
          String Extraction::).

        - The '--non-decimal-data' option to allow non-decimal input
          data (*note Nondecimal Data::).

        - The '--profile' option and 'pgawk', the profiling version of
          'gawk', for producing execution profiles of 'awk' programs
          (*note Profiling::).

        - The '--use-lc-numeric' option to force 'gawk' to use the
          locale's decimal point for parsing input data (*note
          Conversion::).

   * The use of GNU Automake to help in standardizing the configuration
     process (*note Quick Installation::).

   * The use of GNU 'gettext' for 'gawk''s own message output (*note
     Gawk I18N::).

   * BeOS support.  This was later removed.

   * Tandem support.  This was later removed.

   * The Atari port became officially unsupported and was later removed
     entirely.

   * The source code changed to use ISO C standard-style function
     definitions.

   * POSIX compliance for 'sub()' and 'gsub()' (*note Gory Details::).

   * The 'length()' function was extended to accept an array argument
     and return the number of elements in the array (*note String
     Functions::).

   * The 'strftime()' function acquired a third argument to enable
     printing times as UTC (*note Time Functions::).

   Version 4.0 of 'gawk' introduced the following features:

   * Variable additions:

        - 'FPAT', which allows you to specify a regexp that matches the
          fields, instead of matching the field separator (*note
          Splitting By Content::).

        - If 'PROCINFO["sorted_in"]' exists, 'for (iggy in foo)' loops
          sort the indices before looping over them.  The value of this
          element provides control over how the indices are sorted
          before the loop traversal starts (*note Controlling
          Scanning::).

        - 'PROCINFO["strftime"]', which holds the default format for
          'strftime()' (*note Time Functions::).

   * The special files '/dev/pid', '/dev/ppid', '/dev/pgrpid' and
     '/dev/user' were removed.

   * Support for IPv6 was added via the '/inet6' special file.  '/inet4'
     forces IPv4 and '/inet' chooses the system default, which is
     probably IPv4 (*note TCP/IP Networking::).

   * The use of '\s' and '\S' escape sequences in regular expressions
     (*note GNU Regexp Operators::).

   * Interval expressions became part of default regular expressions
     (*note Regexp Operators::).

   * POSIX character classes work even with '--traditional' (*note
     Regexp Operators::).

   * 'break' and 'continue' became invalid outside a loop, even with
     '--traditional' (*note Break Statement::, and also see *note
     Continue Statement::).

   * 'fflush()', 'nextfile', and 'delete ARRAY' are allowed if '--posix'
     or '--traditional', since they are all now part of POSIX.

   * An optional third argument to 'asort()' and 'asorti()', specifying
     how to sort (*note String Functions::).

   * The behavior of 'fflush()' changed to match BWK 'awk' and for
     POSIX; now both 'fflush()' and 'fflush("")' flush all open output
     redirections (*note I/O Functions::).

   * The 'isarray()' function which distinguishes if an item is an array
     or not, to make it possible to traverse arrays of arrays (*note
     Type Functions::).

   * The 'patsplit()' function which gives the same capability as
     'FPAT', for splitting (*note String Functions::).

   * An optional fourth argument to the 'split()' function, which is an
     array to hold the values of the separators (*note String
     Functions::).

   * Arrays of arrays (*note Arrays of Arrays::).

   * The 'BEGINFILE' and 'ENDFILE' special patterns (*note
     BEGINFILE/ENDFILE::).

   * Indirect function calls (*note Indirect Calls::).

   * 'switch' / 'case' are enabled by default (*note Switch
     Statement::).

   * Command-line option changes (*note Options::):

        - The '-b' and '--characters-as-bytes' options which prevent
          'gawk' from treating input as a multibyte string.

        - The redundant '--compat', '--copyleft', and '--usage' long
          options were removed.

        - The '--gen-po' option was finally renamed to the correct
          '--gen-pot'.

        - The '--sandbox' option which disables certain features.

        - All long options acquired corresponding short options, for use
          in '#!' scripts.

   * Directories named on the command line now produce a warning, not a
     fatal error, unless '--posix' or '--traditional' are used (*note
     Command-line directories::).

   * The 'gawk' internals were rewritten, bringing the 'dgawk' debugger
     and possibly improved performance (*note Debugger::).

   * Per the GNU Coding Standards, dynamic extensions must now define a
     global symbol indicating that they are GPL-compatible (*note Plugin
     License::).

   * In POSIX mode, string comparisons use 'strcoll()' / 'wcscoll()'
     (*note POSIX String Comparison::).

   * The option for raw sockets was removed, since it was never
     implemented (*note TCP/IP Networking::).

   * Ranges of the form '[d-h]' are treated as if they were in the C
     locale, no matter what kind of regexp is being used, and even if
     '--posix' (*note Ranges and Locales::).

   * Support was removed for the following systems:

        - Atari

        - Amiga

        - BeOS

        - Cray

        - MIPS RiscOS

        - MS-DOS with the Microsoft Compiler

        - MS-Windows with the Microsoft Compiler

        - NeXT

        - SunOS 3.x, Sun 386 (Road Runner)

        - Tandem (non-POSIX)

        - Prestandard VAX C compiler for VAX/VMS

   Version 4.1 of 'gawk' introduced the following features:

   * Three new arrays: 'SYMTAB', 'FUNCTAB', and
     'PROCINFO["identifiers"]' (*note Auto-set::).

   * The three executables 'gawk', 'pgawk', and 'dgawk', were merged
     into one, named just 'gawk'.  As a result the command-line options
     changed.

   * Command-line option changes (*note Options::):

        - The '-D' option invokes the debugger.

        - The '-i' and '--include' options load 'awk' library files.

        - The '-l' and '--load' options load compiled dynamic
          extensions.

        - The '-M' and '--bignum' options enable MPFR.

        - The '-o' option only does pretty-printing.

        - The '-p' option is used for profiling.

        - The '-R' option was removed.

   * Support for high precision arithmetic with MPFR (*note Arbitrary
     Precision Arithmetic::).

   * The 'and()', 'or()' and 'xor()' functions changed to allow any
     number of arguments, with a minimum of two (*note Bitwise
     Functions::).

   * The dynamic extension interface was completely redone (*note
     Dynamic Extensions::).

   * Redirected 'getline' became allowed inside 'BEGINFILE' and
     'ENDFILE' (*note BEGINFILE/ENDFILE::).

   * The 'where' command was added to the debugger (*note Execution
     Stack::).

   * Support for Ultrix was removed.

   Version 4.2 of 'gawk' introduced the following changes:

   * Changes to 'ENVIRON' are reflected into 'gawk''s environment and
     that of programs that it runs.  *Note Auto-set::.

   * 'FIELDWIDTHS' was enhanced to allow skipping characters before
     assigning a value to a field (*note Splitting By Content::).

   * The 'PROCINFO["argv"]' array.  *Note Auto-set::.

   * The maximum number of hexadecimal digits in '\x' escapes is now
     two.  *Note Escape Sequences::.

   * Strongly typed regexp constants of the form '@/.../' (*note Strong
     Regexp Constants::).

   * The bitwise functions changed, making negative arguments into a
     fatal error (*note Bitwise Functions::).

   * The 'mktime()' function now accepts an optional second argument
     (*note Time Functions::).

   * The 'typeof()' function (*note Type Functions::).

   * Optimizations are enabled by default.  Use '-s' / '--no-optimize'
     to disable optimizations.

   * For many years, POSIX specified that default field splitting only
     allowed spaces and tabs to separate fields, and this was how 'gawk'
     behaved with '--posix'.  As of 2013, the standard restored
     historical behavior, and now default field splitting with '--posix'
     also allows newlines to separate fields.

   * Nonfatal output with 'print' and 'printf'.  *Note Nonfatal::.

   * Retryable I/O via 'PROCINFO[INPUT-FILE, "RETRY"]'; (*note Retrying
     Input::).

   * Changes to the pretty-printer (*note Profiling::):

        - The '--pretty-print' option no longer runs the 'awk' program
          too.

        - Comments in the source program are preserved and placed into
          the output file.

        - Explicit parentheses for expressions in the input are
          preserved in the generated output.

   * Improvements to the extension API (*note Dynamic Extensions::):

        - The 'get_file()' function to access open redirections.

        - The 'nonfatal()' function for generating nonfatal error
          messages.

        - Support for GMP and MPFR values.

        - Input parsers can now override the default field parsing
          mechanism by specifying explicit locations.

   * Shell startup files are supplied with the distribution and
     installed by 'make install' (*note Shell Startup Files::).

   * The 'igawk' program and its manual page are no longer installed
     when 'gawk' is built.  *Note Igawk Program::.

   * Support for MirBSD was removed.

   * Support for GNU/Linux on Alpha was removed.

   Version 5.0 added the following features:

   * The 'PROCINFO["platform"]' array element, which allows you to write
     code that takes the operating system / platform into account.

   Version 5.1 was created to release 'gawk' with a correct major
version number for the API. This was overlooked for version 5.0,
unfortunately.  It added the following features:

   * The index for this manual was completely reworked.

   * Support was added for MSYS2.

   * 'asort()' and 'asorti()' were changed to allow 'FUNCTAB' and
     'SYMTAB' as the first argument if a second destination array is
     supplied (*note String Functions::).

   * The '-I'/'--trace' options were added to print a trace of the byte
     codes as they execute (*note Options::).

   * '$0' and the fields are now cleared before starting a 'BEGINFILE'
     rule (*note BEGINFILE/ENDFILE::).

   * Several example programs in the manual were updated to their modern
     POSIX equivalents.

   * The "no effect" lint warnings from '--lint' were fixed up and now
     behave more sanely (*note Options::).

   * Handling of Infinity and NaN values were improved.  *Note Math
     Definitions::, and also see *note POSIX Floating Point Problems::.

   Version 5.2 added the following features:

   * The 'mkbool()' built-in function (*note Boolean Functions::).

   * Interval expressions in regular expressions are enabled by default
     (*note Interval Expressions::).

   * Support for the FNV1-A hash algorithm for its hash function (*note
     Other Environment Variables::).

   * The 'gawkbug' script for reporting bugs (*note Bug address::).

   * Terence Kelly's persistent memory allocator (PMA) was added,
     allowing the use of persistent data on certain systems (*note
     Persistent Memory::).

© manpagez.com 2000-2025
Individual documents may contain additional copyright information.