manpagez: man pages & more
info gawk
Home | html | info | man

File: gawk.info,  Node: Basic Data Typing,  Prev: Basic High Level,  Up: Basic Concepts

D.2 Data Values in a Computer
=============================

In a program, you keep track of information and values in things called
"variables".  A variable is just a name for a given value, such as
'first_name', 'last_name', 'address', and so on.  'awk' has several
predefined variables, and it has special names to refer to the current
input record and the fields of the record.  You may also group multiple
associated values under one name, as an array.

   Data, particularly in 'awk', consists of either numeric values, such
as 42 or 3.1415927, or string values.  String values are essentially
anything that's not a number, such as a name.  Strings are sometimes
referred to as "character data", since they store the individual
characters that comprise them.  Individual variables, as well as numeric
and string variables, are referred to as "scalar" values.  Groups of
values, such as arrays, are not scalars.

   *note Computer Arithmetic::, provided a basic introduction to numeric
types (integer and floating-point) and how they are used in a computer.
Please review that information, including a number of caveats that were
presented.

   While you are probably used to the idea of a number without a value
(i.e., zero), it takes a bit more getting used to the idea of
zero-length character data.  Nevertheless, such a thing exists.  It is
called the "null string".  The null string is character data that has no
value.  In other words, it is empty.  It is written in 'awk' programs
like this: '""'.

   Humans are used to working in decimal; i.e., base 10.  In base 10,
numbers go from 0 to 9, and then "roll over" into the next column.
(Remember grade school?  42 = 4 x 10 + 2.)

   There are other number bases though.  Computers commonly use base 2
or "binary", base 8 or "octal", and base 16 or "hexadecimal".  In
binary, each column represents two times the value in the column to its
right.  Each column may contain either a 0 or a 1.  Thus, binary 1010
represents (1 x 8) + (0 x 4) + (1 x 2) + (0 x 1), or decimal 10.  Octal
and hexadecimal are discussed more in *note Nondecimal-numbers::.

   At the very lowest level, computers store values as groups of binary
digits, or "bits".  Modern computers group bits into groups of eight,
called "bytes".  Advanced applications sometimes have to manipulate bits
directly, and 'gawk' provides functions for doing so.

   Programs are written in programming languages.  Hundreds, if not
thousands, of programming languages exist.  One of the most popular is
the C programming language.  The C language had a very strong influence
on the design of the 'awk' language.

   There have been several versions of C. The first is often referred to
as "K&R" C, after the initials of Brian Kernighan and Dennis Ritchie,
the authors of the first book on C. (Dennis Ritchie created the
language, and Brian Kernighan was one of the creators of 'awk'.)

   In the mid-1980s, an effort began to produce an international
standard for C. This work culminated in 1989, with the production of the
ANSI standard for C. This standard became an ISO standard in 1990.  In
1999, a revised ISO C standard was approved and released.  Where it
makes sense, POSIX 'awk' is compatible with 1999 ISO C.

© manpagez.com 2000-2025
Individual documents may contain additional copyright information.