manpagez: man pages & more
info coreutils
Home | html | info | man

File: coreutils.info,  Node: Translating,  Next: Squeezing and deleting,  Prev: Character arrays,  Up: tr invocation

9.1.2 Translating
-----------------

‘tr’ performs translation when STRING1 and STRING2 are both given and
the ‘--delete’ (‘-d’) option is not given.  ‘tr’ translates each
character of its input that is in ARRAY1 to the corresponding character
in ARRAY2.  Characters not in ARRAY1 are passed through unchanged.

   As a GNU extension to POSIX, when a character appears more than once
in ARRAY1, only the final instance is used.  For example, these two
commands are equivalent:

     tr aaa xyz
     tr a z

   A common use of ‘tr’ is to convert lowercase characters to uppercase.
This can be done in many ways.  Here are three of them:

     tr abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ
     tr a-z A-Z
     tr '[:lower:]' '[:upper:]'

However, ranges like ‘a-z’ are not portable outside the C locale.

   When ‘tr’ is performing translation, ARRAY1 and ARRAY2 typically have
the same length.  If ARRAY1 is shorter than ARRAY2, the extra characters
at the end of ARRAY2 are ignored.

   On the other hand, making ARRAY1 longer than ARRAY2 is not portable;
POSIX says that the result is undefined.  In this situation, BSD ‘tr’
pads ARRAY2 to the length of ARRAY1 by repeating the last character of
ARRAY2 as many times as necessary.  System V ‘tr’ truncates ARRAY1 to
the length of ARRAY2.

   By default, GNU ‘tr’ handles this case like BSD ‘tr’.  When the
‘--truncate-set1’ (‘-t’) option is given, GNU ‘tr’ handles this case
like the System V ‘tr’ instead.  This option is ignored for operations
other than translation.

   Acting like System V ‘tr’ in this case breaks the relatively common
BSD idiom:

     tr -cs A-Za-z0-9 '\012'

because it converts only zero bytes (the first element in the complement
of ARRAY1), rather than all non-alphanumerics, to newlines.

By the way, the above idiom is not portable because it uses ranges, and
it assumes that the octal code for newline is 012.  Here is a better way
to write it:

     tr -cs '[:alnum:]' '[\n*]'

© manpagez.com 2000-2024
Individual documents may contain additional copyright information.