home *** CD-ROM | disk | FTP | other *** search
-
-
-
- TR(1L) TR(1L)
-
-
- NNAAMMEE
- tr - translate or delete characters
-
- SSYYNNOOPPSSIISS
- ttrr [-cst] [--complement] [--squeeze-repeats] [--trun-
- cate-set1] string1 string2
- ttrr {-s,--squeeze-repeats} [-c] [--complement] string1
- ttrr {-d,--delete} [-c] string1
- ttrr {-d,--delete} {-s,--squeeze-repeats} [-c] [--comple-
- ment] string1 string2
-
- GNU ttrr also accepts the --help and --version options.
-
- DDEESSCCRRIIPPTTIIOONN
- This manual page documents the GNU version of ttrr.. ttrr
- copies the standard input to the standard output, perform-
- ing one of the following operations:
-
- +o translate, and optionally squeeze repeated char-
- acters in the result
- +o squeeze repeated characters
- +o delete characters
- +o delete characters, then squeeze repeated charac-
- ters from the result.
-
- The _s_t_r_i_n_g_1 and (if given) _s_t_r_i_n_g_2 arguments define
- ordered sets of characters, referred to below as set1 and
- set2. These sets are the characters of the input that ttrr
- operates on. The _-_-_c_o_m_p_l_e_m_e_n_t (_-_c) option replaces set1
- with its complement (all of the characters that are not in
- set1).
-
- SSPPEECCIIFFYYIINNGG SSEETTSS OOFF CCHHAARRAACCTTEERRSS
- The format of the _s_t_r_i_n_g_1 and _s_t_r_i_n_g_2 arguments resembles
- the format of regular expressions; however, they are not
- regular expressions, only lists of characters. Most char-
- acters simply represent themselves in these strings, but
- the strings can contain the shorthands listed below, for
- convenience. Some of them can be used only in _s_t_r_i_n_g_1 or
- _s_t_r_i_n_g_2, as noted below.
-
- Backslash escapes. A backslash followed by a character
- not listed below causes an error message.
-
- \a Control-G.
-
- \b Control-H.
-
- \f Control-L.
-
- \n Control-J.
-
- \r Control-M.
-
-
-
-
- FSF GNU Text Utilities 1
-
-
-
-
-
- TR(1L) TR(1L)
-
-
- \t Control-I.
-
- \v Control-K.
-
- \ooo The character with the value given by _o_o_o, which is
- 1 to 3 octal digits.
-
- \\ A backslash.
-
- Ranges. The notation `_m-_n' expands to all of the charac-
- ters from _m through _n, in ascending order. _m should col-
- late before _n; if it doesn't, an error results. As an
- example, `0-9' is the same as `0123456789'. Although GNU
- ttrr does not support the System V syntax that uses square
- brackets to enclose ranges, translations specified in that
- format will still work as long as the brackets in string1
- correspond to identical brackets in string2.
-
- Repeated characters. The notation `[_c*_n]' in _s_t_r_i_n_g_2
- expands to _n copies of character _c. Thus, `[y*6]' is the
- same as `yyyyyy'. The notation `[_c*]' in _s_t_r_i_n_g_2 expands
- to as many copies of _c as are needed to make set2 as long
- as set1. If _n begins with a 0, it is interpreted in
- octal, otherwise in decimal.
-
- Character classes. The notation `[:_c_l_a_s_s_-_n_a_m_e:]' expands
- to all of the characters in the (predefined) class named
- _c_l_a_s_s_-_n_a_m_e. The characters expand in no particular order,
- except for the `upper' and `lower' classes, which expand
- in ascending order. When the _-_-_d_e_l_e_t_e (_-_d) and
- _-_-_s_q_u_e_e_z_e_-_r_e_p_e_a_t_s (_-_s) options are both given, any charac-
- ter class can be used in _s_t_r_i_n_g_2. Otherwise, only the
- character classes `lower' and `upper' are accepted in
- _s_t_r_i_n_g_2, and then only if the corresponding character
- class (`upper' and `lower', respectively) is specified in
- the same relative position in _s_t_r_i_n_g_1. Doing this speci-
- fies case conversion. The class names are given below; an
- error results when an invalid class name is given.
-
- alnum Letters and digits.
-
- alpha Letters.
-
- blank Horizontal whitespace.
-
- cntrl Control characters.
-
- digit Digits.
-
- graph Printable characters, not including space.
-
- lower Lowercase letters.
-
- print Printable characters, including space.
-
-
-
- FSF GNU Text Utilities 2
-
-
-
-
-
- TR(1L) TR(1L)
-
-
- punct Punctuation characters.
-
- space Horizontal or vertical whitespace.
-
- upper Uppercase letters.
-
- xdigit Hexadecimal digits.
-
- Equivalence classes. The syntax `[=_c=]' expands to all of
- the characters that are equivalent to _c, in no particular
- order. Equivalence classes are a recent invention
- intended to support non-English alphabets. But there
- seems to be no standard way to define them or determine
- their contents. Therefore, they are not fully implemented
- in GNU ttrr; each character's equivalence class consists
- only of that character, which makes this a useless con-
- struction currently.
-
- TTRRAANNSSLLAATTIINNGG
- ttrr performs translation when _s_t_r_i_n_g_1 and _s_t_r_i_n_g_2 are both
- given and the --delete (_-_d) option is not given. ttrr
- translates each character of its input that is in set1 to
- the corresponding character in set2. Characters not in
- set1 are passed through unchanged. When a character
- appears more than once in set1 and the corresponding char-
- acters in set2 are not all the same, only the final one is
- used. For example, these two commands are equivalent:
- tr aaa xyz
- tr a z
-
- A common use of ttrr is to convert lowercase characters to
- uppercase. This can be done in many ways. Here are three
- of them:
- tr abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ
- tr a-z A-Z
- tr '[:lower:]' '[:upper:]'
-
- When ttrr is performing translation, set1 and set2 should
- normally have the same length. If set1 is shorter than
- set2, the extra characters at the end of set2 are ignored.
-
- On the other hand, making set1 longer than set2 is not
- portable; POSIX.2 says that the result is undefined. In
- this situation, the BSD ttrr pads set2 to the length of set1
- by repeating the last character of set2 as many times as
- necessary. The System V ttrr truncates set1 to the length
- of set2.
-
- By default, GNU ttrr handles this case like the BSD ttrr does.
- When the --truncate-set1 (_-_t) option is given, GNU ttrr han-
- dles this case like the System V ttrr instead. This option
- is ignored for operations other than translation.
-
- Acting like the System V ttrr in this case breaks the
-
-
-
- FSF GNU Text Utilities 3
-
-
-
-
-
- TR(1L) TR(1L)
-
-
- relatively common BSD idiom:
- tr -cs A-Za-z0-9 '\012'
- because it converts only zero bytes (the first element in
- the complement of set1), rather than all non-
- alphanumerics, to newlines.
-
- SSQQUUEEEEZZIINNGG RREEPPEEAATTSS AANNDD DDEELLEETTIINNGG
- When given just the --delete (_-_d) option, ttrr removes any
- input characters that are in set1.
-
- When given just the --squeeze-repeats (_-_s) option, ttrr
- replaces each input sequence of a repeated character that
- is in set1 with a single occurrence of that character.
-
- When given both the --delete and the --squeeze-repeats
- options, ttrr first performs any deletions using set1, then
- squeezes repeats from any remaining characters using set2.
-
- The --squeeze-repeats option may also be used when trans-
- lating, in which case ttrr first performs translation, then
- squeezes repeats from any remaining characters using set2.
-
- Here are some examples to illustrate various combinations
- of options:
-
- Remove all zero bytes:
- tr -d '\000'
-
- Put all words on lines by themselves. This converts all
- non-alphanumeric characters to newlines, then squeezes
- each string of repeated newlines into a single newline:
- tr -cs '[a-zA-Z0-9]' '[\n*]'
-
- Convert each sequence of repeated newlines to a single
- newline:
- tr -s '\n'
-
- GNU ttrr also accepts the following options in any combina-
- tion with the others.
-
- _-_-_h_e_l_p Print a usage message and exit with a non-zero sta-
- tus.
-
- _-_-_v_e_r_s_i_o_n
- Print version information on standard error then
- exit.
-
- WWAARRNNIINNGG MMEESSSSAAGGEESS
- Setting the environment variable POSIXLY_CORRECT turns off
- several warning and error messages, for strict compliance
- with POSIX.2. The messages normally occur in the follow-
- ing circumstances:
-
- 1. When the _-_-_d_e_l_e_t_e option is given but
-
-
-
- FSF GNU Text Utilities 4
-
-
-
-
-
- TR(1L) TR(1L)
-
-
- _-_-_s_q_u_e_e_z_e_-_r_e_p_e_a_t_s is not, and _s_t_r_i_n_g_2 is given, GNU ttrr by
- default prints a usage message and exits, because _s_t_r_i_n_g_2
- would not be used. The POSIX specification says that
- _s_t_r_i_n_g_2 must be ignored in this case. Silently ignoring
- arguments is a bad idea.
-
- 2. When an ambiguous octal escape is given. For example,
- \400 is actually \40 followed by the digit 0, because the
- value 400 octal does not fit into a single byte.
-
- Note that GNU ttrr does not provide complete BSD or System V
- compatibility. For example, there is no option to disable
- interpretation of the POSIX constructs [:alpha:], [=c=],
- and [c*10]. Also, GNU ttrr does not delete zero bytes auto-
- matically, unlike traditional UNIX versions, which provide
- no way to preserve zero bytes.
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
- FSF GNU Text Utilities 5
-
-
-