NAME
colldef
—
convert collation sequence source
definition
SYNOPSIS
colldef |
[-I map_dir]
[-o out_file]
[filename] |
DESCRIPTION
The colldef
utility converts a collation
sequence source definition into a format usable by the
strxfrm
()
and
strcoll
()
functions. It is used to define the many ways in which strings can be
ordered and collated. The strxfrm
() function
transforms its first argument and places the result in its second argument.
The transformed string is such that it can be correctly ordered with other
transformed strings by using
strcmp
(),
strncmp
(),
or
memcmp
().
The strcoll
() function transforms its arguments and
does a comparison.
The colldef
utility reads the collation
sequence source definition from the standard input and stores the converted
definition in filename. The output file produced contains the database with
collating sequence information in a form usable by system commands and
routines.
The following options are available:
-I
map_dir- Set directory name where charmap files can be found, current directory by default.
-o
out_file- Set output file name, LC_COLLATE by default.
The collation sequence definition specifies a set of collating elements and the rules defining how strings containing these should be ordered. This is most useful for different language definitions.
The specification file can consist of three statements: charmap, substitute and order.
Of these, only the order statement is required. When charmap or substitute is supplied, these statements must be ordered as above. Any statements after the order statement are ignored.
Lines in the specification file beginning with a
‘#
’ are treated as comments and are
ignored. Blank lines are also ignored.
charmap charmapfile
Charmap defines where a mapping of the character and collating element symbols to the actual character encoding can be found.
The format of charmapfile is shown below. Symbol names are separated from their values by TAB or SPACE characters. Symbol-value can be specified in a hexadecimal (\x??) or octal (\???) representation, and can be only one character in length.
symbol-name1 symbol-value1 symbol-name2 symbol-value2 ...
Symbol names cannot be specified in substitute fields.
The charmap statement is optional.
substitute "symbol" with "repl_string"
The substitute statement substitutes the character symbol with the string repl_string. Symbol names cannot be specified in repl_string field. The substitute statement is optional.
order order_list
Order_list is a list of symbols, separated by semi colons, that defines the collating sequence. The special symbol ... specifies, in a short-hand form, symbols that are sequential in machine code order.
An order list element can be represented in any one of the following ways:
- The symbol itself (for example, a for the lower-case letter a).
- The symbol in octal representation (for example, \141 for the letter a).
- The symbol in hexadecimal representation (for example, \x61 for the letter a).
- The symbol name as defined in the charmap file (for example, <letterA> for letterA \023 record in charmapfile). If character map name have > character, it must be escaped as />, single / must be escaped as //.
- Symbols \a, \b, \f, \n, \r, \v are permitted in its usual C-language meaning.
- The symbol chain (for example: abc, <letterA><letterB>c, \xf1b\xf2)
- The symbol range (for example, a;...;z).
- Comma-separated symbols, ranges and chains enclosed in parenthesis (for example ( sym1, sym2, ... )) are assigned the same primary ordering but different secondary ordering.
- Comma-separated symbols, ranges and chains enclosed in curly brackets (for example { sym1, sym2, ... }) are assigned the same primary ordering only.
The backslash character \ is used for continuation. In this case, no characters are permitted after the backslash character.
DIAGNOSTICS
The colldef
utility exits with the
following values:
0
- No errors were found and the output was successfully created.
!=0
- Errors were found.
FILES
- /usr/share/locale/⟨language⟩/LC_COLLATE
- The standard shared location for collation orders under the locale ⟨language⟩.