[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
6.6.5.7 String Comparison
The procedures in this section are similar to the character ordering predicates (see section Characters), but are defined on character sequences.
The first set is specified in R5RS and has names that end in ?
.
The second set is specified in SRFI-13 and the names have not ending
?
.
The predicates ending in -ci
ignore the character case
when comparing strings. For now, case-insensitive comparison is done
using the R5RS rules, where every lower-case character that has a
single character upper-case form is converted to uppercase before
comparison. See See section the (ice-9 i18n)
module, for locale-dependent string comparison.
- Scheme Procedure: string=? s1 s2 s3 …
Lexicographic equality predicate; return
#t
if all strings are the same length and contain the same characters in the same positions, otherwise return#f
.The procedure
string-ci=?
treats upper and lower case letters as though they were the same character, butstring=?
treats upper and lower case as distinct characters.
- Scheme Procedure: string<? s1 s2 s3 …
Lexicographic ordering predicate; return
#t
if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically less than str_i+1.
- Scheme Procedure: string<=? s1 s2 s3 …
Lexicographic ordering predicate; return
#t
if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically less than or equal to str_i+1.
- Scheme Procedure: string>? s1 s2 s3 …
Lexicographic ordering predicate; return
#t
if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically greater than str_i+1.
- Scheme Procedure: string>=? s1 s2 s3 …
Lexicographic ordering predicate; return
#t
if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically greater than or equal to str_i+1.
- Scheme Procedure: string-ci=? s1 s2 s3 …
Case-insensitive string equality predicate; return
#t
if all strings are the same length and their component characters match (ignoring case) at each position; otherwise return#f
.
- Scheme Procedure: string-ci<? s1 s2 s3 …
Case insensitive lexicographic ordering predicate; return
#t
if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically less than str_i+1 regardless of case.
- Scheme Procedure: string-ci<=? s1 s2 s3 …
Case insensitive lexicographic ordering predicate; return
#t
if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically less than or equal to str_i+1 regardless of case.
- Scheme Procedure: string-ci>? s1 s2 s3 …
Case insensitive lexicographic ordering predicate; return
#t
if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically greater than str_i+1 regardless of case.
- Scheme Procedure: string-ci>=? s1 s2 s3 …
Case insensitive lexicographic ordering predicate; return
#t
if, for every pair of consecutive string arguments str_i and str_i+1, str_i is lexicographically greater than or equal to str_i+1 regardless of case.
- Scheme Procedure: string-compare s1 s2 proc_lt proc_eq proc_gt [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_compare (s1, s2, proc_lt, proc_eq, proc_gt, start1, end1, start2, end2)
Apply proc_lt, proc_eq, proc_gt to the mismatch index, depending upon whether s1 is less than, equal to, or greater than s2. The mismatch index is the largest index i such that for every 0 <= j < i, s1[j] = s2[j] – that is, i is the first position that does not match.
- Scheme Procedure: string-compare-ci s1 s2 proc_lt proc_eq proc_gt [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_compare_ci (s1, s2, proc_lt, proc_eq, proc_gt, start1, end1, start2, end2)
Apply proc_lt, proc_eq, proc_gt to the mismatch index, depending upon whether s1 is less than, equal to, or greater than s2. The mismatch index is the largest index i such that for every 0 <= j < i, s1[j] = s2[j] – that is, i is the first position where the lowercased letters do not match.
- Scheme Procedure: string= s1 s2 [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_eq (s1, s2, start1, end1, start2, end2)
Return
#f
if s1 and s2 are not equal, a true value otherwise.
- Scheme Procedure: string<> s1 s2 [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_neq (s1, s2, start1, end1, start2, end2)
Return
#f
if s1 and s2 are equal, a true value otherwise.
- Scheme Procedure: string< s1 s2 [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_lt (s1, s2, start1, end1, start2, end2)
Return
#f
if s1 is greater or equal to s2, a true value otherwise.
- Scheme Procedure: string> s1 s2 [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_gt (s1, s2, start1, end1, start2, end2)
Return
#f
if s1 is less or equal to s2, a true value otherwise.
- Scheme Procedure: string<= s1 s2 [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_le (s1, s2, start1, end1, start2, end2)
Return
#f
if s1 is greater to s2, a true value otherwise.
- Scheme Procedure: string>= s1 s2 [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_ge (s1, s2, start1, end1, start2, end2)
Return
#f
if s1 is less to s2, a true value otherwise.
- Scheme Procedure: string-ci= s1 s2 [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_ci_eq (s1, s2, start1, end1, start2, end2)
Return
#f
if s1 and s2 are not equal, a true value otherwise. The character comparison is done case-insensitively.
- Scheme Procedure: string-ci<> s1 s2 [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_ci_neq (s1, s2, start1, end1, start2, end2)
Return
#f
if s1 and s2 are equal, a true value otherwise. The character comparison is done case-insensitively.
- Scheme Procedure: string-ci< s1 s2 [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_ci_lt (s1, s2, start1, end1, start2, end2)
Return
#f
if s1 is greater or equal to s2, a true value otherwise. The character comparison is done case-insensitively.
- Scheme Procedure: string-ci> s1 s2 [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_ci_gt (s1, s2, start1, end1, start2, end2)
Return
#f
if s1 is less or equal to s2, a true value otherwise. The character comparison is done case-insensitively.
- Scheme Procedure: string-ci<= s1 s2 [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_ci_le (s1, s2, start1, end1, start2, end2)
Return
#f
if s1 is greater to s2, a true value otherwise. The character comparison is done case-insensitively.
- Scheme Procedure: string-ci>= s1 s2 [start1 [end1 [start2 [end2]]]]
- C Function: scm_string_ci_ge (s1, s2, start1, end1, start2, end2)
Return
#f
if s1 is less to s2, a true value otherwise. The character comparison is done case-insensitively.
- Scheme Procedure: string-hash s [bound [start [end]]]
- C Function: scm_substring_hash (s, bound, start, end)
Compute a hash value for s. The optional argument bound is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
- Scheme Procedure: string-hash-ci s [bound [start [end]]]
- C Function: scm_substring_hash_ci (s, bound, start, end)
Compute a hash value for s. The optional argument bound is a non-negative exact integer specifying the range of the hash function. A positive value restricts the return value to the range [0,bound).
Because the same visual appearance of an abstract Unicode character can
be obtained via multiple sequences of Unicode characters, even the
case-insensitive string comparison functions described above may return
#f
when presented with strings containing different
representations of the same character. For example, the Unicode
character “LATIN SMALL LETTER S WITH DOT BELOW AND DOT ABOVE” can be
represented with a single character (U+1E69) or by the character “LATIN
SMALL LETTER S” (U+0073) followed by the combining marks “COMBINING
DOT BELOW” (U+0323) and “COMBINING DOT ABOVE” (U+0307).
For this reason, it is often desirable to ensure that the strings to be compared are using a mutually consistent representation for every character. The Unicode standard defines two methods of normalizing the contents of strings: Decomposition, which breaks composite characters into a set of constituent characters with an ordering defined by the Unicode Standard; and composition, which performs the converse.
There are two decomposition operations. “Canonical decomposition” produces character sequences that share the same visual appearance as the original characters, while “compatibility decomposition” produces ones whose visual appearances may differ from the originals but which represent the same abstract character.
These operations are encapsulated in the following set of normalization forms:
- NFD
Characters are decomposed to their canonical forms.
- NFKD
Characters are decomposed to their compatibility forms.
- NFC
Characters are decomposed to their canonical forms, then composed.
- NFKC
Characters are decomposed to their compatibility forms, then composed.
The functions below put their arguments into one of the forms described above.
- Scheme Procedure: string-normalize-nfd s
- C Function: scm_string_normalize_nfd (s)
Return the
NFD
normalized form of s.
- Scheme Procedure: string-normalize-nfkd s
- C Function: scm_string_normalize_nfkd (s)
Return the
NFKD
normalized form of s.
- Scheme Procedure: string-normalize-nfc s
- C Function: scm_string_normalize_nfc (s)
Return the
NFC
normalized form of s.
- Scheme Procedure: string-normalize-nfkc s
- C Function: scm_string_normalize_nfkc (s)
Return the
NFKC
normalized form of s.
[ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] |
This document was generated on April 20, 2013 using texi2html 5.0.