| [ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
6.1.10 Unicode (UCS-2) Strings
UCS-2 strings cannot be read by the standard reader but UTF-8 strings
can. The special syntax for UTF-8 is described by the
regular expression:
#u"([^]|\")*".
The library functions for Unicode string processing are:
- bigloo procedure: make-ucs2-string k
- bigloo procedure: make-ucs2-string k char
- bigloo procedure: ucs2-string k …
- bigloo procedure: ucs2-string-length s-ucs2
- bigloo procedure: ucs2-string-ref s-ucs2 k
- bigloo procedure: ucs2-string-set! s-ucs2 k char
- bigloo procedure: ucs2-string=? s-ucs2a s-ucs2b
- bigloo procedure: ucs2-string-ci=? s-ucs2a s-ucs2b
- bigloo procedure: ucs2-string<? s-ucs2a s-ucs2b
- bigloo procedure: ucs2-string>? s-ucs2a s-ucs2b
- bigloo procedure: ucs2-string<=? s-ucs2a s-ucs2b
- bigloo procedure: ucs2-string>=? s-ucs2a s-ucs2b
- bigloo procedure: ucs2-string-ci<? s-ucs2a s-ucs2b
- bigloo procedure: ucs2-string-ci>? s-ucs2a s-ucs2b
- bigloo procedure: ucs2-string-ci<=? s-ucs2a s-ucs2b
- bigloo procedure: ucs2-string-ci>=? s-ucs2a s-ucs2b
- bigloo procedure: subucs2-string s-ucs2 start end
- bigloo procedure: ucs2-string-append s-ucs2 …
- bigloo procedure: ucs2-string->list s-ucs2
- bigloo procedure: list->ucs2-string chars
- bigloo procedure: ucs2-string-copy s-ucs2
- bigloo procedure: ucs2-string-fill! s-ucs2 char
- Stores char in every element of the given s-ucs2 and returns an unspecified value. 
- bigloo procedure: ucs2-string-downcase s-ucs2
- Builds a newly allocated ucs2-string with lower case letters. 
- bigloo procedure: ucs2-string-upcase s-ucs2
- Builds a new allocated ucs2-string with upper case letters. 
- bigloo procedure: ucs2-string->utf8-string s-ucs2
- bigloo procedure: utf8-string->ucs2-string string
- Convert UCS-2 strings to (or from) UTF-8 encoded ascii strings. 
- bigloo procedure: utf8-string? string
- Returns - #tif and only if the argument string is a well formed UTF8 string. Otherwise returns- #f.
- bigloo procedure: utf8-string-length string
- Returns the number of characters of an UTF8 string. It raises an error if the string is not a well formed UTF8 string (i.e., it does satisfies the - utf8-string?predicate.
- bigloo procedure: utf8-string-ref string i
- Returns the character (represented as an UTF8 string) at the position i in string. 
- library procedure: utf8-substring string start [end]
- 
string must be a string, and start and end must be exact integers satisfying: 0 <= START <= END <= (string-length STRING) The optional argument end defaults to (utf8-string-length STRING).utf8-substringreturns a newly allocated string formed from the characters of STRING beginning with index START (inclusive) and ending with index END (exclusive).If the argument string is not a well formed UTF8 string an error is raised. Otherwise, the result is also a well formed UTF8 string. 
- bigloo procedure: iso-latin->utf8 string
- bigloo procedure: iso-latin->utf8! string
- bigloo procedure: utf8->iso-latin string
- bigloo procedure: utf8->iso-latin! string
- bigloo procedure: utf8->iso-latin-15 string
- bigloo procedure: utf8->iso-latin-15! string
- Encode and decode iso-latin strings into utf8. The functions - iso-latin->utf8-string!,- utf8->iso-latin!and- utf8->iso-latin-15!may return, as result, the string they receive as argument.
- bigloo procedure: cp1252->utf8 string
- bigloo procedure: cp1252->utf8! string
- bigloo procedure: utf8->cp1252 string
- bigloo procedure: utf8->cp1252! string
- Encode and decode cp1252 strings into utf8. The functions - cp1252->utf8-string!and- utf8->cp1252!may return, as result, the string they receive as argument.
- bigloo procedure: 8bits->utf8 string table
- bigloo procedure: 8bits->utf8! string table
- bigloo procedure: utf8->8bits string invtable
- bigloo procedure: utf8->8bits! string inv-table
- These are the general conversion routines used internally by - iso-latin->utf8and- cp1252->utf8. They convert any 8 bits string into its equivalent UTF-8 representation and vice versa.- The argument table should be either - #f, which means that the basic (i.e., iso-latin-1) 8bits -> UTF8 converion is used, or it must be a vector of at maximun 127 entries containing strings of characters. This table contains the encodings for the 8 bits characters whose code range from 128 to 255.- The table is not required to be complete. That is, it is not required to give the whole character encoding set. Only the characters that need a non-iso-latin canonical representation must be given. For instance, the CP1252 table can be defined as: - (define cp1252 '#("\xe2\x82\xac" ;; 0x80 "" ;; 0x81 "\xe2\x80\x9a" ;; 0x82 "\xc6\x92" ;; 0x83 "\xe2\x80\x9e" ;; 0x84 "\xe2\x80\xa6" ;; 0x85 "\xe2\x80\xa0" ;; 0x86 "\xe2\x80\xa1" ;; 0x87 "\xcb\x86" ;; 0x88 "\xe2\x80\xb0" ;; 0x89 "\xc5\xa0" ;; 0x8a "\xe2\x80\xb9" ;; 0x8b "\xc5\x92" ;; 0x8c "" ;; 0x8d "\xc5\xbd" ;; 0x8e "" ;; 0x8f "" ;; 0x90 "\xe2\x80\x98" ;; 0x91 "\xe2\x80\x99" ;; 0x92 "\xe2\x80\x9c" ;; 0x93 "\xe2\x80\x9d" ;; 0x94 "\xe2\x80\xa2" ;; 0x95 "\xe2\x80\x93" ;; 0x96 "\xe2\x80\x94" ;; 0x97 "\xcb\x9c" ;; 0x98 "\xe2\x84\xa2" ;; 0x99 "\xc5\xa1" ;; 0x9a "\xe2\x80\xba" ;; 0x9b "\xc5\x93" ;; 0x9c "" ;; 0x9d "\xc5\xbe" ;; 0x9e "\xc5\xb8")) ;; 0x9f- The argument inv-table is a inverse table that can be build from a table and using the function - inverse-utf8-table.
- procedure: inverse-utf8-table vector
- Inverse an UTF8 table into an object suitable for - utf8->8bitsand- utf8->8bits!.
| [ << ] | [ < ] | [ Up ] | [ > ] | [ >> ] | [Top] | [Contents] | [Index] | [ ? ] | 
 
  This document was generated on October 23, 2011 using texi2html 5.0.
 
 
