Qore Programming Language  0.9.16
QoreEncoding.h File Reference
#include <qore/common.h>
#include <qore/QoreThreadLock.h>
#include <cstring>
#include <map>
#include <string>
#include <strings.h>
Include dependency graph for QoreEncoding.h:
This graph shows which files directly or indirectly include this file:

Go to the source code of this file.

Classes

class  QoreEncoding
 defines string encoding functions in Qore More...
 
class  QoreEncodingManager
 manages encodings in Qore More...
 

Typedefs

typedef qore_offset_t(* mbcs_charlen_t) (const char *str, qore_size_t valid_len)
 for multi-byte encodings: gives the number of total bytes for the character given one or more characters More...
 
typedef qore_size_t(* mbcs_end_t) (const char *str, const char *end, qore_size_t num_chars, bool &invalid)
 for multi-byte character set encodings: gives the number of bytes for the number of chars
 
typedef unsigned(* mbcs_get_unicode_t) (const char *p)
 returns the unicode code point for the given character, assumes there is enough data for the character and that the character is valid (must be checked before calling)
 
typedef qore_size_t(* mbcs_length_t) (const char *str, const char *end, bool &invalid)
 for multi-byte character set encodings: gives the length of the string in characters
 
typedef qore_size_t(* mbcs_pos_t) (const char *str, const char *ptr, bool &invalid)
 for multi-byte character set encodings: gives the character position of the ptr
 

Variables

const DLLEXPORT QoreEncodingQCS_DEFAULT
 the default encoding for the Qore library
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_1
 latin-1, Western European encoding
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_10
 latin-6, Nordic character set
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_11
 Thai character set.
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_13
 latin-7, Baltic rim character set
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_14
 latin-8, Celtic character set
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_15
 latin-9, Western European with euro symbol
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_16
 latin-10, Southeast European character set
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_2
 latin-2, Central European encoding
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_3
 latin-3, Southern European character set
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_4
 latin-4, Northern European character set
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_5
 Cyrillic character set.
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_6
 Arabic character set.
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_7
 Greek character set.
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_8
 Hebrew character set.
 
const DLLEXPORT QoreEncodingQCS_ISO_8859_9
 latin-5, Turkish character set
 
const DLLEXPORT QoreEncodingQCS_KOI7
 Russian: Kod Obmena Informatsiey, 7 bit characters.
 
const DLLEXPORT QoreEncodingQCS_KOI8_R
 Russian: Kod Obmena Informatsiey, 8 bit.
 
const DLLEXPORT QoreEncodingQCS_KOI8_U
 Ukrainian: Kod Obmena Informatsiey, 8 bit.
 
const DLLEXPORT QoreEncodingQCS_USASCII
 ascii encoding
 
const DLLEXPORT QoreEncodingQCS_UTF16
 UTF-16 (only UTF-8 and UTF-16* are multi-byte encodings)
 
const DLLEXPORT QoreEncodingQCS_UTF16BE
 UTF-16BE (only UTF-8 and UTF-16* are multi-byte encodings)
 
const DLLEXPORT QoreEncodingQCS_UTF16LE
 UTF-16LE (only UTF-8 and UTF-16* are multi-byte encodings)
 
const DLLEXPORT QoreEncodingQCS_UTF8
 UTF-8 multi-byte encoding (only UTF-8 and UTF-16 are multi-byte encodings)
 
DLLEXPORT QoreEncodingManager QEM
 the QoreEncodingManager object
 

Detailed Description

provides definitions related to character encoding support in Qore including the QoreEncoding class and QCS_DEFAULT, the default encoding for the Qore library

Typedef Documentation

◆ mbcs_charlen_t

typedef qore_offset_t(* mbcs_charlen_t) (const char *str, qore_size_t valid_len)

for multi-byte encodings: gives the number of total bytes for the character given one or more characters

Parameters
stra pointer to the character data to check
lenthe number of valid bytes at the start of the character pointer
Returns
0=invalid, positive = number of characters needed, negative numbers = number of additional bytes needed to perform the check