Qore Programming Language  0.9.3.1
QoreEncoding Class Reference

defines string encoding functions in Qore More...

#include <QoreEncoding.h>

Public Member Methods

DLLEXPORT qore_size_t getByteLen (const char *p, const char *end, qore_size_t c, bool &invalid) const
 gives the number of bytes for the number of chars in the string or up to the end of the string More...
 
DLLEXPORT qore_size_t getByteLen (const char *p, const char *end, qore_size_t c, ExceptionSink *xsink) const
 gives the number of bytes for the number of chars in the string or up to the end of the string More...
 
DLLEXPORT qore_offset_t getCharLen (const char *p, qore_size_t valid_len) const
 gives the number of total bytes for the character given one or more characters More...
 
DLLEXPORT qore_size_t getCharPos (const char *p, const char *end, bool &invalid) const
 gives the character position (number of characters) starting from the first pointer to the second More...
 
DLLEXPORT qore_size_t getCharPos (const char *p, const char *end, ExceptionSink *xsink) const
 gives the character position (number of characters) starting from the first pointer to the second More...
 
DLLEXPORT const char * getCode () const
 returns the string code (ex: "UTF-8") for the encoding
 
DLLEXPORT const char * getDesc () const
 returns the description for the encoding
 
DLLEXPORT qore_size_t getLength (const char *p, const char *end, bool &invalid) const
 gives the length of the string in characters More...
 
DLLEXPORT qore_size_t getLength (const char *p, const char *end, ExceptionSink *xsink) const
 gives the length of the string in characters More...
 
DLLEXPORT int getMaxCharWidth () const
 returns the maximum character width in bytes for the encoding
 
DLLEXPORT unsigned getMinCharWidth () const
 returns the minimum character width in bytes for the encoding More...
 
DLLEXPORT int getUnicode (const char *p, const char *end, unsigned &clen, ExceptionSink *xsink) const
 returns the unicode code point for the given character; if there are any errors (invalid character, not enough space in the string, etc), a Qore-language exception is thrown More...
 
DLLEXPORT bool isAsciiCompat () const
 returns true if the character encoding is backwards-compatible with ASCII More...
 
DLLEXPORT bool isMultiByte () const
 returns true if the encoding is a multi-byte encoding
 

Detailed Description

defines string encoding functions in Qore

for performance reasons this is not a class hierarchy with virtual methods; this ugly implementation with function pointers is much faster. Only encodings where a single character can be more than 1 byte needs to have functions implemented.

Note
only encodings that are backwards compatible with ASCII are supported by Qore; currently the only multi-byte encoding completely supported by qore is UTF-8 (UTF-16* encodings are not properly supported yet)
the default encoding is represented by QCS_DEFAULT; unless another encoding is explicitly given, all strings will be tagged with QCS_DEFAULT
See also
QCS_DEFAULT

Member Function Documentation

◆ getByteLen() [1/2]

DLLEXPORT qore_size_t QoreEncoding::getByteLen ( const char *  p,
const char *  end,
qore_size_t  c,
bool &  invalid 
) const

gives the number of bytes for the number of chars in the string or up to the end of the string

Parameters
pa pointer to the character data
enda pointer to the next byte after the end of the character data
cthe number of characters to check
invalidif true after executing the function, invalid input was given and the return value should be ignored
Returns
the number of bytes for the given number of characters in the string or up to the end of the string

◆ getByteLen() [2/2]

DLLEXPORT qore_size_t QoreEncoding::getByteLen ( const char *  p,
const char *  end,
qore_size_t  c,
ExceptionSink xsink 
) const

gives the number of bytes for the number of chars in the string or up to the end of the string

Parameters
pa pointer to the character data
enda pointer to the next byte after the end of the character data
cthe number of characters to check
xsinkQore-language exceptions will be raised using this argument
Returns
the number of bytes for the given number of characters in the string or up to the end of the string

◆ getCharLen()

DLLEXPORT qore_offset_t QoreEncoding::getCharLen ( const char *  p,
qore_size_t  valid_len 
) const

gives the number of total bytes for the character given one or more characters

always returns 1 for single-byte encodings

Parameters
pa pointer to the character data to check
valid_lenthe number of valid bytes at the start of the character pointer
Returns
0=invalid, positive = number of bytes needed to represent the character (and all are present), negative numbers = number of additional bytes needed to perform the check

◆ getCharPos() [1/2]

DLLEXPORT qore_size_t QoreEncoding::getCharPos ( const char *  p,
const char *  end,
bool &  invalid 
) const

gives the character position (number of characters) starting from the first pointer to the second

Parameters
pa pointer to the character data
enda pointer to the next byte after the end of the character data
invalidif true after executing the function, invalid input was given and the return value should be ignored
Returns
the number of bytes for the given number of characters in the string

◆ getCharPos() [2/2]

DLLEXPORT qore_size_t QoreEncoding::getCharPos ( const char *  p,
const char *  end,
ExceptionSink xsink 
) const

gives the character position (number of characters) starting from the first pointer to the second

Parameters
pa pointer to the character data
enda pointer to the next byte after the end of the character data
xsinkQore-language exceptions will be raised using this argument
Returns
the number of bytes for the given number of characters in the string

◆ getLength() [1/2]

DLLEXPORT qore_size_t QoreEncoding::getLength ( const char *  p,
const char *  end,
bool &  invalid 
) const

gives the length of the string in characters

Parameters
pa pointer to the character data
enda pointer to the next byte after the end of the character data
invalidif true after executing the function, invalid input was given and the return value should be ignored
Returns
the number of characters in the string

◆ getLength() [2/2]

DLLEXPORT qore_size_t QoreEncoding::getLength ( const char *  p,
const char *  end,
ExceptionSink xsink 
) const

gives the length of the string in characters

Parameters
pa pointer to the character data
enda pointer to the next byte after the end of the character data
xsinkQore-language exceptions will be raised using this argument
Returns
the number of characters in the string

◆ getMinCharWidth()

DLLEXPORT unsigned QoreEncoding::getMinCharWidth ( ) const

returns the minimum character width in bytes for the encoding

Since
Qore 0.8.12

◆ getUnicode()

DLLEXPORT int QoreEncoding::getUnicode ( const char *  p,
const char *  end,
unsigned &  clen,
ExceptionSink xsink 
) const

returns the unicode code point for the given character; if there are any errors (invalid character, not enough space in the string, etc), a Qore-language exception is thrown

Parameters
pa pointer to a character
enda pointer to the next byte after the end of the character data
clenthe length of the character
xsinkQore-language exceptions will be raised using this argument
Returns
the unicode code point for the character or -1 in case of an error
Since
Qore 0.8.13

◆ isAsciiCompat()

DLLEXPORT bool QoreEncoding::isAsciiCompat ( ) const

returns true if the character encoding is backwards-compatible with ASCII

Since
Qore 0.8.12

The documentation for this class was generated from the following file: