GNUstep CoreBase Library 0.2
Character Utilities

Detailed Description

Unicode Code Point Functions

Boolean GSCharacterIsASCII (const UTF32Char c)
 Determine if a character is an ASCII character (less than 128).
 
Boolean GSCharacterIsWhitespace (const UTF32Char c)
 Determine if a character is a whitespace character.
 
Boolean GSCharacterIsInSupplementaryPlane (const UTF32Char c)
 Determine if character is in one of the supplementary planes.
 
Boolean GSCharacterIsSurrogate (const UTF32Char c)
 Determine true if character is a surrogate code point.
 
Boolean GSCharacterIsLeadSurrogate (const UTF32Char c)
 Determine if character is a leading surrogate code point.
 
Boolean GSCharacterIsTrailSurrogate (const UTF32Char c)
 Determine if character is a trailing surrogate code point.
 

UTF-8 Utilities

CFIndex GSUTF8CharacterTrailBytesCount (const UTF8Char c)
 Determine the number of trailing bytes for a UTF-8 character based on the leading code unit.
 
Boolean GSUTF8CharacterIsTrailing (const UTF8Char c)
 Determines if the specified UTF-8 code unit is a trailing code unit.
 
CFIndex GSUTF8CharacterLength (const UTF32Char c)
 Determine the number of UTF-8 code units required to represent the specified Unicode code point.
 
CFIndex GSUTF8CharacterAppendByteOrderMark (UTF8Char *d, const UTF8Char *limit)
 Append the UTF-8 Byte Order Mark to the string buffer.
 
Boolean GSUTF8CharacterSkipByteOrderMark (const UTF8Char **s, const UTF8Char *limit)
 Determine if a UTF-8 string buffer has a Byte Order Mark.
 
CFIndex GSUTF8CharacterAppend (UTF8Char *d, const UTF8Char *limit, UTF32Char c)
 Append a character to a UTF-8 string buffer.
 
CFIndex GSUTF8CharacterGet (const UTF8Char *s, const UTF8Char *limit, UTF32Char *c)
 Get a Unicode code unit from a UTF-8 string buffer.
 
#define kGSUTF8CharacterMaximumLength   4
 The maximum number of UTF-8 code units required to represent the highest Unicode code point.
 

UTF-16 Utilities

CFIndex GSUTF16CharacterAppend (UTF16Char *d, const UTF16Char *limit, UTF32Char c)
 Append a character to a UTF-16 string buffer.
 
CFIndex GSUTF16CharacterGet (const UTF16Char *s, const UTF16Char *limit, UTF32Char *c)
 Get a Unicode code point from a UTF-16 string buffer.
 
#define kGSUTF16CharacterMaximumLength   2
 The maximum number of UTF-16 code units required to represent the highest Unicode code point.
 
#define kGSUTF16CharacterByteOrderMark   0xFEFF
 The Byte Order Mark for UTF-16 strings.
 
#define kGSUTF16CharacterSwappedByteOrderMark   0xFFFE
 The swapped Byte Order Mark for UTF-16 strings.
 

UTF-32 Utilities

#define kGSUTF32CharacterByteOrderMark   0x0000FEFF
 The Byte Order Mark for UTF-32 strings.
 
#define kGSUTF32CharacterSwappedByteOrderMark   0xFFFE0000
 The swapped Byte Order Mark for UTF-32 strings.
 

Function Documentation

◆ GSCharacterIsASCII()

Boolean GSCharacterIsASCII ( const UTF32Char  c)
Parameters
[in]cCharacter to test.
Returns
Return true if character is an ASCII character.

◆ GSCharacterIsWhitespace()

Boolean GSCharacterIsWhitespace ( const UTF32Char  c)
Parameters
[in]cCharater to test.
Returns
True if character is whitespace.

◆ GSCharacterIsInSupplementaryPlane()

Boolean GSCharacterIsInSupplementaryPlane ( const UTF32Char  c)
Parameters
[in]cCharacter to test.
Returns
Returns true if character is in one of the supplementary planes and false if in the Basic Multilingual plane.

◆ GSCharacterIsSurrogate()

Boolean GSCharacterIsSurrogate ( const UTF32Char  c)
Parameters
[in]cCharacter to test.
Returns
Returns true if character is a surrogate and false, otherwise.

◆ GSCharacterIsLeadSurrogate()

Boolean GSCharacterIsLeadSurrogate ( const UTF32Char  c)
Parameters
[in]cCharacter to test.
Returns
Returns true if character is leading and false, otherwise.

◆ GSCharacterIsTrailSurrogate()

Boolean GSCharacterIsTrailSurrogate ( const UTF32Char  c)
Parameters
[in]cCharacter to test.
Returns
Returns true if character is trailing and false, otherwise.

◆ GSUTF8CharacterTrailBytesCount()

CFIndex GSUTF8CharacterTrailBytesCount ( const UTF8Char  c)
Parameters
[in]cLeading code unit to test.
Returns
The number of trailing bytes.

◆ GSUTF8CharacterIsTrailing()

Boolean GSUTF8CharacterIsTrailing ( const UTF8Char  c)
Parameters
[in]cThe code unit to test.
Returns
Returns true if this UTF-8 code unit is a trailing code unit.

◆ GSUTF8CharacterLength()

CFIndex GSUTF8CharacterLength ( const UTF32Char  c)
Parameters
[in]cThe Unicode code point to test.
Returns
The number of UTF-8 code units required.

◆ GSUTF8CharacterAppendByteOrderMark()

CFIndex GSUTF8CharacterAppendByteOrderMark ( UTF8Char *  d,
const UTF8Char *  limit 
)
Parameters
[in,out]dA pointer to the current position of the string buffer. This value is updated after a call to the function.
[in]limitThe position just after the end of the buffer.
Returns
True if the function was successful and false, otherwise.

◆ GSUTF8CharacterSkipByteOrderMark()

Boolean GSUTF8CharacterSkipByteOrderMark ( const UTF8Char **  s,
const UTF8Char *  limit 
)
Parameters
[in,out]sA pointer to the current position of the string buffer. This value is updated after a call to the function.
[in]limitThe position just after the end of the buffer. The caller must ensure this parameter is beyond the string buffer pointed to by d.
Returns
True if a Byte Order Mark is found and false, otherwise.

◆ GSUTF8CharacterAppend()

CFIndex GSUTF8CharacterAppend ( UTF8Char *  d,
const UTF8Char *  limit,
UTF32Char  c 
)
Parameters
[in]dA pointer to the current position of the string buffer. This value is updated after a call to the function.
[in]limitThe position just after the end of the buffer.
[in]cThe Unicode code point to write.
Returns
The amount of code units written to the destination buffer. Will return 0 if c is a surrogate or invalid code point.

◆ GSUTF8CharacterGet()

CFIndex GSUTF8CharacterGet ( const UTF8Char *  s,
const UTF8Char *  limit,
UTF32Char *  c 
)
Parameters
[in,out]sA pointer to the current position of the source buffer. This value is updated after a call to the function.
[in]limitThe position just after the end of the buffer. Must be at least *s + 1.
[out]cOn return, the character.
Returns
A valid Unicode code unit. Will return 0 if:
  1. The UTF-8 code unit is also a 0.
  2. An invalid code point or code unit is encountered and loss was not specified.

◆ GSUTF16CharacterAppend()

CFIndex GSUTF16CharacterAppend ( UTF16Char *  d,
const UTF16Char *  limit,
UTF32Char  c 
)
Parameters
[in,out]dA pointer to the current position of the buffer. This value is updated after a call to the function.
[in]limitThe position just after the end of the buffer.
[in]cThe Unicode code point to write.
Returns
True if the functions was successful, and false if there is not enough space left in the string buffer or the code point is invalid.

◆ GSUTF16CharacterGet()

CFIndex GSUTF16CharacterGet ( const UTF16Char *  s,
const UTF16Char *  limit,
UTF32Char *  c 
)
Parameters
[in]sA pointer to the current position of the buffer. This value is updated after a call to the function.
[in]limitThe position just after the end of the buffer. Must be at least *s + 1.
[out]cOn return, the character.
Returns
A valid Unicode code point. Will return 0 if:
  1. The UTF-16 code unit is also a 0.
  2. The UTF-16 code unit pointed to by s is not a leading code unit.
  3. The leading UTF-16 code unit does not have a trailing pair.