mbrtoc32() — Convert a multibyte character to a char32_t character

Standards

Standards / Extensions C or C++ Dependencies

ISO C Amendment
C11

both z/OS® V2R1

Format

#include <uchar.h>

size_t mbrtoc32(char32_t * restrict pc32, 
                const char * restrict s, 
                size_t n, 
                mbstate_t * restrict ps);

General description

The mbrtoc32() function converts a multibyte character to a wide character of type char32_t, and returns the number of bytes of the multibyte character.

If s is not a null pointer, the mbrtoc32() function inspects at most n bytes beginning with the byte pointed to by s to determine the number of bytes needed to complete the next multibyte character (including any shift sequences). If the function determines that the next multibyte character is complete and valid, it determines the values of the corresponding wide characters and then, if pc32 is not a null pointer, stores the value of the first (or only) such character in the object pointed to by pc32. Subsequent calls will store successive wide characters without consuming any additional input until all the characters have been stored. If the corresponding wide character is the null wide character, the resulting state described is the initial conversion state.

If s is a null pointer, the mbrtoc32() function is equivalent to the call mbrtoc32(NULL,"",1,ps). In this case, the values of the parameters pc32 and n are ignored.

If ps is a null pointer, mbrtoc32() uses its own internal object to track the shift state. Otherwise *ps must be a valid mbstate_t object. An mbstate_t object *ps can be initialized to the initial state by assigning 0 to it, or by calling mbrtoc32(NULL,NULL,0,ps).

Usage notes

  1. To use the mbrtoc32() function, compile the source code with the LANGLVL(EXTC1X) option.
  2. The mbrtoc32() function only supports the CCSIDs that are provided by Unicode Services.
  3. The result of converting multiple string alternately in one thread by using multiple mbstate_t objects (including the internal one) is undefined.

Returned value

The mbrtoc32() function returns the first of the following that applies (given the current conversion state):
0
If the next n or fewer bytes complete the multibyte character that corresponds to the null wide character (which is the value stored).
between 1 and n inclusive
If the next n or fewer bytes complete a valid multibyte character (which is the value stored); the value returned is the number of bytes that complete the multibyte character.
-3
If the next character resulting from a previous call has been stored (no byte from the input has been consumed by this call).
-2
If the next n bytes contribute to an incomplete (but potentially valid) multibyte character, and all n bytes have been processed (no value is stored). When n has at least the value of the MB_CUR_MAX macro, this case can only occur if s points to a sequence of redundant shift sequence (for implementations with state-dependent encodings).
-1
If an encoding error occurs (when the next n or fewer bytes do not contribute to a complete and valid multibyte character). The value of the macro EILSEQ is stored in errno, and the conversion state is unspecified.

Example

#include <stdio.h>
#include <stdlib.h>
#include <uchar.h>

int main(void)
{
   char32_t c32;
   char mbs[] = "a" ; /* string containing the multibyte character */
   mbstate_t ss = 0 ;    /* set shift state to the initial state */
   int length = 0 ;
   
   /* Determine the length of the multibyte character pointed to by */
   /* mbs. Store the multibyte character in the char32_t object */
   /* called c32. */

   length = mbrtoc32(&c32, mbs, MB_CUR_MAX, &ss);
   if (length < 0) {
      /* -2 and -3 return value could not happen during converting the 'a' */
      perror("mbrtoc32() fails to convert");
      exit(-1);
   }
   
   printf(" mbs:\"%s\"\n", mbs);
   printf(" length: %d \n", length);
   printf(" c32: 0x%08x \n", c32);
}
Output:
mbs:"a"
length: 1
c32: 0x00000061

Related information