iconv() — Code conversion

Standards

Standards / Extensions C or C++ Dependencies

XPG4
XPG4.2
Single UNIX Specification, Version 3

both  

Format

#include <iconv.h>

size_t iconv(iconv_t cd, char **__restrict__ inbuf, 
				 size_t *__restrict__ inbytesleft, char **__restrict__ outbuf,
             size_t *__restrict__ outbytesleft);

General description

Converts a sequence of characters, indirectly pointed to by inbuf, from one encoded character set into a sequence of corresponding characters in another encoded character set. The resulting character sequence is then stored into the array indirectly pointed to by outbuf. The encoded character sets are those specified in the iconv_open() call that returned the conversion descriptor, cd. If the descriptor refers to the state-dependent encoding, then before it is first used, the cd descriptor is in its initial shift state.

The inbuf argument points to a variable that points to the first character in the input buffer. inbytesleft indicates the number of bytes to the end of the buffer to be converted. The outbuf argument points to a variable that points to the first character in the output buffer. outbytesleft indicates the number of available bytes to the end of the buffer.

If the output character set refers to the state-dependent encoding—if it contains the multibyte characters with shift-states—the conversion descriptor cd is placed in its initial state by a call for which inbuf is a NULL pointer, or for which inbuf points to a NULL pointer. When iconv() is called in this way, and if outbuf is not a NULL pointer or a pointer to a NULL pointer, and outbytesleft points to a positive value, iconv() places in the output buffer the byte sequence to change the output buffer to the initial shift state. If the output buffer is not large enough to hold the entire reset sequence, iconv() fails, and sets errno to E2BIG. Subsequent calls with inbuf as other than a NULL pointer or a pointer to a NULL pointer cause conversion from the current state of the conversion descriptor.

If a sequence of input bytes does not form a valid character in the specified encoded character set, conversion stops after the previous successfully converted character, and iconv() sets errno to EILSEQ. If the input buffer ends with an incomplete character or shift sequence, conversion stops after the previous successfully converted bytes, and iconv() sets errno to EINVAL. If the output buffer is not large enough to hold the entire converted input, conversion stops just before the input bytes that would cause the output buffer to overflow.

The variable pointed to by inbuf is updated to point to the byte following the last byte of a successfully converted character. The value pointed to by inbytesleft is decremented to reflect the number of bytes still not converted in the input buffer. The variable pointed to by outbuf is updated to point to the byte following the last byte of converted output data. The value pointed to by outbytesleft is decremented to reflect the number of bytes still available in the output buffer. For state-dependent encoding, the conversion descriptor is updated to reflect the shift state in effect at the end of the last successfully converted byte sequence.

If iconv() encounters a character in the input buffer that is valid, but for which a conversion is not defined in the conversion descriptor, cd, then iconv() performs a nonidentical conversion on this character. The conversion is implementation-defined.

The <iconv.h>header file declares the iconv_t type that is a pointer to the object capable of storing the information about the converters used to convert characters in one coded character set to another. For state-dependent encoding, the object must be capable of storing the encoded information about the current shift state.

Special considerations for bidirectional language support: If the _BIDION environment variable is set to TRUE, iconv() performs bidirectional layout transformation to the converted characters. The required attributes for bidirectional layout transformation can be specified using the environment variable _BIDIATTR (eg. export _BIDIATTR="@ls typeoftext=visual:implicit, orientation=ltr:ltr,numerals=nominal:national"). For a detailed description of the bidirectional layout transformation, see “Bidirectional Language Support” in z/OS XL C/C++ Programming Guide. If the environment variable _BIDIATTR is not set, the default values will be used.

iconv() can perform bidirectional layout transformation while converting the data from the fromCodePage to the toCodePage. Bidirectional layout transformation will take place only if bidirectional language support is activated, see iconv_open() — Allocate code conversion descriptor for more information about activating bidirectional layout transformation. In case iconv encounters any error in input or output buffers in the bidirectional part it will bypass the bidirectional layout transformation and continue its normal function as usual.

Special behavior for POSIX C: In the POSIX environment, a conversion descriptor returned from a successful iconv_open() may be used safely within a single thread. In addition, it may be opened on one thread, used on a second thread (iconv()), and closed (iconv_open()) on a third thread. However, you must ensure correct cross-thread sequencing and synchronization (that is: iconv_open(), followed by optional iconv() calls, followed by iconv_close()). The use of a shared conversion descriptor by iconv() across multiple threads may result in undefined behavior.

Returned value

If successful, iconv() updates the variables pointed to by the arguments to reflect the extent of the conversion and returns the number of nonidentical conversions performed.

If the entire string in the input buffer is converted, the value pointed to by inbytesleft will be 0. If the input conversion is stopped because of any conditions mentioned above, the value pointed to by inbytesleft will be nonzero and errno is set to indicate the condition.

If an error occurs, iconv() returns (size_t)-1 and sets errno to one of the following values:
Error Code
Description
EBADF
cd is not a valid descriptor.
ECUNNOENV

A CUN_RS_NO_UNI_ENV error was issued by Unicode Conversion Services.

See z/OS Unicode Services User's Guide and Reference documentation for user action.

ECUNNOCONV

A CUN_RS_NO_CONVERSION error was issued by Unicode Conversion Services.

See z/OS Unicode Services User's Guide and Reference documentation for user action.

ECUNNOTALIGNED

A CUN_RS_TABLE_NOT_ALIGNED error was issued by Unicode Conversion Services.

See z/OS Unicode Services User's Guide and Reference documentation for user action.

ECUNERR

Function iconv() encountered an unexpected error while using Unicode Conversion Services.

See message EDC6258 for additional information.

EILSEQ
Input conversion stopped due to an input byte that does not belong to the input codeset.
EINVAL
Input conversion stopped due to an incomplete character or shift sequence at the end of the input buffer.
E2BIG
Input conversion stopped due to lack of space in the output buffer.

Example

CELEBI01
⁄* CELEBI01

   This example converts an array of characters coded in encoded character
   set IBM-1047 to an array of characters coded in encoded character set
   IBM-037.
   Input is in inbuf, output will be in outbuf.

 *⁄
#include <iconv.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>

main ()
{
            char   *inptr;  ⁄* Pointer used for input buffer  *⁄
            char   *outptr; ⁄* Pointer used for output buffer *⁄
            char    inbuf[20] =
                  "ABCDEFGH!@#$1234";
                                              ⁄* input buffer *⁄
   unsigned char    outbuf[20];   ⁄* output buffer  *⁄
   iconv_t          cd;     ⁄* conversion descriptor          *⁄
   size_t           inleft; ⁄* number of bytes left in inbuf  *⁄
   size_t           outleft;⁄* number of bytes left in outbuf *⁄
   int              rc;     ⁄* return code of iconv()         *⁄


   if ((cd = iconv_open("IBM-037", "IBM-1047")) == (iconv_t)(-1)) {
      fprintf(stderr, "Cannot open converter from %s to %s\n",
                                          "IBM-1047", "IBM-037");
      exit(8);
   }

   inleft = 16;
   outleft = 20;
   inptr = inbuf;
   outptr = (char*)outbuf;

   rc = iconv(cd, &inptr, &inleft, &outptr, &outleft);
   if (rc == -1) {
      fprintf(stderr, "Error in converting characters\n");
      exit(8);
   }
   iconv_close(cd);
}

Related information