fscanf(), scanf(), sscanf() — Read and format data

Standards

Standards / Extensions	C or C++	Dependencies
ISO C POSIX.1 XPG4 XPG4.2 Single UNIX Specification, Version 3 C/C++ DFP Language environment	both	z/OS® V1R8

Format

#include <stdio.h>

int fscanf(FILE *__restrict__stream, const char *__restrict__format-string, …);
int scanf(const char *__restrict__format-string, …);
int sscanf(const char *__restrict__buffer, const char *__restrict__format, …);

#define _OPEN_SYS_UNLOCKED_EXT 1
#include <stdio.h>

int fscanf_unlocked(FILE *__restrict__stream, 
                    const char *__restrict__format-string, …);
int scanf_unlocked(const char *__restrict__format-string, …);

General description

These three related functions are referred to as the fscanf family.

Reads data from the current position of the specified stream into the locations given by the entries in the argument list, if any. The argument list, if it exists, follows the format string. The fscanf() function cannot be used for a file opened with type=record or type=blocked.

The scanf() function reads data from the standard input stream stdin into the locations given by each entry in the argument list. The argument list, if it exists, follows the format string. scanf() cannot be used if stdin has been reopened as a type=record or type=blocked file.

The sscanf() function reads data from buffer into the locations given by argument-list. Reaching the end of the string pointed to by buffer is equivalent to fscanf() reaching EOF. If the strings pointed to by buffer and format overlap, behavior is undefined.

fscanf() and scanf() have the same restriction as any read operation for a read immediately following a write or a write immediately following a read. Between a write and a subsequent read, there must be an intervening flush or reposition. Between a read and a subsequent write, there must also be an intervening flush or reposition unless an EOF has been reached.

For all three functions, each entry in the argument list must be a pointer to a variable of a type that matches the corresponding conversion specification in format-string. If the types do not match, the results are undefined.

For all three functions, the format-string controls the interpretation of the argument list. The format-string can contain multibyte characters beginning and ending in the initial shift state.

The format string pointed to by format-string can contain one or more of the following:

White space characters, as specified by isspace(), such as blanks and newline characters. A white space character causes fscanf(), scanf(), and sscanf() to read, but not to store, all consecutive white space characters in the input up to the next character that is not white space. One white space character in format-string matches any combination of white space characters in the input.
Characters that are not white space, except for the percent sign character (%). A non-white space character causes fscanf(), scanf(), and sscanf() to read, but not to store, a matching non-white space character. If the next character in the input stream does not match, the function ends.
Conversion specifications which are introduced by the percent sign (%) or the sequence (%n$) where n is a decimal integer in the range [1,NL_ARGMAX]. A conversion specification causes fscanf(), scanf(), and sscanf() to read and convert characters in the input into values of a conversion specifier. The value is assigned to an argument in the argument list.

All three functions read format-string from left to right. Characters outside of conversion specifications are expected to match the sequence of characters in the input stream; the matched characters in the input stream are scanned but not stored. If a character in the input stream conflicts with format-string, the function ends, terminating with a “matching” failure. The conflicting character is left in the input stream as if it had not been read.

When the first conversion specification is found, the value of the first input field is converted according to the conversion specification and stored in the location specified by the first entry in the argument list. The second conversion specification converts the second input field and stores it in the second entry in the argument list, and so on through the end of format-string.

fscanf_unlocked() is functionally equivalent to fscanf() with the exception that it is not thread-safe. This function can safely be used in a multithreaded application if and only if it is called while the invoking thread owns the (FILE*) object, as is the case after a successful call to either the flockfile() or ftrylockfile() function.

scanf_unlocked() is functionally equivalent to scanf() with the exception that it is not thread-safe. This function can safely be used in a multithreaded application if and only if it is called while the invoking thread owns the (FILE*) object, as is the case after a successful call to either the flockfile() or ftrylockfile() function.

An input field is defined as:

All characters until a white space character (space, tab, or newline) is encountered
All characters until a character is encountered that cannot be converted according to the conversion specification
All characters until the field width is reached.

If there are too many arguments for the conversion specifications, the extra arguments are evaluated but otherwise ignored. The results are undefined if there are not enough arguments for the conversion specifications.


Syntax of Conversion Specification for fscanf(), scanf(), and sscanf()

>>-%--+---+--+-------+--+----+--conversion specifier-----------><
      '-*-'  '-width-'  +-h--+                         
                        +-hh-+                         
                        +-l--+                         
                        +-ll-+                         
                        +-j--+                         
                        +-t--+                         
                        +-z--+                         
                        +-D--+                         
                        +-DD-+                         
                        +-H--+                         
                        '-L--'

Each field of the conversion specification is a single character or a number signifying a particular format option. The conversion specifier, which appears after the last optional format field, determines whether the input field is interpreted as a character, a string, or a number. The simplest conversion specification contains only the percent sign and a conversion specifier (for example, %s).

Each field of the format specification is discussed in detail below.

Other than conversion specifiers, you should avoid using the percent sign (%), except to specify the percent sign: %%. Currently, the percent sign is treated as the start of a conversion specifier. Any unrecognized specifier is treated as an ordinary sequence of characters. If, in the future, z/OS XL C/C++ permits a new conversion specifier, it could match a section of your format string, be interpreted incorrectly, and result in undefined behavior. See Table 1 for a list of conversion specifiers.

An asterisk (*) following the percent sign suppresses assignment of the next input field, which is interpreted as a field of the specified conversion specifier. The field is scanned but not stored.

width is a positive decimal integer controlling the maximum number of characters to be read. No more than width characters are converted and stored at the corresponding argument.

Fewer than width characters are read if a white space character (space, tab, or newline), or a character that cannot be converted according to the given format occurs before width is reached.

Optional prefix: The optional prefix characters used to indicate the size of the argument expected are explained:

Prefix: Meaning
h: Specifies that the d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to short or unsigned short.
hh: Specifies that the d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to signed char or unsigned char.
l: (ell) Specifies that the d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to long or unsigned long; the following a, A, e, E, f, F, g, or G conversion specifier applies to an argument with type pointer to double; and the following c, s, or [ conversion specifier applies to an argument with type pointer to wchar_t.
ll: (ell-ell) Specifies that the d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to long long or unsigned long long.
j: Specifies that the d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to intmax_t or uintmax_t.
t: Specifies that the d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to ptrdiff_t or the corresponding unsigned type.
z: Specifies that the d, i, o, u, x, X, or n conversion specifier applies to an argument with type pointer to size_t or the corresponding signed integer type.
D: Specifies that the e, E, f, F, g, or G conversion specifier applies to an argument with type pointer to _Decimal64 float.
DD: Specifies that the e, E, f, F, g, or G conversion specifier applies to an argument with type pointer to _Decimal128 float.
H: Specifies that the e, E, f, F, g, or G conversion specifier applies to an argument with type pointer to _Decimal32 float.
L: Specifies that the a, A, e, E, f, F, g, or G conversion specifier applies to an argument with type pointer to long double.

Conversion specifier: Table 1 explains the valid conversion specifiers and their meanings are in.

Table 1. Conversion Specifiers in fscanf Family
Conversion Specifier	Type of Input Expected	Type of Argument
d	Decimal integer	Pointer to `int`
o	Octal integer	Pointer to `unsigned int`
x X	Hexadecimal integer	Pointer to `unsigned int`
i	Decimal, hexadecimal, or octal integer	Pointer to `int`
u	Unsigned decimal integer	Pointer to `unsigned int`
e E f F g G	Floating-point value consisting of an optional sign (+ or -); a series of one or more decimal digits possibly containing a decimal-point; and an optional exponent (e or E) followed by a possibly signed integer value	Pointer to `float`
a A	Matches an optionally signed floating-point number, infinity, or NaN, whose format is the same as expected for the subject sequence of strtod(). In the absence of a size modifier, the application shall ensure that the corresponding argument is a pointer to float.	Pointer to `float`
D(n,p)	Fixed-point value consisting of an optional sign (+ or -); a series of one or more decimal digits possibly containing a decimal-point.	Pointer to decimal
c	Sequence of one or more characters as specified by field width; white space characters that are ordinarily skipped are read when `%c` is specified. No terminating null is added.	Pointer to `char` large enough for input field.
C or lc	The input is a sequence of one or more multibyte characters as specified by the field width, beginning in the initial shift state. Each multibyte character in the sequence is converted to a wide character as if by a call to the mbrtowc() function. The conversion state described by the mbstate_t object is initialized to zero before the first multibyte character is converted. The corresponding argument is a pointer to the initial element of an array of `wchar_t` large enough to accept the resulting sequence of wide characters. No NULL wide character is added.	`C` or `lc` uses a pointer to `wchar_t`.
s	Like `c`, a sequence of bytes of type `char` (signed or unsigned), except that white space characters are not allowed, and a terminating null is always added.	Pointer to character array large enough for input field, plus a terminating NULL character (`\0`) that is automatically appended.
S or ls	A sequence of multibyte characters that begins and ends in the initial shift state. Each multibyte character in the sequence is converted to a wide character as if by a call to the mbrtowc() function, with the conversion state described by the mbstate_t object initialized to zero before the first multibyte character is converted. The corresponding argument is a pointer to the initial array of `wchar_t` large enough to accept the sequence and the terminating NULL wide character, which is added automatically.	`S` or `ls` uses a pointer to `wchar_t` string.
n	No input read from stream or buffer.	Pointer to `int`, into which is stored the number of characters successfully read from the stream or buffer up to that point in the call to either fscanf() or to scanf().
p	Pointer to `void` converted to series of characters. For the specific format of the input, see the individual system reference guides.	Pointer to `void`.
[	A non-empty sequence of bytes to be matched against a set of expected bytes (the scanset), which form the conversion specification. White space characters that are ordinarily skipped are read when `%[` is specified. Consider the following situations: [^bytes]. In this case, the scanset contains all bytes that do not appear between the circumflex and the right square bracket. []abc] or [^]abc.] In both these cases the right square bracket is included in the scanset (in the first case: ]abc and in the second case, not ]abc) [a–z] In EBCDIC The – is in the scanset, the characters b through y are not in the scanset; in ASCII The – is not in the scanset, the characters b through y are. The code point for the square brackets ([ and ]) and the caret (^) vary among the EBCDIC encoded character sets. The default C locale expects these characters to use the code points for encoded character set Latin-1 / Open Systems 1047. Conversion proceeds one byte at a time: there is no conversion to wide characters.	Pointer to the initial byte of an array of char, signed char, or unsigned char large enough to accept the sequence and a terminating byte, which will be added automatically.
l[	If an `l` length modifier is present, input is a sequence of multibyte characters that begins and ends in the initial shift state. Each multibyte character in the sequence is converted to a wide character as if by a call to the mbrtowc() function, with the conversion state described by the mbstate_t object initialized to zero before the first multibyte character is converted. The corresponding argument is a pointer to the initial array of `wchar_t` large enough to accept the sequence and the terminating NULL wide character, which is added automatically.	`l[` uses a pointer to `wchar_t` string

When the LC_SYNTAX category is set using setlocale(), the format strings passed to the fscanf(), scanf(), or sscanf() functions must use the same encoded character set as is specified for the LC_SYNTAX category.

To read strings not delimited by space characters, substitute a set of characters in square brackets ([ ]) for the s (string) conversion specifier. The corresponding input field is read up to the first character that does not appear in the bracketed character set. If the first character in the set is a logical not (¬), the effect is reversed: the input field is read up to the first character that does appear in the rest of the character set.

To store a string without storing an ending NULL character (\0), use the specification %ac, where a is a decimal integer. In this instance, the c conversion specifier means that the argument is a pointer to a character array. The next a characters are read from the input stream into the specified location, and no NULL character is added.

The input for a %x conversion specifier is interpreted as a hexadecimal number.

All three functions, fscanf(), scanf(), and sscanf() scan each input field character by character. It might stop reading a particular input field either before it reaches a space character, when the specified width is reached, or when the next character cannot be converted as specified. When a conflict occurs between the specification and the input character, the next input field begins at the first unread character. The conflicting character, if there is one, is considered unread and is the first character of the next input field or the first character in subsequent read operations on the input stream.

Special behavior for XPG4.2:

When the %n$ conversion specification is found, the value of the input field is converted according to the conversion specification and stored in the location specified by the nth argument in the argument list. Numbered arguments in the argument list can only be referenced once from format-string.
The format-string can contain either form of the conversion specification, that is, % or %n$ but the two forms cannot be mixed within a single format-string except that %% or %* can be mixed with the %n$ form.

Floating-point and the fscanf family of formatted input functions: The fscanf family functions match e, E, f, F, g or G conversion specifiers to floating-point number substrings in the input stream. The fscanf family functions convert each input substring matched by an e, E, f, F, g or G conversion specifier to a float, double or long double value depending on a size modifier preceding the e, E, f, F, g or G conversion specifier.

The floating-point value produced is hexadecimal floating-point or IEEE Binary Floating-Point format depending on the floating-point mode of the thread invoking the fscanf family function. The fscanf family functions use __isBFP() to determine the floating-point mode of invoking threads.

Many z/OS XL C/C++ formatted input functions, including the fscanf family, recognize special infinity and NaN floating-point number input sequences when the invoking thread is in IEEE Binary Floating-Point mode as determined by __isBFP().

The special sequence for infinity input is an optional plus or minus sign, then the character sequence INF, where the individual characters may be uppercase or lowercase, and then a white space character (space, tab, or newline), a NULL character (\0) or EOF.
The special sequence for NaN input is an optional plus or minus sign, then the character sequence NANS for a signalling NaN or NANQ for a quiet NaN, where the individual characters may be uppercase or lowercase, then an optional NaN ordinal sequence, and then a a white space character (space, tab, or newline), a NULL character (\0) or EOF.
For binary floating point NANs: A NaN ordinal sequence is a left-parenthesis character, “(”, followed by a digit sequence representing an integer n, where 1 <= n <= INT_MAX-1, followed by a right-parenthesis character, “)”. If the NaN ordinal sequence is omitted, NaN ordinal sequence (1) is assumed. The integer value, n, corresponding to a NaN ordinal sequence determines what IEEE Binary Floating-Point NaN fraction bits are produced by formatted input functions.

For a signalling NaN, these functions produce NaN fraction bits (left to right) by reversing the bits (right to left) of the even integer value 2*n.

For a quiet NaN they produce NaN fraction bits (left to right) by reversing the bits (right to left) of the odd integer value 2*n-1.

For decimal floating point NANs: A NaN ordinal sequence is a left parenthesis character, "(", followed by a decimal digit sequence of up to 6 digits for a _Decimal32 output number, up to 15 digits for a _Decimal64 output value, or up to 33 digits for a _Decimal128 output value, followed by a right parenthesis, ")". If the NaN ordinal sequence is omitted, NaN ordinal sequence "(0)" is assumed. If the NaN ordinal sequence is shorter than 6, 15, or 33 digits, it will be padded on the left with "0" digits so that the length becomes 6, 15, or, 33 digits for _Decimal32, _Decimal64, and _Decimal128 values respectively.

For decimal floating point numbers, the digits are not reversed, and both odd or even NaN ordinal sequences can be specified for either a Quiet NAN or Signalling NAN.

Usage note

To use IEEE decimal floating-point, the hardware must have the Decimal Floating-Point Facility installed.

Returned value

All three functions, fscanf(), scanf(), and sscanf() return the number of input items that were successfully matched and assigned. The returned value does not include conversions that were performed but not assigned (for example, suppressed assignments). The functions return EOF if there is an input failure before any conversion, or if EOF is reached before any conversion. Thus a returned value of 0 means that no fields were assigned: there was a matching failure before any conversion. Also, if there is an input failure, then the file error indicator is set, which is not the case for a matching failure.

The ferror() and feof() functions are used to distinguish between a read error and an EOF. Note that EOF is only reached when an attempt is made to read “past” the last byte of data. Reading up to and including the last byte of data does not turn on the EOF indicator.

Examples

CELEBF42

⁄* CELEBF42
                                              
   This example scans various types of data                                     

 *⁄                                                                             
#include <stdio.h>                                                              
                                                                                
int main(void)                                                                  
{                                                                               
   int i;                                                                       
   float fp;                                                                    
   char c, s[81];                                                               
                                                                                
   printf("Enter an integer, a real number, a character "                       
          "and a string : \n");                                                 
   if (scanf("%d %f %c %s", &i, &fp, &c, s) != 4)                               
      printf("Not all of the fields were assigned\n");                          
   else                                                                         
   {                                                                            
      printf("integer = %d\n", i);                                              
      printf("real number = %f\n", fp);                                         
      printf("character = %c\n", c);                                            
      printf("string = %s\n",s);                                                
   }                                                                            
}

Output

If input is: 12 2.5 a yes, then output would be:

Enter an integer, a real number, a character and a string:
integer = 12
real number = 2.500000
character = a
string = yes

CELEBF43

/* CELEBF43
   This example converts a hexadecimal integer to a decimal integer.
   The while loop ends if the input value is not a hexadecimal integer.
 */
#include <stdio.h>

int main(void)
{
   int number;

   printf("Enter a hexadecimal number or anything else to quit:\n");
   while (scanf("%x",&number))
      {
      printf("Hexadecimal Number = %x\n",number);
      printf("Decimal Number     = %d\n",number);
      }
}

Output

If input is: 0x231 0xf5e 0x1 q, then output would be:

Enter a hexadecimal number or anything else to quit:
Hexadecimal Number = 231
Decimal Number     = 561
Hexadecimal Number = f5e
Decimal Number     = 3934
Hexadecimal Number = 1
Decimal Number     = 1

CELEBF44

/* CELEBF44
   The next example illustrates the use of scanf() to input fixed-point
   decimal data types. This example works under C only, not C++.
 */
#include <stdio.h>
#include <decimal.h>

decimal(15,4) pd01;
decimal(10,2) pd02;
decimal(5,5) pd03;

int main(void) {
  printf("\nFirst time :-------------------------------\n");
  printf("Enter three fixed-point decimal number\n");
  printf("  (15,4) (10,2) (5,5)\n");
  if (scanf("%D(15,4) %D(10,2) %D(5,5)", &pd01, &pd02, &pd03) != 3) {
    printf("Error found in scanf\n");
  } else {
    printf("pd01 = %D(15,4)\n", pd01);
    printf("pd02 = %D(10,2)\n", pd02);
    printf("pd03 = %D(5,5)\n", pd03);
  }
  printf("\nSecond time :------------------------------\n");
  printf("Enter three fixed-point decimal number\n");
  printf("  (15,4) (10,2) (5,5)\n");
  if (scanf("%D(15,4) %D(10,2) %D(5,5)", &pd01, &pd02, &pd03) != 3) {
    printf("Error found in scanf\n");
  } else {
    printf("pd01 = %D(15,4)\n", pd01);
    printf("pd02 = %D(10,2)\n", pd02);
    printf("pd03 = %D(5,5)\n", pd03);
  }
  return(0);
}

Output

First time :-------------------------------
Enter three fixed-point decimal number
  (15,4) (10,2) (5,5)
12345678901.2345 -987.6 .24680
pd01 = 12345678901.2345
pd02 = -987.60
pd03 = 0.24680

Second time :------------------------------
Enter three fixed-point decimal number
  (15,4) (10,2) (5,5)
123456789013579.24680 123.4567890 987
pd01 = 12345678901.3579
pd02 = 123.45
pd03 = 0.98700

CELEBF46

/* CELEBF46
   The next example opens the file myfile.dat for reading and then scans
   this file for a string, a long integer value, a character, and a
   floating-point value.
 */
#include <stdio.h>
#define  MAX_LEN  80

int main(void)
{
   FILE *stream;
   long l;
   float fp;
   char s[MAX_LEN + 1];
   char c;

   stream = fopen("myfile.dat", "r");

   /* Put in various data. */
   fscanf(stream, "%s", &s[0]);
   fscanf(stream, "%ld", &l);
   fscanf(stream, "%c", &c);
   fscanf(stream, "%f", &fp);

   printf("string = %s\n", s);
   printf("long double = %ld\n", l);
   printf("char = %c\n", c);
   printf("float = %f\n", fp);
}

Output

If myfile.dat contains abcdefghijklmnopqrstuvwxyz 343.2, then the expected output is:

string = abcdefghijklmnopqrstuvwxyz
long double = 343
char = .
float = 2.000000

CELEBS32

/* CELEBS32
   This example uses sscanf() to read various data from the string
   tokenstring, and then displays the data.
 */
#include <stdio.h>
#define  SIZE  81

int main(void)
{
char *tokenstring = "15 12 14";
int i;
float fp;
char s[SIZE];
char c;

   /* Input various data                              */
   printf("No. of conversions=%d\n",
      sscanf(tokenstring, "%s %c%d%f", s, &c, &i, &fp));

   /* If there were no space between %s and %c,       */
   /* sscanf would read the first character following */
   /* the string, which is a blank space.             */

   /* Display the data */
   printf("string = %s\n",s);
   printf("character = %c\n",c);
   printf("integer = %d\n",i);
   printf("floating-point number = %f\n",fp);
}

Output

You would see this output from example CELEBS32.

No. of conversions = 4
string = 15
character = 1
integer = 2
floating-point number = 14.000000

Related information

See the topic about internationalization of locales and character sets in z/OS XL C/C++ Programming Guide.
locale.h
stdio.h
fprintf(), printf(), sprintf() — Format and write data
__isBFP() — Determine application floating-point format
localtime(), localtime64() — Convert time and correct for local time
setlocale() — Set locale