Name
scanf, fscanf, sscanf, vscanf, vfscanf, vsscanf - convert
formatted input
Synopsis
#include <stdio.h>
int scanf(const char *restrict format...);
int fscanf(FILE *restrict stream, const char *restrict format...);
int sscanf(const char *restrict s, const char *restrict format...);
#include <stdarg.h>
#include <stdio.h>
int vscanf(const char *format, va_list arg);
int vfscanf(FILE *stream, const char *format, va_list arg);
int vsscanf(const char *s, const char *format, va_list arg);
Description
The scanf() function reads from the standard input stream
stdin.
The fscanf() function reads from the named input stream.
The sscanf() function reads from the string s.
The vscanf(), vfscanf(), and vsscanf() functions are
equivalent to the scanf(), fscanf(), and sscanf() functions,
respectively, except that instead of being called with a
variable number of arguments, they are called with an argu-
ment list as defined by the <stdarg.h> header . These func-
tions do not invoke the va_end() macro. Applications using
these functions should call va_end(ap) afterwards to clean
up.
Each function reads bytes, interprets them according to a
format, and stores the results in its arguments. Each
expects, as arguments, a control string format described
below, and a set of pointer arguments indicating where the
converted input should be stored. The result is undefined if
there are insufficient arguments for the format. If the for-
mat is exhausted while arguments remain, the excess argu-
ments are evaluated but are otherwise ignored.
Conversions can be applied to the nth argument after the
format in the argument list, rather than to the next unused
argument. In this case, the conversion character % (see
below) is replaced by the sequence %n$, where n is a decimal
integer in the range [1, NL_ARGMAX]. This feature provides
for the definition of format strings that select arguments
in an order appropriate to specific languages. In format
strings containing the %n$ form of conversion specifica-
tions, it is unspecified whether numbered arguments in the
argument list can be referenced from the format string more
than once.
The format can contain either form of a conversion specifi-
cation, that is, % or %n$, but the two forms cannot normally
be mixed within a single format string. The only exception
to this is that %% or %* can be mixed with the %n$ form.
The scanf() function in all its forms allows for detection
of a language-dependent radix character in the input string.
The radix character is defined in the program's locale
(category LC_NUMERIC). In the POSIX locale, or in a locale
where the radix character is not defined, the radix charac-
ter defaults to a period (.).
The format is a character string, beginning and ending in
its initial shift state, if any, composed of zero or more
directives. Each directive is composed of one of the follow-
ing:
o one or more white-space characters (space, tab,
newline, vertical-tab or form-feed characters);
o an ordinary character (neither % nor a white-space
character); or
o a conversion specification.
Conversion Specifications
Each conversion specification is introduced by the character
% or the character sequence %n$, after which the following
appear in sequence:
o An optional assignment-suppressing character *.
o An optional non-zero decimal integer that specifies
the maximum field width.
o An option length modifier that specifies the size
of the receiving object.
o A conversion specifier character that specifies the
type of conversion to be applied. The valid conver-
sion characters are described below.
The scanf() functions execute each directive of the format
in turn. If a directive fails, as detailed below, the func-
tion returns. Failures are described as input failures (due
to the unavailability of input bytes) or matching failures
(due to inappropriate input).
A directive composed of one or more white-space characters
is executed by reading input until no more valid input can
be read, or up to the first byte which is not a white-space
character which remains unread.
A directive that is an ordinary character is executed as
follows. The next byte is read from the input and compared
with the byte that comprises the directive; if the com-
parison shows that they are not equivalent, the directive
fails, and the differing and subsequent bytes remain unread.
A directive that is a conversion specification defines a set
of matching input sequences, as described below for each
conversion character. A conversion specification is executed
in the following steps:
Input white-space characters (as specified by isspace(3C))
are skipped, unless the conversion specification includes a
[, c, C, or n conversion character.
An item is read from the input unless the conversion specif-
ication includes an n conversion character. The length of
the item read is limited to any specified maximum field
width, which is interpreted in either characters or bytes
depending on the conversion character. In Solaris default
mode, the input item is defined as the longest sequence of
input bytes that forms a matching sequence. In some cases,
scanf() might need to read several extra characters beyond
the end of the input item to find the end of a matching
sequence. In C99/SUSv3 mode, the input item is defined as
the longest sequence of input bytes that is, or is a prefix
of, a matching sequence. With this definition, scanf() need
only read at most one character beyond the end of the input
item. Therefore, in C99/SUSv3 mode, some sequences that are
acceptable to strtod(3C), strtol(3C), and similar functions
are unacceptable to scanf(). In either mode, scanf()
attempts to push back any excess bytes read using
ungetc(3C). Assuming all such attempts succeed, the first
byte, if any, after the input item remains unread. If the
length of the input item is 0, the conversion fails. This
condition is a matching failure unless end-of-file, an
encoding error, or a read error prevented input from the
stream, in which case it is an input failure.
Except in the case of a % conversion character, the input
item (or, in the case of a %n conversion specification, the
count of input bytes) is converted to a type appropriate to
the conversion character. If the input item is not a match-
ing sequence, the execution of the conversion specification
fails; this condition is a matching failure. Unless assign-
ment suppression was indicated by a *, the result of the
conversion is placed in the object pointed to by the first
argument following the format argument that has not already
received a conversion result if the conversion specification
is introduced by %, or in the nth argument if introduced by
the character sequence %n$. If this object does not have an
appropriate type, or if the result of the conversion cannot
be represented in the space provided, the behavior is unde-
fined.
Length Modifiers
The length modifiers and their meanings are:
hh
Specifies that a following d, i, o, u, x, X,
or n conversion specifier applies to an
argument with type pointer to signed char or
unsigned char.
h
Specifies that a following d, i, o, u, x, X,
or n conversion specifier applies to an
argument with type pointer to short or
unsigned short.
l (ell)
Specifies that a following d, i, o, u, x, X,
or n conversion specifier applies to an
argument with type pointer to long or
unsigned long; that a following a, A, e, E,
f, F, g, or G conversion specifier applies
to an argument with type pointer to double;
or that a following c, s, or [ conversion
specifier applies to an argument with type
pointer to wchar_t.
ll (ell-ell)
Specifies that a following d, i, o, u, x, X,
or n conversion specifier applies to an
argument with type pointer to long long or
unsigned long long.
j
Specifies that a following d, i, o, u, x, X,
or n conversion specifier applies to an
argument with type pointer to intmax_t or
uintmax_t.
z
Specifies that a following d, i, o, u, x, X,
or n conversion specifier applies to an
argument with type pointer to size_t or the
corresponding signed integer type.
t
Specifies that a following d, i, o, u, x, X,
or n conversion specifier applies to an
argument with type pointer to ptrdiff_t or
the corresponding unsigned type.
L
Specifies that a following a, A, e, E, f, F,
g, or G conversion specifier applies to an
argument with type pointer to long double.
If a length modifier appears with any conversion specifier
other than as specified above, the behavior is undefined.
Conversion Characters
The following conversion characters are valid:
d
Matches an optionally signed decimal integer,
whose format is the same as expected for the sub-
ject sequence of strtol(3C) with the value 10 for
the base argument. In the absence of a size
modifier, the corresponding argument must be a
pointer to int.
i
Matches an optionally signed integer, whose for-
mat is the same as expected for the subject
sequence of strtol() with 0 for the base argu-
ment. In the absence of a size modifier, the
corresponding argument must be a pointer to int.
o
Matches an optionally signed octal integer, whose
format is the same as expected for the subject
sequence of strtoul(3C) with the value 8 for the
base argument. In the absence of a size modifier,
the corresponding argument must be a pointer to
unsigned int.
u
Matches an optionally signed decimal integer,
whose format is the same as expected for the sub-
ject sequence of strtoul() with the value 10 for
the base argument. In the absence of a size
modifier, the corresponding argument must be a
pointer to unsigned int.
x
Matches an optionally signed hexadecimal integer,
whose format is the same as expected for the sub-
ject sequence of strtoul() with the value 16 for
the base argument. In the absence of a size
modifier, the corresponding argument must be a
pointer to unsigned int.
a,e,f,g
Matches an optionally signed floating-point
number, infinity, or NaN, whose format is the
same as expected for the subject sequence of
strtod(3C). In the absence of a size modifier,
the corresponding argument must be a pointer to
float. The e, f, and g specifiers match hexade-
cimal floating point values only in C99/SUSv3
(see standards(5)) mode, but the a specifier
always matches hexadecimal floating point values.
These conversion specifiers match any subject
sequence accepted by strtod(3C), including the
INF, INFINITY, NAN, and NAN(n-char-sequence)
forms. The result of the conversion is the same
as that of calling strtod() (or strtof() or
strtold()) with the matching sequence, including
the raising of floating point exceptions and the
setting of errno to ERANGE, if applicable.
s
Matches a sequence of bytes that are not white-
space characters. The corresponding argument must
be a pointer to the initial byte of an array of
char, signed char, or unsigned char large enough
to accept the sequence and a terminating null
character code, which will be added automati-
cally.
If an l (ell) qualifier is present, the input is
a sequence of characters that begins in the ini-
tial shift state. Each character is converted to
a wide-character as if by a call to the
mbrtowc(3C) function, with the conversion state
described by an mbstate_t object initialized to
zero before the first character is converted.
The corresponding argument must be a pointer to
an array of wchar_t large enough to accept the
sequence and the terminating null wide-
character, which will be added automatically.
[
Matches a non-empty sequence of characters from a
set of expected characters (the scanset). The
normal skip over white-space characters is
suppressed in this case. The corresponding argu-
ment must be a pointer to the initial byte of an
array of char, signed char, or unsigned char
large enough to accept the sequence and a ter-
minating null byte, which will be added automati-
cally.
If an l (ell) qualifier is present, the input is
a sequence of characters that begins in the ini-
tial shift state. Each character in the sequence
is converted to a wide-character as if by a call
to the mbrtowc() function, with the conversion
state described by an mbstate_t object initial-
ized to zero before the first character is con-
verted. The corresponding argument must be a
pointer to an array of wchar_t large enough to
accept the sequence and the terminating null
wide-character, which will be added automati-
cally.
The conversion specification includes all subse-
quent characters in the format string up to and
including the matching right square bracket (]).
The characters between the square brackets (the
scanlist) comprise the scanset, unless the char-
acter after the left square bracket is a circum-
flex (^), in which case the scanset contains all
characters that do not appear in the scanlist
between the circumflex and the right square
bracket. If the conversion specification begins
with [] or [^], the right square bracket is
included in the scanlist and the next right
square bracket is the matching right square
bracket that ends the conversion specification;
otherwise the first right square bracket is the
one that ends the conversion specification. If a
- is in the scanlist and is not the first charac-
ter, nor the second where the first character is
a ^, nor the last character, it indicates a range
of characters to be matched.
c
Matches a sequence of characters of the number
specified by the field width (1 if no field width
is present in the conversion specification). The
corresponding argument must be a pointer to the
initial byte of an array of char, signed char, or
unsigned char large enough to accept the
sequence. No null byte is added. The normal skip
over white-space characters is suppressed in this
case.
If an l (ell) qualifier is present, the input is
a sequence of characters that begins in the ini-
tial shift state. Each character in the sequence
is converted to a wide-character as if by a call
to the mbrtowc() function, with the conversion
state described by an mbstate_t object initial-
ized to zero before the first character is con-
verted. The corresponding argument must be a
pointer to an array of wchar_t large enough to
accept the resulting sequence of wide-characters.
No null wide-character is added.
p
Matches the set of sequences that is the same as
the set of sequences that is produced by the %p
conversion of the corresponding printf(3C) func-
tions. The corresponding argument must be a
pointer to a pointer to void. If the input item
is a value converted earlier during the same pro-
gram execution, the pointer that results will
compare equal to that value; otherwise the
behavior of the %p conversion is undefined.
n
No input is consumed. The corresponding argument
must be a pointer to the integer into which is to
be written the number of bytes read from the
input so far by this call to the scanf()
functions. Execution of a %n conversion specifi-
cation does not increment the assignment count
returned at the completion of execution of the
function.
C
Same as lc.
S
Same as ls.
%
Matches a single %; no conversion or assignment
occurs. The complete conversion specification
must be %%.
If a conversion specification is invalid, the behavior is
undefined.
The conversion characters A, E, F, G, and X are also valid
and behave the same as, respectively, a, e, f, g, and x.
If end-of-file is encountered during input, conversion is
terminated. If end-of-file occurs before any bytes matching
the current conversion specification (except for %n) have
been read (other than leading white-space characters, where
permitted), execution of the current conversion specifica-
tion terminates with an input failure. Otherwise, unless
execution of the current conversion specification is ter-
minated with a matching failure, execution of the following
conversion specification (if any) is terminated with an
input failure.
Reaching the end of the string in sscanf() is equivalent to
encountering end-of-file for fscanf().
If conversion terminates on a conflicting input, the offend-
ing input is left unread in the input. Any trailing white
space (including newline characters) is left unread unless
matched by a conversion specification. The success of
literal matches and suppressed assignments is only directly
determinable via the %n conversion specification.
The fscanf() and scanf() functions may mark the st_atime
field of the file associated with stream for update. The
st_atime field will be marked for update by the first suc-
cessful execution of fgetc(3C), fgets(3C), fread(3C),
fscanf(), getc(3C), getdelim(3C), getline(3C), getchar(3C),
gets(3C), or scanf() using stream that returns data not sup-
plied by a prior call to ungetc(3C).
Return Values
Upon successful completion, these functions return the
number of successfully matched and assigned input items;
this number can be 0 in the event of an early matching
failure. If the input ends before the first matching
failure or conversion, EOF is returned. If a read error
occurs the error indicator for the stream is set, EOF is
returned, and errno is set to indicate the error.
Errors
For the conditions under which the scanf() functions will
fail and may fail, refer to fgetc(3C) or fgetwc(3C).
In addition, fscanf() may fail if:
EILSEQ
Input byte sequence does not form a valid charac-
ter.
EINVAL
There are insufficient arguments.
Usage
If the application calling the scanf() functions has any
objects of type wint_t or wchar_t, it must also include the
header <wchar.h> to have these objects defined.
Examples
Example 1 The call:
int i, n; float x; char name[50];
n = scanf("%d%f%49s", &i, &x, name);
with the input line:
25 54.32E-1 Hamster
will assign to n the value 3, to i the value 25, to x the
value 5.432, and name will contain the string Hamster.
Example 2 The call:
int i; float x; char name[50];
(void) scanf("%2d%f%*d %49[0123456789]", &i, &x, name);
with input:
56789 0123 56a72
will assign 56 to i, 789.0 to x, skip 0123, and place the
string 56\0 in name. The next call to getchar(3C) will
return the character a.
Attributes
See attributes(5) for descriptions of the following attri-
butes:
tab() box; lw(2.75i) |lw(2.75i) lw(2.75i) |lw(2.75i) ATTRI-
BUTE TYPEATTRIBUTE VALUE _ CSIEnabled _ Interface
StabilityCommitted _ MT-LevelMT-Safe _ StandardSee stan-
dards(5).
See Also
fgetc(3C), fgets(3C), fgetwc(3C), fread(3C), getdelim(3C),
getline(3C), isspace(3C), printf(3C), setlocale(3C),
strtod(3C), strtol(3C), strtoul(3C), wcrtomb(3C),
ungetc(3C), attributes(5), standards(5)
Warnings
The use of %s or %[ without length modifiers will result in
buffer overflows if the input string contains a buffer
longer than the memory allocation provided by the user.
Notes
The behavior of the conversion specifier %% has changed for
all of the functions described on this manual page. Previ-
ously the %% specifier accepted a % character from input
only if there were no preceding white space characters. The
new behavior accepts % even if there are preceding white
space characters. This new behavior now aligns with the
description on this manual page and in various standards. If
the old behavior is desired, the conversion specification
%*[%] can be used.