Basic
Types
Integer
Types
C
supports two fundamentally different kinds of numeric types: integer types and
floating types. Values of an integer
type are whole numbers, while values of a floating
type can have a fractional part as well. The integer types, in turn, are
divided into two categories: signed and unsigned.
C's
integer types come in different sizes. The int type is usually 32 bits, but may
be 16 bits on older CPUs. Since some programs require numbers that are too
large to store in int form, C also provides long integers. At
times, we may need to conserve memory by instructing the compiler to store a
number in less space than normal; such a number is called a short
integer.
To
construct an integer type that exactly meets our needs, we can specify that a
variable is long or short, signed or unsigned. We can even combine specifiers
(e.g., long unsigned int). However, only the following six combinations
actually produce different types:
short int
unsigned short int
int
unsigned int
long int
unsigned long int
Other
combinations are synonyms for one of these six types. (For example, long signed
int is the same as long int, since integers are always signed unless otherwise
specified.) Incidentally, the order of the specifiers doesn't matter; unsigned
short int is the same as short unsigned int.
C
allows us to abbreviate the names of integer types by dropping the word int.
For example, unsigned short int may be abbreviated to unsigned short, and long
int may be abbreviated to just long. Omitting int is a widespread practice
among C programmers, and some newer C-based languages (including Java) actually
require the programmer to write short or long rather than short int or long
int. For these reasons, I'll often omit the word int when it's not strictly
necessary.
The
range of values represented by each of the six integer types varies from one
machine to another. However, there are a couple of rules that all compilers
must obey. First, the C standard requires that short int, int, and long int
each cover a certain minimum range of values .Second, the standard requires
that int not be shorter than short int, and long int not be shorter than int.
However, it's possible that short int represents the same range of values as
int; also, int may have the same range as long int.
Table shows
the usual range of values for the integer types on a 16-bit machine; note that
short int and int have identical ranges.
Type
|
Smallest Value
|
largest Value
|
short int
|
-32.768
|
32,767
|
unsigned short int
|
0
|
65,535
|
int
|
-32,768
|
32,767
|
unsigned int
|
0
|
65,535
|
long int
|
-2,147.483,648
|
2.147.483.647
|
unsigned long int
|
0
|
4,294.967,295
|
Table shows the usual ranges on a
32-bit machine; here int and long int have identical ranges.
Type
|
Smallest Value
|
Largest Value
|
short int
|
-32,768
|
32,767
|
unsigned short int
|
0
|
65.535
|
int
|
-2.147.483,648
|
2,147,483,647
|
unsigned int
|
0
|
4,294.967,295
|
long int
|
-2,147,483.648
|
2,147,483,647
|
unsigned long int
|
0
|
4,294,967,295
|
In
recent years, 64-bit CPUs have become more common. Table 7.3 shows typical
ranges for the integer types on a 64-bit machine (especially under UNIX).
Type
|
Smallest Value
|
Largest Value
|
short int
|
-32,768
|
32,767
|
unsigned short int
|
0
|
65,535
|
int
|
-2.147,483,648
|
2,147,483,647
|
unsigned int
|
0
|
4,294,967,295
|
long int
|
-9,223,372,036,854.775,808
|
9.223,372,036,854.775.807
|
unsigned long int
|
0
|
18.446.744,073.709,551.615
|
Integer Constants
Let's turn our attention to constants—numbers that appear in the text of a
program. not numbers that are read, written, or computed. C allows integer
constants to be written in decimal (base 10), octal (base 8). or hexadecimal
(base 16).
Octal and Hexadecimal Numbers
An
octal number is written using only the digits 0 through 7. Each position in an
octal number represents a power of 8 (just as each position in a decimal number
represents a power of 10). Thus, the octal number 237 represents the decimal
num ber2 x82 + 3 x 81
+ 7x 8° = 128 + 24 + 7= 159.
A
hexadecimal (or hex) number is written using the digits 0 through 9 plus the
letters A through F, which stand for 10 through 15, respectively. Each position
in a hex number represents a power of 16; the hex number 1AF has the decimal
value 1 x 162+ 10 x 161 + 15 x 16° = 256 + 160 + 15 =
431.
Decimal constants contain digits between 0 and 9, but must not begin with a
zero:
15 255 32767
Octal constants contain only digits between 0 and 7. and must
begin with a zero:
017 0377 077777
Hexadecimal
constants
contain digits between 0 and 9 and letters between a and f,
and always begin with Ox:
Oxf Oxff 0x7fff
The letters in a hexadecimal
constant may be either upper or lower case:
Oxff OxfF OxFf
OxFF OXff OXfF OXFf OXFF
Keep in mind that octal and hexadecimal are nothing more
than an alternative way of writing numbers; they have no effect on how the
numbers are actually stored. (Integers are always stored in binary, regardless
of what notation we've used to express them.) We can switch from one notation
to another at any time, and even mix them: 10 + 015 + 0x2 0 has the value 55
(decimal). Octal and hex are most convenient for writing low-level programs.
The
type of a decimal integer constant is normally int. However, if the
value of the constant is too large to store as an int, the constant has type
long int instead. In the unlikely case that the constant is too large to store
as a long int, the compiler will try unsigned long int as a last resort. The
rules for determining the type of an octal or hexadecimal
constant are slightly different: the compiler will go through the types int, unsigned
int, long int, and unsigned long int until it finds one capable of representing
the constant.
To force
the compiler to treat a constant as a long integer, just follow it with the
letter L (or 1):
15L 0377L
0x7fffL
To indicate that a constant is unsigned,
put the letter U (or u) after it: 15U
0377U 0x7fffU
L and U
may be used in combination to show that a constant is both long and unsigned:
Oxf f f f f f f fUL. (The order of the L and U doesn't matter, nor does their
case.)
Integer Constants in C99
In C99, integer constants that end with either LL or 11 (the
case of the two letters must match) have type long long int. Adding the letter
U (or u) before or after the LL or 11 denotes a constant of type unsigned long
long int.
C99's general rules for determining the type of an integer
constant are a bit different from those in C89. The type of a decimal constant
with no suffix (U, u, L, 1. LL, or 11) is the "smallest" of the types
int, long int, or long long int that can represent the value of that constant.
For an octal or hexadecimal constant, however, the list of possible types is
int, unsigned int, long int, unsigned long int, long long int, and unsigned
long long int, in that order. Any suffix at the end of a constant changes the
list of possible types. For example, a constant that ends with U (or u) must
have one of the types unsigned int, unsigned long int, or unsigned long long
int. A decimal constant that ends with L (or 1) must have one of the types
long int or long long int. There's also a provision for a constant to have an
extended integer type if it's too large to represent using one of the standard
integer types.
Integer
Overflow
When arithmetic operations are performed on integers, it's
possible that the result will be too large to represent. For example, when an
arithmetic operation is performed on two int values, the result must be able
to be represented as an int. If the result can't be represented as an int
(because it requires too many bits), we say that overflow has
occurred.
The
behavior when integer overflow occurs depends on whether the operands were
signed or unsigned. When overflow occurs during an operation on signed integers,
the program's behavior is undefined.Most likely the result of the operation
will simply be wrong, but the program could crash or exhibit other undesirable
behavior.
When
overflow occurs during an operation on unsigned integers, though,
the result is defined: we get the correct answer modulo 2",
where n is the number of bits used to store the result. For example,
if we add 1 to the unsigned 16-bit number 65,535, the result is guaranteed to
be 0.
Reading
and Writing Integers
Suppose that a program isn't working because one of its int
variables is overflowing. Our first thought is to change the type of the variable
from int to lonq int. But we're not done yet; we need to see how the change
will affect the rest of the program. In particular, we must check whether the
variable is used in a call of printf or scanf. If so, the format string in the
call will need to be changed, since the %d conversion works only for the int
type.
Reading
and writing unsigned, short, and long integers requires several new conversion
specifiers:
When
reading or writing an unsigned integer, use the letter u, o, or x
instead ofd in the conversion specification. If the u specifier is present, the
number is read (or written) in decimal notation; o indicates octal notation,
and x indicates hexadecimal notation.
unsigned int u;
scanf("%u",
|
&u) ;
|
/*
|
reads
|
u
|
in
|
base
|
10
|
*/
|
printf("%u" ,
|
, u);
|
/*
|
writes
|
u
|
in
|
base
|
10
|
*/
|
scanf("%o",
|
&u) ;
|
/*
|
reads
|
u
|
in
|
base
|
8
|
*/
|
printf("%o",
|
r
U) ;
|
/*
|
writes
|
u
|
in
|
base
|
8
|
*/
|
Floating Types
The
integer types aren't suitable for all applications. Sometimes we'll need variables
that can store numbers with digits after the decimal point, or numbers that are
exceedingly large or small. Numbers like these are stored in floating-point
format (so called because the decimal point "floats"). C provides
three floating types, corresponding to different floating-point
formats:
float Single-precision
floating-point
double Double-precision
floating-point
long
double Extended-precision
floating-point
float
is suitable when the amount of precision isn't critical (calculating temperatures
to one decimal point, for example), double provides greater precision— enough
for most programs, long double, which supplies the ultimate in precision, is
rarely used.
The
C standard doesn't state how much precision the float, double, and long double
types provide, since different computers may store floating-point numbers in
different ways. Most modern computers follow the specifications in IEEE Standard
754 (also known as IEC 60559).
Floating
Constants
Floating
constants can be written in a variety of ways. The following constants, for
example, are all valid ways of writing the number 57.0:
57.0 57. 57.OeO 57E0 5.7el
5.7e+l .
57e2 570.e-1
A
floating constant must contain a decimal point and/or an exponent: the exponent
indicates the power of 10 by which the number is to be scaled. If an exponent
is present, it must be preceded by the letter E (or e). An optional + or - sign
may appear after the E (or e).
By
default, floating constants are stored as double-precision numbers. In other
words, when a C compiler finds the constant 57.0 in a program, it arranges for
the number to be stored in memory in the same format as a double variable. This
rule generally causes no problems, since double values are converted automatically
to float when necessary.
On
occasion, it may be necessary to force the compiler to store a floating constant
in float or long double format. To indicate that only single precision is
desired, put the letter F (or f) at the end of the constant (for
example, 57 . OF). To indicate
that a constant should be stored in long double format, put the letter L (or 1)
at the end (57 . OL).
C99
has a provision for writing floating constants in hexadecimal. Such a constant
begins with Ox or OX (like a hexadecimal integer
constant). This feature is rarely used.
Character Types
The only remaining basic type is char, the
character type. The values of type char can vary from one computer to another,
because different machines may have different underlying character sets.
Character Sets
Today's
most popular character set is
ASCII (American Standard Code for
Information Interchange), a 7-bit code capable of representing 128 characters.
In ASCII, the digits 0 to 9 are represented by the codes 0110000-0111001, and
the uppercase letters A to Z are represented by 1000001-1011010. ASCII is
often extended to a 256-character code known as Latin-1 that
provides the characters necessary for Western European and many African
languages.
Operations on Characters
Working
with characters in C is simple, because of one fact: C treats characters as
small integers. After all, characters are encoded in binary, and it doesn't
take much imagination to view these binary codes as integers. In ASCII, for
example, character codes range from 0000000 to 1111111. which we can think of
as the integers from 0 to 127. The character ' a' has the value 97, ' A' has
the value 65, ' 0 1 has the value 48, and 1 1 has the
value 32. The connection between characters and integers in C is so strong that
character constants actually have int type rather than char type (an
interesting fact, but not one that will often matter to us). When a character
appears in a computation, C simply uses its integer value.
Signed and Unsigned Characters
Since C allows characters to be used as integers, it shouldn't
be surprising that the char type—like the integer types—exists in both signed
and unsigned versions. Signed characters normally have values between -128 and
127. while unsigned characters have values between 0 and 255.
The
C standard doesn't specify whether ordinary char is a signed or an unsigned
type; some compilers treat it as a signed type, while others treat it as an
unsigned type. (Some even allow the programmer to select, via a compiler
option, whether char should be signed or unsigned.)
Most of
the time, we don't really care whether char is signed or unsigned. Once in a
while, though, we do, especially if we're using a character variable to store a
small integer. For this reason. C allows the use of the words signed and
unsigned to modify char:
signed char sch; unsigned char uch;
Don'/ assume that
char is either signed or unsigned by default. If it matters, use signed
char or unsigned char instead of char.
In light of the close relationship between characters and
integers, C89 uses the term integral types to refer to both the integer
types and the character types. Enumerated types are also integral types.
C99
doesn't use the term "integral types." Instead, it expands the
meaning of "integer types" to include the character types and the
enumerated types. C99's _Bool type is considered to be an unsigned integer
type.
Arithmetic Types
The
integer types and floating types are collectively known as arithmetic
types. Here's a summary of the arithmetic types in C89, divided into
categories and subcategories:
Integral types
char
Signed
integer types (signed char, short int, int, long int)
Unsigned integer types (unsigned
char, unsigned short int, unsigned int. unsigned long int).
Escape Sequences
A character constant is usually one character enclosed in
single quotes, as we've seen in previous examples. However, certain special
characters—including the new-line character—can't be written in this way.
because they're invisible (nonprinting) or because they can't be entered from
the keyboard. So that programs can deal with every character in the underlying
character set, C provides a special notation. the escape sequence.
There are
two kinds of escape sequences: character escapes and numeric escapes.
Name
|
Escape Sequence
|
Alert (bell)
|
\a
|
Backspace
|
\b
|
Form feed
|
\f
|
New line
|
\n
|
Carriage return
|
\r
|
Horizontal lab
|
\t
|
Vertical tab
|
\v
|
Backslash
|
//
|
Question mark
|
\?
|
Single quote
|
\’
|
Double quote
|
\”
|
Reading
and Writing Characters using scanf and printf
The %c conversion specification allows scanf and printf to
read and write single characters:
char ch;
scanf("%c",
&ch); /* reads a single character */ printf ("%c", ch) ;/* writes
a single character */
scanf
doesn't skip white-space characters before reading a character. If the next
unread character is a space, then the variable ch in the previous example will
contain a space after scanf returns. To force scanf to skip white space before
reading a character, put a space in its format string just before %c:
scanf("
%c", &ch);
/* skips
white space, then reads ch */
Since
scanf doesn't normally skip white space, it's easy to detect the end of an
input line: check to see if the character just read is the new-line character.
No comments:
Post a Comment