ZeePedia

The C in C++:Creating functions, Controlling execution, Introduction to operators

<< Making & Using Objects:Tools for separate compilation, Reading and writing files
Data Abstraction:The basic object, Abstract data typing, Header file etiquette >>
img
3: The C in C++
Since C++ is based on C, you must be familiar with the
syntax of C in order to program in C++, just as you
must be reasonably fluent in algebra in order to tackle
calculus.
121
img
If you've never seen C before, this chapter will give you a decent
background in the style of C used in C++. If you are familiar with
the style of C described in the first edition of Kernighan & Ritchie
(often called K&R C), you will find some new and different features
in C++ as well as in Standard C. If you are familiar with Standard
C, you should skim through this chapter looking for features that
are particular to C++. Note that there are some fundamental C++
features introduced here, which are basic ideas that are akin to the
features in C or often modifications to the way that C does things.
The more sophisticated C++ features will not be introduced until
later chapters.
This chapter is a fairly fast coverage of C constructs and
introduction to some basic C++ constructs, with the understanding
that you've had some experience programming in another
language. A more gentle introduction to C is found in the CD ROM
packaged in the back of this book, titled Thinking in C: Foundations
for Java & C++ by Chuck Allison (published by MindView, Inc., and
also available at www.MindView.net). This is a seminar on a CD
ROM with the goal of taking you carefully through the
fundamentals of the C language. It focuses on the knowledge
necessary for you to be able to move on to the C++ or Java
languages rather than trying to make you an expert in all the dark
corners of C (one of the reasons for using a higher-level language
like C++ or Java is precisely so we can avoid many of these dark
corners). It also contains exercises and guided solutions. Keep in
mind that because this chapter goes beyond the Thinking in C CD,
the CD is not a replacement for this chapter, but should be used
instead as a preparation for this chapter and for the book.
Creating functions
In old (pre-Standard) C, you could call a function with any number
or type of arguments and the compiler wouldn't complain.
122
Thinking in C++
img
Everything seemed fine until you ran the program. You got
mysterious results (or worse, the program crashed) with no hints as
to why. The lack of help with argument passing and the enigmatic
bugs that resulted is probably one reason why C was dubbed a
"high-level assembly language." Pre-Standard C programmers just
adapted to it.
Standard C and C++ use a feature called function prototyping. With
function prototyping, you must use a description of the types of
arguments when declaring and defining a function. This
description is the "prototype." When the function is called, the
compiler uses the prototype to ensure that the proper arguments
are passed in and that the return value is treated correctly. If the
programmer makes a mistake when calling the function, the
compiler catches the mistake.
Essentially, you learned about function prototyping (without
naming it as such) in the previous chapter, since the form of
function declaration in C++ requires proper prototyping. In a
function prototype, the argument list contains the types of
arguments that must be passed to the function and (optionally for
the declaration) identifiers for the arguments. The order and type of
the arguments must match in the declaration, definition, and
function call. Here's an example of a function prototype in a
declaration:
int translate(float x, float y, float z);
You do not use the same form when declaring variables in function
prototypes as you do in ordinary variable definitions. That is, you
cannot say: float x, y, z You must indicate the type of each
.
argument. In a function declaration, the following form is also
acceptable:
int translate(float, float, float);
3: The C in C++
123
img
Since the compiler doesn't do anything but check for types when
the function is called, the identifiers are only included for clarity
when someone is reading the code.
In the function definition, names are required because the
arguments are referenced inside the function:
int translate(float x, float y, float z) {
x = y = z;
// ...
}
It turns out this rule applies only to C. In C++, an argument may be
unnamed in the argument list of the function definition. Since it is
unnamed, you cannot use it in the function body, of course.
Unnamed arguments are allowed to give the programmer a way to
"reserve space in the argument list." Whoever uses the function
must still call the function with the proper arguments. However,
the person creating the function can then use the argument in the
future without forcing modification of code that calls the function.
This option of ignoring an argument in the list is also possible if
you leave the name in, but you will get an annoying warning
message about the value being unused every time you compile the
function. The warning is eliminated if you remove the name.
C and C++ have two other ways to declare an argument list. If you
have an empty argument list, you can declare it as func( ) in C++,
which tells the compiler there are exactly zero arguments. You
should be aware that this only means an empty argument list in
C++. In C it means "an indeterminate number of arguments (which
is a "hole" in C since it disables type checking in that case). In both
C and C++, the declaration func(void);means an empty argument
list. The void keyword means "nothing" in this case (it can also
mean "no type" in the case of pointers, as you'll see later in this
chapter).
The other option for argument lists occurs when you don't know
how many arguments or what type of arguments you will have;
124
Thinking in C++
img
this is called a variable argument list. This "uncertain argument list"
is represented by ellipses (...). Defining a function with a variable
argument list is significantly more complicated than defining a
regular function. You can use a variable argument list for a function
that has a fixed set of arguments if (for some reason) you want to
disable the error checks of function prototyping. Because of this,
you should restrict your use of variable argument lists to C and
avoid them in C++ (in which, as you'll learn, there are much better
alternatives). Handling variable argument lists is described in the
library section of your local C guide.
Function return values
A C++ function prototype must specify the return value type of the
function (in C, if you leave off the return value type it defaults to
int). The return type specification precedes the function name. To
specify that no value is returned, use the void keyword. This will
generate an error if you try to return a value from the function.
Here are some complete function prototypes:
int f1(void); // Returns an int, takes no arguments
int f2(); // Like f1() in C++ but not in Standard C!
float f3(float, int, char, double); // Returns a float
void f4(void); // Takes no arguments, returns nothing
To return a value from a function, you use the return statement.
return exits the function back to the point right after the function
call. If return has an argument, that argument becomes the return
value of the function. If a function says that it will return a
particular type, then each return statement must return that type.
You can have more than one return statement in a function
definition:
//: C03:Return.cpp
// Use of "return"
#include <iostream>
using namespace std;
char cfunc(int i) {
3: The C in C++
125
img
if(i == 0)
return 'a';
if(i == 1)
return 'g';
if(i == 5)
return 'z';
return 'c';
}
int main() {
cout << "type an integer: ";
int val;
cin >> val;
cout << cfunc(val) << endl;
} ///:~
In cfunc( ) the first if that evaluates to true exits the function via
,
the return statement. Notice that a function declaration is not
necessary because the function definition appears before it is used
in main( ), so the compiler knows about it from that function
definition.
Using the C function library
All the functions in your local C function library are available while
you are programming in C++. You should look hard at the function
library before defining your own function ­ there's a good chance
that someone has already solved your problem for you, and
probably given it a lot more thought and debugging.
A word of caution, though: many compilers include a lot of extra
functions that make life even easier and are tempting to use, but are
not part of the Standard C library. If you are certain you will never
want to move the application to another platform (and who is
certain of that?), go ahead ­use those functions and make your life
easier. If you want your application to be portable, you should
restrict yourself to Standard library functions. If you must perform
platform-specific activities, try to isolate that code in one spot so it
can be changed easily when porting to another platform. In C++,
126
Thinking in C++
img
platform-specific activities are often encapsulated in a class, which
is the ideal solution.
The formula for using a library function is as follows: first, find the
function in your programming reference (many programming
references will index the function by category as well as
alphabetically). The description of the function should include a
section that demonstrates the syntax of the code. The top of this
section usually has at least one #includeline, showing you the
header file containing the function prototype. Duplicate this
#includeline in your file so the function is properly declared. Now
you can call the function in the same way it appears in the syntax
section. If you make a mistake, the compiler will discover it by
comparing your function call to the function prototype in the
header and tell you about your error. The linker searches the
Standard library by default, so that's all you need to do: include the
header file and call the function.
Creating your own libraries with the librarian
You can collect your own functions together into a library. Most
programming packages come with a librarian that manages groups
of object modules. Each librarian has its own commands, but the
general idea is this: if you want to create a library, make a header
file containing the function prototypes for all the functions in your
library. Put this header file somewhere in the preprocessor's search
path, either in the local directory (so it can be found by #include
"header" or in the include directory (so it can be found by
)
#include <header> Now take all the object modules and hand
).
them to the librarian along with a name for the finished library
(most librarians require a common extension, such as .lib or .a).
Place the finished library where the other libraries reside so the
linker can find it. When you use your library, you will have to add
something to the command line so the linker knows to search the
library for the functions you call. You must find all the details in
your local manual, since they vary from system to system.
3: The C in C++
127
img
Controlling execution
This section covers the execution control statements in C++. You
must be familiar with these statements before you can read and
write C or C++ code.
C++ uses all of C's execution control statements. These include if-
else, while, do-while for, and a selection statement called switch.
,
C++ also allows the infamous goto, which will be avoided in this
book.
True and false
All conditional statements use the truth or falsehood of a
conditional expression to determine the execution path. An
example of a conditional expression is A == B. This uses the
conditional operator == to see if the variable A is equivalent to the
variable B. The expression produces a Boolean true or false (these
are keywords only in C++; in C an expression is "true" if it
evaluates to a nonzero value). Other conditional operators are >, <,
>=, etc. Conditional statements are covered more fully later in this
chapter.
if-else
The if-else statement can exist in two forms: with or without the
else. The two forms are:
if(expression)
statement
or
if(expression)
statement
else
statement
The "expression" evaluates to true or false. The "statement" means
either a simple statement terminated by a semicolon or a
128
Thinking in C++
img
.
compound statement, which is a group of simple statements
enclosed in braces. Any time the word "statement" is used, it
always implies that the statement is simple or compound. Note that
this statement can also be another if, so they can be strung together.
//: C03:Ifthen.cpp
// Demonstration of if and if-else conditionals
#include <iostream>
using namespace std;
int main() {
int i;
cout << "type a number and 'Enter'" << endl;
cin >> i;
if(i > 5)
cout << "It's greater than 5" << endl;
else
if(i < 5)
cout << "It's less than 5 " << endl;
else
cout << "It's equal to 5 " << endl;
cout << "type a number and 'Enter'" << endl;
cin >> i;
if(i < 10)
if(i > 5)  // "if" is just another statement
cout << "5 < i < 10" << endl;
else
cout << "i <= 5" << endl;
else // Matches "if(i < 10)"
cout << "i >= 10" << endl;
} ///:~
It is conventional to indent the body of a control flow statement so
the reader may easily determine where it begins and ends1.
1 Note that all conventions seem to end after the agreement that some sort of
indentation take place. The feud between styles of code formatting is unending. See
Appendix A for the description of this book's coding style.
3: The C in C++
129
img
while
while, do-while,and for control looping. A statement repeats until
the controlling expression evaluates to false. The form of a while
loop is
while(expression)
statement
The expression is evaluated once at the beginning of the loop and
again before each further iteration of the statement.
This example stays in the body of the while loop until you type the
secret number or press control-C.
//: C03:Guess.cpp
// Guess a number (demonstrates "while")
#include <iostream>
using namespace std;
int main() {
int secret = 15;
int guess = 0;
// "!=" is the "not-equal" conditional:
while(guess != secret) { // Compound statement
cout << "guess the number: ";
cin >> guess;
}
cout << "You guessed it!" << endl;
} ///:~
The while's conditional expression is not restricted to a simple test
as in the example above; it can be as complicated as you like as long
as it produces a true or false result. You will even see code where
the loop has no body, just a bare semicolon:
while(/* Do a lot here */)
;
In these cases, the programmer has written the conditional
expression not only to perform the test but also to do the work.
130
Thinking in C++
img
do-while
The form of do-whileis
do
statement
while(expression);
The do-whileis different from the while because the statement
always executes at least once, even if the expression evaluates to
false the first time. In a regular while, if the conditional is false the
first time the statement never executes.
If a do-whileis used in Guess.cpp the variable guess does not
,
need an initial dummy value, since it is initialized by the cin
statement before it is tested:
//: C03:Guess2.cpp
// The guess program using do-while
#include <iostream>
using namespace std;
int main() {
int secret = 15;
int guess; // No initialization needed here
do {
cout << "guess the number: ";
cin >> guess; // Initialization happens
}
while(guess != secret);
cout << "You got it!" << endl;
} ///:~
For some reason, most programmers tend to avoid do-whileand
just work with while.
for
A for loop performs initialization before the first iteration. Then it
performs conditional testing and, at the end of each iteration, some
form of "stepping." The form of the for loop is:
for(initialization; conditional; step)
3: The C in C++
131
img
statement
Any of the expressions initialization, conditional, or step may be
empty. The initialization code executes once at the very beginning.
The conditional is tested before each iteration (if it evaluates to false
at the beginning, the statement never executes). At the end of each
loop, the step executes.
for loops are usually used for "counting" tasks:
//: C03:Charlist.cpp
// Display all the ASCII characters
// Demonstrates "for"
#include <iostream>
using namespace std;
int main() {
for(int i = 0; i < 128; i = i + 1)
if (i != 26)  // ANSI Terminal Clear screen
cout << " value: " << i
<< " character: "
<< char(i) // Type conversion
<< endl;
} ///:~
You may notice that the variable i is defined at the point where it is
used, instead of at the beginning of the block denoted by the open
curly brace `{'. This is in contrast to traditional procedural
languages (including C), which require that all variables be defined
at the beginning of the block. This will be discussed later in this
chapter.
The break and continue keywords
Inside the body of any of the looping constructs while, do-while,or
for, you can control the flow of the loop using break and continue
.
break quits the loop without executing the rest of the statements in
the loop. continuestops the execution of the current iteration and
goes back to the beginning of the loop to begin a new iteration.
132
Thinking in C++
img
As an example of break and continue this program is a very
,
simple menu system:
//: C03:Menu.cpp
// Simple menu program demonstrating
// the use of "break" and "continue"
#include <iostream>
using namespace std;
int main() {
char c; // To hold response
while(true) {
cout << "MAIN MENU:" << endl;
cout << "l: left, r: right, q: quit -> ";
cin >> c;
if(c == 'q')
break; // Out of "while(1)"
if(c == 'l') {
cout << "LEFT MENU:" << endl;
cout << "select a or b: ";
cin >> c;
if(c == 'a') {
cout << "you chose 'a'" << endl;
continue; // Back to main menu
}
if(c == 'b') {
cout << "you chose 'b'" << endl;
continue; // Back to main menu
}
else {
cout << "you didn't choose a or b!"
<< endl;
continue; // Back to main menu
}
}
if(c == 'r') {
cout << "RIGHT MENU:" << endl;
cout << "select c or d: ";
cin >> c;
if(c == 'c') {
cout << "you chose 'c'" << endl;
continue; // Back to main menu
}
if(c == 'd') {
3: The C in C++
133
img
cout << "you chose 'd'" << endl;
continue; // Back to main menu
}
else {
cout << "you didn't choose c or d!"
<< endl;
continue; // Back to main menu
}
}
cout << "you must type l or r or q!" << endl;
}
cout << "quitting menu..." << endl;
} ///:~
If the user selects `q' in the main menu, the break keyword is used
to quit, otherwise the program just continues to execute
indefinitely. After each of the sub-menu selections, the continue
keyword is used to pop back up to the beginning of the while loop.
The while(true)statement is the equivalent of saying "do this loop
forever." The break statement allows you to break out of this
infinite while loop when the user types a `q.'
switch
A switch statement selects from among pieces of code based on the
value of an integral expression. Its form is:
switch(selector) {
case integral-value1
:
break;
statement;
case integral-value2
:
break;
statement;
case integral-value3
:
break;
statement;
case integral-value4
:
break;
statement;
case integral-value5
:
break;
statement;
(...)
default: statement;
}
Selector is an expression that produces an integral value. The switch
compares the result of selector to each integral value. If it finds a
match, the corresponding statement (simple or compound)
executes. If no match occurs, the default statement executes.
134
Thinking in C++
img
You will notice in the definition above that each case ends with a
break, which causes execution to jump to the end of the switch
body (the closing brace that completes the switch). This is the
conventional way to build a switch statement, but the break is
optional. If it is missing, your case "drops through" to the one after
it. That is, the code for the following case statements execute until a
break is encountered. Although you don't usually want this kind of
behavior, it can be useful to an experienced programmer.
The switch statement is a clean way to implement multi-way
selection (i.e., selecting from among a number of different
execution paths), but it requires a selector that evaluates to an
integral value at compile-time. If you want to use, for example, a
string object as a selector, it won't work in a switch statement. For
a string selector, you must instead use a series of if statements and
compare the string inside the conditional.
The menu example shown above provides a particularly nice
example of a switch:
//: C03:Menu2.cpp
// A menu using a switch statement
#include <iostream>
using namespace std;
int main() {
bool quit = false;  // Flag for quitting
while(quit == false) {
cout << "Select a, b, c or q to quit: ";
char response;
cin >> response;
switch(response) {
case 'a' : cout << "you chose 'a'" << endl;
break;
case 'b' : cout << "you chose 'b'" << endl;
break;
case 'c' : cout << "you chose 'c'" << endl;
break;
case 'q' : cout << "quitting menu" << endl;
quit = true;
3: The C in C++
135
img
break;
default
: cout << "Please use a,b,c or q!"
<< endl;
}
}
} ///:~
The quit flag is a bool, short for "Boolean," which is a type you'll
find only in C++. It can have only the keyword values true or false.
Selecting `q' sets the quit flag to true. The next time the selector is
evaluated, quit == falsereturns false so the body of the while does
not execute.
Using and misusing goto
The goto keyword is supported in C++, since it exists in C. Using
goto is often dismissed as poor programming style, and most of the
time it is. Anytime you use goto, look at your code and see if
there's another way to do it. On rare occasions, you may discover
goto can solve a problem that can't be solved otherwise, but still,
consider it carefully. Here's an example that might make a
plausible candidate:
//: C03:gotoKeyword.cpp
// The infamous goto is supported in C++
#include <iostream>
using namespace std;
int main() {
long val = 0;
for(int i = 1; i < 1000; i++) {
for(int j = 1; j < 100; j += 10) {
val = i * j;
if(val > 47000)
goto bottom;
// Break would only go to the outer 'for'
}
}
bottom: // A label
cout << val << endl;
} ///:~
136
Thinking in C++
img
.
The alternative would be to set a Boolean that is tested in the outer
for loop, and then do a break from the inner for loop. However, if
you have several levels of for or while this could get awkward.
Recursion
Recursion is an interesting and sometimes useful programming
technique whereby you call the function that you're in. Of course, if
this is all you do, you'll keep calling the function you're in until
you run out of memory, so there must be some way to "bottom
out" the recursive call. In the following example, this "bottoming
out" is accomplished by simply saying that the recursion will go
only until the cat exceeds `Z':2
//: C03:CatsInHats.cpp
// Simple demonstration of recursion
#include <iostream>
using namespace std;
void removeHat(char cat) {
for(char c = 'A'; c < cat; c++)
cout << "  ";
if(cat <= 'Z') {
cout << "cat " << cat << endl;
removeHat(cat + 1); // Recursive call
} else
cout << "VOOM!!!" << endl;
}
int main() {
removeHat('A');
} ///:~
In removeHat( ) you can see that as long as cat is less than `Z',
,
removeHat( )will be called from within removeHat( ) thus
,
effecting the recursion. Each time removeHat( )is called, its
2 Thanks to Kris C. Matson for suggesting this exercise topic.
3: The C in C++
137
img
argument is one greater than the current cat so the argument keeps
increasing.
Recursion is often used when evaluating some sort of arbitrarily
complex problem, since you aren't restricted to a particular "size"
for the solution ­ the function can just keep recursing until it's
reached the end of the problem.
Introduction to operators
You can think of operators as a special type of function (you'll learn
that C++ operator overloading treats operators precisely that way).
An operator takes one or more arguments and produces a new
value. The arguments are in a different form than ordinary function
calls, but the effect is the same.
From your previous programming experience, you should be
reasonably comfortable with the operators that have been used so
far. The concepts of addition (+), subtraction and unary minus (-),
multiplication (*), division (/), and assignment(=) all have
essentially the same meaning in any programming language. The
full set of operators is enumerated later in this chapter.
Precedence
Operator precedence defines the order in which an expression
evaluates when several different operators are present. C and C++
have specific rules to determine the order of evaluation. The easiest
to remember is that multiplication and division happen before
addition and subtraction. After that, if an expression isn't
transparent to you it probably won't be for anyone reading the
code, so you should use parentheses to make the order of
evaluation explicit. For example:
A = X + Y - 2/2 + Z;
138
Thinking in C++
img
has a very different meaning from the same statement with a
particular grouping of parentheses:
A = X + (Y - 2)/(2 + Z);
(Try evaluating the result with X = 1, Y = 2, and Z = 3.)
Auto increment and decrement
C, and therefore C++, is full of shortcuts. Shortcuts can make code
much easier to type, and sometimes much harder to read. Perhaps
the C language designers thought it would be easier to understand
a tricky piece of code if your eyes didn't have to scan as large an
area of print.
One of the nicer shortcuts is the auto-increment and auto-
decrement operators. You often use these to change loop variables,
which control the number of times a loop executes.
The auto-decrement operator is `--' and means "decrease by one
unit." The auto-increment operator is `++' and means "increase by
one unit." If A is an int, for example, the expression ++A is
equivalent to (A = A + 1 Auto-increment and auto-decrement
).
operators produce the value of the variable as a result. If the
operator appears before the variable, (i.e., ++A), the operation is
first performed and the resulting value is produced. If the operator
appears after the variable (i.e. A++), the current value is produced,
and then the operation is performed. For example:
//: C03:AutoIncrement.cpp
// Shows use of auto-increment
// and auto-decrement operators.
#include <iostream>
using namespace std;
int main() {
int i = 0;
int j = 0;
cout << ++i << endl; // Pre-increment
cout << j++ << endl; // Post-increment
3: The C in C++
139
img
cout << --i << endl; // Pre-decrement
cout << j-- << endl; // Post decrement
} ///:~
If you've been wondering about the name "C++," now you
understand. It implies "one step beyond C."
Introduction to data types
Data types define the way you use storage (memory) in the
programs you write. By specifying a data type, you tell the
compiler how to create a particular piece of storage, and also how
to manipulate that storage.
Data types can be built-in or abstract. A built-in data type is one
that the compiler intrinsically understands, one that is wired
directly into the compiler. The types of built-in data are almost
identical in C and C++. In contrast, a user-defined data type is one
that you or another programmer create as a class. These are
commonly referred to as abstract data types. The compiler knows
how to handle built-in types when it starts up; it "learns" how to
handle abstract data types by reading header files containing class
declarations (you'll learn about this in later chapters).
Basic built-in types
The Standard C specification for built-in types (which C++ inherits)
doesn't say how many bits each of the built-in types must contain.
Instead, it stipulates the minimum and maximum values that the
built-in type must be able to hold. When a machine is based on
binary, this maximum value can be directly translated into a
minimum number of bits necessary to hold that value. However, if
a machine uses, for example, binary-coded decimal (BCD) to
represent numbers, then the amount of space in the machine
required to hold the maximum numbers for each data type will be
different. The minimum and maximum values that can be stored in
the various data types are defined in the system header files
140
Thinking in C++
img
limits.hand float.h (in C++ you will generally #include <climits>
and <cfloat>instead).
C and C++ have four basic built-in data types, described here for
binary-based machines. A char is for character storage and uses a
minimum of 8 bits (one byte) of storage, although it may be larger.
An int stores an integral number and uses a minimum of two bytes
of storage. The float and double types store floating-point
numbers, usually in IEEE floating-point format. float is for single-
precision floating point and double is for double-precision floating
point.
As mentioned previously, you can define variables anywhere in a
scope, and you can define and initialize them at the same time.
Here's how to define variables using the four basic data types:
//: C03:Basic.cpp
// Defining the four basic data
// types in C and C++
int main() {
// Definition without initialization:
char protein;
int carbohydrates;
float fiber;
double fat;
// Simultaneous definition & initialization:
char pizza = 'A', pop = 'Z';
int dongdings = 100, twinkles = 150,
heehos = 200;
float chocolate = 3.14159;
// Exponential notation:
double fudge_ripple = 6e-4;
} ///:~
The first part of the program defines variables of the four basic data
types without initializing them. If you don't initialize a variable, the
Standard says that its contents are undefined (usually, this means
they contain garbage). The second part of the program defines and
initializes variables at the same time (it's always best, if possible, to
3: The C in C++
141
img
provide an initialization value at the point of definition). Notice the
use of exponential notation in the constant 6e-4, meaning "6 times
10 to the minus fourth power."
bool, true, & false
Before bool became part of Standard C++, everyone tended to use
different techniques in order to produce Boolean-like behavior.
These produced portability problems and could introduce subtle
errors.
The Standard C++ bool type can have two states expressed by the
built-in constants true (which converts to an integral one) and false
(which converts to an integral zero). All three names are keywords.
In addition, some language elements have been adapted:
Element
Usage with bool
&& || !
Take bool arguments and
produce bool results.
< > <=
Produce bool results.
>= == !=
if, for,
Conditional expressions
while, do
convert to bool values.
?:
First operand converts to bool
value.
Because there's a lot of existing code that uses an int to represent a
flag, the compiler will implicitly convert from an int to a bool
(nonzero values will produce true while zero values produce false).
Ideally, the compiler will give you a warning as a suggestion to
correct the situation.
An idiom that falls under "poor programming style" is the use of
++ to set a flag to true. This is still allowed, but deprecated, which
means that at some time in the future it will be made illegal. The
142
Thinking in C++
img
problem is that you're making an implicit type conversion from
bool to int, incrementing the value (perhaps beyond the range of
the normal bool values of zero and one), and then implicitly
converting it back again.
Pointers (which will be introduced later in this chapter) will also be
automatically converted to bool when necessary.
Specifiers
Specifiers modify the meanings of the basic built-in types and
expand them to a much larger set. There are four specifiers: long,
short, signed, and unsigned
.
long and short modify the maximum and minimum values that a
data type will hold. A plain int must be at least the size of a short.
The size hierarchy for integral types is: short int, int, long int. All
the sizes could conceivably be the same, as long as they satisfy the
minimum/maximum value requirements. On a machine with a 64-
bit word, for instance, all the data types might be 64 bits.
The size hierarchy for floating point numbers is: float, double, and
long double. "long float" is not a legal type. There are no short
floating-point numbers.
The signed and unsignedspecifiers tell the compiler how to use
the sign bit with integral types and characters (floating-point
numbers always contain a sign). An unsignednumber does not
keep track of the sign and thus has an extra bit available, so it can
store positive numbers twice as large as the positive numbers that
can be stored in a signed number. signed is the default and is only
necessary with char; char may or may not default to signed. By
specifying signed char, you force the sign bit to be used.
The following example shows the size of the data types in bytes by
using the sizeof operator, introduced later in this chapter:
//: C03:Specify.cpp
3: The C in C++
143
img
// Demonstrates the use of specifiers
#include <iostream>
using namespace std;
int main() {
char c;
unsigned char cu;
int i;
unsigned int iu;
short int is;
short iis; // Same as short int
unsigned short int isu;
unsigned short iisu;
long int il;
long iil;  // Same as long int
unsigned long int ilu;
unsigned long iilu;
float f;
double d;
long double ld;
cout
<< "\n char= " << sizeof(c)
<< "\n unsigned char = " << sizeof(cu)
<< "\n int = " << sizeof(i)
<< "\n unsigned int = " << sizeof(iu)
<< "\n short = " << sizeof(is)
<< "\n unsigned short = " << sizeof(isu)
<< "\n long = " << sizeof(il)
<< "\n unsigned long = " << sizeof(ilu)
<< "\n float = " << sizeof(f)
<< "\n double = " << sizeof(d)
<< "\n long double = " << sizeof(ld)
<< endl;
} ///:~
Be aware that the results you get by running this program will
probably be different from one machine/operating
system/compiler to the next, since (as mentioned previously) the
only thing that must be consistent is that each different type hold
the minimum and maximum values specified in the Standard.
When you are modifying an int with short or long, the keyword int
is optional, as shown above.
144
Thinking in C++
img
Introduction to pointers
Whenever you run a program, it is first loaded (typically from disk)
into the computer's memory. Thus, all elements of your program
are located somewhere in memory. Memory is typically laid out as
a sequential series of memory locations; we usually refer to these
locations as eight-bit bytes but actually the size of each space
depends on the architecture of the particular machine and is
usually called that machine's word size. Each space can be uniquely
distinguished from all other spaces by its address. For the purposes
of this discussion, we'll just say that all machines use bytes that
have sequential addresses starting at zero and going up to however
much memory you have in your computer.
Since your program lives in memory while it's being run, every
element of your program has an address. Suppose we start with a
simple program:
//: C03:YourPets1.cpp
#include <iostream>
using namespace std;
int dog, cat, bird, fish;
void f(int pet) {
cout << "pet id number: " << pet << endl;
}
int main() {
int i, j, k;
} ///:~
Each of the elements in this program has a location in storage when
the program is running. Even the function occupies storage. As
you'll see, it turns out that what an element is and the way you
define it usually determines the area of memory where that
element is placed.
There is an operator in C and C++ that will tell you the address of
an element. This is the `&' operator. All you do is precede the
3: The C in C++
145
img
identifier name with `&' and it will produce the address of that
identifier. YourPets1.cppcan be modified to print out the addresses
of all its elements, like this:
//: C03:YourPets2.cpp
#include <iostream>
using namespace std;
int dog, cat, bird, fish;
void f(int pet) {
cout << "pet id number: " << pet << endl;
}
int main() {
int i, j, k;
cout << "f(): " << (long)&f << endl;
cout << "dog: " << (long)&dog << endl;
cout << "cat: " << (long)&cat << endl;
cout << "bird: " << (long)&bird << endl;
cout << "fish: " << (long)&fish << endl;
cout << "i: " << (long)&i << endl;
cout << "j: " << (long)&j << endl;
cout << "k: " << (long)&k << endl;
} ///:~
The (long) is a cast. It says "Don't treat this as if it's normal type,
instead treat it as a long." The cast isn't essential, but if it wasn't
there, the addresses would have been printed out in hexadecimal
instead, so casting to a long makes things a little more readable.
The results of this program will vary depending on your computer,
OS, and all sorts of other factors, but it will always give you some
interesting insights. For a single run on my computer, the results
looked like this:
f(): 4198736
dog: 4323632
cat: 4323636
bird: 4323640
fish: 4323644
i: 6684160
146
Thinking in C++
img
j: 6684156
k: 6684152
You can see how the variables that are defined inside main( ) are in
a different area than the variables defined outside of main( ); you'll
understand why as you learn more about the language. Also, f( )
appears to be in its own area; code is typically separated from data
in memory.
Another interesting thing to note is that variables defined one right
after the other appear to be placed contiguously in memory. They
are separated by the number of bytes that are required by their data
type. Here, the only data type used is int, and cat is four bytes
away from dog, bird is four bytes away from cat, etc. So it would
appear that, on this machine, an int is four bytes long.
Other than this interesting experiment showing how memory is
mapped out, what can you do with an address? The most
important thing you can do is store it inside another variable for
later use. C and C++ have a special type of variable that holds an
address. This variable is called a pointer.
The operator that defines a pointer is the same as the one used for
multiplication: `*'. The compiler knows that it isn't multiplication
because of the context in which it is used, as you will see.
When you define a pointer, you must specify the type of variable it
points to. You start out by giving the type name, then instead of
immediately giving an identifier for the variable, you say "Wait, it's
a pointer" by inserting a star between the type and the identifier. So
a pointer to an int looks like this:
int* ip; // ip points to an int variable
The association of the `*' with the type looks sensible and reads
easily, but it can actually be a bit deceiving. Your inclination might
be to say "intpointer" as if it is a single discrete type. However,
with an int or other basic data type, it's possible to say:
3: The C in C++
147
img
int a, b, c;
whereas with a pointer, you'd like to say:
int* ipa, ipb, ipc;
C syntax (and by inheritance, C++ syntax) does not allow such
sensible expressions. In the definitions above, only ipa is a pointer,
but ipb and ipc are ordinary ints (you can say that "* binds more
tightly to the identifier"). Consequently, the best results can be
achieved by using only one definition per line; you still get the
sensible syntax without the confusion:
int* ipa;
int* ipb;
int* ipc;
Since a general guideline for C++ programming is that you should
always initialize a variable at the point of definition, this form
actually works better. For example, the variables above are not
initialized to any particular value; they hold garbage. It's much
better to say something like:
int a = 47;
int* ipa = &a;
Now both a and ipa have been initialized, and ipa holds the
address of a.
Once you have an initialized pointer, the most basic thing you can
do with it is to use it to modify the value it points to. To access a
variable through a pointer, you dereference the pointer using the
same operator that you used to define it, like this:
*ipa = 100;
Now a contains the value 100 instead of 47.
These are the basics of pointers: you can hold an address, and you
can use that address to modify the original variable. But the
148
Thinking in C++
img
question still remains: why do you want to modify one variable
using another variable as a proxy?
For this introductory view of pointers, we can put the answer into
two broad categories:
1.
To change "outside objects" from within a function. This is
perhaps the most basic use of pointers, and it will be
examined here.
2.
To achieve many other clever programming techniques,
which you'll learn about in portions of the rest of the book.
Modifying the outside object
Ordinarily, when you pass an argument to a function, a copy of
that argument is made inside the function. This is referred to as
pass-by-value. You can see the effect of pass-by-value in the
following program:
//: C03:PassByValue.cpp
#include <iostream>
using namespace std;
void f(int a) {
cout << "a = " << a << endl;
a = 5;
cout << "a = " << a << endl;
}
int main() {
int x = 47;
cout << "x = " << x << endl;
f(x);
cout << "x = " << x << endl;
} ///:~
In f( ), a is a local variable, so it exists only for the duration of the
function call to f( ). Because it's a function argument, the value of a
is initialized by the arguments that are passed when the function is
3: The C in C++
149
img
called; in main( ) the argument is x, which has a value of 47, so this
value is copied into a when f( ) is called.
When you run this program you'll see:
x
=
47
a
=
47
a
=
5
x
=
47
Initially, of course, x is 47. When f( ) is called, temporary space is
created to hold the variable a for the duration of the function call,
and a is initialized by copying the value of x, which is verified by
printing it out. Of course, you can change the value of a and show
that it is changed. But when f( ) is completed, the temporary space
that was created for a disappears, and we see that the only
connection that ever existed between a and x happened when the
value of x was copied into a.
When you're inside f( ), x is the outside object (my terminology), and
changing the local variable does not affect the outside object,
naturally enough, since they are two separate locations in storage.
But what if you do want to modify the outside object? This is where
pointers come in handy. In a sense, a pointer is an alias for another
variable. So if we pass a pointer into a function instead of an
ordinary value, we are actually passing an alias to the outside
object, enabling the function to modify that outside object, like this:
//: C03:PassAddress.cpp
#include <iostream>
using namespace std;
void f(int* p) {
cout << "p = " << p << endl;
cout << "*p = " << *p << endl;
*p = 5;
cout << "p = " << p << endl;
}
int main() {
150
Thinking in C++
img
int x =
47;
cout <<
"x = " << x << endl;
cout <<
"&x = " << &x << endl;
f(&x);
cout <<
"x = " << x << endl;
} ///:~
Now f( ) takes a pointer as an argument and dereferences the
pointer during assignment, and this causes the outside object x to
be modified. The output is:
x = 47
&x = 0065FE00
p = 0065FE00
*p = 47
p = 0065FE00
x=5
Notice that the value contained in p is the same as the address of x
­ the pointer p does indeed point to x. If that isn't convincing
enough, when p is dereferenced to assign the value 5, we see that
the value of x is now changed to 5 as well.
Thus, passing a pointer into a function will allow that function to
modify the outside object. You'll see plenty of other uses for
pointers later, but this is arguably the most basic and possibly the
most common use.
Introduction to C++ references
Pointers work roughly the same in C and in C++, but C++ adds an
additional way to pass an address into a function. This is pass-by-
reference and it exists in several other programming languages so it
was not a C++ invention.
Your initial perception of references may be that they are
unnecessary, that you could write all your programs without
references. In general, this is true, with the exception of a few
important places that you'll learn about later in the book. You'll
also learn more about references later, but the basic idea is the same
3: The C in C++
151
img
as the demonstration of pointer use above: you can pass the
address of an argument using a reference. The difference between
references and pointers is that calling a function that takes
references is cleaner, syntactically, than calling a function that takes
pointers (and it is exactly this syntactic difference that makes
references essential in certain situations). If PassAddress.cppis
modified to use references, you can see the difference in the
function call in main( ):
//: C03:PassReference.cpp
#include <iostream>
using namespace std;
void f(int& r) {
cout << "r = " << r << endl;
cout << "&r = " << &r << endl;
r = 5;
cout << "r = " << r << endl;
}
int main() {
int x = 47;
cout << "x = " << x << endl;
cout << "&x = " << &x << endl;
f(x); // Looks like pass-by-value,
// is actually pass by reference
cout << "x = " << x << endl;
} ///:~
In f( )'s argument list, instead of saying int* to pass a pointer, you
say int& to pass a reference. Inside f( ), if you just say `r' (which
would produce the address if r were a pointer) you get the value in
the variable that r references. If you assign to r, you actually assign to
the variable that r references. In fact, the only way to get the
address that's held inside r is with the `&' operator.
In main( ), you can see the key effect of references in the syntax of
the call to f( ), which is just f(x). Even though this looks like an
ordinary pass-by-value, the effect of the reference is that it actually
152
Thinking in C++
img
takes the address and passes it in, rather than making a copy of the
value. The output is:
x = 47
&x = 0065FE00
r = 47
&r = 0065FE00
r=5
x=5
So you can see that pass-by-reference allows a function to modify
the outside object, just like passing a pointer does (you can also
observe that the reference obscures the fact that an address is being
passed; this will be examined later in the book). Thus, for this
simple introduction you can assume that references are just a
syntactically different way (sometimes referred to as "syntactic
sugar") to accomplish the same thing that pointers do: allow
functions to change outside objects.
Pointers and references as modifiers
So far, you've seen the basic data types char, int, float, and double,
along with the specifiers signed, unsigned short, and long, which
,
can be used with the basic data types in almost any combination.
Now we've added pointers and references that are orthogonal to
the basic data types and specifiers, so the possible combinations
have just tripled:
//: C03:AllDefinitions.cpp
// All possible combinations of basic data types,
// specifiers, pointers and references
#include <iostream>
using namespace std;
void f1(char c, int i, float f, double d);
void f2(short int si, long int li, long double ld);
void f3(unsigned char uc, unsigned int ui,
unsigned short int usi, unsigned long int uli);
void f4(char* cp, int* ip, float* fp, double* dp);
void f5(short int* sip, long int* lip,
long double* ldp);
3: The C in C++
153
img
void f6(unsigned char* ucp, unsigned int* uip,
unsigned short int* usip,
unsigned long int* ulip);
void f7(char& cr, int& ir, float& fr, double& dr);
void f8(short int& sir, long int& lir,
long double& ldr);
void f9(unsigned char& ucr, unsigned int& uir,
unsigned short int& usir,
unsigned long int& ulir);
int main() {} ///:~
Pointers and references also work when passing objects into and
out of functions; you'll learn about this in a later chapter.
There's one other type that works with pointers: void. If you state
that a pointer is a void*, it means that any type of address at all can
be assigned to that pointer (whereas if you have an int*, you can
assign only the address of an int variable to that pointer). For
example:
//: C03:VoidPointer.cpp
int main() {
void* vp;
char c;
int i;
float f;
double d;
// The address of ANY type can be
// assigned to a void pointer:
vp = &c;
vp = &i;
vp = &f;
vp = &d;
} ///:~
Once you assign to a void* you lose any information about what
type it is. This means that before you can use the pointer, you must
cast it to the correct type:
//: C03:CastFromVoidPointer.cpp
int main() {
int i = 99;
154
Thinking in C++
img
void* vp = &i;
// Can't dereference a void pointer:
// *vp = 3; // Compile-time error
// Must cast back to int before dereferencing:
*((int*)vp) = 3;
} ///:~
The cast (int*)vptakes the void* and tells the compiler to treat it as
an int*, and thus it can be successfully dereferenced. You might
observe that this syntax is ugly, and it is, but it's worse than that ­
the void* introduces a hole in the language's type system. That is, it
allows, or even promotes, the treatment of one type as another
type. In the example above, I treat an int as an int by casting vp to
an int*, but there's nothing that says I can't cast it to a char* or
double*, which would modify a different amount of storage that
had been allocated for the int, possibly crashing the program. In
general, void pointers should be avoided, and used only in rare
special cases, the likes of which you won't be ready to consider
until significantly later in the book.
You cannot have a void reference, for reasons that will be explained
in Chapter 11.
Scoping
Scoping rules tell you where a variable is valid, where it is created,
and where it gets destroyed (i.e., goes out of scope). The scope of a
variable extends from the point where it is defined to the first
closing brace that matches the closest opening brace before the
variable was defined. That is, a scope is defined by its "nearest" set
of braces. To illustrate:
//: C03:Scope.cpp
// How variables are scoped
int main() {
int scp1;
// scp1 visible here
{
// scp1 still visible here
3: The C in C++
155
img
//.....
int scp2;
// scp2 visible here
//.....
{
// scp1 & scp2 still visible here
//..
int scp3;
// scp1, scp2 & scp3 visible here
// ...
} // <-- scp3 destroyed here
// scp3 not available here
// scp1 & scp2 still visible here
// ...
} // <-- scp2 destroyed here
// scp3 & scp2 not available here
// scp1 still visible here
//..
} // <-- scp1 destroyed here
///:~
The example above shows when variables are visible and when
they are unavailable (that is, when they go out of scope). A variable
can be used only when inside its scope. Scopes can be nested,
indicated by matched pairs of braces inside other matched pairs of
braces. Nesting means that you can access a variable in a scope that
encloses the scope you are in. In the example above, the variable
scp1 is available inside all of the other scopes, while scp3 is
available only in the innermost scope.
Defining variables on the fly
As noted earlier in this chapter, there is a significant difference
between C and C++ when defining variables. Both languages
require that variables be defined before they are used, but C (and
many other traditional procedural languages) forces you to define
all the variables at the beginning of a scope, so that when the
compiler creates a block it can allocate space for those variables.
While reading C code, a block of variable definitions is usually the
first thing you see when entering a scope. Declaring all variables at
156
Thinking in C++
img
the beginning of the block requires the programmer to write in a
particular way because of the implementation details of the
language. Most people don't know all the variables they are going
to use before they write the code, so they must keep jumping back
to the beginning of the block to insert new variables, which is
awkward and causes errors. These variable definitions don't
usually mean much to the reader, and they actually tend to be
confusing because they appear apart from the context in which they
are used.
C++ (not C) allows you to define variables anywhere in a scope, so
you can define a variable right before you use it. In addition, you
can initialize the variable at the point you define it, which prevents
a certain class of errors. Defining variables this way makes the code
much easier to write and reduces the errors you get from being
forced to jump back and forth within a scope. It makes the code
easier to understand because you see a variable defined in the
context of its use. This is especially important when you are
defining and initializing a variable at the same time ­ you can see
the meaning of the initialization value by the way the variable is
used.
You can also define variables inside the control expressions of for
loops and while loops, inside the conditional of an if statement,
and inside the selector statement of a switch. Here's an example
showing on-the-fly variable definitions:
//: C03:OnTheFly.cpp
// On-the-fly variable definitions
#include <iostream>
using namespace std;
int main() {
//..
{ // Begin a new scope
int q = 0; // C requires definitions here
//..
// Define at point of use:
for(int i = 0; i < 100; i++) {
3: The C in C++
157
img
q++; // q comes from a larger scope
// Definition at the end of the scope:
int p = 12;
}
int p = 1;  // A different p
} // End scope containing q & outer p
cout << "Type characters:" << endl;
while(char c = cin.get() != 'q') {
cout << c << " wasn't it" << endl;
if(char x = c == 'a' || c == 'b')
cout << "You typed a or b" << endl;
else
cout << "You typed " << x << endl;
}
cout << "Type A, B, or C" << endl;
switch(int i = cin.get()) {
case 'A': cout << "Snap" << endl; break;
case 'B': cout << "Crackle" << endl; break;
case 'C': cout << "Pop" << endl; break;
default: cout << "Not A, B or C!" << endl;
}
} ///:~
In the innermost scope, p is defined right before the scope ends, so
it is really a useless gesture (but it shows you can define a variable
anywhere). The p in the outer scope is in the same situation.
The definition of i in the control expression of the for loop is an
example of being able to define a variable exactly at the point you
need it (you can do this only in C++). The scope of i is the scope of
the expression controlled by the for loop, so you can turn around
and re-use i in the next for loop. This is a convenient and
commonly-used idiom in C++; i is the classic name for a loop
counter and you don't have to keep inventing new names.
Although the example also shows variables defined within while,
if, and switch statements, this kind of definition is much less
common than those in for expressions, possibly because the syntax
is so constrained. For example, you cannot have any parentheses.
That is, you cannot say:
158
Thinking in C++
img
while((char c = cin.get()) != 'q')
The addition of the extra parentheses would seem like an innocent
and useful thing to do, and because you cannot use them, the
results are not what you might like. The problem occurs because
`!=' has a higher precedence than `=', so the char c ends up
containing a bool converted to char. When that's printed, on many
terminals you'll see a smiley-face character.
In general, you can consider the ability to define variables within
while, if, and switch statements as being there for completeness,
but the only place you're likely to use this kind of variable
definition is in a for loop (where you'll use it quite often).
Specifying storage allocation
When creating a variable, you have a number of options to specify
the lifetime of the variable, how the storage is allocated for that
variable, and how the variable is treated by the compiler.
Global variables
Global variables are defined outside all function bodies and are
available to all parts of the program (even code in other files).
Global variables are unaffected by scopes and are always available
(i.e., the lifetime of a global variable lasts until the program ends). If
the existence of a global variable in one file is declared using the
extern keyword in another file, the data is available for use by the
second file. Here's an example of the use of global variables:
//: C03:Global.cpp
//{L} Global2
// Demonstration of global variables
#include <iostream>
using namespace std;
int globe;
void func();
int main() {
3: The C in C++
159
img
globe =
12;
cout <<
globe << endl;
func();
// Modifies globe
cout <<
globe << endl;
} ///:~
Here's a file that accesses globe as an extern:
//: C03:Global2.cpp {O}
// Accessing external global variables
extern int globe;
// (The linker resolves the reference)
void func() {
globe = 47;
} ///:~
Storage for the variable globe is created by the definition in
Global.cpp and that same variable is accessed by the code in
,
Global2.cpp Since the code in Global2.cppis compiled separately
.
from the code in Global.cpp the compiler must be informed that
,
the variable exists elsewhere by the declaration
extern int globe;
When you run the program, you'll see that the call to func( ) does
indeed affect the single global instance of globe.
In Global.cpp you can see the special comment tag (which is my
,
own design):
//{L} Global2
This says that to create the final program, the object file with the
name Global2 must be linked in (there is no extension because the
extension names of object files differ from one system to the next).
In Global2.cpp the first line has another special comment tag {O},
,
which says "Don't try to create an executable out of this file, it's
being compiled so that it can be linked into some other executable."
The ExtractCode.cppprogram in Volume 2 of this book
(downloadable at ) reads these tags and creates
160
Thinking in C++
img
the appropriate makefileso everything compiles properly (you'll
learn about makefile at the end of this chapter).
s
Local variables
Local variables occur within a scope; they are "local" to a function.
They are often called automatic variables because they automatically
come into being when the scope is entered and automatically go
away when the scope closes. The keyword auto makes this explicit,
but local variables default to auto so it is never necessary to declare
something as an auto.
Register variables
A register variable is a type of local variable. The registerkeyword
tells the compiler "Make accesses to this variable as fast as
possible." Increasing the access speed is implementation
dependent, but, as the name suggests, it is often done by placing
the variable in a register. There is no guarantee that the variable
will be placed in a register or even that the access speed will
increase. It is a hint to the compiler.
There are restrictions to the use of registervariables. You cannot
take or compute the address of a registervariable. A register
variable can be declared only within a block (you cannot have
global or static registervariables). You can, however, use a register
variable as a formal argument in a function (i.e., in the argument
list).
In general, you shouldn't try to second-guess the compiler's
optimizer, since it will probably do a better job than you can. Thus,
the registerkeyword is best avoided.
static
The static keyword has several distinct meanings. Normally,
variables defined local to a function disappear at the end of the
function scope. When you call the function again, storage for the
3: The C in C++
161
img
variables is created anew and the values are re-initialized. If you
want a value to be extant throughout the life of a program, you can
define a function's local variable to be static and give it an initial
value. The initialization is performed only the first time the
function is called, and the data retains its value between function
calls. This way, a function can "remember" some piece of
information between function calls.
You may wonder why a global variable isn't used instead. The
beauty of a static variable is that it is unavailable outside the scope
of the function, so it can't be inadvertently changed. This localizes
errors.
Here's an example of the use of static variables:
//: C03:Static.cpp
// Using a static variable in a function
#include <iostream>
using namespace std;
void func() {
static int i = 0;
cout << "i = " << ++i << endl;
}
int main() {
for(int x = 0; x < 10; x++)
func();
} ///:~
Each time func( ) is called in the for loop, it prints a different value.
If the keyword static is not used, the value printed will always be
`1'.
The second meaning of static is related to the first in the
"unavailable outside a certain scope" sense. When static is applied
to a function name or to a variable that is outside of all functions, it
means "This name is unavailable outside of this file." The function
name or variable is local to the file; we say it has file scope. As a
162
Thinking in C++
img
demonstration, compiling and linking the following two files will
cause a linker error:
//: C03:FileStatic.cpp
// File scope demonstration. Compiling and
// linking this file with FileStatic2.cpp
// will cause a linker error
// File scope means only available in this file:
static int fs;
int main() {
fs = 1;
} ///:~
Even though the variable fs is claimed to exist as an extern in the
following file, the linker won't find it because it has been declared
static in FileStatic.cpp
.
//: C03:FileStatic2.cpp {O}
// Trying to reference fs
extern int fs;
void func() {
fs = 100;
} ///:~
The static specifier may also be used inside a class. This
explanation will be delayed until you learn to create classes, later in
the book.
extern
The extern keyword has already been briefly described and
demonstrated. It tells the compiler that a variable or a function
exists, even if the compiler hasn't yet seen it in the file currently
being compiled. This variable or function may be defined in
another file or further down in the current file. As an example of
the latter:
//: C03:Forward.cpp
// Forward function & data declarations
3: The C in C++
163
img
#include <iostream>
using namespace std;
// This is not actually external, but the
// compiler must be told it exists somewhere:
extern int i;
extern void func();
int main() {
i = 0;
func();
}
int i; // The data definition
void func() {
i++;
cout << i;
} ///:~
When the compiler encounters the declaration `extern int i it
',
knows that the definition for i must exist somewhere as a global
variable. When the compiler reaches the definition of i, no other
declaration is visible, so it knows it has found the same i declared
earlier in the file. If you were to define i as static, you would be
telling the compiler that i is defined globally (via the extern), but it
also has file scope (via the static), so the compiler will generate an
error.
Linkage
To understand the behavior of C and C++ programs, you need to
know about linkage. In an executing program, an identifier is
represented by storage in memory that holds a variable or a
compiled function body. Linkage describes this storage as it is seen
by the linker. There are two types of linkage: internal linkage and
external linkage.
Internal linkage means that storage is created to represent the
identifier only for the file being compiled. Other files may use the
same identifier name with internal linkage, or for a global variable,
and no conflicts will be found by the linker ­ separate storage is
created for each identifier. Internal linkage is specified by the
keyword static in C and C++.
164
Thinking in C++
img
External linkage means that a single piece of storage is created to
represent the identifier for all files being compiled. The storage is
created once, and the linker must resolve all other references to that
storage. Global variables and function names have external linkage.
These are accessed from other files by declaring them with the
keyword extern. Variables defined outside all functions (with the
exception of const in C++) and function definitions default to
external linkage. You can specifically force them to have internal
linkage using the static keyword. You can explicitly state that an
identifier has external linkage by defining it with the extern
keyword. Defining a variable or function with extern is not
necessary in C, but it is sometimes necessary for const in C++.
Automatic (local) variables exist only temporarily, on the stack,
while a function is being called. The linker doesn't know about
automatic variables, and so these have no linkage.
Constants
In old (pre-Standard) C, if you wanted to make a constant, you had
to use the preprocessor:
#define PI 3.14159
Everywhere you used PI, the value 3.14159 was substituted by the
preprocessor (you can still use this method in C and C++).
When you use the preprocessor to create constants, you place
control of those constants outside the scope of the compiler. No
type checking is performed on the name PI and you can't take the
address of PI (so you can't pass a pointer or a reference to PI). PI
cannot be a variable of a user-defined type. The meaning of PI lasts
from the point it is defined to the end of the file; the preprocessor
doesn't recognize scoping.
C++ introduces the concept of a named constant that is just like a
variable, except that its value cannot be changed. The modifier
const tells the compiler that a name represents a constant. Any data
3: The C in C++
165
img
type, built-in or user-defined, may be defined as const. If you
define something as const and then attempt to modify it, the
compiler will generate an error.
You must specify the type of a const, like this:
const int x = 10;
In Standard C and C++, you can use a named constant in an
argument list, even if the argument it fills is a pointer or a reference
(i.e., you can take the address of a const). A const has a scope, just
like a regular variable, so you can "hide" a const inside a function
and be sure that the name will not affect the rest of the program.
The const was taken from C++ and incorporated into Standard C,
albeit quite differently. In C, the compiler treats a const just like a
variable that has a special tag attached that says "Don't change
me." When you define a const in C, the compiler creates storage for
it, so if you define more than one const with the same name in two
different files (or put the definition in a header file), the linker will
generate error messages about conflicts. The intended use of const
in C is quite different from its intended use in C++ (in short, it's
nicer in C++).
Constant values
In C++, a const must always have an initialization value (in C, this
is not true). Constant values for built-in types are expressed as
decimal, octal, hexadecimal, or floating-point numbers (sadly,
binary numbers were not considered important), or as characters.
In the absence of any other clues, the compiler assumes a constant
value is a decimal number. The numbers 47, 0, and 1101 are all
treated as decimal numbers.
A constant value with a leading 0 is treated as an octal number
(base 8). Base 8 numbers can contain only digits 0-7; the compiler
flags other digits as an error. A legitimate octal number is 017 (15 in
base 10).
166
Thinking in C++
img
A constant value with a leading 0x is treated as a hexadecimal
number (base 16). Base 16 numbers contain the digits 0-9 and a-f or
A-F. A legitimate hexadecimal number is 0x1fe (510 in base 10).
Floating point numbers can contain decimal points and exponential
powers (represented by e, which means "10 to the power of"). Both
the decimal point and the e are optional. If you assign a constant to
a floating-point variable, the compiler will take the constant value
and convert it to a floating-point number (this process is one form
of what's called implicit type conversion). However, it is a good idea
to use either a decimal point or an e to remind the reader that you
are using a floating-point number; some older compilers also need
the hint.
Legitimate floating-point constant values are: 1e4, 1.0001, 47.0, 0.0,
and -1.159e-77. You can add suffixes to force the type of floating-
point number: f or F forces a float, L or l forces a long double;
otherwise the number will be a double.
Character constants are characters surrounded by single quotes, as:
`A', `0', ` `. Notice there is a big difference between the character `0'
(ASCII 96) and the value 0. Special characters are represented with
the "backslash escape": `\n' (newline), `\t' (tab), `\\' (backslash),
`\r' (carriage return), `\"' (double quotes), `\'' (single quote), etc.
You can also express char constants in octal: `\17' or hexadecimal:
`\xff'.
volatile
Whereas the qualifier const tells the compiler "This never changes"
(which allows the compiler to perform extra optimizations), the
qualifier volatiletells the compiler "You never know when this will
change," and prevents the compiler from performing any
optimizations based on the stability of that variable. Use this
keyword when you read some value outside the control of your
code, such as a register in a piece of communication hardware. A
3: The C in C++
167
img
volatilevariable is always read whenever its value is required,
even if it was just read the line before.
A special case of some storage being "outside the control of your
code" is in a multithreaded program. If you're watching a
particular flag that is modified by another thread or process, that
flag should be volatileso the compiler doesn't make the
assumption that it can optimize away multiple reads of the flag.
Note that volatilemay have no effect when a compiler is not
optimizing, but may prevent critical bugs when you start
optimizing the code (which is when the compiler will begin looking
for redundant reads).
The const and volatilekeywords will be further illuminated in a
later chapter.
Operators and their use
This section covers all the operators in C and C++.
All operators produce a value from their operands. This value is
produced without modifying the operands, except with the
assignment, increment, and decrement operators. Modifying an
operand is called a side effect. The most common use for operators
that modify their operands is to generate the side effect, but you
should keep in mind that the value produced is available for your
use just as in operators without side effects.
Assignment
Assignment is performed with the operator =. It means "Take the
right-hand side (often called the rvalue) and copy it into the left-
hand side (often called the lvalue)." An rvalue is any constant,
variable, or expression that can produce a value, but an lvalue must
be a distinct, named variable (that is, there must be a physical space
in which to store data). For instance, you can assign a constant
168
Thinking in C++
img
value to a variable (A = 4;), but you cannot assign anything to
constant value ­ it cannot be an lvalue (you can't say 4 = A;).
Mathematical operators
The basic mathematical operators are the same as the ones available
in most programming languages: addition (+), subtraction (-),
division (/), multiplication (*), and modulus (%; this produces the
remainder from integer division). Integer division truncates the
result (it doesn't round). The modulus operator cannot be used
with floating-point numbers.
C and C++ also use a shorthand notation to perform an operation
and an assignment at the same time. This is denoted by an operator
followed by an equal sign, and is consistent with all the operators
in the language (whenever it makes sense). For example, to add 4 to
the variable x and assign x to the result, you say: x += 4;.
This example shows the use of the mathematical operators:
//: C03:Mathops.cpp
// Mathematical operators
#include <iostream>
using namespace std;
// A macro to display a string and a value.
#define PRINT(STR, VAR) \
cout << STR " = " << VAR << endl
int main() {
int i, j, k;
float u, v, w;  // Applies to doubles, too
cout << "enter an integer: ";
cin >> j;
cout << "enter another integer: ";
cin >> k;
PRINT("j",j);  PRINT("k",k);
i = j + k; PRINT("j + k",i);
i = j - k; PRINT("j - k",i);
i = k / j; PRINT("k / j",i);
i = k * j; PRINT("k * j",i);
3: The C in C++
169
img
i = k % j; PRINT("k % j",i);
// The following only works with integers:
j %= k; PRINT("j %= k", j);
cout << "Enter a floating-point number: ";
cin >> v;
cout << "Enter another floating-point number:";
cin >> w;
PRINT("v",v); PRINT("w",w);
u = v + w; PRINT("v + w", u);
u = v - w; PRINT("v - w", u);
u = v * w; PRINT("v * w", u);
u = v / w; PRINT("v / w", u);
// The following works for ints, chars,
// and doubles too:
PRINT("u", u); PRINT("v", v);
u += v; PRINT("u += v", u);
u -= v; PRINT("u -= v", u);
u *= v; PRINT("u *= v", u);
u /= v; PRINT("u /= v", u);
} ///:~
The rvalues of all the assignments can, of course, be much more
complex.
Introduction to preprocessor macros
Notice the use of the macro PRINT( )to save typing (and typing
errors!). Preprocessor macros are traditionally named with all
uppercase letters so they stand out ­ you'll learn later that macros
can quickly become dangerous (and they can also be very useful).
The arguments in the parenthesized list following the macro name
are substituted in all the code following the closing parenthesis.
The preprocessor removes the name PRINT and substitutes the
code wherever the macro is called, so the compiler cannot generate
any error messages using the macro name, and it doesn't do any
type checking on the arguments (the latter can be beneficial, as
shown in the debugging macros at the end of the chapter).
170
Thinking in C++
img
Relational operators
Relational operators establish a relationship between the values of
the operands. They produce a Boolean (specified with the bool
keyword in C++) true if the relationship is true, and false if the
relationship is false. The relational operators are: less than (<),
greater than (>), less than or equal to (<=), greater than or equal to
(>=), equivalent (==), and not equivalent (!=). They may be used
with all built-in data types in C and C++. They may be given
special definitions for user-defined data types in C++ (you'll learn
about this in Chapter 12, which covers operator overloading).
Logical operators
The logical operators and (&&) and or (||) produce a true or false
based on the logical relationship of its arguments. Remember that
in C and C++, a statement is true if it has a non-zero value, and
false if it has a value of zero. If you print a bool, you'll typically see
a `1' for true and `0' for false.
This example uses the relational and logical operators:
//: C03:Boolean.cpp
// Relational and logical operators.
#include <iostream>
using namespace std;
int main() {
int i,j;
cout << "Enter an integer: ";
cin >> i;
cout << "Enter another integer: ";
cin >> j;
cout << "i > j is " << (i > j) << endl;
cout << "i < j is " << (i < j) << endl;
cout << "i >= j is " << (i >= j) << endl;
cout << "i <= j is " << (i <= j) << endl;
cout << "i == j is " << (i == j) << endl;
cout << "i != j is " << (i != j) << endl;
cout << "i && j is " << (i && j) << endl;
cout << "i || j is " << (i || j) << endl;
3: The C in C++
171
img
cout << " (i < 10) && (j < 10) is "
<< ((i < 10) && (j < 10))  << endl;
} ///:~
You can replace the definition for int with float or double in the
program above. Be aware, however, that the comparison of a
floating-point number with the value of zero is strict; a number that
is the tiniest fraction different from another number is still "not
equal." A floating-point number that is the tiniest bit above zero is
still true.
Bitwise operators
The bitwise operators allow you to manipulate individual bits in a
number (since floating point values use a special internal format,
the bitwise operators work only with integral types: char, int and
long). Bitwise operators perform Boolean algebra on the
corresponding bits in the arguments to produce the result.
The bitwise and operator (&) produces a one in the output bit if
both input bits are one; otherwise it produces a zero. The bitwise or
operator (|) produces a one in the output bit if either input bit is a
one and produces a zero only if both input bits are zero. The
bitwise exclusive or, or xor (^) produces a one in the output bit if one
or the other input bit is a one, but not both. The bitwise not (~, also
called the ones complement operator) is a unary operator ­ it only
takes one argument (all other bitwise operators are binary
operators). Bitwise not produces the opposite of the input bit ­ a
one if the input bit is zero, a zero if the input bit is one.
Bitwise operators can be combined with the = sign to unite the
operation and assignment: &=, |=, and ^= are all legitimate
operations (since ~ is a unary operator it cannot be combined with
the = sign).
172
Thinking in C++
img
Shift operators
The shift operators also manipulate bits. The left-shift operator (<<)
produces the operand to the left of the operator shifted to the left
by the number of bits specified after the operator. The right-shift
operator (>>) produces the operand to the left of the operator
shifted to the right by the number of bits specified after the
operator. If the value after the shift operator is greater than the
number of bits in the left-hand operand, the result is undefined. If
the left-hand operand is unsigned, the right shift is a logical shift so
the upper bits will be filled with zeros. If the left-hand operand is
signed, the right shift may or may not be a logical shift (that is, the
behavior is undefined).
Shifts can be combined with the equal sign (<<= and >>=). The
lvalue is replaced by the lvalue shifted by the rvalue.
What follows is an example that demonstrates the use of all the
operators involving bits. First, here's a general-purpose function
that prints a byte in binary format, created separately so that it may
be easily reused. The header file declares the function:
//: C03:printBinary.h
// Display a byte in binary
void printBinary(const unsigned char val);
///:~
Here's the implementation of the function:
//: C03:printBinary.cpp {O}
#include <iostream>
void printBinary(const unsigned char val) {
for(int i = 7; i >= 0; i--)
if(val & (1 << i))
std::cout << "1";
else
std::cout << "0";
} ///:~
The printBinary( )function takes a single byte and displays it bit-
by-bit. The expression
3: The C in C++
173
img
(1 << i)
produces a one in each successive bit position; in binary: 00000001,
00000010, etc. If this bit is bitwise anded with val and the result is
nonzero, it means there was a one in that position in val.
Finally, the function is used in the example that shows the bit-
manipulation operators:
//: C03:Bitwise.cpp
//{L} printBinary
// Demonstration of bit manipulation
#include "printBinary.h"
#include <iostream>
using namespace std;
// A macro to save typing:
#define PR(STR, EXPR) \
cout << STR; printBinary(EXPR); cout << endl;
int main() {
unsigned int getval;
unsigned char a, b;
cout << "Enter a number between 0 and 255: ";
cin >> getval; a = getval;
PR("a in binary: ", a);
cout << "Enter a number between 0 and 255: ";
cin >> getval; b = getval;
PR("b in binary: ", b);
PR("a | b = ", a | b);
PR("a & b = ", a & b);
PR("a ^ b = ", a ^ b);
PR("~a = ", ~a);
PR("~b = ", ~b);
// An interesting bit pattern:
unsigned char c = 0x5A;
PR("c in binary: ", c);
a |= c;
PR("a |= c; a = ", a);
b &= c;
PR("b &= c; b = ", b);
b ^= a;
PR("b ^= a; b = ", b);
} ///:~
174
Thinking in C++
img
Once again, a preprocessor macro is used to save typing. It prints
the string of your choice, then the binary representation of an
expression, then a newline.
In main( ), the variables are unsigned This is because, in general,
.
you don't want signs when you are working with bytes. An int
must be used instead of a char for getval because the "cin >>"
statement will otherwise treat the first digit as a character. By
assigning getval to a and b, the value is converted to a single byte
(by truncating it).
The << and >> provide bit-shifting behavior, but when they shift
bits off the end of the number, those bits are lost (it's commonly
said that they fall into the mythical bit bucket, a place where
discarded bits end up, presumably so they can be reused...). When
manipulating bits you can also perform rotation, which means that
the bits that fall off one end are inserted back at the other end, as if
they're being rotated around a loop. Even though most computer
processors provide a machine-level rotate command (so you'll see
it in the assembly language for that processor), there is no direct
support for "rotate" in C or C++. Presumably the designers of C felt
justified in leaving "rotate" off (aiming, as they said, for a minimal
language) because you can build your own rotate command. For
example, here are functions to perform left and right rotations:
//: C03:Rotation.cpp {O}
// Perform left and right rotations
unsigned char rol(unsigned char val) {
int highbit;
if(val & 0x80) // 0x80 is the high bit only
highbit = 1;
else
highbit = 0;
// Left shift (bottom bit becomes 0):
val <<= 1;
// Rotate the high bit onto the bottom:
val |= highbit;
return val;
3: The C in C++
175
img
}
unsigned char ror(unsigned char val) {
int lowbit;
if(val & 1) // Check the low bit
lowbit = 1;
else
lowbit = 0;
val >>= 1; // Right shift by one position
// Rotate the low bit onto the top:
val |= (lowbit << 7);
return val;
} ///:~
Try using these functions in Bitwise.cpp Notice the definitions (or
.
at least declarations) of rol( ) and ror( ) must be seen by the
compiler in Bitwise.cppbefore the functions are used.
The bitwise functions are generally extremely efficient to use
because they translate directly into assembly language statements.
Sometimes a single C or C++ statement will generate a single line
of assembly code.
Unary operators
Bitwise not isn't the only operator that takes a single argument. Its
companion, the logical not (!), will take a true value and produce a
false value. The unary minus (-) and unary plus (+) are the same
operators as binary minus and plus; the compiler figures out which
usage is intended by the way you write the expression. For
instance, the statement
x = -a;
has an obvious meaning. The compiler can figure out:
x = a * -b;
but the reader might get confused, so it is safer to say:
x = a * (-b);
176
Thinking in C++
img
The unary minus produces the negative of the value. Unary plus
provides symmetry with unary minus, although it doesn't actually
do anything.
The increment and decrement operators (++ and --) were
introduced earlier in this chapter. These are the only operators
other than those involving assignment that have side effects. These
operators increase or decrease the variable by one unit, although
"unit" can have different meanings according to the data type ­ this
is especially true with pointers.
The last unary operators are the address-of (&), dereference (* and -
>), and cast operators in C and C++, and new and delete in C++.
Address-of and dereference are used with pointers, described in
this chapter. Casting is described later in this chapter, and new and
delete are introduced in Chapter 4.
The ternary operator
The ternary if-else is unusual because it has three operands. It is
truly an operator because it produces a value, unlike the ordinary
if-else statement. It consists of three expressions: if the first
expression (followed by a ?) evaluates to true, the expression
following the ? is evaluated and its result becomes the value
produced by the operator. If the first expression is false, the third
expression (following a :) is executed and its result becomes the
value produced by the operator.
The conditional operator can be used for its side effects or for the
value it produces. Here's a code fragment that demonstrates both:
a = --b ? b : (b = -99);
Here, the conditional produces the rvalue. a is assigned to the value
of b if the result of decrementing b is nonzero. If b became zero, a
and b are both assigned to -99. b is always decremented, but it is
assigned to -99 only if the decrement causes b to become 0. A
3: The C in C++
177
img
similar statement can be used without the "a =" just for its side
effects:
--b ? b : (b = -99);
Here the second B is superfluous, since the value produced by the
operator is unused. An expression is required between the ? and :.
In this case, the expression could simply be a constant that might
make the code run a bit faster.
The comma operator
The comma is not restricted to separating variable names in
multiple definitions, such as
int i, j, k;
Of course, it's also used in function argument lists. However, it can
also be used as an operator to separate expressions ­ in this case it
produces only the value of the last expression. All the rest of the
expressions in the comma-separated list are evaluated only for their
side effects. This example increments a list of variables and uses the
last one as the rvalue:
//: C03:CommaOperator.cpp
#include <iostream>
using namespace std;
int main() {
int a = 0, b = 1, c = 2, d = 3, e = 4;
a = (b++, c++, d++, e++);
cout << "a = " << a << endl;
// The parentheses are critical here. Without
// them, the statement will evaluate to:
(a = b++), c++, d++, e++;
cout << "a = " << a << endl;
} ///:~
In general, it's best to avoid using the comma as anything other
than a separator, since people are not used to seeing it as an
operator.
178
Thinking in C++
img
Common pitfalls when using operators
As illustrated above, one of the pitfalls when using operators is
trying to get away without parentheses when you are even the least
bit uncertain about how an expression will evaluate (consult your
local C manual for the order of expression evaluation).
Another extremely common error looks like this:
//: C03:Pitfall.cpp
// Operator mistakes
int main() {
int a = 1, b = 1;
while(a = b) {
// ....
}
} ///:~
The statement a = b will always evaluate to true when b is non-
zero. The variable a is assigned to the value of b, and the value of b
is also produced by the operator =. In general, you want to use the
equivalence operator == inside a conditional statement, not
assignment. This one bites a lot of programmers (however, some
compilers will point out the problem to you, which is helpful).
A similar problem is using bitwise and and or instead of their
logical counterparts. Bitwise and and or use one of the characters (&
or |), while logical and and or use two (&& and ||). Just as with =
and ==, it's easy to just type one character instead of two. A useful
mnemonic device is to observe that "Bits are smaller, so they don't
need as many characters in their operators."
Casting operators
The word cast is used in the sense of "casting into a mold." The
compiler will automatically change one type of data into another if
it makes sense. For instance, if you assign an integral value to a
floating-point variable, the compiler will secretly call a function (or
more probably, insert code) to convert the int to a float. Casting
3: The C in C++
179
img
allows you to make this type conversion explicit, or to force it when
it wouldn't normally happen.
To perform a cast, put the desired data type (including all
modifiers) inside parentheses to the left of the value. This value can
be a variable, a constant, the value produced by an expression, or
the return value of a function. Here's an example:
//: C03:SimpleCast.cpp
int main() {
int b = 200;
unsigned long a = (unsigned long int)b;
} ///:~
Casting is powerful, but it can cause headaches because in some
situations it forces the compiler to treat data as if it were (for
instance) larger than it really is, so it will occupy more space in
memory; this can trample over other data. This usually occurs
when casting pointers, not when making simple casts like the one
shown above.
C++ has an additional casting syntax, which follows the function
call syntax. This syntax puts the parentheses around the argument,
like a function call, rather than around the data type:
//: C03:FunctionCallCast.cpp
int main() {
float a = float(200);
// This is equivalent to:
float b = (float)200;
} ///:~
Of course in the case above you wouldn't really need a cast; you
could just say 200f (in effect, that's typically what the compiler will
do for the above expression). Casts are generally used instead with
variables, rather than constants.
180
Thinking in C++
img
C++ explicit casts
Casts should be used carefully, because what you are actually
doing is saying to the compiler "Forget type checking ­ treat it as
this other type instead." That is, you're introducing a hole in the
C++ type system and preventing the compiler from telling you that
you're doing something wrong with a type. What's worse, the
compiler believes you implicitly and doesn't perform any other
checking to catch errors. Once you start casting, you open yourself
up for all kinds of problems. In fact, any program that uses a lot of
casts should be viewed with suspicion, no matter how much you
are told it simply "must" be done that way. In general, casts should
be few and isolated to the solution of very specific problems.
Once you understand this and are presented with a buggy
program, your first inclination may be to look for casts as culprits.
But how do you locate C-style casts? They are simply type names
inside of parentheses, and if you start hunting for such things you'll
discover that it's often hard to distinguish them from the rest of
your code.
Standard C++ includes an explicit cast syntax that can be used to
completely replace the old C-style casts (of course, C-style casts
cannot be outlawed without breaking code, but compiler writers
could easily flag old-style casts for you). The explicit cast syntax is
such that you can easily find them, as you can see by their names:
static_cast
For "well-behaved" and
"reasonably well-behaved" casts,
including things you might now
do without a cast (such as an
automatic type conversion).
const_cast
To cast away const and/or
volatile
.
reinterpret_cast
To cast to a completely different
meaning. The key is that you'll
3: The C in C++
181
img
need to cast back to the original
type to use it safely. The type you
cast to is typically used only for
bit twiddling or some other
mysterious purpose. This is the
most dangerous of all the casts.
dynamic_cast
For type-safe downcasting (this
cast will be described in Chapter
15).
The first three explicit casts will be described more completely in
the following sections, while the last one can be demonstrated only
after you've learned more, in Chapter 15.
static_cast
A static_castis used for all conversions that are well-defined. These
include "safe" conversions that the compiler would allow you to do
without a cast and less-safe conversions that are nonetheless well-
defined. The types of conversions covered by static_castinclude
typical castless conversions, narrowing (information-losing)
conversions, forcing a conversion from a void*, implicit type
conversions, and static navigation of class hierarchies (since you
haven't seen classes and inheritance yet, this last topic will be
delayed until Chapter 15):
//: C03:static_cast.cpp
void func(int) {}
int main() {
int i = 0x7fff; // Max pos value = 32767
long l;
float f;
// (1) Typical castless conversions:
l = i;
f = i;
// Also works:
l = static_cast<long>(i);
f = static_cast<float>(i);
182
Thinking in C++
img
// (2) Narrowing conversions:
i = l; // May lose digits
i = f; // May lose info
// Says "I know," eliminates warnings:
i = static_cast<int>(l);
i = static_cast<int>(f);
char c = static_cast<char>(i);
// (3) Forcing a conversion from void* :
void* vp = &i;
// Old way produces a dangerous conversion:
float* fp = (float*)vp;
// The new way is equally dangerous:
fp = static_cast<float*>(vp);
// (4) Implicit type conversions, normally
// performed by the compiler:
double d = 0.0;
int x = d; // Automatic type conversion
x = static_cast<int>(d); // More explicit
func(d); // Automatic type conversion
func(static_cast<int>(d)); // More explicit
} ///:~
In Section (1), you see the kinds of conversions you're used to
doing in C, with or without a cast. Promoting from an int to a long
or float is not a problem because the latter can always hold every
value that an int can contain. Although it's unnecessary, you can
use static_castto highlight these promotions.
Converting back the other way is shown in (2). Here, you can lose
data because an int is not as "wide" as a long or a float; it won't
hold numbers of the same size. Thus these are called narrowing
conversions. The compiler will still perform these, but will often give
you a warning. You can eliminate this warning and indicate that
you really did mean it using a cast.
Assigning from a void* is not allowed without a cast in C++ (unlike
C), as seen in (3). This is dangerous and requires that programmers
3: The C in C++
183
img
know what they're doing. The static_cast at least, is easier to locate
,
than the old standard cast when you're hunting for bugs.
Section (4) of the program shows the kinds of implicit type
conversions that are normally performed automatically by the
compiler. These are automatic and require no casting, but again
static_casthighlights the action in case you want to make it clear
what's happening or hunt for it later.
const_cast
If you want to convert from a const to a nonconst or from a volatile
to a nonvolatile you use const_cast This is the only conversion
,
.
allowed with const_cast if any other conversion is involved it must
;
be done using a separate expression or you'll get a compile-time
error.
//: C03:const_cast.cpp
int main() {
const int i = 0;
int* j = (int*)&i; // Deprecated form
j  = const_cast<int*>(&i); // Preferred
// Can't do simultaneous additional casting:
//! long* l = const_cast<long*>(&i); // Error
volatile int k = 0;
int* u = const_cast<int*>(&k);
} ///:~
If you take the address of a const object, you produce a pointer to a
const, and this cannot be assigned to a nonconst pointer without a
cast. The old-style cast will accomplish this, but the const_castis
the appropriate one to use. The same holds true for volatile
.
reinterpret_cast
This is the least safe of the casting mechanisms, and the one most
likely to produce bugs. A reinterpret_cast
pretends that an object is
just a bit pattern that can be treated (for some dark purpose) as if it
were an entirely different type of object. This is the low-level bit
twiddling that C is notorious for. You'll virtually always need to
184
Thinking in C++
img
reinterpret_cast
back to the original type (or otherwise treat the
variable as its original type) before doing anything else with it.
//: C03:reinterpret_cast.cpp
#include <iostream>
using namespace std;
const int sz = 100;
struct X { int a[sz]; };
void print(X* x) {
for(int i = 0; i < sz; i++)
cout << x->a[i] << ' ';
cout << endl << "--------------------" << endl;
}
int main() {
X x;
print(&x);
int* xp = reinterpret_cast<int*>(&x);
for(int* i = xp; i < xp + sz; i++)
*i = 0;
// Can't use xp as an X* at this point
// unless you cast it back:
print(reinterpret_cast<X*>(xp));
// In this example, you can also just use
// the original identifier:
print(&x);
} ///:~
In this simple example, struct Xjust contains an array of int, but
when you create one on the stack as in X x, the values of each of the
ints are garbage (this is shown using the print( )function to display
the contents of the struct). To initialize them, the address of the X is
taken and cast to an int pointer, which is then walked through the
array to set each int to zero. Notice how the upper bound for i is
calculated by "adding" sz to xp; the compiler knows that you
actually want sz pointer locations greater than xp and it does the
correct pointer arithmetic for you.
The idea of reinterpret_cast that when you use it, what you get is
is
so foreign that it cannot be used for the type's original purpose
3: The C in C++
185
img
unless you cast it back. Here, we see the cast back to an X* in the
call to print, but of course since you still have the original identifier
you can also use that. But the xp is only useful as an int*, which is
truly a "reinterpretation" of the original X.
A reinterpret_cast
often indicates inadvisable and/or nonportable
programming, but it's available when you decide you have to use
it.
sizeof ­ an operator by itself
The sizeof operator stands alone because it satisfies an unusual
need. sizeof gives you information about the amount of memory
allocated for data items. As described earlier in this chapter, sizeof
tells you the number of bytes used by any particular variable. It can
also give the size of a data type (with no variable name):
//: C03:sizeof.cpp
#include <iostream>
using namespace std;
int main() {
cout << "sizeof(double) = " << sizeof(double);
cout << ", sizeof(char) = " << sizeof(char);
} ///:~
By definition, the sizeof any type of char (signed, unsignedor
plain) is always one, regardless of whether the underlying storage
for a char is actually one byte. For all other types, the result is the
size in bytes.
Note that sizeof is an operator, not a function. If you apply it to a
type, it must be used with the parenthesized form shown above,
but if you apply it to a variable you can use it without parentheses:
//: C03:sizeofOperator.cpp
int main() {
int x;
int i = sizeof x;
} ///:~
186
Thinking in C++
img
sizeof can also give you the sizes of user-defined data types. This is
used later in the book.
The asm keyword
This is an escape mechanism that allows you to write assembly
code for your hardware within a C++ program. Often you're able
to reference C++ variables within the assembly code, which means
you can easily communicate with your C++ code and limit the
assembly code to that necessary for efficiency tuning or to use
special processor instructions. The exact syntax that you must use
when writing the assembly language is compiler-dependent and
can be discovered in your compiler's documentation.
Explicit operators
These are keywords for bitwise and logical operators. Non-U.S.
programmers without keyboard characters like &, |, ^, and so on,
were forced to use C's horrible trigraphs, which were not only
annoying to type, but obscure when reading. This is repaired in
C++ with additional keywords:
Keyword
Meaning
and
&& (logical and)
or
|| (logical or)
not
! (logical NOT)
not_eq
!= (logical not-equivalent)
bitand
& (bitwise and)
and_eq
&= (bitwise and-assignment)
bitor
| (bitwise or)
or_eq
|= (bitwise or-assignment)
xor
^ (bitwise exclusive-or)
3: The C in C++
187
img
Keyword
Meaning
xor_eq
^= (bitwise exclusive-or-
assignment)
compl
~ (ones complement)
If your compiler complies with Standard C++, it will support these
keywords.
Composite type creation
The fundamental data types and their variations are essential, but
rather primitive. C and C++ provide tools that allow you to
compose more sophisticated data types from the fundamental data
types. As you'll see, the most important of these is struct, which is
the foundation for class in C++. However, the simplest way to
create more sophisticated types is simply to alias a name to another
name via typedef.
Aliasing names with typedef
This keyword promises more than it delivers: typedef suggests
"type definition" when "alias" would probably have been a more
accurate description, since that's what it really does. The syntax is:
typedef existing-type-description alias-name
People often use typedef when data types get slightly complicated,
just to prevent extra keystrokes. Here is a commonly-used typedef:
typedef unsigned long ulong;
Now if you say ulong the compiler knows that you mean unsigned
long. You might think that this could as easily be accomplished
using preprocessor substitution, but there are key situations in
which the compiler must be aware that you're treating a name as if
it were a type, so typedef is essential.
188
Thinking in C++
img
One place where typedef comes in handy is for pointer types. As
previously mentioned, if you say:
int* x, y;
This actually produces an int* which is x and an int (not an int*)
which is y. That is, the `*' binds to the right, not the left. However,
if you use a typedef:
typedef int* IntPtr;
IntPtr x, y;
Then both x and y are of type int*.
You can argue that it's more explicit and therefore more readable to
avoid typedefs for primitive types, and indeed programs rapidly
become difficult to read when many typedefs are used. However,
typedefs become especially important in C when used with struct.
Combining variables with struct
A struct is a way to collect a group of variables into a structure.
Once you create a struct, then you can make many instances of this
"new" type of variable you've invented. For example:
//: C03:SimpleStruct.cpp
struct Structure1 {
char c;
int i;
float f;
double d;
};
int main() {
struct Structure1 s1, s2;
s1.c = 'a'; // Select an element using a '.'
s1.i = 1;
s1.f = 3.14;
s1.d = 0.00093;
s2.c = 'a';
s2.i = 1;
s2.f = 3.14;
3: The C in C++
189
img
s2.d = 0.00093;
} ///:~
The struct declaration must end with a semicolon. In main( ), two
instances of Structure1are created: s1 and s2. Each of these has
their own separate versions of c, i, f, and d. So s1 and s2 represent
clumps of completely independent variables. To select one of the
elements within s1 or s2, you use a `.', syntax you've seen in the
previous chapter when using C++ class objects ­ since classes
evolved from structs, this is where that syntax arose from.
One thing you'll notice is the awkwardness of the use of Structure1
(as it turns out, this is only required by C, not C++). In C, you can't
just say Structure1when you're defining variables, you must say
struct Structure1This is where typedef becomes especially handy
.
in C:
//: C03:SimpleStruct2.cpp
// Using typedef with struct
typedef struct {
char c;
int i;
float f;
double d;
} Structure2;
int main() {
Structure2 s1, s2;
s1.c = 'a';
s1.i = 1;
s1.f = 3.14;
s1.d = 0.00093;
s2.c = 'a';
s2.i = 1;
s2.f = 3.14;
s2.d = 0.00093;
} ///:~
By using typedef in this way, you can pretend (in C; try removing
the typedef for C++) that Structure2is a built-in type, like int or
float, when you define s1 and s2 (but notice it only has data ­
190
Thinking in C++
img
characteristics ­ and does not include behavior, which is what we
get with real objects in C++). You'll notice that the struct identifier
has been left off at the beginning, because the goal is to create the
typedef. However, there are times when you might need to refer to
the struct during its definition. In those cases, you can actually
repeat the name of the struct as the struct name and as the typedef:
//: C03:SelfReferential.cpp
// Allowing a struct to refer to itself
typedef struct SelfReferential {
int i;
SelfReferential* sr; // Head spinning yet?
} SelfReferential;
int main() {
SelfReferential sr1, sr2;
sr1.sr = &sr2;
sr2.sr = &sr1;
sr1.i = 47;
sr2.i = 1024;
} ///:~
If you look at this for awhile, you'll see that sr1 and sr2 point to
each other, as well as each holding a piece of data.
Actually, the struct name does not have to be the same as the
typedef name, but it is usually done this way as it tends to keep
things simpler.
Pointers and structs
In the examples above, all the structs are manipulated as objects.
However, like any piece of storage, you can take the address of a
struct object (as seen in SelfReferential.cpp
above). To select the
elements of a particular struct object, you use a `.', as seen above.
However, if you have a pointer to a struct object, you must select
an element of that object using a different operator: the `->'. Here's
an example:
//: C03:SimpleStruct3.cpp
3: The C in C++
191
img
// Using pointers to structs
typedef struct Structure3 {
char c;
int i;
float f;
double d;
} Structure3;
int main() {
Structure3 s1, s2;
Structure3* sp = &s1;
sp->c = 'a';
sp->i = 1;
sp->f = 3.14;
sp->d = 0.00093;
sp = &s2; // Point to a different struct object
sp->c = 'a';
sp->i = 1;
sp->f = 3.14;
sp->d = 0.00093;
} ///:~
In main( ), the struct pointer sp is initially pointing to s1, and the
members of s1 are initialized by selecting them with the `->' (and
you use this same operator in order to read those members). But
then sp is pointed to s2, and those variables are initialized the same
way. So you can see that another benefit of pointers is that they can
be dynamically redirected to point to different objects; this
provides more flexibility in your programming, as you will learn.
For now, that's all you need to know about structs, but you'll
become much more comfortable with them (and especially their
more potent successors, classes) as the book progresses.
Clarifying programs with enum
An enumerated data type is a way of attaching names to numbers,
thereby giving more meaning to anyone reading the code. The
enum keyword (from C) automatically enumerates any list of
identifiers you give it by assigning them values of 0, 1, 2, etc. You
can declare enum variables (which are always represented as
192
Thinking in C++
img
integral values). The declaration of an enum looks similar to a
struct declaration.
An enumerated data type is useful when you want to keep track of
some sort of feature:
//: C03:Enum.cpp
// Keeping track of shapes
enum ShapeType {
circle,
square,
rectangle
};  // Must end with a semicolon like a struct
int main() {
ShapeType shape = circle;
// Activities here....
// Now do something based on what the shape is:
switch(shape) {
case circle:  /* circle stuff */ break;
case square:  /* square stuff */ break;
case rectangle:  /* rectangle stuff */ break;
}
} ///:~
shape is a variable of the ShapeTypeenumerated data type, and its
value is compared with the value in the enumeration. Since shape
is really just an int, however, it can be any value an int can hold
(including a negative number). You can also compare an int
variable with a value in the enumeration.
You should be aware that the example above of switching on type
turns out to be a problematic way to program. C++ has a much
better way to code this sort of thing, the explanation of which must
be delayed until much later in the book.
If you don't like the way the compiler assigns values, you can do it
yourself, like this:
enum ShapeType {
3: The C in C++
193
img
circle = 10, square = 20, rectangle = 50
};
If you give values to some names and not to others, the compiler
will use the next integral value. For example,
enum snap { crackle = 25, pop };
The compiler gives pop the value 26.
You can see how much more readable the code is when you use
enumerated data types. However, to some degree this is still an
attempt (in C) to accomplish the things that we can do with a class
in C++, so you'll see enum used less in C++.
Type checking for enumerations
C's enumerations are fairly primitive, simply associating integral
values with names, but they provide no type checking. In C++, as
you may have come to expect by now, the concept of type is
fundamental, and this is true with enumerations. When you create
a named enumeration, you effectively create a new type just as you
do with a class: The name of your enumeration becomes a reserved
word for the duration of that translation unit.
In addition, there's stricter type checking for enumerations in C++
than in C. You'll notice this in particular if you have an instance of
an enumeration color called a. In C you can say a++, but in C++
you can't. This is because incrementing an enumeration is
performing two type conversions, one of them legal in C++ and one
of them illegal. First, the value of the enumeration is implicitly cast
from a color to an int, then the value is incremented, then the int is
cast back into a color. In C++ this isn't allowed, because color is a
distinct type and not equivalent to an int. This makes sense,
because how do you know the increment of blue will even be in the
list of colors? If you want to increment a color, then it should be a
class (with an increment operation) and not an enum, because the
class can be made to be much safer. Any time you write code that
194
Thinking in C++
img
assumes an implicit conversion to an enum type, the compiler will
flag this inherently dangerous activity.
Unions (described next) have similar additional type checking in
C++.
Saving memory with union
Sometimes a program will handle different types of data using the
same variable. In this situation, you have two choices: you can
create a struct containing all the possible different types you might
need to store, or you can use a union. A union piles all the data
into a single space; it figures out the amount of space necessary for
the largest item you've put in the union, and makes that the size of
the union. Use a union to save memory.
Anytime you place a value in a union, the value always starts in
the same place at the beginning of the union, but only uses as much
space as is necessary. Thus, you create a "super-variable" capable
of holding any of the union variables. All the addresses of the
union variables are the same (in a class or struct, the addresses are
different).
Here's a simple use of a union. Try removing various elements and
see what effect it has on the size of the union. Notice that it makes
no sense to declare more than one instance of a single data type in a
union (unless you're just doing it to use a different name).
//: C03:Union.cpp
// The size and simple use of a union
#include <iostream>
using namespace std;
union Packed { // Declaration similar to a class
char i;
short j;
int k;
long l;
float f;
3: The C in C++
195
img
double d;
// The union will be the size of a
// double, since that's the largest element
};  // Semicolon ends a union, like a struct
int main() {
cout << "sizeof(Packed) = "
<< sizeof(Packed) << endl;
Packed x;
x.i = 'c';
cout << x.i << endl;
x.d = 3.14159;
cout << x.d << endl;
} ///:~
The compiler performs the proper assignment according to the
union member you select.
Once you perform an assignment, the compiler doesn't care what
you do with the union. In the example above, you could assign a
floating-point value to x:
x.f = 2.222;
and then send it to the output as if it were an int:
cout << x.i;
This would produce garbage.
Arrays
Arrays are a kind of composite type because they allow you to
clump a lot of variables together, one right after the other, under a
single identifier name. If you say:
int a[10];
You create storage for 10 int variables stacked on top of each other,
but without unique identifier names for each variable. Instead, they
are all lumped under the name a.
196
Thinking in C++
img
To access one of these array elements, you use the same square-
bracket syntax that you use to define an array:
a[5] = 47;
However, you must remember that even though the size of a is 10,
you select array elements starting at zero (this is sometimes called
zero indexing), so you can select only the array elements 0-9, like
this:
//: C03:Arrays.cpp
#include <iostream>
using namespace std;
int main() {
int a[10];
for(int i = 0; i < 10; i++) {
a[i] = i * 10;
cout << "a[" << i << "] = " << a[i] << endl;
}
} ///:~
Array access is extremely fast. However, if you index past the end
of the array, there is no safety net ­ you'll step on other variables.
The other drawback is that you must define the size of the array at
compile time; if you want to change the size at runtime you can't
do it with the syntax above (C does have a way to create an array
dynamically, but it's significantly messier). The C++ vector,
introduced in the previous chapter, provides an array-like object
that automatically resizes itself, so it is usually a much better
solution if your array size cannot be known at compile time.
You can make an array of any type, even of structs:
//: C03:StructArray.cpp
// An array of struct
typedef struct {
int i, j, k;
} ThreeDpoint;
3: The C in C++
197
img
int main() {
ThreeDpoint p[10];
for(int i = 0; i < 10; i++) {
p[i].i = i + 1;
p[i].j = i + 2;
p[i].k = i + 3;
}
} ///:~
Notice how the struct identifier i is independent of the for loop's i.
To see that each element of an array is contiguous with the next,
you can print out the addresses like this:
//: C03:ArrayAddresses.cpp
#include <iostream>
using namespace std;
int main() {
int a[10];
cout << "sizeof(int) = "<< sizeof(int) << endl;
for(int i = 0; i < 10; i++)
cout << "&a[" << i << "] = "
<< (long)&a[i] << endl;
} ///:~
When you run this program, you'll see that each element is one int
size away from the previous one. That is, they are stacked one on
top of the other.
Pointers and arrays
The identifier of an array is unlike the identifiers for ordinary
variables. For one thing, an array identifier is not an lvalue; you
cannot assign to it. It's really just a hook into the square-bracket
syntax, and when you give the name of an array, without square
brackets, what you get is the starting address of the array:
//: C03:ArrayIdentifier.cpp
#include <iostream>
using namespace std;
int main() {
198
Thinking in C++
img
int a[10];
cout << "a = " << a << endl;
cout << "&a[0] =" << &a[0] << endl;
} ///:~
When you run this program you'll see that the two addresses
(which will be printed in hexadecimal, since there is no cast to
long) are the same.
So one way to look at the array identifier is as a read-only pointer
to the beginning of an array. And although we can't change the
array identifier to point somewhere else, we can create another
pointer and use that to move around in the array. In fact, the
square-bracket syntax works with regular pointers as well:
//: C03:PointersAndBrackets.cpp
int main() {
int a[10];
int* ip = a;
for(int i = 0; i < 10; i++)
ip[i] = i * 10;
} ///:~
The fact that naming an array produces its starting address turns
out to be quite important when you want to pass an array to a
function. If you declare an array as a function argument, what
you're really declaring is a pointer. So in the following example,
func1( )and func2( )effectively have the same argument lists:
//: C03:ArrayArguments.cpp
#include <iostream>
#include <string>
using namespace std;
void func1(int a[], int size) {
for(int i = 0; i < size; i++)
a[i] = i * i - i;
}
void func2(int* a, int size) {
for(int i = 0; i < size; i++)
a[i] = i * i + i;
3: The C in C++
199
img
.
}
void print(int a[], string name, int size) {
for(int i = 0; i < size; i++)
cout << name << "[" << i << "] = "
<< a[i] << endl;
}
int main() {
int a[5], b[5];
// Probably garbage values:
print(a, "a", 5);
print(b, "b", 5);
// Initialize the arrays:
func1(a, 5);
func1(b, 5);
print(a, "a", 5);
print(b, "b", 5);
// Notice the arrays are always modified:
func2(a, 5);
func2(b, 5);
print(a, "a", 5);
print(b, "b", 5);
} ///:~
Even though func1( )and func2( )declare their arguments
differently, the usage is the same inside the function. There are
some other issues that this example reveals: arrays cannot be
passed by value3, that is, you never automatically get a local copy
of the array that you pass into a function. Thus, when you modify
an array, you're always modifying the outside object. This can be a
bit confusing at first, if you're expecting the pass-by-value
provided with ordinary arguments.
3 Unless you take the very strict approach that "all argument passing in C/C++ is by
value, and the `value' of an array is what is produced by the array identifier: it's
address." This can be seen as true from the assembly-language standpoint, but I don't
think it helps when trying to work with higher-level concepts. The addition of
references in C++ makes the "all passing is by value" argument more confusing, to
the point where I feel it's more helpful to think in terms of "passing by value" vs.
"passing addresses."
200
Thinking in C++
img
You'll notice that print( )uses the square-bracket syntax for array
arguments. Even though the pointer syntax and the square-bracket
syntax are effectively the same when passing arrays as arguments,
the square-bracket syntax makes it clearer to the reader that you
mean for this argument to be an array.
Also note that the size argument is passed in each case. Just passing
the address of an array isn't enough information; you must always
be able to know how big the array is inside your function, so you
don't run off the end of that array.
Arrays can be of any type, including arrays of pointers. In fact,
when you want to pass command-line arguments into your
program, C and C++ have a special argument list for main( ),
which looks like this:
int main(int argc, char* argv[]) { // ...
The first argument is the number of elements in the array, which is
the second argument. The second argument is always an array of
char*, because the arguments are passed from the command line as
character arrays (and remember, an array can be passed only as a
pointer). Each whitespace-delimited cluster of characters on the
command line is turned into a separate array argument. The
following program prints out all its command-line arguments by
stepping through the array:
//: C03:CommandLineArgs.cpp
#include <iostream>
using namespace std;
int main(int argc, char* argv[]) {
cout << "argc = " << argc << endl;
for(int i = 0; i < argc; i++)
cout << "argv[" << i << "] = "
<< argv[i] << endl;
} ///:~
3: The C in C++
201
img
You'll notice that argv[0] is the path and name of the program
itself. This allows the program to discover information about itself.
It also adds one more to the array of program arguments, so a
common error when fetching command-line arguments is to grab
argv[0] when you want argv[1].
You are not forced to use argc and argv as identifiers in main( );
those identifiers are only conventions (but it will confuse people if
you don't use them). Also, there is an alternate way to declare argv:
int main(int argc, char** argv) { // ...
Both forms are equivalent, but I find the version used in this book
to be the most intuitive when reading the code, since it says,
directly, "This is an array of character pointers."
All you get from the command-line is character arrays; if you want
to treat an argument as some other type, you are responsible for
converting it inside your program. To facilitate the conversion to
numbers, there are some helper functions in the Standard C library,
declared in <cstdlib> The simplest ones to use are atoi( ), atol( ),
.
and atof( ) to convert an ASCII character array to an int, long, and
double floating-point value, respectively. Here's an example using
atoi( ) (the other two functions are called the same way):
//: C03:ArgsToInts.cpp
// Converting command-line arguments to ints
#include <iostream>
#include <cstdlib>
using namespace std;
int main(int argc, char* argv[]) {
for(int i = 1; i < argc; i++)
cout << atoi(argv[i]) << endl;
} ///:~
In this program, you can put any number of arguments on the
command line. You'll notice that the for loop starts at the value 1 to
skip over the program name at argv[0]. Also, if you put a floating-
202
Thinking in C++
img
point number containing a decimal point on the command line,
atoi( ) takes only the digits up to the decimal point. If you put non-
numbers on the command line, these come back from atoi( ) as
zero.
Exploring floating-point format
The printBinary( )function introduced earlier in this chapter is
handy for delving into the internal structure of various data types.
The most interesting of these is the floating-point format that
allows C and C++ to store numbers representing very large and
very small values in a limited amount of space. Although the
details can't be completely exposed here, the bits inside of floats
and doubles are divided into three regions: the exponent, the
mantissa, and the sign bit; thus it stores the values using scientific
notation. The following program allows you to play around by
printing out the binary patterns of various floating point numbers
so you can deduce for yourself the scheme used in your compiler's
floating-point format (usually this is the IEEE standard for floating
point numbers, but your compiler may not follow that):
//: C03:FloatingAsBinary.cpp
//{L} printBinary
//{T} 3.14159
#include "printBinary.h"
#include <cstdlib>
#include <iostream>
using namespace std;
int main(int argc, char* argv[]) {
if(argc != 2) {
cout << "Must provide a number" << endl;
exit(1);
}
double d = atof(argv[1]);
unsigned char* cp =
reinterpret_cast<unsigned char*>(&d);
for(int i = sizeof(double); i > 0 ; i -= 2) {
printBinary(cp[i-1]);
printBinary(cp[i]);
}
3: The C in C++
203
img
} ///:~
First, the program guarantees that you've given it an argument by
checking the value of argc, which is two if there's a single argument
(it's one if there are no arguments, since the program name is
always the first element of argv). If this fails, a message is printed
and the Standard C Library function exit( ) is called to terminate
the program.
The program grabs the argument from the command line and
converts the characters to a double using atof( ). Then the double is
treated as an array of bytes by taking the address and casting it to
an unsigned char* Each of these bytes is passed to printBinary( )
.
for display.
This example has been set up to print the bytes in an order such
that the sign bit appears first ­ on my machine. Yours may be
different, so you might want to re-arrange the way things are
printed. You should also be aware that floating-point formats are
not trivial to understand; for example, the exponent and mantissa
are not generally arranged on byte boundaries, but instead a
number of bits is reserved for each one and they are packed into the
memory as tightly as possible. To truly see what's going on, you'd
need to find out the size of each part of the number (sign bits are
always one bit, but exponents and mantissas are of differing sizes)
and print out the bits in each part separately.
Pointer arithmetic
If all you could do with a pointer that points at an array is treat it as
if it were an alias for that array, pointers into arrays wouldn't be
very interesting. However, pointers are more flexible than this,
since they can be modified to point somewhere else (but remember,
the array identifier cannot be modified to point somewhere else).
Pointer arithmetic refers to the application of some of the arithmetic
operators to pointers. The reason pointer arithmetic is a separate
subject from ordinary arithmetic is that pointers must conform to
204
Thinking in C++
img
special constraints in order to make them behave properly. For
example, a common operator to use with pointers is ++, which
"adds one to the pointer." What this actually means is that the
pointer is changed to move to "the next value," whatever that
means. Here's an example:
//: C03:PointerIncrement.cpp
#include <iostream>
using namespace std;
int main() {
int i[10];
double d[10];
int* ip = i;
double* dp = d;
cout << "ip = "
<< (long)ip << endl;
ip++;
cout << "ip = "
<< (long)ip << endl;
cout << "dp = "
<< (long)dp << endl;
dp++;
cout << "dp = "
<< (long)dp << endl;
} ///:~
For one run on my machine, the output is:
ip
=
6684124
ip
=
6684128
dp
=
6684044
dp
=
6684052
What's interesting here is that even though the operation ++
appears to be the same operation for both the int* and the double*,
you can see that the pointer has been changed only 4 bytes for the
int* but 8 bytes for the double*. Not coincidentally, these are the
sizes of int and double on my machine. And that's the trick of
pointer arithmetic: the compiler figures out the right amount to
change the pointer so that it's pointing to the next element in the
array (pointer arithmetic is only meaningful within arrays). This
even works with arrays of structs:
//: C03:PointerIncrement2.cpp
3: The C in C++
205
img
#include <iostream>
using namespace std;
typedef struct {
char c;
short s;
int i;
long l;
float f;
double d;
long double ld;
} Primitives;
int main() {
Primitives p[10];
Primitives* pp = p;
cout << "sizeof(Primitives) = "
<< sizeof(Primitives) << endl;
cout << "pp = " << (long)pp << endl;
pp++;
cout << "pp = " << (long)pp << endl;
} ///:~
The output for one run on my machine was:
sizeof(Primitives) = 40
pp = 6683764
pp = 6683804
So you can see the compiler also does the right thing for pointers to
structs (and classes and unions).
Pointer arithmetic also works with the operators --, +, and -, but the
latter two operators are limited: you cannot add two pointers, and
if you subtract pointers the result is the number of elements
between the two pointers. However, you can add or subtract an
integral value and a pointer. Here's an example demonstrating the
use of pointer arithmetic:
//: C03:PointerArithmetic.cpp
#include <iostream>
using namespace std;
206
Thinking in C++
img
#define P(EX) cout << #EX << ": " << EX << endl;
int main() {
int a[10];
for(int i = 0; i < 10; i++)
a[i] = i; // Give it index values
int* ip = a;
P(*ip);
P(*++ip);
P(*(ip + 5));
int* ip2 = ip + 5;
P(*ip2);
P(*(ip2 - 4));
P(*--ip2);
P(ip2 - ip); // Yields number of elements
} ///:~
It begins with another macro, but this one uses a preprocessor
feature called stringizing (implemented with the `#' sign before an
expression) that takes any expression and turns it into a character
array. This is quite convenient, since it allows the expression to be
printed, followed by a colon, followed by the value of the
expression. In main( ) you can see the useful shorthand that is
produced.
Although pre- and postfix versions of ++ and -- are valid with
pointers, only the prefix versions are used in this example because
they are applied before the pointers are dereferenced in the
expressions above, so they allow us to see the effects of the
operations. Note that only integral values are being added and
subtracted; if two pointers were combined this way the compiler
would not allow it.
Here is the output of the program above:
*ip: 0
*++ip: 1
*(ip + 5): 6
*ip2: 6
*(ip2 - 4): 2
*--ip2: 5
3: The C in C++
207
img
In all cases, the pointer arithmetic results in the pointer being
adjusted to point to the "right place," based on the size of the
elements being pointed to.
If pointer arithmetic seems a bit overwhelming at first, don't worry.
Most of the time you'll only need to create arrays and index into
them with [ ], and the most sophisticated pointer arithmetic you'll
usually need is ++ and --. Pointer arithmetic is generally reserved
for more clever and complex programs, and many of the containers
in the Standard C++ library hide most of these clever details so you
don't have to worry about them.
Debugging hints
In an ideal environment, you have an excellent debugger available
that easily makes the behavior of your program transparent so you
can quickly discover errors. However, most debuggers have blind
spots, and these will require you to embed code snippets in your
program to help you understand what's going on. In addition, you
may be developing in an environment (such as an embedded
system, which is where I spent my formative years) that has no
debugger available, and perhaps very limited feedback (i.e. a one-
line LED display). In these cases you become creative in the ways
you discover and display information about the execution of your
program. This section suggests some techniques for doing this.
Debugging flags
If you hard-wire debugging code into a program, you can run into
problems. You start to get too much information, which makes the
bugs difficult to isolate. When you think you've found the bug you
start tearing out debugging code, only to find you need to put it
back in again. You can solve these problems with two types of
flags: preprocessor debugging flags and runtime debugging flags.
208
Thinking in C++
img
Preprocessor debugging flags
By using the preprocessor to #define one or more debugging flags
(preferably in a header file), you can test a flag using an #ifdef
statement and conditionally include debugging code. When you
think your debugging is finished, you can simply #undef the
flag(s) and the code will automatically be removed (and you'll
reduce the size and runtime overhead of your executable file).
It is best to decide on names for debugging flags before you begin
building your project so the names will be consistent. Preprocessor
flags are traditionally distinguished from variables by writing them
in all upper case. A common flag name is simply DEBUG (but be
careful you don't use NDEBUG, which is reserved in C). The
sequence of statements might be:
#define DEBUG // Probably in a header file
//...
#ifdef DEBUG // Check to see if flag is defined
/* debugging code here */
#endif // DEBUG
Most C and C++ implementations will also let you #define and
#undef flags from the compiler command line, so you can re-
compile code and insert debugging information with a single
command (preferably via the makefile, a tool that will be described
shortly). Check your local documentation for details.
Runtime debugging flags
In some situations it is more convenient to turn debugging flags on
and off during program execution, especially by setting them when
the program starts up using the command line. Large programs are
tedious to recompile just to insert debugging code.
To turn debugging code on and off dynamically, create bool flags:
//: C03:DynamicDebugFlags.cpp
#include <iostream>
#include <string>
using namespace std;
3: The C in C++
209
img
// Debug flags aren't necessarily global:
bool debug = false;
int main(int argc, char* argv[]) {
for(int i = 0; i < argc; i++)
if(string(argv[i]) == "--debug=on")
debug = true;
bool go = true;
while(go) {
if(debug) {
// Debugging code here
cout << "Debugger is now on!" << endl;
} else {
cout << "Debugger is now off." << endl;
}
cout << "Turn debugger [on/off/quit]: ";
string reply;
cin >> reply;
if(reply == "on") debug = true; // Turn it on
if(reply == "off") debug = false; // Off
if(reply == "quit") break; // Out of 'while'
}
} ///:~
This program continues to allow you to turn the debugging flag on
and off until you type "quit" to tell it you want to exit. Notice it
requires that full words are typed in, not just letters (you can
shorten it to letter if you wish). Also, a command-line argument can
optionally be used to turn debugging on at startup ­ this argument
can appear anyplace in the command line, since the startup code in
main( ) looks at all the arguments. The testing is quite simple
because of the expression:
string(argv[i])
This takes the argv[i] character array and creates a string, which
then can be easily compared to the right-hand side of the ==. The
program above searches for the entire string --debug=on You can
.
also look for --debug=and then see what's after that, to provide
more options. Volume 2 (available from )
devotes a chapter to the Standard C++ string class.
210
Thinking in C++
img
Although a debugging flag is one of the relatively few areas where
it makes a lot of sense to use a global variable, there's nothing that
says it must be that way. Notice that the variable is in lower case
letters to remind the reader it isn't a preprocessor flag.
Turning variables and expressions into strings
When writing debugging code, it is tedious to write print
expressions consisting of a character array containing the variable
name, followed by the variable. Fortunately, Standard C includes
the stringize operator `#', which was used earlier in this chapter.
When you put a # before an argument in a preprocessor macro, the
preprocessor turns that argument into a character array. This,
combined with the fact that character arrays with no intervening
punctuation are concatenated into a single character array, allows
you to make a very convenient macro for printing the values of
variables during debugging:
#define PR(x) cout << #x " = " << x << "\n";
If you print the variable a by calling the macro PR(a), it will have
the same effect as the code:
cout << "a = " << a << "\n";
This same process works with entire expressions. The following
program uses a macro to create a shorthand that prints the
stringized expression and then evaluates the expression and prints
the result:
//: C03:StringizingExpressions.cpp
#include <iostream>
using namespace std;
#define P(A) cout << #A << ": " << (A) << endl;
int main() {
int a = 1, b = 2, c = 3;
P(a); P(b); P(c);
P(a + b);
3: The C in C++
211
img
P((c - a)/b);
} ///:~
You can see how a technique like this can quickly become
indispensable, especially if you have no debugger (or must use
multiple development environments). You can also insert an #ifdef
to cause P(A) to be defined as "nothing" when you want to strip
out debugging.
The C assert( ) macro
In the standard header file <cassert>you'll find assert( ) which is a
,
convenient debugging macro. When you use assert( ) you give it
,
an argument that is an expression you are "asserting to be true."
The preprocessor generates code that will test the assertion. If the
assertion isn't true, the program will stop after issuing an error
message telling you what the assertion was and that it failed.
Here's a trivial example:
//: C03:Assert.cpp
// Use of the assert() debugging macro
#include <cassert>  // Contains the macro
using namespace std;
int main() {
int i = 100;
assert(i != 100); // Fails
} ///:~
The macro originated in Standard C, so it's also available in the
header file assert.h
.
When you are finished debugging, you can remove the code
generated by the macro by placing the line:
#define NDEBUG
in the program before the inclusion of <cassert> or by defining
,
NDEBUG on the compiler command line. NDEBUG is a flag used
in <cassert>to change the way code is generated by the macros.
212
Thinking in C++
img
Later in this book, you'll see some more sophisticated alternatives
to assert( )
.
Function addresses
Once a function is compiled and loaded into the computer to be
executed, it occupies a chunk of memory. That memory, and thus
the function, has an address.
C has never been a language to bar entry where others fear to tread.
You can use function addresses with pointers just as you can use
variable addresses. The declaration and use of function pointers
looks a bit opaque at first, but it follows the format of the rest of the
language.
Defining a function pointer
To define a pointer to a function that has no arguments and no
return value, you say:
void (*funcPtr)();
When you are looking at a complex definition like this, the best
way to attack it is to start in the middle and work your way out.
"Starting in the middle" means starting at the variable name, which
is funcPtr. "Working your way out" means looking to the right for
the nearest item (nothing in this case; the right parenthesis stops
you short), then looking to the left (a pointer denoted by the
asterisk), then looking to the right (an empty argument list
indicating a function that takes no arguments), then looking to the
left (void, which indicates the function has no return value). This
right-left-right motion works with most declarations.
To review, "start in the middle" ("funcPtr is a ..."), go to the right
(nothing there ­ you're stopped by the right parenthesis), go to the
left and find the `*' ("... pointer to a ..."), go to the right and find the
empty argument list ("... function that takes no arguments ... "), go
3: The C in C++
213
img
to the left and find the void ("funcPtr is a pointer to a function that
takes no arguments and returns void").
You may wonder why *funcPtrrequires parentheses. If you didn't
use them, the compiler would see:
void *funcPtr();
You would be declaring a function (that returns a void*) rather
than defining a variable. You can think of the compiler as going
through the same process you do when it figures out what a
declaration or definition is supposed to be. It needs those
parentheses to "bump up against" so it goes back to the left and
finds the `*', instead of continuing to the right and finding the
empty argument list.
Complicated declarations & definitions
As an aside, once you figure out how the C and C++ declaration
syntax works you can create much more complicated items. For
instance:
//: C03:ComplicatedDefinitions.cpp
/* 1. */
void * (*(*fp1)(int))[10];
/* 2. */
float (*(*fp2)(int,int,float))(int);
/* 3. */
typedef double (*(*(*fp3)())[10])();
fp3 a;
/* 4. */
int (*(*f4())[10])();
int main() {} ///:~
Walk through each one and use the right-left guideline to figure it
out. Number 1 says "fp1 is a pointer to a function that takes an
integer argument and returns a pointer to an array of 10 void
pointers."
214
Thinking in C++
img
Number 2 says "fp2 is a pointer to a function that takes three
arguments (int, int, and float) and returns a pointer to a function
that takes an integer argument and returns a float."
If you are creating a lot of complicated definitions, you might want
to use a typedef. Number 3 shows how a typedef saves typing the
complicated description every time. It says "An fp3 is a pointer to a
function that takes no arguments and returns a pointer to an array
of 10 pointers to functions that take no arguments and return
doubles." Then it says "a is one of these fp3 types." typedef is
generally useful for building complicated descriptions from simple
ones.
Number 4 is a function declaration instead of a variable definition.
It says "f4 is a function that returns a pointer to an array of 10
pointers to functions that return integers."
You will rarely if ever need such complicated declarations and
definitions as these. However, if you go through the exercise of
figuring them out you will not even be mildly disturbed with the
slightly complicated ones you may encounter in real life.
Using a function pointer
Once you define a pointer to a function, you must assign it to a
function address before you can use it. Just as the address of an
array arr[10] is produced by the array name without the brackets
(arr), the address of a function func() is produced by the function
name without the argument list (func). You can also use the more
explicit syntax &func(). To call the function, you dereference the
pointer in the same way that you declared it (remember that C and
C++ always try to make definitions look the same as the way they
are used). The following example shows how a pointer to a
function is defined and used:
//: C03:PointerToFunction.cpp
// Defining and using a pointer to a function
#include <iostream>
3: The C in C++
215
img
using namespace std;
void func() {
cout << "func() called..." << endl;
}
int main() {
void (*fp)();  // Define a function pointer
fp = func;  // Initialize it
(*fp)();
// Dereferencing calls the function
void (*fp2)() = func;  // Define and initialize
(*fp2)();
} ///:~
After the pointer to function fp is defined, it is assigned to the
address of a function func() using fp = func(notice the argument
list is missing on the function name). The second case shows
simultaneous definition and initialization.
Arrays of pointers to functions
One of the more interesting constructs you can create is an array of
pointers to functions. To select a function, you just index into the
array and dereference the pointer. This supports the concept of
table-driven code; instead of using conditionals or case statements,
you select functions to execute based on a state variable (or a
combination of state variables). This kind of design can be useful if
you often add or delete functions from the table (or if you want to
create or change such a table dynamically).
The following example creates some dummy functions using a
preprocessor macro, then creates an array of pointers to those
functions using automatic aggregate initialization. As you can see,
it is easy to add or remove functions from the table (and thus,
functionality from the program) by changing a small amount of
code:
//: C03:FunctionTable.cpp
// Using an array of pointers to functions
#include <iostream>
216
Thinking in C++
img
using namespace std;
// A macro to define dummy functions:
#define DF(N) void N() { \
cout << "function " #N " called..." << endl; }
DF(a); DF(b); DF(c); DF(d); DF(e); DF(f); DF(g);
void (*func_table[])() = { a, b, c, d, e, f, g };
int main() {
while(1) {
cout << "press a key from 'a' to 'g' "
"or q to quit" << endl;
char c, cr;
cin.get(c); cin.get(cr); // second one for CR
if ( c == 'q' )
break; // ... out of while(1)
if ( c < 'a' || c > 'g' )
continue;
(*func_table[c - 'a'])();
}
} ///:~
At this point, you might be able to imagine how this technique
could be useful when creating some sort of interpreter or list
processing program.
Make: managing separate
compilation
When using separate compilation (breaking code into a number of
translation units), you need some way to automatically compile
each file and to tell the linker to build all the pieces ­ along with the
appropriate libraries and startup code ­ into an executable file.
Most compilers allow you to do this with a single command-line
statement. For the GNU C++ compiler, for example, you might say
g++ SourceFile1.cpp SourceFile2.cpp
3: The C in C++
217
img
The problem with this approach is that the compiler will first
compile each individual file, regardless of whether that file needs to
be rebuilt or not. With many files in a project, it can become
prohibitive to recompile everything if you've changed only a single
file.
The solution to this problem, developed on Unix but available
everywhere in some form, is a program called make. The make
utility manages all the individual files in a project by following the
instructions in a text file called a makefile When you edit some of
.
the files in a project and type make, the make program follows the
guidelines in the makefileto compare the dates on the source code
files to the dates on the corresponding target files, and if a source
code file date is more recent than its target file, make invokes the
compiler on the source code file. make only recompiles the source
code files that were changed, and any other source-code files that
are affected by the modified files. By using make, you don't have to
re-compile all the files in your project every time you make a
change, nor do you have to check to see that everything was built
properly. The makefilecontains all the commands to put your
project together. Learning to use make will save you a lot of time
and frustration. You'll also discover that make is the typical way
that you install new software on a Linux/Unix machine (although
those makefile tend to be far more complicated than the ones
s
presented in this book, and you'll often automatically generate a
makefilefor your particular machine as part of the installation
process).
Because make is available in some form for virtually all C++
compilers (and even if it isn't, you can use freely-available makes
with any compiler), it will be the tool used throughout this book.
However, compiler vendors have also created their own project
building tools. These tools ask you which files are in your project
and determine all the relationships themselves. These tools use
something similar to a makefile generally called a project file, but
,
the programming environment maintains this file so you don't
218
Thinking in C++
img
have to worry about it. The configuration and use of project files
varies from one development environment to another, so you must
find the appropriate documentation on how to use them (although
project file tools provided by compiler vendors are usually so
simple to use that you can learn them by playing around ­ my
favorite form of education).
The makefile used within this book should work even if you are
s
also using a specific vendor's project-building tool.
Make activities
When you type make (or whatever the name of your "make"
program happens to be), the make program looks in the current
directory for a file named makefile which you've created if it's
,
your project. This file lists dependencies between source code files.
make looks at the dates on files. If a dependent file has an older
date than a file it depends on, make executes the rule given after the
dependency.
All comments in makefile start with a # and continue to the end of
s
the line.
As a simple example, the makefilefor a program called "hello"
might contain:
# A comment
hello.exe: hello.cpp
mycompiler hello.cpp
This says that hello.exe(the target) depends on hello.cpp When
.
hello.cpphas a newer date than hello.exe make executes the
,
"rule" mycompiler hello.cppThere may be multiple dependencies
.
and multiple rules. Many make programs require that all the rules
begin with a tab. Other than that, whitespace is generally ignored
so you can format for readability.
3: The C in C++
219
img
The rules are not restricted to being calls to the compiler; you can
call any program you want from within make. By creating groups
of interdependent dependency-rule sets, you can modify your
source code files, type make and be certain that all the affected files
will be rebuilt correctly.
Macros
A makefilemay contain macros (note that these are completely
different from C/C++ preprocessor macros). Macros allow
convenient string replacement. The makefile in this book use a
s
macro to invoke the C++ compiler. For example,
CPP = mycompiler
hello.exe: hello.cpp
$(CPP) hello.cpp
The = is used to identify CPP as a macro, and the $ and parentheses
expand the macro. In this case, the expansion means that the macro
call $(CPP) will be replaced with the string mycompiler With the
.
macro above, if you want to change to a different compiler called
cpp, you just change the macro to:
CPP = cpp
You can also add compiler flags, etc., to the macro, or use separate
macros to add compiler flags.
Suffix Rules
It becomes tedious to tell make how to invoke the compiler for
every single cpp file in your project, when you know it's the same
basic process each time. Since make is designed to be a time-saver,
it also has a way to abbreviate actions, as long as they depend on
file name suffixes. These abbreviations are called suffix rules. A
suffix rule is the way to teach make how to convert a file with one
type of extension (.cpp, for example) into a file with another type of
extension (.obj or .exe). Once you teach make the rules for
producing one kind of file from another, all you have to do is tell
make which files depend on which other files. When make finds a
220
Thinking in C++
img
file with a date earlier than the file it depends on, it uses the rule to
create a new file.
The suffix rule tells make that it doesn't need explicit rules to build
everything, but instead it can figure out how to build things based
on their file extension. In this case it says "To build a file that ends
in exe from one that ends in cpp, invoke the following command."
Here's what it looks like for the example above:
CPP = mycompiler
.SUFFIXES: .exe .cpp
.cpp.exe:
$(CPP) $<
The .SUFFIXESdirective tells make that it should watch out for
any of the following file-name extensions because they have special
meaning for this particular makefile. Next you see the suffix rule
.cpp.exe,which says "Here's how to convert any file with an
extension of cpp to one with an extension of exe" (when the cpp file
is more recent than the exe file). As before, the $(CPP) macro is
used, but then you see something new: $<. Because this begins with
a `$' it's a macro, but this is one of make's special built-in macros.
The $< can be used only in suffix rules, and it means "whatever
prerequisite triggered the rule" (sometimes called the dependent),
which in this case translates to "the cpp file that needs to be
compiled."
Once the suffix rules have been set up, you can simply say, for
example, "make Union.exe and the suffix rule will kick in, even
,"
though there's no mention of "Union" anywhere in the makefile
.
Default targets
After the macros and suffix rules, make looks for the first "target"
in a file, and builds that, unless you specify differently. So for the
following makefile
:
CPP = mycompiler
.SUFFIXES: .exe .cpp
.cpp.exe:
3: The C in C++
221
img
$(CPP) $<
target1.exe:
target2.exe:
If you just type `make', then target1.exewill be built (using the
default suffix rule) because that's the first target that make
encounters. To build target2.exeyou'd have to explicitly say `make
target2.exe This becomes tedious, so you normally create a default
'.
"dummy" target that depends on all the rest of the targets, like this:
CPP = mycompiler
.SUFFIXES: .exe .cpp
.cpp.exe:
$(CPP) $<
all: target1.exe target2.exe
Here, `all' does not exist and there's no file called `all', so every
time you type make, the program sees `all' as the first target in the
list (and thus the default target), then it sees that `all' does not exist
so it had better make it by checking all the dependencies. So it looks
at target1.exeand (using the suffix rule) sees whether (1)
target1.exeexists and (2) whether target1.cppis more recent than
target1.exe and if so runs the suffix rule (if you provide an explicit
,
rule for a particular target, that rule is used instead). Then it moves
on to the next file in the default target list. Thus, by creating a
default target list (typically called `all' by convention, but you can
call it anything) you can cause every executable in your project to
be made simply by typing `make'. In addition, you can have other
non-default target lists that do other things ­ for example, you
could set it up so that typing `make debug rebuilds all your files
'
with debugging wired in.
Makefiles in this book
Using the program ExtractCode.cppfrom Volume 2 of this book,
all the code listings in this book are automatically extracted from
the ASCII text version of this book and placed in subdirectories
according to their chapters. In addition, ExtractCode.cppcreates
several makefile in each subdirectory (with different names) so
s
222
Thinking in C++
img
you can simply move into that subdirectory and type make -f
mycompiler.makefile
(substituting the name of your compiler for
`mycompiler the `-f' flag says "use what follows as the
',
makefile  Finally, ExtractCode.cppcreates a "master" makefile
").
in the root directory where the book's files have been expanded,
and this makefiledescends into each subdirectory and calls make
with the appropriate makefile This way you can compile all the
.
code in the book by invoking a single make command, and the
process will stop whenever your compiler is unable to handle a
particular file (note that a Standard C++ conforming compiler
should be able to compile all the files in this book). Because
implementations of make vary from system to system, only the
most basic, common features are used in the generated makefile
s.
An example makefile
As mentioned, the code-extraction tool ExtractCode.cpp
automatically generates makefile for each chapter. Because of this,
s
the makefile for each chapter will not be placed in the book (all
s
the makefiles are packaged with the source code, which you can
download from ). However, it's useful to see an
example of a makefile What follows is a shortened version of the
.
one that was automatically generated for this chapter by the book's
extraction tool. You'll find more than one makefilein each
subdirectory (they have different names; you invoke a specific one
with `make -f'). This one is for GNU C++:
CPP = g++
OFLAG = -o
.SUFFIXES : .o .cpp .c
.cpp.o :
$(CPP) $(CPPFLAGS) -c $<
.c.o :
$(CPP) $(CPPFLAGS) -c $<
all: \
Return \
Declare \
Ifthen \
3: The C in C++
223
img
Guess \
Guess2
# Rest of the files for this chapter not shown
Return: Return.o
$(CPP) $(OFLAG)Return Return.o
Declare: Declare.o
$(CPP) $(OFLAG)Declare Declare.o
Ifthen: Ifthen.o
$(CPP) $(OFLAG)Ifthen Ifthen.o
Guess: Guess.o
$(CPP) $(OFLAG)Guess Guess.o
Guess2: Guess2.o
$(CPP) $(OFLAG)Guess2 Guess2.o
Return.o: Return.cpp
Declare.o: Declare.cpp
Ifthen.o: Ifthen.cpp
Guess.o: Guess.cpp
Guess2.o: Guess2.cpp
The macro CPP is set to the name of the compiler. To use a different
compiler, you can either edit the makefileor change the value of
the macro on the command line, like this:
make CPP=cpp
Note, however, that ExtractCode.cpphas an automatic scheme to
automatically build makefile for additional compilers.
s
The second macro OFLAG is the flag that's used to indicate the
name of the output file. Although many compilers automatically
assume the output file has the same base name as the input file,
others don't (such as Linux/Unix compilers, which default to
creating a file called a.out).
You can see that there are two suffix rules here, one for cpp files
and one for .c files (in case any C source code needs to be
224
Thinking in C++
img
compiled). The default target is all, and each line for this target is
"continued" by using the backslash, up until Guess2, which is the
last one in the list and thus has no backslash. There are many more
files in this chapter, but only these are shown here for the sake of
brevity.
The suffix rules take care of creating object files (with a .o
extension) from cpp files, but in general you need to explicitly state
rules for creating the executable, because normally an executable is
created by linking many different object files and make cannot
guess what those are. Also, in this case (Linux/Unix) there is no
standard extension for executables so a suffix rule won't work for
these simple situations. Thus, you see all the rules for building the
final executables explicitly stated.
This makefiletakes the absolute safest route of using as few make
features as possible; it only uses the basic make concepts of targets
and dependencies, as well as macros. This way it is virtually
assured of working with as many make programs as possible. It
tends to produce a larger makefile but that's not so bad since it's
,
automatically generated by ExtractCode.cpp
.
There are lots of other make features that this book will not use, as
well as newer and cleverer versions and variations of make with
advanced shortcuts that can save a lot of time. Your local
documentation may describe the further features of your particular
make, and you can learn more about make from Managing Projects
with Make by Oram and Talbott (O'Reilly, 1993). Also, if your
compiler vendor does not supply a make or it uses a non-standard
make, you can find GNU make for virtually any platform in
existence by searching the Internet for GNU archives (of which
there are many).
3: The C in C++
225
img
Summary
This chapter was a fairly intense tour through all the fundamental
features of C++ syntax, most of which are inherited from and in
common with C (and result in C++'s vaunted backwards
compatibility with C). Although some C++ features were
introduced here, this tour is primarily intended for people who are
conversant in programming, and simply need to be given an
introduction to the syntax basics of C and C++. If you're already a
C programmer, you may have even seen one or two things about C
here that were unfamiliar, aside from the C++ features that were
most likely new to you. However, if this chapter has still seemed a
bit overwhelming, you should go through the CD ROM course
Thinking in C: Foundations for C++ and Java (which contains lectures,
exercises, and guided solutions), which is bound into this book, and
also available at .
Exercises
Solutions to selected exercises can be found in the electronic document The Thinking in C++ Annotated
Solution Guide, available for a small fee from .
1.
Create a header file (with an extension of `.h'). In this file,
declare a group of functions by varying the argument
lists and return values from among the following: void,
char, int, and float. Now create a .cpp file that includes
your header file and creates definitions for all of these
functions. Each definition should simply print out the
function name, argument list, and return type so you
know it's been called. Create a second .cpp file that
includes your header file and defines int main( )
,
containing calls to all of your functions. Compile and run
your program.
2.
Write a program that uses two nested for loops and the
modulus operator (%) to detect and print prime numbers
(integral numbers that are not evenly divisible by any
other numbers except for themselves and 1).
226
Thinking in C++
img
3.
Write a program that uses a while loop to read words
from standard input (cin) into a string. This is an
"infinite" while loop, which you break out of (and exit
the program) using a break statement. For each word
that is read, evaluate it by first using a sequence of if
statements to "map" an integral value to the word, and
then use a switch statement that uses that integral value
as its selector (this sequence of events is not meant to be
good programming style; it's just supposed to give you
exercise with control flow). Inside each case, print
something meaningful. You must decide what the
"interesting" words are and what the meaning is. You
must also decide what word will signal the end of the
program. Test the program by redirecting a file into the
program's standard input (if you want to save typing,
this file can be your program's source file).
4.
Modify Menu.cppto use switch statements instead of if
statements.
5.
Write a program that evaluates the two expressions in
the section labeled "precedence."
6.
Modify YourPets2.cppso that it uses various different
data types (char, int, float, double, and their variants).
Run the program and create a map of the resulting
memory layout. If you have access to more than one kind
of machine, operating system, or compiler, try this
experiment with as many variations as you can manage.
7.
Create two functions, one that takes a string* and one
that takes a string&. Each of these functions should
modify the outside string object in its own unique way.
In main( ), create and initialize a string object, print it,
then pass it to each of the two functions, printing the
results.
8.
Write a program that uses all the trigraphs to see if your
compiler supports them.
3: The C in C++
227
img
9.
Compile and run Static.cpp Remove the static keyword
.
from the code, compile and run it again, and explain
what happens.
10.
Try to compile and link FileStatic.cppwith
FileStatic2.cpp What does the resulting error message
.
mean?
11.
Modify Boolean.cppso that it works with double values
instead of ints.
12.
Modify Boolean.cppand Bitwise.cppso they use the
explicit operators (if your compiler is conformant to the
C++ Standard it will support these).
13.
Modify Bitwise.cppto use the functions from
Rotation.cpp Make sure you display the results in such a
.
way that it's clear what's happening during rotations.
14.
Modify Ifthen.cppto use the ternary if-else operator (?:).
15.
Create a struct that holds two string objects and one int.
Use a typedef for the struct name. Create an instance of
the struct, initialize all three values in your instance, and
print them out. Take the address of your instance and
assign it to a pointer to your struct type. Change the
three values in your instance and print them out, all
using the pointer.
16.
Create a program that uses an enumeration of colors.
Create a variable of this enum type and print out all the
numbers that correspond with the color names, using a
for loop.
17.
Experiment with Union.cppby removing various union
elements to see the effects on the size of the resulting
union. Try assigning to one element (thus one type) of
the union and printing out a via a different element (thus
a different type) to see what happens.
18.
Create a program that defines two int arrays, one right
after the other. Index off the end of the first array into the
second, and make an assignment. Print out the second
array to see the changes cause by this. Now try defining a
228
Thinking in C++
img
char variable between the first array definition and the
second, and repeat the experiment. You may want to
create an array printing function to simplify your coding.
19.
Modify ArrayAddresses.cpp work with the data types
to
char, long int float, and double.
,
20.
Apply the technique in ArrayAddresses.cpp print out
to
the size of the struct and the addresses of the array
elements in StructArray.cpp
.
21.
Create an array of string objects and assign a string to
each element. Print out the array using a for loop.
22.
Create two new programs starting from ArgsToInts.cpp
so they use atol( ) and atof( ), respectively.
23.
Modify PointerIncrement2.cpp it uses a union instead
so
of a struct.
24.
Modify PointerArithmetic.cpp work with long and
to
long double
.
25.
Define a float variable. Take its address, cast that address
to an unsigned char and assign it to an unsigned char
,
pointer. Using this pointer and [ ], index into the float
variable and use the printBinary( )function defined in
this chapter to print out a map of the float (go from 0 to
sizeof(float) Change the value of the float and see if
).
you can figure out what's going on (the float contains
encoded data).
26.
Define an array of int. Take the starting address of that
array and use static_castto convert it into an void*.
Write a function that takes a void*, a number (indicating
a number of bytes), and a value (indicating the value to
which each byte should be set) as arguments. The
function should set each byte in the specified range to the
specified value. Try out the function on your array of int.
27.
Create a const array of double and a volatilearray of
double. Index through each array and use const_castto
cast each element to non-const and non-volatile
,
respectively, and assign a value to each element.
3: The C in C++
229
img
28.
Create a function that takes a pointer to an array of
double and a value indicating the size of that array. The
function should print each element in the array. Now
create an array of double and initialize each element to
zero, then use your function to print the array. Next use
reinterpret_cast cast the starting address of your array
to
to an unsigned char* and set each byte of the array to 1
,
(hint: you'll need to use sizeof to calculate the number of
bytes in a double). Now use your array-printing function
to print the results. Why do you think each element was
not set to the value 1.0?
29.
(Challenging) Modify FloatingAsBinary.cpp that it
so
prints out each part of the double as a separate group of
bits. You'll have to replace the calls to printBinary( )with
your own specialized code (which you can derive from
printBinary( ) in order to do this, and you'll also have to
look up and understand the floating-point format along
with the byte ordering for your compiler (this is the
challenging part).
30.
Create a makefile that not only compiles YourPets1.cpp
and YourPets2.cpp(for your particular compiler) but
also executes both programs as part of the default target
behavior. Make sure you use suffix rules.
31.
Modify StringizingExpressions.cpp that P(A) is
so
conditionally #ifdefed to allow the debugging code to be
automatically stripped out by setting a command-line
flag. You will need to consult your compiler's
documentation to see how to define and undefine
preprocessor values on the compiler command line.
32.
Define a function that takes a double argument and
returns an int. Create and initialize a pointer to this
function, and call the function through your pointer.
33.
Declare a pointer to a function taking an int argument
and returning a pointer to a function that takes a char
argument and returns a float.
230
Thinking in C++
img
34.
Modify FunctionTable.cpp that each function returns
so
a string (instead of printing out a message) and so that
this value is printed inside of main( ).
35.
Create a makefilefor one of the previous exercises (of
your choice) that allows you to type make for a
production build of the program, and make debugfor a
build of the program including debugging information.
3: The C in C++
231
img
232