References & the Copy-Constructor:The copyconstructor, Pointers to members

<< Name Control:Static elements from C, Static initialization dependency, specifications

Operator Overloading:Overloadable operators, Overloading assignment >>

11: References &

the Copy-Constructor

References are like constant pointers that are

automatically dereferenced by the compiler.

473

Although references also exist in Pascal, the C++ version was taken

from the Algol language. They are essential in C++ to support the

syntax of operator overloading (see Chapter 12), but they are also a

general convenience to control the way arguments are passed into

and out of functions.

This chapter will first look briefly at the differences between

pointers in C and C++, then introduce references. But the bulk of

the chapter will delve into a rather confusing issue for the new C++

programmer: the copy-constructor, a special constructor (requiring

references) that makes a new object from an existing object of the

same type. The copy-constructor is used by the compiler to pass

and return objects by value into and out of functions.

Finally, the somewhat obscure C++ pointer-to-member feature is

illuminated.

Pointers in C++

The most important difference between pointers in C and those in

C++ is that C++ is a more strongly typed language. This stands out

where void* is concerned. C doesn't let you casually assign a

pointer of one type to another, but it does allow you to accomplish

this through a void*. Thus,

bird* b;

rock* r;

void* v;

v = r;

b = v;

Because this "feature" of C allows you to quietly treat any type like

any other type, it leaves a big hole in the type system. C++ doesn't

allow this; the compiler gives you an error message, and if you

really want to treat one type as another, you must make it explicit,

both to the compiler and to the reader, using a cast. (Chapter 3

introduced C++'s improved "explicit" casting syntax.)

474

Thinking in C++

References in C++

A reference (&) is like a constant pointer that is automatically

dereferenced. It is usually used for function argument lists and

function return values. But you can also make a free-standing

reference. For example,

//: C11:FreeStandingReferences.cpp

#include <iostream>

using namespace std;

// Ordinary free-standing reference:

int y;

int& r = y;

// When a reference is created, it must

// be initialized to a live object.

// However, you can also say:

const int& q = 12; // (1)

// References are tied to someone else's storage:

int x = 0;

// (2)

int& a = x;

// (3)

int main() {

cout << "x = " << x << ", a = " << a << endl;

a++;

cout << "x = " << x << ", a = " << a << endl;

} ///:~

In line (1), the compiler allocates a piece of storage, initializes it

with the value 12, and ties the reference to that piece of storage. The

point is that any reference must be tied to someone else's piece of

storage. When you access a reference, you're accessing that storage.

Thus, if you write lines like (2) and (3), then incrementing a is

actually incrementing x, as is shown in main( ). Again, the easiest

way to think about a reference is as a fancy pointer. One advantage

of this "pointer" is that you never have to wonder whether it's been

initialized (the compiler enforces it) and how to dereference it (the

compiler does it).

There are certain rules when using references:

11: References & the Copy-Constructor

475

A reference must be initialized when it is created. (Pointers

can be initialized at any time.)

Once a reference is initialized to an object, it cannot be

changed to refer to another object. (Pointers can be pointed to

another object at any time.)

You cannot have NULL references. You must always be able

to assume that a reference is connected to a legitimate piece

of storage.

References in functions

The most common place you'll see references is as function

arguments and return values. When a reference is used as a

function argument, any modification to the reference inside the

function will cause changes to the argument outside the function. Of

course, you could do the same thing by passing a pointer, but a

reference has much cleaner syntax. (You can think of a reference as

nothing more than a syntax convenience, if you want.)

If you return a reference from a function, you must take the same

care as if you return a pointer from a function. Whatever the

reference is connected to shouldn't go away when the function

returns, otherwise you'll be referring to unknown memory.

Here's an example:

//: C11:Reference.cpp

// Simple C++ references

int* f(int* x) {

(*x)++;

return x; // Safe, x is outside this scope

}

int& g(int& x) {

x++; // Same effect as in f()

return x; // Safe, outside this scope

}

476

Thinking in C++

int& h() {

int q;

//! return q; // Error

static int x;

return x; // Safe, x lives outside this scope

}

int main() {

int a = 0;

f(&a); // Ugly (but explicit)

g(a); // Clean (but hidden)

} ///:~

The call to f( ) doesn't have the convenience and cleanliness of

using references, but it's clear that an address is being passed. In

the call to g( ), an address is being passed (via a reference), but you

don't see it.

const references

The reference argument in Reference.cppworks only when the

argument is a non-const object. If it is a const object, the function

g( ) will not accept the argument, which is actually a good thing,

because the function does modify the outside argument. If you

know the function will respect the constness of an object, making

the argument a const reference will allow the function to be used in

all situations. This means that, for built-in types, the function will

not modify the argument, and for user-defined types, the function

will call only const member functions, and won't modify any

public data members.

The use of const references in function arguments is especially

important because your function may receive a temporary object.

This might have been created as a return value of another function

or explicitly by the user of your function. Temporary objects are

always const, so if you don't use a const reference, that argument

won't be accepted by the compiler. As a very simple example,

//: C11:ConstReferenceArguments.cpp

// Passing references as const

11: References & the Copy-Constructor

477

void f(int&) {}

void g(const int&) {}

int main() {

//! f(1); // Error

g(1);

} ///:~

The call to f(1) causes a compile-time error because the compiler

must first create a reference. It does so by allocating storage for an

int, initializing it to one and producing the address to bind to the

reference. The storage must be a const because changing it would

make no sense you can never get your hands on it again. With all

temporary objects you must make the same assumption: that

they're inaccessible. It's valuable for the compiler to tell you when

you're changing such data because the result would be lost

information.

Pointer references

In C, if you want to modify the contents of the pointer rather than

what it points to, your function declaration looks like:

void f(int**);

and you'd have to take the address of the pointer when passing it

in:

int i = 47;

int* ip = &i;

f(&ip);

With references in C++, the syntax is cleaner. The function

argument becomes a reference to a pointer, and you no longer have

to take the address of that pointer. Thus,

//: C11:ReferenceToPointer.cpp

#include <iostream>

using namespace std;

void increment(int*& i) { i++; }

478

Thinking in C++

int main() {

int* i = 0;

cout << "i = " << i << endl;

increment(i);

cout << "i = " << i << endl;

} ///:~

By running this program, you'll prove to yourself that the pointer is

incremented, not what it points to.

Argument-passing guidelines

Your normal habit when passing an argument to a function should

be to pass by const reference. Although at first this may seem like

only an efficiency concern (and you normally don't want to concern

yourself with efficiency tuning while you're designing and

assembling your program), there's more at stake: as you'll see in

the remainder of the chapter, a copy-constructor is required to pass

an object by value, and this isn't always available.

The efficiency savings can be substantial for such a simple habit: to

pass an argument by value requires a constructor and destructor

call, but if you're not going to modify the argument then passing by

const reference only needs an address pushed on the stack.

In fact, virtually the only time passing an address isn't preferable is

when you're going to do such damage to an object that passing by

value is the only safe approach (rather than modifying the outside

object, something the caller doesn't usually expect). This is the

subject of the next section.

The copy-constructor

Now that you understand the basics of the reference in C++, you're

ready to tackle one of the more confusing concepts in the language:

the copy-constructor, often called X(X&) ("X of X ref"). This

constructor is essential to control passing and returning of user-

11: References & the Copy-Constructor

479

defined types by value during function calls. It's so important, in

fact, that the compiler will automatically synthesize a copy-

constructor if you don't provide one yourself, as you will see.

Passing & returning by value

To understand the need for the copy-constructor, consider the way

C handles passing and returning variables by value during function

calls. If you declare a function and make a function call,

int f(int x, char c);

int g = f(a, b);

how does the compiler know how to pass and return those

variables? It just knows! The range of the types it must deal with is

so small char, int, float, double, and their variations that this

information is built into the compiler.

If you figure out how to generate assembly code with your

compiler and determine the statements generated by the function

call to f( ), you'll get the equivalent of:

push b

push a

call f()

add sp,4

mov g, register a

This code has been cleaned up significantly to make it generic; the

expressions for b and a will be different depending on whether the

variables are global (in which case they will be _b and _a) or local

(the compiler will index them off the stack pointer). This is also true

for the expression for g. The appearance of the call to f( ) will

depend on your name-decoration scheme, and "register a" depends

on how the CPU registers are named within your assembler. The

logic behind the code, however, will remain the same.

In C and C++, arguments are first pushed on the stack from right to

left, then the function call is made. The calling code is responsible

480

Thinking in C++

for cleaning the arguments off the stack (which accounts for the

add sp,4 But notice that to pass the arguments by value, the

compiler simply pushes copies on the stack it knows how big

they are and that pushing those arguments makes accurate copies

of them.

The return value of f( ) is placed in a register. Again, the compiler

knows everything there is to know about the return value type

because that type is built into the language, so the compiler can

return it by placing it in a register. With the primitive data types in

C, the simple act of copying the bits of the value is equivalent to

copying the object.

Passing & returning large objects

But now consider user-defined types. If you create a class and you

want to pass an object of that class by value, how is the compiler

supposed to know what to do? This is not a type built into the

compiler; it's a type you have created.

To investigate this, you can start with a simple structure that is

clearly too large to return in registers:

//: C11:PassingBigStructures.cpp

struct Big {

char buf[100];

int i;

long d;

} B, B2;

Big bigfun(Big b) {

b.i = 100; // Do something to the argument

return b;

}

int main() {

B2 = bigfun(B);

} ///:~

Decoding the assembly output is a little more complicated here

because most compilers use "helper" functions instead of putting

11: References & the Copy-Constructor

481

all functionality inline. In main( ), the call to bigfun( )starts as you

might guess the entire contents of B is pushed on the stack. (Here,

you might see some compilers load registers with the address of

the Big and its size, then call a helper function to push the Big onto

the stack.)

In the previous code fragment, pushing the arguments onto the

stack was all that was required before making the function call. In

PassingBigStructures.cpphowever, you'll see an additional

action: the address of B2 is pushed before making the call, even

though it's obviously not an argument. To comprehend what's

going on here, you need to understand the constraints on the

compiler when it's making a function call.

Function-call stack frame

When the compiler generates code for a function call, it first pushes

all the arguments on the stack, then makes the call. Inside the

function, code is generated to move the stack pointer down even

farther to provide storage for the function's local variables.

("Down" is relative here; your machine may increment or

decrement the stack pointer during a push.) But during the

assembly-language CALL, the CPU pushes the address in the

program code where the function call came from, so the assembly-

language RETURN can use that address to return to the calling

point. This address is of course sacred, because without it your

program will get completely lost. Here's what the stack frame looks

like after the CALL and the allocation of local variable storage in

the function:

Function arguments

Return address

Local variables

482

Thinking in C++

The code generated for the rest of the function expects the memory

to be laid out exactly this way, so that it can carefully pick from the

function arguments and local variables without touching the return

address. I shall call this block of memory, which is everything used

by a function in the process of the function call, the function frame.

You might think it reasonable to try to return values on the stack.

The compiler could simply push it, and the function could return

an offset to indicate how far down in the stack the return value

begins.

Re-entrancy

The problem occurs because functions in C and C++ support

interrupts; that is, the languages are re-entrant. They also support

recursive function calls. This means that at any point in the

execution of a program an interrupt can occur without breaking the

program. Of course, the person who writes the interrupt service

routine (ISR) is responsible for saving and restoring all the registers

that are used in the ISR, but if the ISR needs to use any memory

further down on the stack, this must be a safe thing to do. (You can

think of an ISR as an ordinary function with no arguments and

void return value that saves and restores the CPU state. An ISR

function call is triggered by some hardware event instead of an

explicit call from within a program.)

Now imagine what would happen if an ordinary function tried to

return values on the stack. You can't touch any part of the stack

that's above the return address, so the function would have to push

the values below the return address. But when the assembly-

language RETURN is executed, the stack pointer must be pointing

to the return address (or right below it, depending on your

machine), so right before the RETURN, the function must move the

stack pointer up, thus clearing off all its local variables. If you're

trying to return values on the stack below the return address, you

become vulnerable at that moment because an interrupt could

come along. The ISR would move the stack pointer down to hold

11: References & the Copy-Constructor

483

its return address and its local variables and overwrite your return

value.

To solve this problem, the caller could be responsible for allocating

the extra storage on the stack for the return values before calling

the function. However, C was not designed this way, and C++

must be compatible. As you'll see shortly, the C++ compiler uses a

more efficient scheme.

Your next idea might be to return the value in some global data

area, but this doesn't work either. Reentrancy means that any

function can be an interrupt routine for any other function,

including the same function you're currently inside. Thus, if you put

the return value in a global area, you might return into the same

function, which would overwrite that return value. The same logic

applies to recursion.

The only safe place to return values is in the registers, so you're

back to the problem of what to do when the registers aren't large

enough to hold the return value. The answer is to push the address

of the return value's destination on the stack as one of the function

arguments, and let the function copy the return information

directly into the destination. This not only solves all the problems,

it's more efficient. It's also the reason that, in

PassingBigStructures.cppthe compiler pushes the address of B2

before the call to bigfun( )in main( ). If you look at the assembly

output for bigfun( ) you can see it expects this hidden argument

and performs the copy to the destination inside the function.

Bitcopy versus initialization

So far, so good. There's a workable process for passing and

returning large simple structures. But notice that all you have is a

way to copy the bits from one place to another, which certainly

works fine for the primitive way that C looks at variables. But in

C++ objects can be much more sophisticated than a patch of bits;

they have meaning. This meaning may not respond well to having

its bits copied.

484

Thinking in C++

Consider a simple example: a class that knows how many objects of

its type exist at any one time. From Chapter 10, you know the way

to do this is by including a static data member:

//: C11:HowMany.cpp

// A class that counts its objects

#include <fstream>

#include <string>

using namespace std;

ofstream out("HowMany.out");

class HowMany {

static int objectCount;

public:

HowMany() { objectCount++; }

static void print(const string& msg = "") {

if(msg.size() != 0) out << msg << ": ";

out << "objectCount = "

<< objectCount << endl;

}

~HowMany() {

objectCount--;

print("~HowMany()");

}

};

int HowMany::objectCount = 0;

// Pass and return BY VALUE:

HowMany f(HowMany x) {

x.print("x argument inside f()");

return x;

}

int main() {

HowMany h;

HowMany::print("after construction of h");

HowMany h2 = f(h);

HowMany::print("after call to f()");

} ///:~

The class HowMany contains a static int objectCountand a static

member function print( )to report the value of that objectCount

11: References & the Copy-Constructor

485

along with an optional message argument. The constructor

increments the count each time an object is created, and the

destructor decrements it.

The output, however, is not what you would expect:

after construction of h: objectCount = 1

x argument inside f(): objectCount = 1

~HowMany(): objectCount = 0

after call to f(): objectCount = 0

~HowMany(): objectCount = -1

~HowMany(): objectCount = -2

After h is created, the object count is one, which is fine. But after

the call to f( ) you would expect to have an object count of two,

because h2 is now in scope as well. Instead, the count is zero, which

indicates something has gone horribly wrong. This is confirmed by

the fact that the two destructors at the end make the object count go

negative, something that should never happen.

Look at the point inside f( ), which occurs after the argument is

passed by value. This means the original object h exists outside the

function frame, and there's an additional object inside the function

frame, which is the copy that has been passed by value. However,

the argument has been passed using C's primitive notion of

bitcopying, whereas the C++ HowMany class requires true

initialization to maintain its integrity, so the default bitcopy fails to

produce the desired effect.

When the local object goes out of scope at the end of the call to f( ),

the destructor is called, which decrements objectCount so outside

the function, objectCountis zero. The creation of h2 is also

performed using a bitcopy, so the constructor isn't called there

either, and when h and h2 go out of scope, their destructors cause

the negative values of objectCount

486

Thinking in C++

Copy-construction

The problem occurs because the compiler makes an assumption

about how to create a new object from an existing object. When you

pass an object by value, you create a new object, the passed object

inside the function frame, from an existing object, the original

object outside the function frame. This is also often true when

returning an object from a function. In the expression

HowMany h2 = f(h);

h2, a previously unconstructed object, is created from the return

value of f( ), so again a new object is created from an existing one.

The compiler's assumption is that you want to perform this

creation using a bitcopy, and in many cases this may work fine, but

in HowMany it doesn't fly because the meaning of initialization

goes beyond simply copying. Another common example occurs if

the class contains pointers what do they point to, and should you

copy them or should they be connected to some new piece of

memory?

Fortunately, you can intervene in this process and prevent the

compiler from doing a bitcopy. You do this by defining your own

function to be used whenever the compiler needs to make a new

object from an existing object. Logically enough, you're making a

new object, so this function is a constructor, and also logically

enough, the single argument to this constructor has to do with the

object you're constructing from. But that object can't be passed into

the constructor by value because you're trying to define the function

that handles passing by value, and syntactically it doesn't make

sense to pass a pointer because, after all, you're creating the new

object from an existing object. Here, references come to the rescue,

so you take the reference of the source object. This function is called

the copy-constructor and is often referred to as X(X&), which is its

appearance for a class called X.

11: References & the Copy-Constructor

487

If you create a copy-constructor, the compiler will not perform a

bitcopy when creating a new object from an existing one. It will

always call your copy-constructor. So, if you don't create a copy-

constructor, the compiler will do something sensible, but you have

the choice of taking over complete control of the process.

Now it's possible to fix the problem in HowMany.cpp

//: C11:HowMany2.cpp

// The copy-constructor

#include <fstream>

#include <string>

using namespace std;

ofstream out("HowMany2.out");

class HowMany2 {

string name; // Object identifier

static int objectCount;

public:

HowMany2(const string& id = "") : name(id) {

++objectCount;

print("HowMany2()");

}

~HowMany2() {

--objectCount;

print("~HowMany2()");

}

// The copy-constructor:

HowMany2(const HowMany2& h) : name(h.name) {

name += " copy";

++objectCount;

print("HowMany2(const HowMany2&)");

}

void print(const string& msg = "") const {

if(msg.size() != 0)

out << msg << endl;

out << '\t' << name << ": "

<< "objectCount = "

<< objectCount << endl;

}

};

int HowMany2::objectCount = 0;

488

Thinking in C++

// Pass and return BY VALUE:

HowMany2 f(HowMany2 x) {

x.print("x argument inside f()");

out << "Returning from f()" << endl;

return x;

}

int main() {

HowMany2 h("h");

out << "Entering f()" << endl;

HowMany2 h2 = f(h);

h2.print("h2 after call to f()");

out << "Call f(), no return value" << endl;

f(h);

out << "After call to f()" << endl;

} ///:~

There are a number of new twists thrown in here so you can get a

better idea of what's happening. First, the string name acts as an

object identifier when information about that object is printed. In

the constructor, you can put an identifier string (usually the name

of the object) that is copied to name using the string constructor.

The default = "" creates an empty string. The constructor

increments the objectCountas before, and the destructor

decrements it.

Next is the copy-constructor, HowMany2(const HowMany2&)

The copy-constructor can create a new object only from an existing

one, so the existing object's name is copied to name, followed by

the word "copy" so you can see where it came from. If you look

closely, you'll see that the call name(h.name)in the constructor

initializer list is actually calling the string copy-constructor.

Inside the copy-constructor, the object count is incremented just as

it is inside the normal constructor. This means you'll now get an

accurate object count when passing and returning by value.

The print( )function has been modified to print out a message, the

object identifier, and the object count. It must now access the name

11: References & the Copy-Constructor

489

data of a particular object, so it can no longer be a static member

function.

Inside main( ), you can see that a second call to f( ) has been added.

However, this call uses the common C approach of ignoring the

return value. But now that you know how the value is returned

(that is, code inside the function handles the return process, putting

the result in a destination whose address is passed as a hidden

argument), you might wonder what happens when the return

value is ignored. The output of the program will throw some

illumination on this.

Before showing the output, here's a little program that uses

iostreams to add line numbers to any file:

//: C11:Linenum.cpp

//{T} Linenum.cpp

// Add line numbers

#include "../require.h"

#include <vector>

#include <string>

#include <fstream>

#include <iostream>

#include <cmath>

using namespace std;

int main(int argc, char* argv[]) {

requireArgs(argc, 1, "Usage: linenum file\n"

"Adds line numbers to file");

ifstream in(argv[1]);

assure(in, argv[1]);

string line;

vector<string> lines;

while(getline(in, line)) // Read in entire file

lines.push_back(line);

if(lines.size() == 0) return 0;

int num = 0;

// Number of lines in file determines width:

const int width = int(log10(lines.size())) + 1;

for(int i = 0; i < lines.size(); i++) {

cout.setf(ios::right, ios::adjustfield);

cout.width(width);

490

Thinking in C++

cout << ++num << ") " << lines[i] << endl;

}

} ///:~

The entire file is read into a vector<string> using the same code

that you've seen earlier in the book. When printing the line

numbers, we'd like all the lines to be aligned with each other, and

this requires adjusting for the number of lines in the file so that the

width allowed for the line numbers is consistent. We can easily

determine the number of lines using vector::size( ,) but what we

really need to know is whether there are more than 10 lines, 100

lines, 1,000 lines, etc. If you take the logarithm, base 10, of the

number of lines in the file, truncate it to an int and add one to the

value, you'll find out the maximum width that your line count will

be.

You'll notice a couple of strange calls inside the for loop: setf( ) and

width( ) These are ostream calls that allow you to control, in this

case, the justification and width of the output. However, they must

be called each time a line is output and that is why they are inside

the for loop. Volume 2 of this book has an entire chapter explaining

iostreams that will tell you more about these calls as well as other

ways to control iostreams.

When Linenum.cppis applied to HowMany2.out the result is

HowMany2()

h: objectCount = 1

Entering f()

HowMany2(const HowMany2&)

h copy: objectCount = 2

x argument inside f()

h copy: objectCount = 2

Returning from f()

HowMany2(const HowMany2&)

10)

h copy copy: objectCount = 3

11)

~HowMany2()

12)

h copy: objectCount = 2

13)

h2 after call to f()

14)

h copy copy: objectCount = 2

11: References & the Copy-Constructor

491

15)

Call f(), no return value

16)

HowMany2(const HowMany2&)

17)

h copy: objectCount = 3

18)

x argument inside f()

19)

h copy: objectCount = 3

20)

Returning from f()

21)

HowMany2(const HowMany2&)

22)

h copy copy: objectCount = 4

23)

~HowMany2()

24)

h copy: objectCount = 3

25)

~HowMany2()

26)

h copy copy: objectCount = 2

27)

After call to f()

28)

~HowMany2()

29)

h copy copy: objectCount = 1

30)

~HowMany2()

31)

h: objectCount = 0

As you would expect, the first thing that happens is that the normal

constructor is called for h, which increments the object count to

one. But then, as f( ) is entered, the copy-constructor is quietly

called by the compiler to perform the pass-by-value. A new object

is created, which is the copy of h (thus the name "h copy") inside

the function frame of f( ), so the object count becomes two, courtesy

of the copy-constructor.

Line eight indicates the beginning of the return from f( ). But before

the local variable "h copy" can be destroyed (it goes out of scope at

the end of the function), it must be copied into the return value,

which happens to be h2. A previously unconstructed object (h2) is

created from an existing object (the local variable inside f( )), so of

course the copy-constructor is used again in line nine. Now the

name becomes "h copy copy" for h2's identifier because it's being

copied from the copy that is the local object inside f( ). After the

object is returned, but before the function ends, the object count

becomes temporarily three, but then the local object "h copy" is

destroyed. After the call to f( ) completes in line 13, there are only

two objects, h and h2, and you can see that h2 did indeed end up as

"h copy copy."

492

Thinking in C++

Temporary objects

Line 15 begins the call to f(h), this time ignoring the return value.

You can see in line 16 that the copy-constructor is called just as

before to pass the argument in. And also, as before, line 21 shows

the copy-constructor is called for the return value. But the copy-

constructor must have an address to work on as its destination (a

this pointer). Where does this address come from?

It turns out the compiler can create a temporary object whenever it

needs one to properly evaluate an expression. In this case it creates

one you don't even see to act as the destination for the ignored

return value of f( ). The lifetime of this temporary object is as short

as possible so the landscape doesn't get cluttered up with

temporaries waiting to be destroyed and taking up valuable

resources. In some cases, the temporary might immediately be

passed to another function, but in this case it isn't needed after the

function call, so as soon as the function call ends by calling the

destructor for the local object (lines 23 and 24), the temporary object

is destroyed (lines 25 and 26).

Finally, in lines 28-31, the h2 object is destroyed, followed by h, and

the object count goes correctly back to zero.

Default copy-constructor

Because the copy-constructor implements pass and return by value,

it's important that the compiler creates one for you in the case of

simple structures effectively, the same thing it does in C.

However, all you've seen so far is the default primitive behavior: a

bitcopy.

When more complex types are involved, the C++ compiler will still

automatically create a copy-constructor if you don't make one.

Again, however, a bitcopy doesn't make sense, because it doesn't

necessarily implement the proper meaning.

11: References & the Copy-Constructor

493

Here's an example to show the more intelligent approach the

compiler takes. Suppose you create a new class composed of objects

of several existing classes. This is called, appropriately enough,

composition, and it's one of the ways you can make new classes from

existing classes. Now take the role of a naive user who's trying to

solve a problem quickly by creating a new class this way. You don't

know about copy-constructors, so you don't create one. The

example demonstrates what the compiler does while creating the

default copy-constructor for your new class:

//: C11:DefaultCopyConstructor.cpp

// Automatic creation of the copy-constructor

#include <iostream>

#include <string>

using namespace std;

class WithCC { // With copy-constructor

public:

// Explicit default constructor required:

WithCC() {}

WithCC(const WithCC&) {

cout << "WithCC(WithCC&)" << endl;

}

};

class WoCC { // Without copy-constructor

string id;

public:

WoCC(const string& ident = "") : id(ident) {}

void print(const string& msg = "") const {

if(msg.size() != 0) cout << msg << ": ";

cout << id << endl;

}

};

class Composite {

WithCC withcc; // Embedded objects

WoCC wocc;

public:

Composite() : wocc("Composite()") {}

void print(const string& msg = "") const {

wocc.print(msg);

494

Thinking in C++

}

};

int main() {

Composite c;

c.print("Contents of c");

cout << "Calling Composite copy-constructor"

<< endl;

Composite c2 = c; // Calls copy-constructor

c2.print("Contents of c2");

} ///:~

The class WithCC contains a copy-constructor, which simply

announces that it has been called, and this brings up an interesting

issue. In the class Composite an object of WithCC is created using

a default constructor. If there were no constructors at all in

WithCC, the compiler would automatically create a default

constructor, which would do nothing in this case. However, if you

add a copy-constructor, you've told the compiler you're going to

handle constructor creation, so it no longer creates a default

constructor for you and will complain unless you explicitly create a

default constructor as was done for WithCC.

The class WoCC has no copy-constructor, but its constructor will

store a message in an internal string that can be printed out using

print( ) This constructor is explicitly called in Composite

constructor initializer list (briefly introduced in Chapter 8 and

covered fully in Chapter 14). The reason for this becomes apparent

later.

The class Compositehas member objects of both WithCC and

WoCC (note the embedded object wocc is initialized in the

constructor-initializer list, as it must be), and no explicitly defined

copy-constructor. However, in main( ) an object is created using the

copy-constructor in the definition:

Composite c2 = c;

11: References & the Copy-Constructor

495

The copy-constructor for Compositeis created automatically by the

compiler, and the output of the program reveals the way that it is

created:

Contents of c: Composite()

Calling Composite copy-constructor

WithCC(WithCC&)

Contents of c2: Composite()

To create a copy-constructor for a class that uses composition (and

inheritance, which is introduced in Chapter 14), the compiler

recursively calls the copy-constructors for all the member objects

and base classes. That is, if the member object also contains another

object, its copy-constructor is also called. So in this case, the

compiler calls the copy-constructor for WithCC. The output shows

this constructor being called. Because WoCC has no copy-

constructor, the compiler creates one for it that just performs a

bitcopy, and calls that inside the Compositecopy-constructor. The

call to Composite::print( ) main shows that this happens because

the contents of c2.wocc are identical to the contents of c.wocc. The

process the compiler goes through to synthesize a copy-constructor

is called memberwise initialization.

It's always best to create your own copy-constructor instead of

letting the compiler do it for you. This guarantees that it will be

under your control.

Alternatives to copy-construction

At this point your head may be swimming, and you might be

wondering how you could have possibly written a working class

without knowing about the copy-constructor. But remember: You

need a copy-constructor only if you're going to pass an object of

your class by value. If that never happens, you don't need a copy-

constructor.

496

Thinking in C++

Preventing pass-by-value

"But," you say, "if I don't make a copy-constructor, the compiler

will create one for me. So how do I know that an object will never

be passed by value?"

There's a simple technique for preventing pass-by-value: declare a

private copy-constructor. You don't even need to create a

definition, unless one of your member functions or a friend

function needs to perform a pass-by-value. If the user tries to pass

or return the object by value, the compiler will produce an error

message because the copy-constructor is private. It can no longer

create a default copy-constructor because you've explicitly stated

that you're taking over that job.

Here's an example:

//: C11:NoCopyConstruction.cpp

// Preventing copy-construction

class NoCC {

int i;

NoCC(const NoCC&); // No definition

public:

NoCC(int ii = 0) : i(ii) {}

};

void f(NoCC);

int main() {

NoCC n;

//! f(n); // Error: copy-constructor called

//! NoCC n2 = n; // Error: c-c called

//! NoCC n3(n); // Error: c-c called

} ///:~

Notice the use of the more general form

NoCC(const NoCC&);

using the const.

11: References & the Copy-Constructor

497

Functions that modify outside objects

Reference syntax is nicer to use than pointer syntax, yet it clouds

the meaning for the reader. For example, in the iostreams library

one overloaded version of the get( ) function takes a char& as an

argument, and the whole point of the function is to modify its

argument by inserting the result of the get( ). However, when you

read code using this function it's not immediately obvious to you

that the outside object is being modified:

char c;

cin.get(c);

Instead, the function call looks like a pass-by-value, which suggests

the outside object is not modified.

Because of this, it's probably safer from a code maintenance

standpoint to use pointers when you're passing the address of an

argument to modify. If you always pass addresses as const

references except when you intend to modify the outside object via

the address, where you pass by non-const pointer, then your code

is far easier for the reader to follow.

Pointers to members

A pointer is a variable that holds the address of some location. You

can change what a pointer selects at runtime, and the destination of

the pointer can be either data or a function. The C++

pointer-to-member follows this same concept, except that what it

selects is a location inside a class. The dilemma here is that a

pointer needs an address, but there is no "address" inside a class;

selecting a member of a class means offsetting into that class. You

can't produce an actual address until you combine that offset with

the starting address of a particular object. The syntax of pointers to

members requires that you select an object at the same time you're

dereferencing the pointer to member.

498

Thinking in C++

To understand this syntax, consider a simple structure, with a

pointer sp and an object so for this structure. You can select

members with the syntax shown:

//: C11:SimpleStructure.cpp

struct Simple { int a; };

int main() {

Simple so, *sp = &so;

sp->a;

so.a;

} ///:~

Now suppose you have an ordinary pointer to an integer, ip. To

access what ip is pointing to, you dereference the pointer with a `*':

*ip = 4;

Finally, consider what happens if you have a pointer that happens

to point to something inside a class object, even if it does in fact

represent an offset into the object. To access what it's pointing at,

you must dereference it with *. But it's an offset into an object, so

you must also refer to that particular object. Thus, the * is combined

with the object dereference. So the new syntax becomes >* for a

pointer to an object, and .* for the object or a reference, like this:

objectPointer->*pointerToMember = 47;

object.*pointerToMember = 47;

Now, what is the syntax for defining pointerToMember Like any

pointer, you have to say what type it's pointing at, and you use a *

in the definition. The only difference is that you must say what

class of objects this pointer-to-member is used with. Of course, this

is accomplished with the name of the class and the scope resolution

operator. Thus,

int ObjectClass::*pointerToMember;

defines a pointer-to-member variable called pointerToMemberthat

points to any int inside ObjectClass You can also initialize the

pointer-to-member when you define it (or at any other time):

11: References & the Copy-Constructor

499

int ObjectClass::*pointerToMember = &ObjectClass::a;

There is actually no "address" of ObjectClass::abecause you're just

referring to the class and not an object of that class. Thus,

&ObjectClass::acan be used only as pointer-to-member syntax.

Here's an example that shows how to create and use pointers to

data members:

//: C11:PointerToMemberData.cpp

#include <iostream>

using namespace std;

class Data {

public:

int a, b, c;

void print() const {

cout << "a = " << a << ", b = " << b

<< ", c = " << c << endl;

}

};

int main() {

Data d, *dp = &d;

int Data::*pmInt = &Data::a;

dp->*pmInt = 47;

pmInt = &Data::b;

d.*pmInt = 48;

pmInt = &Data::c;

dp->*pmInt = 49;

dp->print();

} ///:~

Obviously, these are too awkward to use anywhere except for

special cases (which is exactly what they were intended for).

Also, pointers to members are quite limited: they can be assigned

only to a specific location inside a class. You could not, for example,

increment or compare them as you can with ordinary pointers.

500

Thinking in C++

Functions

A similar exercise produces the pointer-to-member syntax for

member functions. A pointer to a function (introduced at the end of

Chapter 3) is defined like this:

int (*fp)(float);

The parentheses around (*fp) are necessary to force the compiler to

evaluate the definition properly. Without them this would appear

to be a function that returns an int*.

Parentheses also play an important role when defining and using

pointers to member functions. If you have a function inside a class,

you define a pointer to that member function by inserting the class

name and scope resolution operator into an ordinary function

pointer definition:

//: C11:PmemFunDefinition.cpp

class Simple2 {

public:

int f(float) const { return 1; }

};

int (Simple2::*fp)(float) const;

int (Simple2::*fp2)(float) const = &Simple2::f;

int main() {

fp = &Simple2::f;

} ///:~

In the definition for fp2 you can see that a pointer to member

function can also be initialized when it is created, or at any other

time. Unlike non-member functions, the & is not optional when

taking the address of a member function. However, you can give

the function identifier without an argument list, because overload

resolution can be determined by the type of the pointer to member.

An example

The value of a pointer is that you can change what it points to at

runtime, which provides an important flexibility in your

programming because through a pointer you can select or change

11: References & the Copy-Constructor

501

behavior at runtime. A pointer-to-member is no different; it allows

you to choose a member at runtime. Typically, your classes will

only have member functions publicly visible (data members are

usually considered part of the underlying implementation), so the

following example selects member functions at runtime.

//: C11:PointerToMemberFunction.cpp

#include <iostream>

using namespace std;

class Widget {

public:

void f(int) const

{

cout

"Widget::f()\n";

}

void g(int) const

{

cout

"Widget::g()\n";

}

void h(int) const

{

cout

"Widget::h()\n";

}

void i(int) const

{

cout

"Widget::i()\n";

}

};

int main() {

Widget w;

Widget* wp = &w;

void (Widget::*pmem)(int) const = &Widget::h;

(w.*pmem)(1);

(wp->*pmem)(2);

} ///:~

Of course, it isn't particularly reasonable to expect the casual user

to create such complicated expressions. If the user must directly

manipulate a pointer-to-member, then a typedef is in order. To

really clean things up, you can use the pointer-to-member as part of

the internal implementation mechanism. Here's the preceding

example using a pointer-to-member inside the class. All the user

needs to do is pass a number in to select a function.1

//: C11:PointerToMemberFunction2.cpp

#include <iostream>

using namespace std;

1 Thanks to Owen Mortensen for this example

502

Thinking in C++

class Widget {

void f(int) const { cout << "Widget::f()\n"; }

void g(int) const { cout << "Widget::g()\n"; }

void h(int) const { cout << "Widget::h()\n"; }

void i(int) const { cout << "Widget::i()\n"; }

enum { cnt = 4 };

void (Widget::*fptr[cnt])(int) const;

public:

Widget() {

fptr[0] = &Widget::f; // Full spec required

fptr[1] = &Widget::g;

fptr[2] = &Widget::h;

fptr[3] = &Widget::i;

}

void select(int i, int j) {

if(i < 0 || i >= cnt) return;

(this->*fptr[i])(j);

}

int count() { return cnt; }

};

int main() {

Widget w;

for(int i = 0; i < w.count(); i++)

w.select(i, 47);

} ///:~

In the class interface and in main( ), you can see that the entire

implementation, including the functions, has been hidden away.

The code must even ask for the count( )of functions. This way, the

class implementer can change the quantity of functions in the

underlying implementation without affecting the code where the

class is used.

The initialization of the pointers-to-members in the constructor

may seem overspecified. Shouldn't you be able to say

fptr[1] = &g;

because the name g occurs in the member function, which is

automatically in the scope of the class? The problem is this doesn't

conform to the pointer-to-member syntax, which is required so

11: References & the Copy-Constructor

503

everyone, especially the compiler, can figure out what's going on.

Similarly, when the pointer-to-member is dereferenced, it seems

(this->*fptr[i])(j);

is also over-specified; this looks redundant. Again, the syntax

requires that a pointer-to-member always be bound to an object

when it is dereferenced.

Summary

Pointers in C++ are almost identical to pointers in C, which is good.

Otherwise, a lot of C code wouldn't compile properly under C++.

The only compile-time errors you will produce occur with

dangerous assignments. If these are in fact what are intended, the

compile-time errors can be removed with a simple (and explicit!)

cast.

C++ also adds the reference from Algol and Pascal, which is like a

constant pointer that is automatically dereferenced by the compiler.

A reference holds an address, but you treat it like an object.

References are essential for clean syntax with operator overloading

(the subject of the next chapter), but they also add syntactic

convenience for passing and returning objects for ordinary

functions.

The copy-constructor takes a reference to an existing object of the

same type as its argument, and it is used to create a new object

from an existing one. The compiler automatically calls the copy-

constructor when you pass or return an object by value. Although

the compiler will automatically create a copy-constructor for you, if

you think one will be needed for your class, you should always

define it yourself to ensure that the proper behavior occurs. If you

don't want the object passed or returned by value, you should

create a private copy-constructor.

504

Thinking in C++

Pointers-to-members have the same functionality as ordinary

pointers: You can choose a particular region of storage (data or

function) at runtime. Pointers-to-members just happen to work

with class members instead of with global data or functions. You

get the programming flexibility that allows you to change behavior

at runtime.

Exercises

Solutions to selected exercises can be found in the electronic document The Thinking in C++ Annotated

Solution Guide, available for a small fee from .

Turn the "bird & rock" code fragment at the beginning of

this chapter into a C program (using structs for the data

types), and show that it compiles. Now try to compile it

with the C++ compiler and see what happens.

Take the code fragments in the beginning of the section

titled "References in C++" and put them into a main( ).

Add statements to print output so that you can prove to

yourself that references are like pointers that are

automatically dereferenced.

Write a program in which you try to (1) Create a

reference that is not initialized when it is created. (2)

Change a reference to refer to another object after it is

initialized. (3) Create a NULL reference.

Write a function that takes a pointer argument, modifies

what the pointer points to, and then returns the

destination of the pointer as a reference.

Create a class with some member functions, and make

that the object that is pointed to by the argument of

Exercise 4. Make the pointer a const and make some of

the member functions const and prove that you can only

call the const member functions inside your function.

Make the argument to your function a reference instead

of a pointer.

11: References & the Copy-Constructor

505

Take the code fragments at the beginning of the section

titled "Pointer references" and turn them into a program.

Create a function that takes an argument of a reference to

a pointer to a pointer and modifies that argument. In

main( ), call the function.

Create a function that takes a char& argument and

modifies that argument. In main( ), print out a char

variable, call your function for that variable, and print it

out again to prove to yourself that it has been changed.

How does this affect program readability?

Write a class that has a const member function and a

non-const member function. Write three functions that

take an object of that class as an argument; the first takes

it by value, the second by reference, and the third by

const reference. Inside the functions, try to call both

member functions of your class and explain the results.

10.

(Somewhat challenging) Write a simple function that

takes an int as an argument, increments the value, and

returns it. In main( ), call your function. Now discover

how your compiler generates assembly code and trace

through the assembly statements so that you understand

how arguments are passed and returned, and how local

variables are indexed off the stack.

11.

Write a function that takes as its arguments a char, int,

float, and double. Generate assembly code with your

compiler and find the statements that push the

arguments on the stack before a function call.

12.

Write a function that returns a double. Generate

assembly code and determine how the value is returned.

13.

Produce assembly code for PassingBigStructures.cpp

Trace through and demystify the way your compiler

generates code to pass and return large structures.

14.

Write a simple recursive function that decrements its

argument and returns zero if the argument becomes zero,

otherwise it calls itself. Generate assembly code for this

506

Thinking in C++

function and explain how the way that the assembly code

is created by the compiler supports recursion.

15.

Write code to prove that the compiler automatically

synthesizes a copy-constructor if you don't create one

yourself. Prove that the synthesized copy-constructor

performs a bitcopy of primitive types and calls the copy-

constructor of user-defined types.

16.

Write a class with a copy-constructor that announces

itself to cout. Now create a function that passes an object

of your new class in by value and another one that

creates a local object of your new class and returns it by

value. Call these functions to prove to yourself that the

copy-constructor is indeed quietly called when passing

and returning objects by value.

17.

Create a class that contains a double*. The constructor

initializes the double* by calling new doubleand

assigning a value to the resulting storage from the

constructor argument. The destructor prints the value

that's pointed to, assigns that value to -1, calls delete for

the storage, and then sets the pointer to zero. Now create

a function that takes an object of your class by value, and

call this function in main( ). What happens? Fix the

problem by writing a copy-constructor.

18.

Create a class with a constructor that looks like a copy-

constructor, but that has an extra argument with a

default value. Show that this is still used as the copy-

constructor.

19.

Create a class with a copy-constructor that announces

itself. Make a second class containing a member object of

the first class, but do not create a copy-constructor. Show

that the synthesized copy-constructor in the second class

automatically calls the copy-constructor of the first class.

20.

Create a very simple class, and a function that returns an

object of that class by value. Create a second function that

takes a reference to an object of your class. Call the first

function as the argument of the second function, and

11: References & the Copy-Constructor

507

demonstrate that the second function must use a const

reference as its argument.

21.

Create a simple class without a copy-constructor, and a

simple function that takes an object of that class by value.

Now change your class by adding a private declaration

(only) for the copy-constructor. Explain what happens

when your function is compiled.

22.

This exercise creates an alternative to using the copy-

constructor. Create a class X and declare (but don't

define) a private copy-constructor. Make a public clone( )

function as a const member function that returns a copy

of the object that is created using new. Now write a

function that takes as an argument a const X&and clones

a local copy that can be modified. The drawback to this

approach is that you are responsible for explicitly

destroying the cloned object (using delete) when you're

done with it.

23.

Explain what's wrong with both Mem.cpp and

MemTest.cppfrom Chapter 7. Fix the problem.

24.

Create a class containing a double and a print( )function

that prints the double. In main( ), create pointers to

members for both the data member and the function in

your class. Create an object of your class and a pointer to

that object, and manipulate both class elements via your

pointers to members, using both the object and the

pointer to the object.

25.

Create a class containing an array of int. Can you index

through this array using a pointer to member?

26.

Modify PmemFunDefinition.cpp adding an

overloaded member function f( ) (you can determine the

argument list that causes the overload). Now make a

second pointer to member, assign it to the overloaded

version of f( ), and call the function through that pointer.

How does the overload resolution happen in this case?

27.

Start with FunctionTable.cpp

from Chapter 3. Create a

class that contains a vector of pointers to functions, with

508

Thinking in C++

add( ) and remove( )member functions to add and

remove pointers to functions. Add a run( ) function that

moves through the vector and calls all of the functions.

28.

Modify the above Exercise 27 so that it works with

pointers to member functions instead.

11: References & the Copy-Constructor

509

Table of Contents: