Pre-processor, include directive, define directive, Other Preprocessor Directives, Macros

<< Bitwise Manipulation and Assignment Operator, Programming Constructs

Dynamic Memory Allocation, calloc, malloc, realloc Function, Dangling Pointers >>

CS201 Introduction to Programming

Lecture Handout

Introduction to Programming

Lecture No. 23

Reading Material

Deitel & Deitel - C++ How to Program

Chapter. 17

Summary

Pre-processor

include directive

define directive

Other Preprocessor Directives

Macros

Example

Tips

Preprocessor

Being a concise language, C needs something for its enhancement. So a preprocessor is

used to enhance it. It comes with every C compiler. It makes some changes in the code

before the compilation. The compiler gets the modified source code file. Normally we

can't see what the preprocessor has included. We have so far been using #include

preprocessor directive like #include<iostream.h>. What actually #include does? When

we write #include<somefile>, this somefile is ordinary text file of C code. The line where

we write the #include statement is replaced by the text of that file. We can't see that file

included in our source code. However, when the compiler starts its work, it sees all the

things in the file. Almost all of the preprocessor directives start with # sign. There are two

ways to use #include. We have so far been including the file names enclosing the angle

brackets i.e. #include <somefile>. This way of referring a file tells the compiler that this

file exists in some particular folder (directory) and should be included from there. So we

have included iostream.h, stdlib.h, fstream.h, string.h and some other files and used angle

brackets for all of these files. These files are located in a specific directory. While using

the Dev-Cpp compiler, you should have a look at the directory structure. Open the Dev-

Cpp folder in the windows explorer, you will see many subfolders on the right side. One

of these folders is `include'. On expansion of the folder `include', you will see a lot of

files in this directory. Usually the extension of these files is `h'. Here `h' stands for

header files. Normally we add these files at the start of the program. Therefore these are

Page 284

CS201 Introduction to Programming

known as header files. We can include files anywhere in the code but it needs to be

logical and at the proper position.

include directive

As you know, we have been using functions in the programs. If we have to refer a

function (call a function) in our program, the prototype of function must be declared

before its usage. The compiler should know the name of the function, the arguments it is

expecting and the return type. The first parse of compilation will be successful. If we are

using some library function, it will be included in our program at the time of linking.

Library functions are available in the compiled form, which the linker links with our

program. After the first parse of the compiler, it converts the source code into object

code. Object code is machine code but is not re-locateable executable. The object code of

our program is combined with the object code of the library functions, which the program

is using. Later, some memory location information is included and we get the executable

file. The linker performs this task while the compiler includes the name and arguments of

the function in the object code. For checking the validity of the functions, the compiler

needs to know the definition of the function or at least the prototype of the function. We

have both the options for our functions. Define the function in the start of the program

and use it in the main program. In this case, the definition of the function serves as both

prototype and definition for the function. The compiler compiles the function and the

main program. Then we can link and execute it. As the program gets big, it becomes

difficult to write the definitions of all the functions at the beginning of the program.

Sometimes, we write the functions in a different file and make the object file. We can

include the prototypes of these functions in our program in different manners. One way is

to write the prototype of all these functions in the start before writing the program. The

better way is to make a header file (say myheaderfile.h) and write the prototypes of all

the functions and save it as ordinary text file. Now we need to include it in our program

using the #include directive. As this file is located at the place where our source code is

located, it is not included in the angle brackets in #include directive. It is written in

quotation marks as under:

#include "myHeaderFile.h"

The preprocessor will search for the file "myHeaderFile.h" in the current working

directory. Let's see the difference between the process of the including the file in

brackets and quotation marks. When we include the file in angle brackets, the compiler

looks in a specific directory. But it will look into the current working directory when the

file is included in quotation marks. In the Dev-Cpp IDE, under the tools menu option,

select compiler options. In this dialogue box, we can specify the directories for libraries

and include files. When we use angle brackets with #include, the compiler will look in

the directories specified in include directories option. If we want to write our own header

file and save it in `My Document' folder, the header file should be included with the

quotation marks.

Page 285

CS201 Introduction to Programming

When we compile our source code, the compiler at first looks for the include directives

and processes them one by one. If the first directive is #include<iostream.h>, the

compiler will search this file in the include directory. Then it will include the complete

header file in our source code at the same position where the `include directive' is

written. If the 2nd include directive contains another file, this file will also be included in

the source code after the iostream.h and so on. The compiler will get this expanded

source code file for compilation. As this expanded source code is not available to us and

we will get the executable file in the end.

Can we include the header file at the point other than start of the program? Yes. There is

no restriction. We can include wherever we want. Normally we do this at the start of the

program as these are header files. We do not write a portion of code in a different file and

include this file somewhere in the code. This is legal but not a practice. We have so far

discussed include directive. Now we will discuss another important directive i.e. define

directive.

define directive

We can define macros with the #define directive. Macro is a special name, which is

substituted in the code by its definition, and as a result, we get an expanded code. For

example, we are writing a program, using the constant Pi. Pi is a universal constant and

has a value of 3.1415926. We have to write this value 3.1415926 wherever needed in the

program. It will be better to define Pi somewhere and use Pi instead of the actual value.

We can do the same thing with the variable Pi as double Pi = 3.1415926 while

employing Pi as variable in the program. As this is a variable, one can re-assign it some

new value. We want that wherever we write Pi, its natural value should be replaced. Be

sure that the value of Pi can not be changed. With the define directive, we can define Pi

as:

#define PI 3.1415926

We need to write the name of the symbolic constant and its value, separated by space.

Normally, we write these symbolic constants in capitals as it can be easily identifiable in

the code. When we request the compiler to compile this file, the preprocessor looks for

the define directives and replaces all the names in the code, defined with the define

directives by their values. So compiler does not see PI wherever we have used PI is

replaced with 3.1415926 before the compiler compiles the file.

A small program showing the usage of #define.

/* Program to show the usage of define */

#include <iostream.h>

#define PI 3.1415926

// Defining PI

main()

{

Page 286

CS201 Introduction to Programming

int radius = 5;

cout << "Area of circle with radius " << radius << " = " << PI * radius * radius;

}

What is the benefit of using it? Suppose we have written a program and are using the

value of PI as 3.14 i.e. up to two decimal places. After verifying the accuracy of the

result, we need to have the value of PI as 3.1415926. In case of not using PI as define, we

have to search 3.14 and replace it with 3.1415926 each and every place in the source

code. There may be a problem in performing this `search and replace' task. We can miss

some place or replace something else. Suppose at some place, 3.14 is representing

something else like tax rate. We may change this value too accidentally, considering it

the value for PI. So we can't conduct a blind search and replace and expect that it will

work fine. It will be nicer to define PI at the start of the program. We will be using PI

instead of its value i.e. 3.1415926. Now if we want to change the value of PI, it will be

changed only at one place. The complete program will get the new value. When we

define something with the #define directive, it is substituted with the value before the

compiler compiles the file. This gives us a very nice control needed to change the value

only at one place. Thus the complete program is updated.

We can also put this definition of PI in the header file. The benefit of doing this is, every

program which is using the value of PI from this header file, will get the updated value

when the value in header file is changed. For example, we have five functions, using the

PI and these functions are defined in five different files. So we need to define PI (i.e.

#define PI 3.1415926) in all the five source files. We can define it in one header file and

include this header file in all the source code files. Each function is getting the value of PI

from the header file by changing the value of PI in the header file, all the functions will

be updated with this new value. As these preprocessor directives are not C statements, so

we do not put semicolon in the end of the line. If we put the semicolon with the #include

or #define, it will result in a syntax error.

Other Preprocessor Directives

There are some other preprocessor directives. Here is the list of preprocessor directives.

#include <filename>

#include "filename"

#define

#undef

#ifdef

#ifndef

#if

#else

#elif

#endif

#error

Page 287

CS201 Introduction to Programming

#line

#pragma

#assert

All the preprocessor directives start with the sharp sign (#). We can also do conditional

compilation with it. We have #if, #else, #endif and for else if #elif is used. It can also be

checked whether the symbol which we have defined with #define, is available or not. For

this purpose, #ifdef is used. If we have defined PI, we can always say:

#ifdef PI

... Then do something

#endif

This is an example of conditional compilation. If a symbolic constant is defined, it will be

error to define it again. It is better to check whether it is already defined or not. If it is

already defined and we want to give it some other value, it should be undefined first. The

directive for undefine is #undef. At first, we will undefine it and define it again with new

value. Another advantage of conditional compilation is `while debugging'. The common

technique is to put output statements at various points in the program. These statements

are used in the code to check the value of different variables and to verify that the

program is working fine. It is extremely tedious to remove all these output statements

which we have written for the debugging. To overcome this problem, we can go for

conditional compilation. We can define a symbol at the start of the program as:

#define DEBUG

Here we have defined a symbol DEBUG with no value in front of it. The value is optional

with the define directive. The output statements for debugging will be written as:

#ifdef DEBUG

cout << "Control is in the while loop of calculating average";

#endif

Now this statement will execute if the DEBUG symbol is defined. Otherwise, it will not

be executed.

Here is an example using the debug output statements:

// Program that shows the use of Define for debugging

// Comment the #define DEBUG and see the change in the output

#include <iostream.h>

#include <stdlib.h>

#define DEBUG

main()

Page 288

CS201 Introduction to Programming

{

int z ;

int arraySize = 100;

int a[100] ;

int i;

// Initializing the array.

for ( i = 0; i < arraySize; i++ )

{

a[i] = i;

}

// If the symbol DEBUG is defined then this code will execute

#ifdef DEBUG

for ( i = 0 ; i < arraySize ; i ++ )

cout << "\t " << a[i];

#endif

cout << " Please enter a positive integer " ;

cin >> z ;

int found = 0 ;

// loop to search the number.

for ( i = 0 ; i < arraySize ; i ++ )

{

if ( z == a[i] )

{

found = 1 ;

break ;

}

if ( found == 1 )

cout << " We found the integer at position " << i ;

else

cout << " The number was not found " ;

}

With preprocessor directives, we can carry out conditional compilation, a macro

translation that is replacement of a symbol by the value in front of it. We can not redefine

a symbol without undefining it first. For undefining a symbol, #undef is used. e.g. the

symbol PI can be undefined as:

#undef PI

Page 289

CS201 Introduction to Programming

Now from this point onward in the program, the symbol PI will not be available. The

compiler will not be able to view this symbol and give error if we have used it in the

program after undefining.

As an exercise, open some header files and read them. e.g. we have used a header file

conio.h (i.e. #define<conio.h> ) for consol input output in our programs. This is legacy

library for non-graphical systems. We have two variants of conio in Dev-Cpp i.e. conio.h

and conio.c (folder is `Dev-Cpp\include'). Open and read it. Do not try to change

anything, as it may cause some problems. Now you have enough knowledge to read it

line by line. You will see different symbols in it starting with underscore ( _ ). There are

lots of internal constants and symbolic names starting with double underscore. Therefore

we should not use such variable names that are starting with underscore. You can find the

declaration of different functions in it e.g. the function getche() (i.e. get character with

echo) is declared in conio.h file. If we try to use the function getche() without including

the conio.h file, the compiler will give error like `the function getche() undeclared'. There

is another interesting construct in conio.h i.e.

#ifdef __cplusplus

extern "C" {

#endif

If the symbol __cplusplus is defined, the statement `extern "C" { ` will be included in the

code. We have an opening brace here. Look where the closing brace is. Go to the end of

the same file. You will find the following:

#ifdef __cplusplus

}

#endif

This is an example of conditional compilation i.e. if the symbol is defined, it includes

these lines in the code before compiling. Go through all the header files, we have been

using in our programs so that you can see how professional programmers write code. If

you have the linux operating system, it is free with a source code. The source code of

linux is written in C language. You can see the functions written by the C programming

Gurus. There may be the code of string manipulation function like string copy, string

compare etc.

Macros

Macros are classified into two categories. The first type of macros can be written using

#define. The value of PI can be defined as:

#define PI 3.1415926

Here the symbol PI will be replaced with the actual value (i.e. 3.1415926) in the program.

These are simple macros like symbolic names mapped to constants.

Page 290

CS201 Introduction to Programming

In contrast, the second type of macros takes arguments. It is also called a parameterized

macros. Consider the following:

#define square(x) x * x

Being a non-C code, it does not require any semicolon at the end. Before the compiler

gets the file, the macro replaces all the occurrences of square (x) (that may be square (i),

square (3) etc) with ( x * x ) (that is for square (i) is replaced by i * i, square(3) is

replaced by 3 * 3 ). The compiler will not see square(x). Rather, it will see x * x, and

make an executable file. There is a problem with this macro definition as seen in the

following statement.

square (i + j);

Here we have i+j as x in the definition of macro. When this is replaced with the macro

definition, we will get the statement as:

i+j*i+j

This is certainly not the square of i + j. It is evaluated as (i + ( j * i ) + j due to the

precedence of the operators. How can we overcome this problem? Whenever you write a

parameterized macro, it is necessary to put the parenthesis in the definition of macro. At

first, write the complete definition in the parenthesis, and then put the x also in

parenthesis. The correct definition of the macro will be as:

#define square(x) ((x) * (x))

This macro will work fine. When this macro definition is replaced in the code,

parenthesis will also be copied making the computation correct.

Here is a sample program showing the use of a simple square macro:

/* Program to show the use of macro */

#include <iostream.h>

// Definition of macro square

#define square(x) ((x) * (x))

main()

{

int x;

cout << endl;

cout << " Please enter the value of x to calculate its square ";

cin >> x;

Page 291

CS201 Introduction to Programming

cout << " Square of x = " << square(x) << endl;

cout << " Square of x+2 = " << square(x+2) << endl;

cout << " Square of 7 = " << square(7);

}

We can also write a function to square(x) to calculate the square of a number. What is the

difference between using this square(x) macro and the square(x) function? Whenever we

call a function, a lot of work has to be done during the execution of the program. The

memory in machine is used as stack for the program. The state of a program (i.e. the

value of all the variables of the program), the line no which is currently executing etc is

on the stack. Before calling the function, we write the arguments on the stack. In a way,

we stop at the function calling point and the code jumps to the function definition code.

The function picks up the values of arguments from the stack. Do some computation and

return the control to the main program which starts executing next line. So there is lot of

overhead in function calling. Whenever we call a function, there is some work that

needed to be done. Whenever we do a function call, like if we are calling a function in a

loop, this overhead is involved with every iteration. The overhead is equal number of

times the loop executed. So computer time and resources are wasted. Obviously there are

a number of times when we need to call functions but in this simple example of

calculating square, if we use square function and the program is calling this function

1000 times, a considerable time is wasted. On the other hand, if we define square macro

and use it. The code written in front of macro name is substituted at all the places in the

code where we are using square macro. Therefore the code is expanded before

compilation and compiler see ordinary multiplication statements. There is no function

call involved, thus making the program run faster. We can write complex parameterized

macros. The advantage of using macros is that there is no overhead of function calls and

the program runs faster. If we are using lot of macros in our program, it is replaced by the

macro definition at every place in the code making the program bloat. Therefore our

source code file becomes a large file, resulting in the enlargement of the executable file

too. Sometimes it is better to write functions and define things in it. For simple things like

taking a square, it is nice to write macros that are only one line code substitution by the

preprocessor.

Take care of few things while defining macros. There is no space between the macro

name and the starting parenthesis. If we put a space there, it will be considered as simple

macro without parameters. We can use more than one argument in the macros using

comma-separated list. The naming convention of the arguments follows the same rules as

used in case of simple variable name. After writing the arguments, enclosing parenthesis

is used. There is always a space before starting the definition of the macro.

Example

Suppose we have a program, which is using the area of circle many times in it. Therefore

we will write a macro for the calculation of the area of circle. We know that the formula

for area of circle is PI*r2. Now this formula is substituted wherever we will be referring

Page 292

CS201 Introduction to Programming

to this macro. We know that the PI is also a natural constant. So we will define it first.

Then we will define the macro for the area of the circle. From the perspective of

visibility, it is good to write the name of the macro in capital as CIRCLEAREA. We

don't need to pass the PI as argument to it. The only thing, needed to be passed as

argument, is radius. So the name of the macro will be as CIRCLEAREA (X).We will

write the formula for the calculation of the area of the circle as:

#define CIRCLEAREA(X) (PI * (X) * (X))

Here is the complete code of the program:

/* A simple program using the area of circle formula as macro */

#include <iostream.h>

// Defining the macros

#define PI 3.14159

#define CIRCLEAREA(X) ( PI * X * X)

main()

{

float radius;

cout << " Enter radius of the circle: ";

cin >> radius;

cout << " Area of circle is " << CIRCLEAREA (radius);

}

The CIRCLEAREA will be replaced by the actual macro definition including the entire

parenthesis in the code before compilation. As we have used the parenthesis in the

definition of the CIRCLEAREA macro. The statement for ascertaining the area of circle

with double radius will be as under:

CIRCLEAREA(2 * radius);

The above statement will work fine in calculating the correct area. As we are using

multiplication, so it may work without the use of parenthesis. But if there is some

addition or subtraction like CIRCLEAREA(radius + 2) and the macro definition does not

contain the parenthesis, the correct area will not be calculated. Therefore always use the

parenthesis while writing the macros that takes arguments.

There are some other things about header files. As a proficient programmer writing your

own operating systems, you will be using these things. There are many operating

systems, which are currently in use. Windows is a popular operating system, DOS is

another operating system for PC's, Linux, and different variety of Unix, Sun Solaris and

main frame operating systems. The majority of these operating systems have a C

compiler available. C is a very elegant operating systems language. It is very popular and

Page 293

CS201 Introduction to Programming

available on every platform. By and large the source code which we write in our

programs does not change from machine to machine. The things, which are changed, are

system header files. These files belong to the machine. The header files, which we have

written for our program, will be with the source code. But the iostream, stdlib, stdio,

string header files have certain variations from machine to machine. Over the years as the

C language has evolved, the names of these header files have become standard. Some of

you may have been using some other compiler. But you have noted that in those

compilers, the header files are same, as iostream.h, conio.h etc are available. It applies to

operating systems. While changing operating systems, we come up with the local version

of C/C++ compiler. The name of the header files remains same. Therefore, if we port our

code from one operating system to another, there is no need to change anything in it. It

will automatically include the header files of that compiler. Compile it and run it. It will

run up to 99 % without any error. There may be some behavioral change like function

getche() sometimes read a character without the enter and sometimes you have to type the

character and press enter. So there may be such behavioral change from one operating

system to other. Nonetheless these header files lead to a lot of portability. You can write

program at one operating system and need not to take the system header file with the

code to the operating system.

On the other hand, the header files of our program also assist in the portability in the

sense that we have all the function prototypes, symbolic definitions, conditional

compilations and macros at one place. While writing a lot of codes, we start writing

header files for ourselves because of the style in which we work. We have defined some

common functions in our header files. Now when we are changing the operating system,

this header file is ported with the source code. Similarly, on staring some program, we

include this header file because it contains utility function which we have written.

Here is an interesting example with the #define. If you think you are sharp here is a

challenge for you. Define you own vocabulary with the #define and write C code in front

of it. One can write a poem using this vocabulary which will be replaced by the

preprocessor with the C code. What we need is to include one header file that contains

this vocabulary. So an ordinary English poem is actually a C code. Interesting things can

be done using these techniques.

Tips

All the preprocessor directives start with the # sign

A symbol can not be redefined without undefining it first

The conditional compilation directives help in debugging the program

Do not declare variable names starting with underscore

Always use parenthesis while defining macros that takes arguments

Page 294

Table of Contents: