ZeePedia

A: Coding Style

<< Introduction to Templates:Template syntax, Stack and Stash as templates
B: Programming Guidelines >>
img
A: Coding Style
This appendix is not about indenting and placement of
parentheses and curly braces, although that will be
mentioned. It is about the general guidelines used in
this book for organizing the code listings.
785
img
.
Although many of these issues have been introduced throughout
the book, this appendix appears at the end so it can be assumed
that every topic is fair game, and if you don't understand
something you can look it up in the appropriate section.
All the decisions about coding style in this book have been
deliberately considered and made, sometimes over a period of
years. Of course, everyone has their reasons for organizing code the
way they do, and I'm just trying to tell you how I arrived at mine
and the constraints and environmental factors that brought me to
those decisions.
General
In the text of this book, identifiers (function, variable, and class
names) are set in bold. Most keywords will also be set in bold,
except for those keywords that are used so much that the bolding
can become tedious, such as "class" and "virtual."
I use a particular coding style for the examples in this book. It was
developed over a number of years, and was partially inspired by
Bjarne Stroustrup's style in his original The C++ Programming
Language.1 The subject of formatting style is good for hours of hot
debate, so I'll just say I'm not trying to dictate correct style via my
examples; I have my own motivation for using the style that I do.
Because C++ is a free-form programming language, you can
continue to use whatever style you're comfortable with.
That said, I will note that it is important to have a consistent
formatting style within a project. If you search the Internet, you will
find a number of tools that can be used to reformat all the code in
your project to achieve this valuable consistency.
1 Ibid.
786
Thinking in C++
img
The programs in this book are files that are automatically extracted
from the text of the book, which allows them to be tested to ensure
that they work correctly. Thus, the code files printed in the book
should all work without compile-time errors when compiled with
an implementation that conforms to Standard C++ (note that not all
compilers support all language features). The errors that should
cause compile-time error messages are commented out with the
comment //! so they can be easily discovered and tested using
automatic means. Errors discovered and reported to the author will
appear first in the electronic version of the book (at
) and later in updates of the book.
One of the standards in this book is that all programs will compile
and link without errors (although they will sometimes cause
warnings). To this end, some of the programs, which demonstrate
only a coding example and don't represent stand-alone programs,
will have empty main( ) functions, like this
int main() {}
This allows the linker to complete without an error.
The standard for main( ) is to return an int, but Standard C++
states that if there is no return statement inside main( ), the
compiler will automatically generate code to return 0 This option
.
(no return statement in main( )) will be used in this book (some
compilers may still generate warnings for this, but those are not
compliant with Standard C++).
File names
In C, it has been traditional to name header files (containing
declarations) with an extension of .h and implementation files (that
cause storage to be allocated and code to be generated) with an
extension of .c. C++ went through an evolution. It was first
developed on Unix, where the operating system was aware of
upper and lower case in file names. The original file names were
A: Coding Style
787
img
simply capitalized versions of the C extensions: .H and .C. This of
course didn't work for operating systems that didn't distinguish
upper and lower case, such as DOS. DOS C++ vendors used
extensions of hxx and cxx for header files and implementation files,
respectively, or hpp and cpp. Later, someone figured out that the
only reason you needed a different extension for a file was so the
compiler could determine whether to compile it as a C or C++ file.
Because the compiler never compiled header files directly, only the
implementation file extension needed to be changed. The custom,
across virtually all systems, has now become to use cpp for
implementation files and h for header files. Note that when
including Standard C++ header files, the option of having no file
name extension is used, i.e.: #include <iostream>
.
Begin and end comment tags
A very important issue with this book is that all code that you see
in the book must be verified to be correct (with at least one
compiler). This is accomplished by automatically extracting the
files from the book. To facilitate this, all code listings that are meant
to be compiled (as opposed to code fragments, of which there are
few) have comment tags at the beginning and end. These tags are
used by the code-extraction tool ExtractCode.cppin Volume 2 of
this book (which you can find on the Web site )
to pull each code listing out of the plain-ASCII text version of this
book.
The end-listing tag simply tells ExtractCode.cppthat it's the end of
the listing, but the begin-listing tag is followed by information
about what subdirectory the file belongs in (generally organized by
chapters, so a file that belongs in Chapter 8 would have a tag of
C08), followed by a colon and the name of the listing file.
Because ExtractCode.cppalso creates a makefilefor each
subdirectory, information about how a program is made and the
command-line used to test it is also incorporated into the listings. If
788
Thinking in C++
img
a program is stand-alone (it doesn't need to be linked with
anything else) it has no extra information. This is also true for
header files. However, if it doesn't contain a main( ) and is meant
to be linked with something else, then it has an {O} after the file
name. If this listing is meant to be the main program but needs to
be linked with other components, there's a separate line that begins
with //{L} and continues with all the files that need to be linked
(without extensions, since those can vary from platform to
platform).
You can find examples throughout the book.
If a file should be extracted but the begin- and end-tags should not
be included in the extracted file (for example, if it's a file of test
data) then the begin-tag is immediately followed by a `!'.
Parentheses, braces, and indentation
You may notice the formatting style in this book is different from
many traditional C styles. Of course, everyone thinks their own
style is the most rational. However, the style used here has a simple
logic behind it, which will be presented here mixed in with ideas on
why some of the other styles developed.
The formatting style is motivated by one thing: presentation, both
in print and in live seminars. You may feel your needs are different
because you don't make a lot of presentations. However, working
code is read much more than it is written, and so it should be easy
for the reader to perceive. My two most important criteria are
"scannability" (how easy it is for the reader to grasp the meaning of
a single line) and the number of lines that can fit on a page. This
latter may sound funny, but when you are giving a live
presentation, it's very distracting for the audience if the presenter
must shuffle back and forth between slides, and a few wasted lines
can cause this.
A: Coding Style
789
img
Everyone seems to agree that code inside braces should be
indented. What people don't agree on ­ and the place where there's
the most inconsistency within formatting styles ­ is this: Where
does the opening brace go? This one question, I think, is what
causes such variations among coding styles (For an enumeration of
coding styles, see C++ Programming Guidelines, by Tom Plum and
Dan Saks, Plum Hall 1991.) I'll try to convince you that many of
today's coding styles come from pre-Standard C constraints (before
function prototypes) and are thus inappropriate now.
First, my answer to that key question: the opening brace should
always go on the same line as the "precursor" (by which I mean
"whatever the body is about: a class, function, object definition, if
statement, etc."). This is a single, consistent rule I apply to all of the
code I write, and it makes formatting much simpler. It makes the
"scannability" easier ­ when you look at this line:
int func(int a);
you know, by the semicolon at the end of the line, that this is a
declaration and it goes no further, but when you see the line:
int func(int a) {
you immediately know it's a definition because the line finishes
with an opening brace, not a semicolon. By using this approach,
there's no difference in where you place the opening parenthesis
for a multi-line definition:
int func(int a) {
int b = a + 1;
return b * 2;
}
and for a single-line definition that is often used for inlines:
int func(int a) { return (a + 1) * 2; }
Similarly, for a class:
790
Thinking in C++
img
class Thing;
is a class name declaration, and
class Thing {
is a class definition. You can tell by looking at the single line in all
cases whether it's a declaration or definition. And of course,
putting the opening brace on the same line, instead of a line by
itself, allows you to fit more lines on a page.
So why do we have so many other styles? In particular, you'll
notice that most people create classes following the style above
(which Stroustrup uses in all editions of his book The C++
Programming Language from Addison-Wesley) but create function
definitions by putting the opening brace on a single line by itself
(which also engenders many different indentation styles).
Stroustrup does this except for short inline functions. With the
approach I describe here, everything is consistent ­ you name
whatever it is (class, function, enum, etc.) and on that same line
you put the opening brace to indicate that the body for this thing is
about to follow. Also, the opening brace is the same for short
inlines and ordinary function definitions.
I assert that the style of function definition used by many folks
comes from pre-function-prototyping C, in which you didn't
declare the arguments inside the parentheses, but instead between
the closing parenthesis and the opening curly brace (this shows C's
assembly-language roots):
void bar()
int x;
float y;
{
/* body here */
}
Here, it would be quite ungainly to put the opening brace on the
same line, so no one did it. However, they did make various
A: Coding Style
791
img
decisions about whether the braces should be indented with the
body of the code or whether they should be at the level of the
"precursor." Thus, we got many different formatting styles.
There are other arguments for placing the brace on the line
immediately following the declaration (of a class, struct, function,
etc.). The following came from a reader, and is presented here so
you know what the issues are:
Experienced `vi' (vim) users know that typing the `]' key twice
will take the user to the next occurrence of `{` (or ^L) in column
0. This feature is extremely useful in navigating code (jumping
to the next function or class definition). [My comment: when I
was initially working under Unix, GNU Emacs was just
appearing and I became enmeshed in that. As a result, `vi' has
never made sense to me, and thus I do not think in terms of
"column 0 locations." However, there is a fair contingent of `vi'
users out there, and they are affected by this issue.]
Placing the `{` on the next line eliminates some confusing code
in complex conditionals, aiding in the scannability. Example:
if(cond1
&& cond2
&& cond3) {
statement;
}
The above [asserts the reader] has poor scannability. However,
if (cond1
&& cond2
&& cond3)
{
statement;
}
breaks up the `if' from the body, resulting in better readability.
[Your opinions on whether this is true will vary depending on
what you're used to.]
792
Thinking in C++
img
Finally, it's much easier to visually align braces when they are
aligned in the same column. They visually "stick out" much
better. [End of reader comment]
The issue of where to put the opening curly brace is probably the
most discordant issue. I've learned to scan both forms, and in the
end it comes down to what you've grown comfortable with.
However, I note that the official Java coding standard (found on
Sun's Java Web site) is effectively the same as the one I present here
­ since more folks are beginning to program in both languages, the
consistency between coding styles may be helpful.
The approach I use removes all the exceptions and special cases,
and logically produces a single style of indentation as well. Even
within a function body, the consistency holds, as in:
for(int i = 0; i < 100; i++) {
cout << i << endl;
cout << x * i << endl;
}
The style is easy to teach and to remember ­ you use a single,
consistent rule for all your formatting, not one for classes, two for
functions (one-line inlines vs. multi-line), and possibly others for
for loops, if statements, etc. The consistency alone, I think, makes it
worthy of consideration. Above all, C++ is a newer language than
C, and although we must make many concessions to C, we
shouldn't be carrying too many artifacts with us that cause
problems in the future. Small problems multiplied by many lines of
code become big problems. For a thorough examination of the
subject, albeit in C, see C Style: Standards and Guidelines, by David
Straker (Prentice-Hall 1992).
The other constraint I must work under is the line width, since the
book has a limitation of 50 characters. What happens when
something is too long to fit on one line? Well, again I strive to have
a consistent policy for the way lines are broken up, so they can be
easily viewed. As long as something is part of a single definition,
A: Coding Style
793
img
argument list, etc., continuation lines should be indented one level
in from the beginning of that definition, argument list, etc.
Identifier names
Those familiar with Java will notice that I have switched to using
the standard Java style for all identifier names. However, I cannot
be completely consistent here because identifiers in the Standard C
and C++ libraries do not follow this style.
The style is quite straightforward. The first letter of an identifier is
only capitalized if that identifier is a class. If it is a function or
variable, then the first letter is lowercase. The rest of the identifier
consists of one or more words, run together but distinguished by
capitalizing each word. So a class looks like this:
class FrenchVanilla : public IceCream {
an object identifier looks like this:
FrenchVanilla myIceCreamCone(3);
and a function looks like this:
void eatIceCreamCone();
(for either a member function or a regular function).
The one exception is for compile-time constants (const or #define),
in which all of the letters in the identifier are uppercase.
The value of the style is that capitalization has meaning ­ you can
see from the first letter whether you're talking about a class or an
object/method. This is especially useful when static class members
are accessed.
794
Thinking in C++
img
Order of header inclusion
Headers are included in order from "the most specific to the most
general." That is, any header files in the local directory are included
first, then any of my own "tool" headers, such as require.h then
,
any third-party library headers, then the Standard C++ Library
headers, and finally the C library headers.
The justification for this comes from John Lakos in Large-Scale C++
Software Design (Addison-Wesley, 1996):
Latent usage errors can be avoided by ensuring that the .h file of a
component parses by itself ­ without externally-provided declarations
or definitions... Including the .h file as the very first line of the .c file
ensures that no critical piece of information intrinsic to the physical
interface of the component is missing from the .h file (or, if there is,
that you will find out about it as soon as you try to compile the .c file).
If the order of header inclusion goes "from most specific to most
general," then it's more likely that if your header doesn't parse by
itself, you'll find out about it sooner and prevent annoyances down
the road.
Include guards on header files
Include guards are always used inside header files to prevent
multiple inclusion of a header file during the compilation of a
single .cpp file. The include guards are implemented using a
preprocessor #define and checking to see that a name hasn't
already been defined. The name used for the guard is based on the
name of the header file, with all letters of the file name uppercase
and replacing the `.' with an underscore. For example:
// IncludeGuard.h
#ifndef INCLUDEGUARD_H
#define INCLUDEGUARD_H
// Body of header file here...
#endif // INCLUDEGUARD_H
A: Coding Style
795
img
The identifier on the last line is included for clarity. Although some
preprocessors ignored any characters after an #endif, that isn't
standard behavior and so the identifier is commented.
Use of namespaces
In header files, any "pollution" of the namespace in which the
header is included must be scrupulously avoided. That is, if you
change the namespace outside of a function or class, you will cause
that change to occur for any file that includes your header,
resulting in all kinds of problems. No using declarations of any
kind are allowed outside of function definitions, and no global
using directives are allowed in header files.
In cpp files, any global using directives will only affect that file,
and so in this book they are generally used to produce more easily-
readable code, especially in small programs.
Use of require( ) and assure( )
The require( )and assure( )functions defined in require.hare used
consistently throughout most of the book, so that they may
properly report problems. If you are familiar with the concepts of
preconditions and postconditions (introduced by Bertrand Meyer) you
will recognize that the use of require( )and assure( )more or less
provide preconditions (usually) and postconditions (occasionally).
Thus, at the beginning of a function, before any of the "core" of the
function is executed, the preconditions are checked to make sure
everything is proper and that all of the necessary conditions are
correct. Then the "core" of the function is executed, and sometimes
some postconditions are checked to make sure that the new state of
the data is within defined parameters. You'll notice that the
postcondition checks are rare in this book, and assure( )is
primarily used to make sure that files were opened successfully.
796
Thinking in C++