Logo

Home
mail@c-xx.com

  Version 0.11  ▪  Draft     Feedback is warmly welcome  

C++ Coding Conventions

Consistency is the last refuge of the unimaginative.
—Oscar Wilde

Elegance is not an ornament worthy of man.
—Seneca

Introduction
{

This document is supposed to enumerate a number of conventions aiming at consistency and elegance of C++ code. In general, there is no intention to focus on technical aspects of programming, but rather on look-and-feel issues.

Several well-known practices are put together, combined with personal feelings, preferences and thoughts. By no means this should be considered as a teaching exercise. And – to emphasize once and till the end of the document – sentences like "Don’t do..." should be taken just as brief form of something similar to "in general, there is tendency to conclude that...".

The document is neither academically polished, nor complete. Consider it as a partial memory dump with highlighting.

}

Basic principles
{

Before starting, let’s clarify overall predilections.

Universality vs. Customs

Should the coding style be universal and invariant, regardless of the circumstances? Or should it be adapted to the environment (operating system, framework, libraries in use, etc.)?

There is no simple answer (this statement applies to many other places, so let’s not repeat it anymore), but as a rule of thumb let’s decide in favor of customs. If the environment is well-designed, it is perhaps better to follow its spirit and guidelines.

Does such approach make general coding conventions rather irrelevant? Well, partly yes, but definitely not completely. In certain cases the goal is environment-independent development. And even if environment shapes code, it usually leaves a number of style-related questions open.

In any case, there should be a place for personal preferences. Formal and unconditional applying of (rather cosmetic) rules may seriously damage joy of coding.

Logic vs. Physiologic

Let’s use the term "logic" to refer to semantic side of the code (roles of objects, their behavior, et al.), "physics" for its technical nature (variable types, the way parameters are passed, et al.), and "physiologic" for both together.

Nobody doubts that logic needs to be represented, but should it be accompanied with physics (this is especially prominent for naming discussions)?

Focus on logic. Do not overburden names with additional technical information.

Homogeneity vs. Recognizability

Should the same structure be re-used in different contexts? Well, sounds too fuzzy :-). Let’s consider concrete cases. Should {}-bracket policy be the same for classes, functions and control flow statements? Should variables be named depending on their scope? Should spacing rules be applied similarly to [], () and {}?

By default, homogeneity is advocated. If the language itself uses the same lexemes (e.g. {}-brackets) to denote different types of blocks, those brackets should come with the same formatting. If the comma is used to separate items, it should be pre- or post-spaced in the same way, regardless of context of enumeration.

On the other hand, if the language uses the same lexemes for completely different things (e.g. "<" for comparison and for template definition), the homogeneity rule does not apply :-).

}

Naming
{

Opportunities

Let’s see first what is in our arsenal.

Notation Example
Standard C / GNU notation mouse_weight
Hungarian notation iMouseWeight
Camel case (or lower Camel case) mouseWeight
Pascal case (or upper Camel case) MouseWeight
All capitals case MOUSE_WEIGHT

In addition, there are a number of different prefixes/suffixes which are often used to designate scope, constness or other attributes of a named object.

In this document, mostly Camel and Pascal cases are suggested, also all capitals case has its limited role.

Hungarian notation is not included as it emphasizes types of variables needlessly (besides the fact that it also hardly fits to Camel/Pascal cases).

Class names

Use Pascal case

This is a common practice, despite the fact that for the standard C++ and STL types GNU notation is used (see also the remark just above).
Examples: MainWindow, LinkedList.

Do not use prefixes like capital "C" (for "class")

If one does not use "f" for functions or "n" for namespaces, why should he denote classes in a special way?

Prefer avoiding prefixes like "Q" (for "Qt")

It is perhaps better to avoid prefixes for identification that a class belongs to a certain library; consider using namespaces instead.

Of course, introducing a short prefix to designate elements of a library, like classes, constants or global functions, has certain advantages, especially for multi-functional toolkits or rich frameworks, which are designed to be used throughout in other projects. Moreover, if namespaces were not available, using library prefixes would be one of the best approaches.

However, if, as it is supposed to be, namespaces are already used to split things, such prefixes cause redundancy and some sort of non-normalized code structure. Also, general recommendations are expected to be widely reusable, while unique short prefixes would run out pretty soon.

Use Interface-suffix for interfaces

Although technically, in terms of C++, interfaces are a particular case of classes, semantically they are rather different. Use the Interface suffix to emphasize it. Up to some extent it can be considered as a compensation of a missing keyword interface, which languages like Java or C# have.

Give standard interface implementation the same name, but without Interface-suffix

In case there is a standard, default implementation of an interface, give it the same name, but omit the suffix. For instance, if there is a standard GUI class which implements ButtonInterface, call it Button.

Use a noun-based singular form

Plural forms for classes are somewhat confusing, avoid them. If a class represents a set, for instance, a collection of stamps, use a name like StampCollection instead of Stamps.

Please note that the meaning is "use singular Collection instead of plural Stamps", but not "use singular Stamp in StampCollection instead of plural Stamps". The singular form Stamp is used in StampCollection just because of the English language, StampšCollection would not sound right.

Another remark is that this recommendation is not intended to encourage the reader to use custom containers instead of STL or Boost ones by default.

Consider adding an ancestor class name to the class name

If a class is a derived class, which belongs to a certain category, but didn’t deserve a brand-new name, consider adding a name of class which originates the category. For instance, if GreenTurtle class derived from SeaTurtle class derived from Turtle class derived from Reptile class, call it GreenTurtle, but not GreenSeaTurtle, neither GreenReptile.

Type (typedef) names

Apply class rules
... in particular, do not use suffix "T" or Type for types

Types are somewhat similar to classes and should be treated alike.

Variable names

Use Camel case

Camel case for variables is both elegant and traditional.
Examples: fileName, i.

Use :: for global variables

It is important to emphasize the scope of variables. Instead of naming global variables differently (e.g. Pascal case can be considered), the scope operator is recommended. If a variable sits, for instance, within a namespace ns, use it with ns::. Such approach does not introduce extra naming complexity and keeps "name" and "scope" as two different entities.

Use "_" for member variables

Use "_" as a prefix of suffix of member variables. This recommendation requires a longer discussion.

There are good reasons why member variables should be distinguishable from regular variables or function parameters. Firstly, the scope as such is an essential characteristic. Secondly, it is very common to name function parameters, regular variables and corresponding member variables similarly, so there is a need to distinguish them.

Appropriate thing would be a class-scope operator (similar to the global scope ::-operator). However, there is no such thing in C++. Instead, member variables can be referred using this->. Similar practice is common in Java and C# to resolve ambiguity when a member variable has the same name as a regular variable or a function parameter. But in case of C++ it is visually a bit cumbersome and uncommon. Another approach is to use a prefix or suffix.

A popular pattern suggests a prefix "m" or "m_". However an abbreviation of the word "member hardly finds a way to the heart, similarly to C for class or c for constant. If we can avoid it in other cases, introducing such abbreviations for member variables does not sound right. So a simple "_" is suggested, and it can be considered as a simulation of the missing class-scope-operator. (Cannot resist to mention that a good symbol for such hypothetical operator would be ".", and "_" somewhat approximates it :-). )

So far so good, the "_"-prefix for member variables looks attractive. But there is one thing to have in mind. There are a number of recommendations not to start variable names with the underscore.

Is it too restrictive or not? Indeed, variable names should not begin with "_ _" (double underscore) or "_#", where "#" is a capital letter.

Each name that contains a double underscore (_ _) or begins with an underscore followed by an uppercase letter (2.11) is reserved to the implementation for any use.
Each name that begins with an underscore is reserved to the implementation for use as a name in the global namespace.
—C++ Standard

However if a name starts with "_#" where "#" is a minuscule, it looks perfectly qualified for a member variable. Still, if it is proven that "_" is an inappropriate prefix, use it as a suffix.

Use noun-based singular and plural forms appropriately

Use singular form for a singular object, e.g.
city = new York;

Use plural form for arrays, lists or other collections, e.g.
Polyhedron platonicSolids[5];

Use well-known names for well-known roles

It is hard to come up with anything better than i, j, and k for general indexing purpose. Use them. Use c for character and x, y, z for coordinates. Another practice is to use e for exceptions. (A personal preference is to use x for exceptions, keeping e for events – OK, it is coming from C#.) This recommendation applies only to "quite local" variables.

Consider adding the principal class name to the variable name

Similarly to class naming, consider using "category class name" as a suffix, e.g. cancelButton or titleLabel.

Do not use class/category name based prefixes

Avoid using names like btnCancel or lblTitle. It is similar to (if not exactly) Hungarian notation. This recommendation is especially applicable to GUI member variables.

Do not use special notation for pointers or references

Prefixes like p for pointers or r for references, suffixes like Ptr for pointers et al. are discouraged as they emphasize technical nature of implementation and obscure the logical side.

This also applies to instances of various pointer classes, e.g. shared_ptr or QPointer.

Examples:

Node& head;                       // not rHead
shared_ptr<Vertex> root;          // not rootPtr
void swap(int* left, int* right); // not pLeft, pRight

Constants

Use either Camel case or Pascal case for constants

It is disputable if constness is so essential characteristics that it requires to be prominently emphasized. Basically, constants are, or at least should be considered as, unmodifiable variables. Looking from this angle, the Camel case applies. There is a tangible desire not to reflect constness in names, and in case it would be a must to give the answer, it would be "do not distinguish".

However, firstly, nobody seems to be forced to decide right now, and secondly, there is another question. Even if there is no big need to emphasize constness in general, does it still make sense to denote the "terminal constants", like Pi or DarkGray? There is a feeling that it does. However, since "terminal constants" are rather a semantical thing, it is hard to draw a formal line. Let's leave the issues open for now, allowing both Camel and Pascal cases.

Enumerations and Enumerators

Use Pascal case for enumerations

Enumeration is a particular type. As for classes and typedef'ed types, Pascal case should be used.

Apply for enumerators and for constants the same rules

Enumerators (the values within enumerations) are named constants. They should perhaps be treated alike.

Consider adding enumeration name to all its enumerators

It is a common situation when two or more enumerations have homonymous enumerators, e.g.

enum State
{
    Unknown,
    Off,
    On
}
         enum Direction
{
    Unknown,
    Left,
    Right
}

A way to resolve ambiguity could be to use State::Unknown and Direction::Unknown. But, although some compilers allow using enumeration names as qualifiers (at least VC compiles with a warning), it is beyond the C++ standard (C++0x is supposed to allow it, and even to require explicit scoping for type-safe enumerations).

Consider adding enumeration name, as follows:

enum State
{
    UnknownState,
    OffState,
    OnState
}
         enum Direction
{
    UnknownDirection,
    LeftDirection,
    RightDirection
}

It looks tempting to use enumeration names as prefixes. However, despite some tactical advantages of such approach, it introduces certain inconsistency; compare with recommendations Consider adding the principal class name to the variable name (above) and more general Use words in the normal language order below.

Use noun-based singular and plural forms appropriately

If an enumeration consists of non-combinable enumerators (not supposed to be used flag-alike), use a singular form, e.g.
enum Animal {QuickFox, LazyDog};

Otherwise, use plural form, e.g.
enum Styles {Visible, Enabled};

Function names

Use Camel case

It is debatable if Camel or Pascal case fits better for function names, both practices are reasonable and widely used. Since Camel case looks more elegant, let's stick to it.
Examples: getCount(), sleep().

Use :: for global functions

Similar to variables, use :: wherever appropriate.

Namespaces and namespace aliases

Use Pascal case for namespaces

As Ian Joyner wrote, "In pure OO languages, namespaces are not needed; classes themselves are namespaces.".
Indeed, namespaces can be considered as some type of meta-classes, intended mostly for grouping together classes, non-member functions and variables, et al. which are related to each other. It puts a namespace somewhat close to a particular case of a class, which already has a similar grouping feature (though not across different files). Therefore, the same notation as for classes is recommended.

Use Camel case for namespace aliases, keep it short

A nickname is supposed to be short and light.
In practice, it is convinient to use one word or an abbreviation, so Camel case is not distinguishable from GNU notation. Still, this is Camel case :-).
Example:
namespace math = Science::Mathematics;

Preprocessor elements

Use all capitals style

Preprocessor elements are considered mostly as legacy features and necessary evil. They do not really belong to the language, and should be reduced to a reasonable minimum. Capitalization is supposed to emphasize exceptional use of preprocessor, besides the fact that it is a traditional convention. Underscores drastically improve readability. WATCH_QUICK_FOX is better than CATCHQUICKFOX.

Files

Name files after containing classes

Typical approach is to put class declaration in a separate .h file and class definition in a separate .c++ file. (When classes represent GUI elements, it might be convenient to have a yet another file (like .ui file in case of Qt).) It is logical and intuitive to name such files after the class name, keeping the same naming convention as for classes.

General rules

Avoid abbreviations unless they are de-facto standard
Use de-facto standard abbreviations

Motivation is, of course, to prevent confusions and to keep briefness and readability.

Use entire abbreviation as a word

Capitalize abbreviations as if they would be regular words. E.g. a variable can be called htmlText. Alternatives like HTMLText are less readable (and can, as in this case, violate Camel case convention).

Use noun-based sentences for namespaces, classes, types, variables, constants, enumarations and enumerators
Use verb-based sentences for functions

In general, classes, types, enumerations are to represent some categories; variables, constants, enumerators are certain objects within these categories. In human languages, they are described by noun-based sentences.
Functions are usually to represent certain actions, and this is what verbs are for.

Avoid unneeded negation

Be positive. Prefer names as isDone() to names as isNotDone().

Use words in the normal language order

So as to have names readable and pronounceable naturally, use normal language forms (e.g. OakLeave, not LeaveOak). Well, conceptually it would be better to use reverse naming, since it reflects logical order of specification: base first, derivatives next. For instance, "Bambusa vulgaris" would be somewhat better than "common bamboo". But maybe it is too much.

}

Formatting
{

Tabulation

Use spaces instead of tabs

(I have to change my habits of using tabs :-).)

Set indent size to 4

This seems to be good for readability and not too much for nested constructions. In the dispute "3 vs 4", 4 is advocated due to fundamental practice to prefer powers of 2.

Lines

Ensure the last line termination with EOL, not EOF

Having last line ended with EOF is slightly inconsistent and could be less convenient for some operations, like copy-pasting of the entire file contents.

Avoid trailing whitespaces

Trailing whitespaces definitely do not improve readability and violate normalization. This applies to whitespaces at the line ends and to empty lines at the file end as well.

Avoid long lines

There are some recommendations to limit line lengths to 80 characters or so to prevent breaking lines while printing.

The concrete numbers are questionable, but in general it is a good practice to have line lengths limited adequately to the environment. Visual comparison of two versions of a file might be a good sample when shorter lines fit better.

Brackets

Choose bracket policy and use it consistently
{}-brackets policy

Let's list a few options.

A B C (NOK) D
●●●●
{
    ●●●●;
    ●●●●;
}
●●●● {
    ●●●●;
    ●●●●;
}
●●●●
  {
    ●●●●;
    ●●●●;
  }
●●●●
{   ●●●●;
    ●●●●;
}

Options A and B are the most popular.

Advantages of the first one are that it is more readable and also it makes it easier to move entire blocks (which are "set of sequential lines", not "a bracket at the end" + "set of sequential lines").

The second one saves some screen space.

Option C is not recommended, because it introduces unnecessary indention.

Option D seems to combine advantages of A and B, but looks unusual.

Decide for yourself and use consistently for all structures (classes, functions, control statements, et al.).

()-brackets policy

Some options, not necesserily mutually exclusive, are enumerated below.

A B C D
●● = ●●●●(●, ●, ●); ●● = ●●●●(●,
    ●, ●);
●● = ●●●●(
    ●,
    ●,
    ●
);
●● = ●●●●
(   ●,
    ●,
    ●
);

Variant A is most appropriate for "short lines", B is not much more than breaking A in several lines and is OK as far as line breaking in general is OK.

In case of multiple and/or long-named parameters, variant C can provides readable alternative to B, feel free to use it.

Similarly to {}-brackets, variant D could be used. In fact, using the variant D for ()-brackets in combination with the variant C for {}-brackets, provides a very consistent code structure. But, once again, uncommon :-).

Omit {}-brackets for single statements when applicable

Although some sources recommend using {}-brackets always, this practice seems to blow up code without giving considerable benefits.

Do not put else or catch on the same line as }-bracket

Having if-else or try-catch pairs misaligned looks unbalanced and can be disturbing for readers.

Spacing

Consider natural language rules for spacing

Readability is the goal. In general, "natural language" rules are advocated, including:

  • no space before ",", ";";
  • a space (or EOL) after.
... but be reasonable

Do not add extra spaces after dots, they should be considered as separators rather than regular punctuation signs.

Use table style judiciously

Table style basically means using extra spaces to introduce vertical alignment, like in the following case:

int    i = 0;   // some comment
double d = 0.0; // another comment

Table style is acceptable for pragmatic readability reasons. Though in theory it is perhaps wrong similarly to the following construction

    printf("\
+=================+\n\
|    t a b l e    |\n\
+=================+\n");

which mixes code-centric and data-centric formatting.

Be aware about auto-formatting feature of source code editors. It may not appreciate the table style.

Spacing and brackets
Consider having no spaces after (-bracket and before )-bracket
Consider having no spaces after [-bracket and before ]-bracket

It is not absolutely clear and perhaps subjective if a space after (-bracket and [-bracket and before )-bracket and ]-bracket makes the code easier to read or vice versa.

Due to lack of means to measure readability objectively, to find a "statistically better" way, let’s leave the choice to the reader. If you cannot define a clear winner, then it is better to omit spaces. Shorter code is better, if equally readable. This is also in line with the using of the "natural language" recommendation.

Spacing and control flow statements
Deside if it is better to have a space before opening bracket in control flow statements

There is a common practice to add a space between if, for, etc. and the following opening bracket. Tradition to be respected. Though, answering the usual argument that "this is done to distinguish between operators and functions", I would ask to show me a person for whom such distinction is not obvious :-).

(And please allow absence of such spaces in my code, let me keep my accent :-).)

Spacing and types
Group "*" and "&" together with type

Use constructions like int* x instead of int *x (and ditto for reference &). Of course, prevent confusing constructions like int* x, y.

}

Class and File Layout
{

Declare public class members first, protected next and private last

This is a common practice. It simplifies using header file as some form of a class documentation. Regular users of the class interested mostly in the public section, and those who derive from the class, in public and protected.

Using headers in this way is not very conceptual perhaps, but quite convenient.

Put static class members first within public, protected and private sections

Static class members have different nature and belong to the class itself, not to instances of the class. It is useful to emphasize it by placing them in front of non-static members.

Keep the same order of class members in headers and implementation files

It might be reasonable to do for simplifying browsing of source files and for the sake of consistency.

Place own header first

It is common practice to start C++ file with its own header. This way helps to find if the header contains all necessary include directives and declarations. A side advantage is that such #include can be used as a "tag", making it easy to guess the file name when reading (e.g. printed) code.

Group similar headers together

Split the #include sequence into groups (e.g. framework includes, system includes). Within each group, sort includes alphabetically.

The typical order looks as follows:

However to be consistent with the "own header first" recommendation, one may perhaps also consider the following unusual order:

}

Miscellanea
{

Use virtual keyword for overriden functions

Unfortunately, the C++ language does not ensure that overridden functions are marked explicitly (as e.g. C# does). This could result in funny bugs, for instance if a maintainer of the base class adds a virtual function which is homonymous to one in a derived class.

Although using the keyword virtual explicitly for implicitly virtual functions does not really solve problems, it at least makes things more visible.

Avoid non-empty throw() function attribute
Consider avoiding even empty throw() attribute

Maybe the recommendation should be formulated as "do not use throw() attribute before reading the article of Herb Sutter". It gives excellent explanation of the subject.

Avoid accompanying function definitions with comments like /*virtual*/ or /*static*/

It might look appropriate to have a comment if a function is virtual or static next to the function definition. However it makes code more cumbersome, and in addiion brings the risk of desynchonisation.

Generally avoid using values of assignment operator, increment and decrement

Embedding assignments, increments and decrements within other operations can be illustrated with the following samples.

if((c = a[i]) >= '0' && c <= '9') // embedded assignment
    d = c - '0';
while(--count > 0)   // embedded decrement
    a[i++] = b[j++]; // embedded increments

Although such patterns might look quite usual, in many (if not in most of all) cases it is better to avoid them. No doubt, it is tempting to exploit a fact that the assignment operator, increment and decrement return a value. It can make code shorter and may look somewhat equilibristic. However such embedding is more technical than algorithmical, tends to bring unneeded complexity and decreases maintainability.

Still, there are some situations when embedding might be helpful. Chain assignment is one of such cases. There is nothing wrong in constructions like the following:

x = y = z = 0;

Include namespace names in include guards

Quite often, include guards are constructed based on the class name, for instance:

#ifndef MATRIX_H
#define MATRIX_H // hardly unique name
class Matrix
{
    //....
}
#endif

In could easily lead to a conflict if two classes (maybe within different libraries) have the same name. One alternative is to add a GUID to the define, to get something similar to MATRIX_H_E4865904_C72C_4975_A445_30D1FAAB7546. This is reliable, but bulky.

Another way is to include a name of the namespace to which the class belongs to, or, if the class belongs to a nested namespace, names of the entire namespace hierarchy. For instance, if the Matrix class belongs to the namespace Math::Algebra, the guarding define can be called MATH_ALGEBRA_MATRIX. It brings a good balance between readability and unicity.

Type casting to be done via _cast operators

Casting in C++ can be considered as goto in C. Try to avoid. And if you have to use it, make an appropriate cumbersome _cast construction, not ()-casting.

For pointers, use NULL instead of 0

This issue has a lot of controversy. From several arguments, let’s choose the following. C++0x is going to introduce nullptr, ensuring that null-pointer and 0 are conceptually different; and using NULL seriously helps in find-and-replace procedure.

}

References
{

C++ Programming Style Guidelines
http://geosoft.no/development/cppstyle.html

C++ Coding Standard
http://www.possibility.com/Cpp/CppCodingStandard.html

C++0x in Wikipedia
http://en.wikipedia.org/wiki/C%2B%2B0x

Design Guidelines for Developing Class Libraries
http://msdn.microsoft.com/en-us/library/ms229042.aspx
(This document refers to C#, but many recommendations can be applied to C++ as well.)

Ian Joyner. C++?? : A Critique of C++
http://burks.bton.ac.uk/burks/pcinfo/progdocs/cppcrit/

Herb Sutter. A Pragmatic Look at Exception Specifications
http://www.gotw.ca/publications/mill22.htm

}