The goal of coding standards are to increase the business value of the code. The most obvious (and indeed most important) way to do this is to make the code robust and low defect. Equally important, but more subtle goals include reducing coder friction and maintainability. As such, standards should be kept minimal -- simple enough to actually follow, and important enough to remember.
These standards should be used when building new source files. When an existing file needs to be changed, that is an appropriate time to bring it up to standard. However, it's never a good time to edit a file merely to bring it up to standard. If it ain't broke, don't "fix it" and remember to always "Keep it Working".
You'll notice that I don't touch on the classic "Religious" points:
- tabs vs. spaces
- indentation style
- curly brace style
- etc...
Consistency within a file is important and improves readability. But allowing coders to express themselves is also important. So, if you edit a file, either conform to the religion of that file, or convert the whole file to a new, consistent format. If you convert the whole file, you are effectively taking ownership of it, so be prepared to be the go-to person, or leave it as is.
General Rules:
- Eliminate duplicate code by refactoring as necessary. Any code that appears more than twice should be refactored. Remove any code that is and forever will be unused.
- Take some time to choose great names and well understood metaphors; they stay with the code much longer than you expect.
- Whenever an unreviewed source file crosses your path, pair with someone and review it completely, commenting and making it conform to these standards. Sign the comments at the top of the source file with names of both reviewers and the date reviewed. Eventually all files will have been reviewed, and we will occasionally revisit files that have not been reviewed in awhile. If you touch it, you comment it, and bring it up to standard.
- Comment the what, not the how. The source code should be self-explanatory as to the method for accomplishing some task. But if the task to be accomplished is unclear, call it out explicitly in the comments.
- Code following these guidelines will usually be more maintainable. However, exceptions exist to most of these guidelines, and where they conflict, choose readability and correctness, for foolish consistency hobgobbles little minds.
- No Compiler errors or warnings!
Good code is:
- Straightforward and not clever
- Understandable to someone who has not seen the code before
- As simple as possible, but not simpler
- Consistent
Comments:
Comments are generally a good thing. If you think your code needs a comment, it probably does. I used to be crazy about comments, but my thoughts have changed. Good code is self commenting, with good names for functions, methods, classes, and variables. Functions and methods are verbs, variables and members are nouns. Coding constructs are used to help clarify what's going on, and the expressiveness of the language is judiciously used to even further clarify. For example, do...while is used rather than a primed while when you want the reader to know it will always execute once. Cute or hard code is avoided, or commented. Maintain an appropriate scope with your comments including interfaces, tests, and non-local changes. Describe why you wrote the code, not how you wrote it.
The major problem with comments is duplication of information. Comments should not duplicate what the code says. Later in life, the code will change, and the comment won't. Programmers will read the comment and be mislead, bad bad bad! Instead, comments should clarify the reasons for the choices the code implements. Why use a polling loop vs. a callback architecture? Why are you discarding that buffer periodically, when there appears it may still have salient data?
In the rare event where the code really is obtuse, and you can't figure out how to make it clearer, then a comment block may be required. However, in this case it is best to indicate that this is a good candidate for refactoring. For the occasional code that has to be obtuse (e.g.. for real-world performance tuning), indicate as such. And make sure there are great Unit Tests exercising all the functionality!
Files:
- Header files should use #ifndef NAMESPACE_FILENAME_H to avoid header collision
- Source files should include all necessary headers directly, not indirectly through other (or corresponding) header files.
- Source files should never include other source files.
- File names should be the same as class names. One class per file, please! Header files and source files should be in the same place.
- Unit test files should be named for the class they test, and the filename should begin with the word test.
Format Standards:
- ClassesAreNamedLikeThis
- IInterfacesHaveAnIInThem
- CONSTANTS_LOOK_LIKE_THIS
- MACROS_TOO(x)
- memberFunctionsStartLowerCase()
- NamespacesAreDefinedLikeThis
- GlobalFunctions()
- Accessors always follow the convention of getMemberVariable and setMemberVariable, even though memberVariable starts with lower case. External API consistency is more important than naming consistency across one class's visibility level.
- GlobalProcedureName
- methodName()
- Use StudlyCaps or camelCaps to indicate word separations, rather than underlines. Exception: C/C++ MACRO_DEFINITIONS should be all uppercase with underlines separating words.
Defensive Coding Conventions
Naming
- Choose great metaphors first; choose meaningful and distinct names second.
-
Procedure and function names begin with a verb; functions returning bool begin with 'is'. Accessors and settors begin with 'get' and 'set'. 'find' prepends methods that look up a data member by a key. 'n' prefixes variables that indicate the "count of" elements.
-
Member variables begin with m_, global variables with g_, and static class variables with s_.
-
Do not use single letter variables (or other vague variable names, like 'foo' and 'tmp'), except for iterators (use i, j, k) and common arguments (t for time, n for number of elements).
-
Declare variables in the most limited scope possible.
-
The length of a variable name should be proportional to its scope. In other words, long-running global variables should have significantly longer names than an iterator existing for only a single loop structure.
-
Create symbolic type names for every distinct use of standard data types.
-
Create symbolic constants for all numbers with the possible exception of 0 and 1-even then, you should probably create a symbolic constant for each distinct type that uses these numbers.
-
Create a distinct typedef in a header for all instances of STL and other template-enabled containers, with all code using the symbolic type. Identify these template instances by appending _t.
- Name objects uniquely at construction, and use that name in all output.
- No one or two letter names. Prefer long, descriptive names. Be precise!
C++ Defensive Coding Conventions
- const is your friend. Use it liberally. Embrace the const virus and let it propagate! Members which don't change the state of the object must be marked const. Parameters which are not changed by the function and method must also be marked const.
Declare member variables and functions 'const' where appropriate. Prefer a mutable modifier on a single non-essential member variable to making an entire member function non-const. Pass object arguments as const references by default.
Declare the following as 'const':
-
-
all global constants
-
all member variables that will not change for the duration of the object
-
all pass-by-reference arguments that will not be changed by the function
-
all member functions that do not change the internal state of the object
- Also use enum when appropriate.
- Constants and enumerated types directly related to a class should be declared within that class. If they are of larger scope, they should be declared in the header which covers the scope of need and no more.
- All classes and interfaces (pure virtual classes) must have virtual destructors if and only if that class has any virtual functions, and must not rely on compiler provided ones. Classes intended to be constructed by factories should declare their default constructor non-public to prevent people (or compilers) from newing the object.
- The assignment operator -- operator= -- should always be paired with a copy constructor and vice versa.
- Classes should be derived from at most one concrete class.
- Classes can implement a multitude of "interfaces"
- An interface is a class in which:
-
- the only functions are public pure virtual functions.
- there are no data members.
- they inherit from nothing
- Member Variables should be protected or private. Accessors should be used to get or set their values.
- Never omit scope delineation, even when K&R allows for it. In other words, no if(1) go(); instead if(1){go();}
-
Similarly, declare only one variable per line.
- Prefer size-limited types like u8 and s32. Use int when you really want the default processor word size (e.g.. for a length).
-
Initialize all local variables at declaration. Initialize all member variables in the class' constructor. Prefer class initialization lists to constructor initialization
- ASSERT() invariants liberally. But make sure that these invariants are actually invariant. Asserts are a way of describing expected behavior, logic checking, and helping bring bugs to the surface faster, but do not let them get into release compiled code.
- When commenting out unused code with #ifdef <WHY_CODE_IS_UNUSED> instead of #if 0. If it's unused for a while, delete it, to reduce the clutter and source of confusion!
- Do not use the 'using' directive in a header file and especially not 'using namespace'.
- Inline simple or empty methods-accessors, most constructors, and destructors-directly in the header file.
- Constructors should only initialize member variables and thus should never error. Use an init() member function with standard error semantics for initialization code that can fail.
- Prefer aggregation or composition to inheritance.
- Virtual should be used only when multiple alternatives could be interchanged at runtime, or at startup without recompilation. Consider creating a template that uses a strategy object; the resulting code will be bigger but faster.
- Prefer inline functions, templates, and constants to macros where reasonable.
- (Recommended) Memory Semantic:
The use of pointers and references in C++ is error prone. When should I pass by reference, and when as a pointer? Should I delete the pointer you just passed me, or expect the caller to? I've used the following method with great success in the past. It found several bugs, and effectively eliminated memory leaks.
-
-
Passing a pointer to a method means that object or method owns the pointer. It is responsible for feeing the pointer.
-
Passing an object by reference means the caller retains ownership of the object.
-
Objects passed as pointers may want to support a "IClonable" interface so the caller can retain a copy if needed.
-
Return value semantics are the same. Returning a pointer is giving someone ownership of an object. Returning a reference is keeping the object.
Hungarian Notation, Name Decoration:
Personally, with the few exceptions mentioned in "Naming" above, I'm not a fan. With modern tools, such as what we use (Visual Studio), types are easily discerned by hovering the mouse pointer. Changing the type of a variable and then having to change the name everywhere in the code is problematic. And Hungarian notion mandates duplication of information, which leads to synchronization problems, such as when the type changes and the name doesn't. As such, you can never rely on the notation, and so it is just cumbersome and makes things less readable (how many extraneous letters are prepended until I know what the descriptive variable name is?). Hungarian Notation was intended to indicate the kind, not the type, and this may have some utility, such as distinguishing pointers from simple data types. However, some people like this, and in the spirit of consistency, try to be consistent within a file.
UPDATE (10/14/05): The Hungarian notation described in the various "Clique Coding Standards" documentation is what I'm talking about. Adding prefixes which describe the storage type of a variable is inane. On the other hand, using prefixes to describe the kind of data represented by the variable is useful. The following are useful Hungarian Notation conventions, and are worthy of consideration:
p pointer
b True/False
sz Null terminated string
cb count of bytes
cch count of chars
cw count of words
and the like... However, if we decide to adopt this, we should remove the storage type of Hungarian notation, as it is confusing, and contradictory. If we are to use Hungarian notation, we need to make it a company policy, for its utility is only realized when all the code conforms. Incomplete or incompatible conformity makes things worse. Here is an interesting link describing the differences between the useful kind of Hungarian notation and the asinine type.