SNAP C++ Programming Guide

Revision 02-06-2013

Derived from Google C++ Style Guide
Heavily customized for the SNAP Library
Table of Contents

Background

Stanford Network Analysis Platform (SNAP) is a general purpose network analysis and graph mining library. It easily scales to massive networks with hundreds of millions of nodes, and billions of edges.

SNAP is written in the C++ programming language. This programming guide describes a set of conventions for the SNAP C++ code as well as the most important constructs that are used in the code. To see an example of SNAP programming style, see file graph.h.

C++ has many powerful features, but this power brings with it complexity, which can make code more error-prone and harder to read and maintain. The goal of this guide is to manage this complexity by describing the rules of writing SNAP code. These rules exist to keep the code base consistent and easier to manage while still allowing productive use of C++ features.

Code consistency is important to keep the code base manageable. It is very important that any programmer be able to look at another's code and quickly understand it. Maintaining a uniform style and following conventions means that we can more easily use "pattern-matching" to infer what various symbols are and what they do, which makes code much easier to understand. In some cases there might be good arguments for changing certain style rules, but we nonetheless keep things as they are in order to preserve consistency.

Another issue this guide addresses is that of C++ feature bloat. C++ is a huge language with many advanced features. In some cases we constrain, or even ban, use of certain features. We do this to keep code simple and to avoid the various common errors and problems that these features can cause. This guide lists these features and explains why their use is restricted.

Note that this guide is not a C++ tutorial. We assume that you are familiar with the language.

Formatting

Coding style and formatting can be pretty arbitrary, but code is much easier to follow and learn if everyone uses the same style. Not everyone may agree with every aspect of the formatting rules, but it is important that all SNAP contributors follow the style rules so that we can all read and understand everyone's code easily.

Line Length

Try to keep each line of text in your code at most 80 characters long.

Spaces vs. Tabs

Use only spaces, and indent 2 spaces at a time.

We use spaces for indentation. Do not use tabs in your code. You should set your editor to emit 2 spaces when you hit the tab key.

Conditionals

Prefer no spaces inside parentheses. The else keyword belongs on a new line.

if (condition) {     // no spaces inside parentheses
  ...  // 2 space indent.
} else if (...) {    // The else goes on the same line as the closing brace.
  ...
} else {
  ...
}

You must have a space between the if and the open parenthesis. You must also have a space between the close parenthesis and the curly brace.

if(condition)     // Bad - space missing after IF.
if (condition) {  // Good - proper space after IF and before {.

Short conditional statements may be written on one line if this enhances readability. You may use this only when the line is brief and the statement does not use the else clause. Always use the curly brace:

if (x == kFoo) { return new Foo(); }
if (x == kBar) { return new Bar(); }

Single-line statements without curly braces are prohibited:

if (condition)
  DoSomething();

In most cases, conditional or loop statements with complex conditions or statements are more readable with curly braces.

if (condition) {
  DoSomething();  // 2 space indent.
}

Loops and Switch Statements

Use curly braces for loops:

while (condition) {
  ...           // 2 space indent
}
for (int i = 0; i < Num; i++) {
  ...           // 2 space indent
}
while (condition);  // Bad - looks like part of do/while loop.

case blocks in switch statements can have curly braces or not, depending on your preference. If you do include curly braces they should be placed as shown below.

If the condition is not an enumerated value, switch statements should always have a default case:

switch (var) {
  case 0: {  // 2 space indent
    ...      // 4 space indent
    break;
  }
  case 1: {
    ...
    break;
  }
  default:
    ...
  }
}

Pointer and Reference Expressions

Do not include spaces around period or arrow. Pointer operators do not have trailing spaces.

The following are examples of correctly-formatted pointer and reference expressions:

x = *p;
p = &x;
x = r.y;
x = r->y;

Note that:

  • There are no spaces around the period or arrow when accessing a member.
  • Pointer operators have no space after the * or &.

When declaring a pointer variable or argument, place the asterisk * adjacent to the variable name and the ampersand & adjacent to the type:

char* C;
const int& P;

Boolean Expressions

When you have a long boolean expression, put the operators at the line ends:
if ((ThisOneThing > ThisOtherThing) &&
    (AThirdThing == AFourthThing) &&
    YetAnother && LastOne) {
  ...
}

Function Calls

Place the call on one line if it fits; otherwise, wrap arguments at the parenthesis.

Function calls have the following format:

bool RetVal = DoSomething(Arg1, Arg2, Arg3);

If the arguments do not all fit on one line, they should be broken up onto multiple lines. Do not add spaces after the open paren or before the close paren:

DoSomethingThatRequiresALongFunctionName(Argument1,
 Argument2, Argument3, Argument4);

If the parameter names are very long and there is not much space left due to line indentation, you may place all arguments on subsequent lines:

DoSomethingElseThatRequiresAEvenLongerFunctionName(
 Argument1,
 Argument2,
 Argument3,
 Argument4);

Function Declarations and Definitions

Place the function return type and parameters on the same line as function name, if they fit.

Functions look like this:

ReturnType ClassName::FunctionName(Type ParName1, Type ParName2) {
  DoSomething();
  ...
}

If you have too much text to fit on one line, split the code over several lines:

ReturnType ClassName::ReallyLongFunctionName(Type ParName1,
 Type ParName2, Type ParName3) {
  DoSomething();
  ...
}

Some points to note:

  • The return type is always on the same line as the function name.
  • The open parenthesis is always on the same line as the function name.
  • There is never a space between the function name and the open parenthesis.
  • There is never a space between the parentheses and the parameters.
  • The open curly brace is always at the end of the same line as the last parameter.
  • The close curly brace is either on the last line by itself or (if other style rules permit) on the same line as the open curly brace.
  • There should be a space between the close parenthesis and the open curly brace.
  • All parameters should be named, with identical names in the declaration and implementation.
  • All parameters should be aligned if possible.
  • Default indentation for parameters is 1 space.

Return Values

Do not needlessly surround the return expression with parentheses. Parentheses are ok to make a complex expression more readable:

return Result;                  // No parentheses in the simple case.
return (SomeLongCondition &&    // Parentheses ok to make a complex
        AnotherCondition);      //   expression more readable.

Class Format

The basic format for a class declaration (the comments are omitted, see Class Comments for a discussion of what comments are needed) is:
class TMyClass : public TOtherClass {
public:
  typedef TMyClass TDef;                      // typedefs
  typedef enum { meOne, meTwo, ... } TMyEnum; // enums
public:
  class TPubClass1 {                          // public subclasses
    ...
  }
private:
  class TPriClass2 {                          // private subclasses
    ...
  }
private:
  TInt Var;                                   // private data
  ...
private:
  TInt GetTmpData();                          // private methods
  ...
public:
  TMyClass();                                 // constructors
  ...
  int SetStats(const int N);                  // public methods
  ...
  friend class TMyOtherClass;                 // friends
};

Each public SNAP class must define the following methods: a default constructor, a copy constructor, a TSIn constructor, a Load() method, a Save() method and an assignment operator =:

class TMyClass : public TOtherClass {
...
public:
  TMyClass();                                 // default constructor
  explicit TMyClass(int Var);                 // an explicit constructor (optional)
  TMyClass(const TMyClass& MCVar);            // copy constructor
  TMyClass(TSIn& SIn);                        // TSIn constructor
  void Load(TSIn& SIn);                       // Load() method
  void Save(TSOut& SOut) const;               // Save() method
  TMyClass& operator = (const TMyClass& MCVar);   // '=' operator
  ...
  int GetVar() const;                         // get value of Var
  int SetVar(const int N);                    // set value of Var
  ...
};

Make data members private and provide access to them through Get...() and Set...() methods.

More complex classes with support for "smart" pointers have additional requirements. See Smart Pointers for details.

For a class format example in the SNAP code, see file graph.h:TUNGraph.

Constructor Initializer Lists

Constructor initializer lists can be all on one line or in multiple lines, if the list is too long to fit in one line:
MyClass::MyClass(int Var) : SomeVar(Var), SomeOtherVar(Var + 1) { }

Make sure that the values in the list are listed in the same order in which they are declared. A list order that is different than the declaration order can produce errors that are hard to find.

Templates

Use one line for template forward definitions, if possible. Use two lines for template implementations as shown below.

An example of template forward definition (see alg.h for more):

template <class PGraph> int GetMxDegNId(const PGraph& Graph);

The corresponding template implementation is:

template <class PGraph>
int GetMxDegNId(const PGraph& Graph) {
  ...
}

Vertical Whitespace

Minimize use of vertical whitespace.

This is more a principle than a rule: don't use blank lines when you don't have to. In particular, don't put more than one blank line between functions, resist starting functions with a blank line, don't end functions with a blank line, and be discriminating with your use of blank lines inside functions.

The basic principle is: The more code that fits on one screen, the easier it is to follow and understand the control flow of the program. Of course, readability can suffer from code being too dense as well as too spread out, so use your judgement. But in general, minimize use of vertical whitespace.

Preprocessor Directives

The hash mark that starts a preprocessor directive should always be at the beginning of the line, even when preprocessor directives are within the body of indented code.

Non-ASCII Characters

Non-ASCII characters should be rare, and must use UTF-8 formatting.

In certain cases it is appropriate to include non-ASCII characters in your code. For example, if your code parses data files from foreign sources, it may be appropriate to hard-code the non-ASCII string(s) used in those data files as delimiters. In such cases, you should use UTF-8, since this encoding is understood by most tools able to handle more than just ASCII. Hex encoding is also OK, and encouraged where it enhances readability — for example, "\xEF\xBB\xBF" is the Unicode zero-width no-break space character, which would be invisible if included in the source as straight UTF-8.

Naming

SNAP code uses a range of conventions to name entities. It is important to follow these conventions in your code to keep the code compact and consistent.

General Naming Rules

Function names, variable names, and filenames should be short and concise.

Type and variable names should typically be nouns, ErrCnt. Function names should typically be "command" verbs, OpenFile().

SNAP code uses an extensive list of abbreviations, which make the code easy to understand once you get familiar with them:

  • T...: a type (TInt).
  • P...: a smart pointer (PUNGraph).
  • ...V: a vector (variable InNIdV).
  • ...VV: a matrix (variable FltVV, type TFltVV with floating point elements).
  • ...H: a hash table (variable NodeH, type TIntStrH with Int keys, Str values).
  • ...HH: a hash of hashes (variable NodeHH, type TIntIntHH with Int key 1 and Int key 2).
  • ...I: an iterator (NodeI).
  • ...Pt: an address pointer, used rarely (NetPt).
  • Get...: an access method (GetDeg()).
  • Set...: a set method (SetXYLabel()).
  • ...Int: an integer operation (GetValInt()).
  • ...Flt: a floating point operation (GetValFlt()).
  • ...Str: a string operation (DateStr()).
  • Id: an identifier (GetUId()).
  • NId: a node identifier (GetNIdV()).
  • EId: an edge identifier (GetEIdV()).
  • Nbr: a neighbour (GetNbrNId()).
  • Deg: a node degree (GetOutDeg()).
  • Src: a source node (GetSrcNId()).
  • Dst: a destination node (GetDstNId()).
  • Err: an error (AvgAbsErr).
  • Cnt: a counter (LinksCnt).
  • Mx: a maximum (GetMxWcc()).
  • Mn: a minimum (MnWrdLen).
  • NonZ: a non-zero (NonZNodes).

File Names

Filenames should be all lowercase with no underscores (_) or dashes (-). C++ files should end in .cpp and header files should end in .h.

Examples of acceptable file names:

graph.cpp
bignet.h

Type Names

Type names start with a capital letter "T" and have a capital letter for each new word, with no underscores: TUNGraph.

Variable Names

Variable names start with a capital letter and have a capital letter for each new word, with no underscores: NIdV. An exception for lowercase names is the use of short index names for loop iterations, such as i, j, k.

Variable names should typically be nouns, ErrCnt.

Function Names

Function names start with a capital letter and have a capital letter for each new word, with no underscores: GetInNId().

Function names should typically be "command" verbs, OpenFile().

Enumerator Names

Enumerators start with lowercase letters for each word in the corresponding type, followed by capitalized words, with no underscores:
typedef enum { srUndef, srOk, srFlood, srTimeLimit } TStopReason;

Namespace Names

Do not define any new namespaces.

See Namespaces for a discussion about the SNAP namespaces.

Macro Names

In general macros should not be used. However, if they are absolutely needed, then they should be named with all capitals and underscores:
#define ROUND(x) ...
#define PI_ROUNDED 3.14

Comments and Documentation

Comments are absolutely vital to keeping the code readable. But remember: while comments are very important, the best code is self-documenting. Giving sensible names to types and variables is much better than using obscure names that you must then explain through comments.

Comments in the source code are also used to generate reference documentation for SNAP automatically. A few simple guidelines below show how you can write comments that result in high quality reference documentation.

When writing your comments, write for your audience: the next contributor who will need to understand your code. Be generous — the next one may be you in a few months!

Documentation

SNAP reference documentation is generated from the source code, using the Doxygen documentation system. Each entity in the code has two types of descriptions for the reference documentation, a brief description and detailed description, both are optional. Text for a brief description is located directly in the source code. Text for a detailed description is in a separate file, only a tag name is given in the source code.

A brief description consists of ///, followed by one line of text:

/// Returns ID of the current node.

A detailed description consists of a brief description, followed by ##<tag_name>:

/// Returns ID of NodeN-th neighboring node. ##TNodeI::GetNbrNId

Text for <tag_name> from file <source_file> is placed in file doc/<source_file>.txt. Tag format is:

/// <tag_name>
...<detailed description>
///

For example, a detailed description for ##TNodeI::GetNbrNId from file snap-core/graph.h is in file snap-core/doc/graph.h.txt (see these files for more examples):

/// TNodeI::GetNbrNId
Range of NodeN: 0 <= NodeN < GetNbrDeg(). Since the graph is undirected
GetInNId(), GetOutNId() and GetNbrNId() all give the same output.
///

Additional Documentation Commands

Snap documentation also uses the following Doxygen commands:

  • ///<: for comments associated with variables.
  • @param: for comments associated with function parameters.
  • \c: specifies a typewritter font for the next word.
  • <tt>: specifies a typewritter font for the enclosed text.

More details on how to use these commands are provided in specific sections below.

Class Comments

Every class definition should have at the beginning a description of what it is for and how it should be used. Start the description with a 50 character visual marker as shown below:
//#///////////////////////////////////////////////
/// Undirected graph. ##Undirected_graph
class TUNGraph {
  ...
};

Function Comments

At each function declaration in the *.h file include a 1 line, 1 sentence long description. The description should give use of the function:
/// Deletes node of ID \c NId from the graph. ##TUNGraph::DelNode
void DelNode(const int& NId);
Use \c to specify typewriter font when you refer to variables or functions.

If the description requires more than one sentence, which should happen often, then create a tag ##<class>::<function> at the end of the line and put the remainder of the description in the doc/*.h.txt file.

Function Declarations

Every function declaration should have a description immediately preceding it about what the function does and how to use it. In general, the description does not provide how the function performs its task. That should be left to comments in the function definition.

Function Definitions

Each function definition should have a comment describing what the function does if there's anything tricky about how it does its job. For example, in the definition comment you might describe any coding tricks you use, give an overview of the steps you go through, or explain why you chose to implement the function in the way you did rather than using a viable alternative. If you implemented an algorithm from literature, this is a good place to provide a reference.

Note you should not just repeat the comments given with the function declaration, in the .h file or wherever. It's okay to recapitulate briefly what the function does, but the focus of the comments should be on how it does it.

Function Parameters

It is important that you comment the meaning of input parameters. Use @param construct in the comments to do that. For the example of function void DelNode(const int& NId); above, its parameter is documented in file doc/*.h.txt as follows:

/// TUNGraph::DelNode
@param NId Node Id to be deleted.
///

Variable Comments

In general the actual name of the variable should be descriptive enough to give a good idea of what the variable is used for. In certain cases, comments are required.

To associate a comment with a variable, start the comment with ///<:

TInt NFeatures; ///< Number of features per node.
The comment is to the left of the variable declaration.

Class Data Members

Each class data member (also called an instance variable or member variable) should have a comment describing what it is used for.

Implementation Comments

In your implementation you should have comments in tricky, non-obvious, interesting, or important parts of your code.

Punctuation, Spelling and Grammar

Pay attention to punctuation, spelling, and grammar; it is easier to read well-written comments than badly written ones.

TODO Comments

Use TODO comments for code that is temporary, a short-term solution, or good-enough but not perfect.

TODOs should include the string TODO in all caps, followed by the name, e-mail address, or other identifier of the person who can best provide context about the problem referenced by the TODO. The main purpose is to have a consistent TODO format that can be searched to find the person who can provide more details upon request.

Comment Style

For other comments, use the // syntax to document the code, wherever possible:
// This line illustrates a code comment.

SNAP-Specific Magic

Smart Pointers

Use "smart" pointers instead of class objects, whenever a smart pointer type is defined for a class. Smart pointers are implemented for large, complex classes, like graphs and networks, but not for simpler classes, like vectors.

Smart pointers are objects that act like pointers, but automate management of the underlying memory. They are extremely useful for preventing memory leaks, and are essential for writing exception-safe code.

By convention, class names in SNAP start with letter "T" and their corresponding smart pointer types have "T" replaced with "P".

In the following example, variable Graph is defined as an undirected graph. TUNGraph is the base type and PUNGraph is its corresponding smart pointer type:

PUNGraph Graph = TUNGraph::New();

The following example shows how an undirected graph is loaded from a file:

{ TFIn FIn("input.graph"); PUNGraph Graph2 = TUNGraph::Load(FIn); }

To implement smart pointers for a new class, only a few lines need to be added to the class definition.

The original class definition:

class TUNGraph {
  ...
};

The original class definition after smart pointers are added:

class TUNGraph;

typedef TPt<TUNGraph> PUNGraph;

class TUNGraph {
   ...
private:
  TCRef CRef;
  ...
public:
  ...
  static PUNGraph New();                 // New()  method
  static PUNGraph Load(TSIn& SIn);       // Load() method
  ...
  friend class TPt<TUNGraph>;
};

The new code declares PUNGraph, a smart pointer type for the original class TUNGraph. A few new definitons have been added to TUNGraph: CRef, a reference counter for garbage collection; New(), a method to create an instance; and a friend declaration for TPt<TUNGraph>. The Load() method for a smart pointer class returns a pointer to a new object instance rather than no result, which is the case for regular classes.

An example of definitions for the New() and Load() methods for TUNGraph are shown here:

static PUNGraph New() { return new TUNGraph(); }
static PUNGraph Load(TSIn& SIn) { return PUNGraph(new TUNGraph(SIn)); }

Streams

SNAP defines its own streams. Use those instead of C++ streams, like cin, cout, cerr. For console output, use printf().

SNAP defined streams are:

  • TSIn is an input stream.
  • TSOut is an output stream.
  • TSInOut is an input/output stream.
  • TStdIn is the standard input stream.
  • TStdOut is the standard output stream.
  • TFIn is a file input stream.
  • TFOut is a file output stream.
  • TFInOut is a file input/output stream.
  • TZipIn is a compressed file input.
  • TZipOut is a compressed file output.

Assertions

SNAP defines a rich set of assertions to verify that the program works as expected. Use these assertions are often as possible. Avoid assertions that can cause significant degradation of the program performance.

Assertion names in SNAP use the following convention for the first letter:

  • Assert: compiled only in the debug mode, aborts when the assertion is false.
  • IAssert: always compiled, aborts if the assertion is false.
  • EAssert: always compiled, does not abort, but throws an exception.
If the assertion name ends with letter R, a failed assertion provides a reason for the failure.

Some common SNAP assertions are:

  • Assert verifies the condition. This is the basic assertion.
  • AssertR verifies the condition, provides a reason when the condition fails.

SNAP also implements assertions that always fail. These are used when the program identifies a critical error, such as being out of memory. Fail assertions are:

  • EFailR throws an exception with a reason.
  • FailR prints the reason and terminates the program.
  • Fail terminates the program.

Examples of assertion usage:

AssertR(IsNode(NId), TStr::Fmt("NodeId %d does not exist", NId));
EFailR(TStr::Fmt("JSON Error: Unknown escape sequence: '%s'", Beg).CStr());

Integer Types

Of the built-in C++ integer types, use only int in your code. If a program needs a variable of a different size, use one of these precise-width integer types:
  • int8, uint8: signed, unsigned 8-bit integers.
  • int16, uint16: signed, unsigned 16-bit integers.
  • int32, uint32: signed, unsigned 32-bit integers.
  • int64, uint64: signed, unsigned 64-bit integers.

Use int for integers that are not going to be too large, e.g., loop counters. You can assume that an int is at least 32 bits, but don't assume that it has more than 32 bits.

Functions should not return values of type TSize (size_t). Instead, use fixed size arguments for function return values, like int32, int64.

Do not use the unsigned integer types, unless the quantity you are representing is really a bit pattern rather than a number, or unless you need defined twos-complement overflow. In particular, do not use unsigned types to say a number will never be negative. Instead, use assertions for this purpose.

To print 64-bit integers, use TUInt64::GetStr() for conversion to string and the %s print formatting conversion:

int64 Val = 123456789012345;
TStr Note = TStr::Fmt("64-bit integer value is %s", TUInt64::GetStr(Val).CStr());

Exceptions

SNAP defines its own exceptions. Use SNAP defined exceptions. Do not use C++ exceptions.

SNAP exceptions are implemented with TExcept::Throw and PExcept.

TExcept::Throw throws an exception:

TExcept::Throw("Empty blog url");

PExcept catches an exception:

try {
  ...
} catch (PExcept Except) {
  SaveToErrLog(Except->GetStr().CStr());
}

Other C++ Features

Use of const

Use const whenever it makes sense to do so.

Declared variables and parameters can be preceded by the keyword const to indicate the variables are not changed (e.g., const int Foo). Class functions can have the const qualifier to indicate the function does not change the state of the class member variables (e.g., class Foo { int Bar(char Ch) const; };).

const variables, data members, methods and arguments add a level of compile-time type checking. It is better to detect errors as soon as possible. const can also significantly reduce execution time. Therefore we strongly recommend that you use const whenever it makes sense to do so:

  • If a function does not modify an argument passed by reference or by pointer, that argument should be const.
  • Declare methods to be const whenever possible. Accessors should almost always be const. Other methods should be const if they do not modify any data members, do not call any non-const methods, and do not return a non-const pointer or non-const reference to a data member.
  • Consider making data members const whenever they do not need to be modified after construction.

Put const at the beginning of a definition as in const int* Foo, not in the middle as in int const *Foo.

Note that const is viral: if you pass a const variable to a function, that function must have const in its prototype (or the variable will need a const_cast).

Preprocessor Macros

Avoid using macros, if possible. Prefer inline functions, enums, and const variables to macros.

Macros are not nearly as necessary in C++ as they are in C. Instead of using a macro to inline performance-critical code, use an inline function. Instead of using a macro to store a constant, use a const variable. Instead of using a macro to "abbreviate" a long variable name, use a reference. Instead of using a macro to conditionally compile code ... well, don't do that at all (except, of course, for the #define guards to prevent double inclusion of header files). It makes testing much more difficult.

0 and NULL

Use 0 for integers, 0.0 for reals, NULL for pointers, and '\0' for chars.

sizeof

Use sizeof(varname) instead of sizeof(type) whenever possible.

Use sizeof(varname) because it will update appropriately if the type of the variable changes. sizeof(type) may make sense in some cases, but should generally be avoided because it can fall out of sync if the variable's type changes.

Struct Data;
memset(&Data, 0, sizeof(Data));
memset(&Data, 0, sizeof(Struct));

Casting

Use C++ casts like static_cast<>():
int Cnt = static_cast<int>(ValType);
Do not use other cast formats like int y = (int)x; or int y = int(x);.

Scoping

Namespaces

SNAP uses namespace TSnap to encapsulate global functions. Define all SNAP global functions within that namespace.

Use namespace TSnapDetail to encapsulate local functions.

Do not define any new namespaces, use TSnap for global functions and TSnapDetail for local functions.

Do not use a using-directive to make all names from a namespace available:

// Forbidden -- This pollutes the namespace.
using namespace Foo;

Nonmember, Static Member, and Global Functions

Prefer nonmember functions within a namespace or static member functions to global functions; use completely global functions rarely.

Use namespace TSnap for global functions and namespace TSnapDetail for local functions. See file alg.h for an example.

If you must define a nonmember function and it is only needed locally in its .cpp file, use static linkage: static int Foo() {...}, or namespace TSnapDetail to limit its scope:

namespace TSnapDetail {                // This is in a .cpp file.

// The content of a namespace is not indented
enum { kUnused, kEOF, kError };        // Commonly used tokens.
bool AtEof() { return pos_ == kEOF; }  // Uses our namespace's EOF.

}  // namespace

Static and Global Variables

Static or global variables of class type are forbidden: they cause hard-to-find bugs due to indeterminate order of construction and destruction.

Functions

Functions are a fundamental unit of C++ classes. This section lists the guidelines you should follow when writing a function.

Function Grouping

Group function declarations in *.h files, so that functions relating to common functionality are grouped together. Function definitions in the corresponding *.cpp file should be in the same order as function declarations.

Function Parameters

Parameters to functions should be passed by reference, where appropriate. Passing by reference avoids unneccessary copying and makes the code execution more efficient:
double GetDegreeCentr(const PUNGraph& Graph, const int& NId);

When defining a function, parameter order is: inputs, then outputs.

Input parameters are usually values or const references, while output and input/output parameters are non-const references. In the following example, Graph is an input parameter, and InDegV and OutDegV are output parameters:

void GetDegSeqV(const PGraph& Graph, TIntV& InDegV, TIntV& OutDegV);

When ordering function parameters, put all input-only parameters before any output parameters. In particular, do not add new parameters to the end of the function just because they are new; place new input-only parameters before the output parameters.

This is not a hard-and-fast rule. Parameters that are both input and output (often classes/structs) muddy the waters, and, as always, consistency with related functions may require you to bend the rule.

Default Arguments

The use of default function arguments is discouraged.

Write Short Functions

Prefer small and focused functions.

We recognize that long functions are sometimes appropriate, so no hard limit is placed on functions length. If a function exceeds about 40 lines, think about whether it can be broken up without harming the structure of the program.

Even if your long function works perfectly now, someone modifying it in a few months may add new behavior. This could result in bugs that are hard to find. Keeping your functions short and simple makes it easier for other people to read and modify your code.

Classes

Classes are the fundamental unit of code in C++. Naturally, we use them extensively. This section lists the main dos and don'ts you should follow when writing a class.

Class Definitions

Each public SNAP class must define the following methods: a default constructor, a copy constructor, a TSIn constructor, an assignment operator =, and Save() and Load() methods. Classes that support "smart" pointers must also define a New() method. See Class Format for additional details on class formatting.

Declaration Order

Class declarations generally should be in the following order:
  • Typedefs and Enums
  • Constants (static const data members)
  • Private/Public Classes
  • Data Members (except static const data members)
  • Private Methods
  • Constructors
  • Destructor
  • Public Methods, including static methods
  • Friends

Group the methods so that methods relating to common functionality are grouped together. Method definitions in the corresponding .cpp file should be the same as the declaration order, as much as possible.

Doing Work in Constructors

In general, constructors should merely set member variables to their initial values. Any complex initialization should go in an explicit Init() method.

If your object requires non-trivial initialization, consider having an explicit Init() method. In particular, constructors should not call virtual functions, attempt to raise errors, access potentially uninitialized global variables, etc.

Access Control

Make data members private, and provide access to them through accessor functions as needed. Typically a variable would be called Foo and the accessor function GetFoo(). You may also want a mutator function SetFoo().

Structs vs. Classes

Use a struct only for passive objects that carry data; everything else is a class.

The struct and class keywords behave almost identically in C++. We add our own semantic meanings to each keyword, so you should use the appropriate keyword for the data-type you're defining.

If in doubt, make it a class.

Multiple Inheritance

In general, multiple inheritance is not allowed. Only very rarely is multiple implementation inheritance actually useful. We allow multiple inheritance only when at most one of the base classes has an implementation; all other base classes must be pure interface classes tagged with the Interface suffix.

Header Files

In general, every .cpp file should have an associated .h file. There are some common exceptions, such as unittests and small .cpp files containing just a main() function.

Correct use of header files can make a huge difference to the readability, size and performance of your code.

The #define Guard

All header files should have #define guards to prevent multiple inclusion. The format of the symbol name should be snap_<file>_h:
#ifndef snap_agm_h
#define snap_agm_h
...

#endif  // snap_agm_h

Names and Order of Includes

Use standard order for readability and to avoid hidden dependencies:
  1. C system files.
  2. C++ system files.
  3. Other libraries' .h files.
  4. Your project's .h files.

All of a project's header files should be listed as descendants of the project's source directory without use of UNIX directory shortcuts . (the current directory) or .. (the parent directory). For example, snap-awesome-algorithm/src/base/logging.h should be included as:

#include "base/logging.h"

Within each section it is nice to order the includes alphabetically.

Exceptions to the Rules

The coding conventions described above are mandatory. However, like all good rules, these sometimes have exceptions, which we discuss here.

Existing Non-conformant Code

Some of the SNAP code was written before this style guide has been created and does not conform to the guide.

If you need to change such code, the best option is to rewrite it, so that it conforms to the guide. If that is not possible, because rewriting would require more time and effort than you have available, then stay consistent with the local conventions in that code.

Windows Code

Windows programmers have developed their own set of coding conventions, mainly derived from the conventions in Windows headers and other Microsoft code. You should not use any Windows specific constructs within the SNAP code.

SNAP does not contain any Windows specific code. All such code is encapsulated within the GLIB library, which SNAP uses. If you must implement some Windows specific functionality, contact SNAP maintainers.

Parting Words

Use common sense and BE CONSISTENT.

If you are writing new code, follow this style guide. To see an example of SNAP programming style, see file graph.h.

If you are editing existing code, take a few minutes to look at it and determine its style. If your code looks drastically different from the existing code around it, the discontinuity makes it harder for others to understand it. Try to avoid this.

OK, enough writing about writing code; the code itself is much more interesting. Have fun!


Revision 02-06-2013