Modern C++

Programming

17. Debugging and Testing

Federico Busato

2026-01-06

Table of Contents

1 Debugging Overview

Errors, Defects, and Failures

Cost of Software Defects

Software Defects Classiﬁcation

Program Error and Classiﬁcation

Software Defect Analysis

2 Assertions

Run-time Assertions

Contracts

std::stacktrace

1/107

Table of Contents

3 Execution Debugging

Breakpoints

Watchpoints / Catchpoints

Control Flow

Stack and Info

Disassemble

std::breakpoint

2/107

Table of Contents

4 Memory Debugging

valgrind

5 Hardening Techniques

Stack Usage

Standard C Library Hardening

Standard C++ Library Hardening

Undeﬁned Behavior Protections

Control Flow Protections

3/107

Table of Contents

6 Sanitizers

Address Sanitizer

Leak Sanitizer

Memory Sanitizers

Undeﬁned Behavior Sanitizer

Type Sanitizer

Sampling-Based Sanitizer

7 Debugging Summary

8 Compiler Warnings

4/107

Table of Contents

9 Static Analysis

Compiler-Provided Static Analyzers

Open-Source Static Analyzers

Proprietary Static Analyzers

10 Code Testing

Unit Testing

Test-Driven Development (TDD)

Code Coverage

Fuzz Testing

5/107

Table of Contents

11 Code Quality

clang-tidy

12 Code Complexity

Cyclomatic Complexity

Cognitive Complexity

6/107

Feature Complete

7/107

Debugging Overview

Is this a bug?

for (int i = 0; i <= (2^32) - 1; i++) {

“Software developers spend 35-50 percent of their time vali-

dating and debugging software. The cost of debugging, test-

ing, and veriﬁcation is estimated to account for 50-75 percent

of the total budget of software development projects”

from: John Regehr (on Twitter)

The Debugging Mindset

8/107

Errors, Defects, and Failures

• An error is a human mistake. Errors lead to software defects

• A software defects is an unexpected behavior of the software (correctness,

performance, etc.). Defects potentially lead to software failures

• A software failure is an observable incorrect behavior. A program error is a

failure

9/107

Cost of Software Defects 1/2

10/107

Cost of Software Defects 2/2

Some examples:

• The Millennium Bug (2000): $100 billion

• The Morris Worm (1988): $10 million (single student)

• Ariane 5 (1996): $370 million

• Knight’s unintended trades (2012): $440 million

• Bitcoin exchange error (2011): $1.5 million

• Pentium FDIV Bug (1994): $475 million

• Boeing 737 MAX (2019): $3.9 million

see also:

11 of the most costly software errors in history

Historical Software Accidents and Errors

List of software bugs

11/107

Software Defects Classiﬁcation

Ordered by ﬁx complexity, (time to ﬁx):

(1) Typos, Syntax, Formatting (seconds)

(2) Compilation Warnings/Errors (seconds, minutes)

(3) Logic, Arithmetic, Runtime Errors (minutes, hours, days)

(4) Resource Errors (minutes, hours, days)

(5) Accuracy Errors (hours, days)

(6) Performance Errors (days)

(7) Design Errors (weeks, months)

12/107

Causes of Bugs

• C++ is very error prone language, see 60 terrible tips for a C++

developer

• C++ is a memory unsafe language which means that wrong memory resources

usage leads to undeﬁned behavior instead of a failure.

Memory-related undeﬁned behavior causes a non-deterministic behavior. This also

makes the program much harder to debug

• Human behavior, e.g. copying & pasting code is very common practice and can

introduce subtle bugs → check the code carefully, deep understanding of its

behavior

13/107

Program Error and Classiﬁcation

A program error is a set of conditions that produce an incorrect result or unexpected

behavior, including performance regression, memory consumption, early termination,

etc.

We can distinguish between two kind of errors:

Recoverable Conditions that are not under the control of the program. They

indicate “exceptional" run-time conditions. e.g. ﬁle not found, bad

allocation, wrong user input, etc.

Unrecoverable It is a synonym of a bug. It indicates a problem in the program logic.

The program must terminate and be modiﬁed. e.g. out-of-bound,

division by zero, etc.

A recoverable should be considered unrecoverable if it is extremely rare and diﬃcult to

handle, e.g. bad allocation due to out-of-memory error

14/107

Software Defect Analysis

Dynamic Analysis A mitigation strategy that acts on the runtime state of a program.

Techniques: Print, run-time debugging, sanitizers, fuzzing, unit test support,

performance regression tests

Limitations: Infeasible to cover all program states

Static Analysis A proactive strategy that examines the source code for (potential)

errors.

Techniques: Warnings, static analysis tool, compile-time checks

Limitations: Turing’s undecidability theorem, exponential code paths

How programmers make sure that their software is correct

15/107

Assertions

Unrecoverable Errors and Assertions

Assertions are conditions to detect logic (unrecoverable errors) errors and potentially

prevent them in production.

Assertions express preconditions, invariant, and postconditions

C++ assertions are a statements to detect violations of assumptions.

The language provides two kind of assertions:

• At run-time with assert and contracts C++26

• At compile-time with static_assert C++11, see "Template and

Metaprogramming II" lecture

16/107

Run-time Assertions 1/5

A run-time assertion is deﬁned with the macro assert  in the <cassert> header

# define assert(EXPR)

Note: Assertions capture logic errors and are intended to facilitate debugging.

• They don’t check recoverable errors, e.g. ﬁle not found

• Assertion failures should never be exposed in the normal program execution (e.g.

release mode)

• Assertions may slow down the execution and increase the binary size

17/107

Run-time Assertions 2/5

Assertions are enabled by default and can be disabled by introducing the NDEBUG

macro. It is always a good practice to deﬁne the macro for the whole translation unit

with the ﬂags:

• Clang/Gcc: -DNDEBUG

• MSVC: /DNDEBUG

assert failure calls std::abort()  which causes immediate program termination

assert can be used in constexpr functions starting from C++11.

However, a failed assertion is not allowed in a constant evaluation (core constant

expression )

assert must not contain side eﬀects to avoid inconsistent behavior between release

and debug execution

assert(++n > 0); // incremented only in debug builds

18/107

Run-time Assertions - Example 3/5

# include <cassert>

// Compute ceil(value / multiple) * multiple

int round_up(int value, int multiple) {

assert(value >= 0); // precondition

assert(multiple > 0); // precondition

auto div_result = ceil_div(value, multiple);

assert(check_mul_overflow(div_result, multiple)); // internal

auto result = div_result * multiple;

assert(result >= value); // postcondition

return result;

}

19/107

Run-time Assertions - Limitations 4/5

assert is a single-argument macro and not a keyword.

It doesn’t support angle Brackets and initializer list

# include <cassert> // compiler explorer 

template<int, int>

bool g() { return true; }

struct A {

bool x, y = true;

};

int main() {

assert(g<3, 4>());

assert(A{true, 1}.x);

}

Make assert() macro user friendly for C and C++, P2246

20/107

Assertion Enhancements 5/5

boost.org/libs/assert  provides an enhanced version of assert to help the

debugging process

The library provides the

BOOST_ASSERT(expr) macro which is mapped to the

following function (to implement and customize)

void boost::assertion_failed(

const char* expr, // failed expression

const char* function, // function name of the failed assertion

const char* file, // file name of the failed assertion

long line); // line number of the failed assertion

21/107

Contracts 1/4

A contracts assertion  is a set of conditions that expresses expectations of a

component related to the correct execution of a program.

C++26 introduces three new keywords to express assertions:

contract_assert: Generic assertion

pre: Function preconditions, namely conditions that hold before the

function execution

post: Function postconditions, namely conditions that hold after the

function execution

• Contracts for C++ explained in 5 minutes, Timur Doumler

• Contracts for C++, P2900

22/107

Contracts - Example 2/4

// Compute ceil(value / multiple) * multiple

int round_up(int value, int multiple)

pre(value >= 0),

pre(multiple > 0),

post(result: result >= value) {

auto div_result = ceil_div(value, multiple);

contract_assert(check_mul_overflow(div_result, multiple));

return div_result * multiple;

}

23/107

Contracts - Evaluation Semantic 3/4

Contract assertions provide four evaluation semantics to deﬁne their behavior:

Semantic Termination Diagnostic

ignore

observe

enforce

quick-enforce

The evaluation semantic is set with the compiler ﬂag

-fcontract-semantic=<semantic>

24/107

Contracts - Contract-Violation Handler

⋆

4/4

Diagnostic behavior provides a default print message that can be also customized by

deﬁning the function

void handle_contract_violation(std::contracts::contract_violation) .

std::contracts::contract_violation provides the following methods:

• const char* comment() : textual representation of the predicate

• detection_mode detection_mode() : predicate_false or

evaluation_exception

• exception_ptr evaluation_exception() : pointer to the exception raised during

predicate evaluation

• bool is_terminating() : true if terminating semantic

• assertion_kind kind() : pre , post , assert

• source_location location() : source location of the assertion

• evaluation_semantic semantic() : ignore , observe , enforce ,

quick_enforce

25/107

std::stacktrace 1/2

C++23 introduces std::stacktrace library to get the current function call stack,

namely the sequence of calls from the main() entry point

# include <print>

# include <stacktrace> // the program must be linked with the library

// -lstdc++_libbacktrace

// (-lstdc++exp with gcc-14 trunk)

void g() {

auto call_stack = std::stacktrace::current();

for (const auto& entry : call_stack)

std::print("{}\n", entry);

}

void f() { g(); }

int main() { f(); }

26/107

std::stacktrace 2/2

the previous code prints

g() at /app/example.cpp:6

f() at /app/example.cpp:11

main at /app/example.cpp:13

at :0

__libc_start_main at :0

_start at :0

The library also provides additional functions for entry to allow ﬁne-grained control

of the output description() , source_file() , source_line()

for (const auto& entry : call_stack) { // same output

std::print("{} at {}:{}\n", entry.description(), entry.source_file(),

entry.source_line());

}

27/107

Boost Stacktrace

boost.org/libs/stacktrace  is a third-party library that allows to print the

stacktrace.

boost::stacktrace::stacktrace() returns a string with the stacktrace

This function can be combined with boost::assertion_failed , exception

handling, or signal handling to enhance debugging information

0# bar(int) at /path/to/source/file.cpp:70

1# bar(int) at /path/to/source/file.cpp:70

2# bar(int) at /path/to/source/file.cpp:70

3# bar(int) at /path/to/source/file.cpp:70

4# main at /path/to/main.cpp:93

5# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6

6# _start

28/107

Cpptrace and Backward

https://github.com/jeremy-rifkin/cpptrace  is a simple and portable C++

stacktrace library supporting C++11.

Backward  is a beautiful stack trace pretty printer for C++.

29/107

Execution

Debugging

Execution Debugging (gdb) 1/2

How to compile and run for debugging:

g++ -O0 -g [-g3] <program.cpp> -o program

gdb [--args] ./program <args...>

-O0 Disable any code optimization for helping the debugger. It is implicit for most

compilers

-g Enable debugging

- stores the symbol table information in the executable (mapping between assembly

and source code lines)

- for some compilers, it may disable certain optimizations

- slow down the compilation phase and the execution

-g3 Produces enhanced debugging information, e.g. macro deﬁnitions. Available for

most compilers. Suggested instead of -g

30/107

Execution Debugging (gdb) 2/2

Additional ﬂags:

-ggdb3 Generate speciﬁc debugging information for gdb.

Equivalent to -g3 with gcc

-fno-omit-frame-pointer Do not remove information that can be used to

reconstruct the call stack

-fasynchronous-unwind-tables Allow precise stack unwinding

How to build highly-debuggable C++ binaries

31/107

gdb - Breakpoints

Command Abbr. Description

break <ﬁle>:<line> b Insert a breakpoint in a speciﬁc line

break <function_name> b Insert a breakpoint in a speciﬁc function

break <func/line> if <condition> b Insert a breakpoint with a conditional statement

delete d Delete all breakpoints or watchpoints

delete <breakpoint_number> d Delete a speciﬁc breakpoint

clear [function_name/line_number] Delete a speciﬁc breakpoint

enable/disable <breakpoint_number> Enable/Disable a speciﬁc breakpoint

info breakpoints info b List all active breakpoints

32/107

gdb - Watchpoints / Catchpoints

Command Abbr. Description

watch <expression>

Stop execution when the value of expression changes

(variable, comparison, etc.)

rwatch <variable/location> Stop execution when variable/location is read

delete <watchpoint_number> d Delete a speciﬁc watchpoint

info watchpoints List all active watchpoints

catch throw Stop execution when an exception is thrown

33/107

gdb - Control Flow

Command Abbr. Description

run [args] r Run the program

continue c Continue the execution

finish f Continue until the end of the current function

step s Execute next line of code (follow function calls)

next n Execute next line of code

until <program_point>

Continue until reach line number,

function name, address, etc.

CTRL+C Stop the execution (not quit)

quit q Exit

help [<command>] h Show help about command

34/107

gdb - Stack and Info

Command Abbr. Description

list l Print code

list <function or #start,#end> l Print function/range code

up u Move up in the call stack

down d Move down in the call stack

backtrace [full] bt Prints stack backtrace (call stack) [local vars]

info args Print current function arguments

info locals Print local variables

info variables Print all variables

info <breakpoints/watchpoints/registers>

Show information about program

breakpoints/watchpoints/registers

35/107

gdb - Print

Command Abbr. Description

print <variable> p Print variable

print/h <variable> p/h Print variable in hex

print/nb <variable> p/nb print variable in binary (n bytes)

print/w <address> p/w Print address in binary

p /s <char array/address> Print char array

p *array_var@n Print n array elements

p (int[4])<address> Print four elements of type int

p *(char**)&<std::string> Print std::string

36/107

gdb - Disassemble

Command Description

disassemble <function_name> Disassemble a speciﬁed function

disassemble <0xStart,0xEnd addr> Disassemble function range

nexti <variable>

Execute next line of code (follow

function calls)

stepi <variable> Execute next line of code

x/nfu <address>

Examine address

n number of elements,

f format (d: int, f: ﬂoat, etc.),

u data size (b: byte, w: word, etc.)

37/107

std::breakpoint

C++26 provides the <debugging> library, which allows interaction with a debugger

directly from the source code, without relying on platform-speciﬁc intrinsic instructions

• breakpoint() attempts to temporarily halt the execution of the program and

transfer control to the debugger. The behavior is implementation-deﬁned

• breakpoint_if_debugging() halts the execution if a debugger is detected

• is_debugger_present() returns true if the program is executed under a

debugger, false otherwise

38/107

gdb - Notes

The debugger automatically stops when:

• breakpoint (by using the debugger)

• assertion fail

• segmentation fault

• trigger software breakpoint (e.g. SIGTRAP on Linux)

github.com/scottt/debugbreak

Full story: www.yolinux.com/TUTORIALS/GDB-Commands.html (it also contains a

script to de-referencing STL Containers)

gdb reference card V5 link

39/107

Memory Debugging

Memory Vulnerabilities 1/3

“70% of all the vulnerabilities in Microsoft products are memory safety

issues"

Matt Miller, Microsoft Security Engineer

“Chrome: 70% of all security bugs are memory safety issues"

Chromium Security Report

“you can expect at least 65% of your security vulnerabilities to be

caused by memory unsafety"

What science can tell us about C and C++’s security

Microsoft: 70% of all security bugs are memory safety issues

Chrome: 70% of all security bugs are memory safety issues

What science can tell us about C and C++’s security

40/107

Memory Vulnerabilities 2/3

“Memory Unsafety in Apple’s OS represents 66.3%- 88.2% of all the

vulnerabilities"

“Out of bounds (OOB) reads/writes comprise ∼70% of all the vul-

nerabilities in Android"

Jeﬀ Vander, Google, Android Media Team

“Memory corruption issues are the root-cause of 68% of listed CVEs"

Ben Hawkes, Google, Project Zero

Memory Unsafety in Apple’s Operating Systems

Google Security Blog: Queue the Hardening Enhancements

Google Project Zero

41/107

Memory Vulnerabilities 2/2

Terms like buﬀer overﬂow, race condition, page fault, null pointer, stack exhaustion,

heap exhaustion/corruption, use-after-free, or double free – all describe memory

safety vulnerabilities

Mitigation:

• Run-time check

• Static analysis

• Avoid unsafe language constructs

42/107

valgrind 1/9

valgrind  is a tool suite to automatically detect many

memory management and threading bugs

How to install the last version:

$ wget ftp://sourceware.org/pub/valgrind/valgrind-3.26.tar.bz2

$ tar xf valgrind-3.26.tar.bz2

$ cd valgrind-3.26

$ ./configure --enable-lto

$ make -j 12

$ sudo make install

$ sudo apt install libc6-dbg #if needed

some linux distributions provide the package through apt install valgrid , but it could be an old version

43/107

valgrind 2/9

Basic usage:

• compile with -g

• $ valgrind ./program <args...>

Output example 1:

==60127== Invalid read of size 4 !!out-of-bound access

==60127== at 0x100000D9E: f(int) (main.cpp:86)

==60127== by 0x100000C22: main (main.cpp:40)

==60127== Address 0x10042c148 is 0 bytes after a block of size 40 alloc’d

==60127== at 0x1000161EF: malloc (vg_replace_malloc.c:236)

==60127== by 0x100000C88: f(int) (main.cpp:75)

==60127== by 0x100000C22: main (main.cpp:40)

44/107

valgrind 3/9

Output example 2:

!!memory leak

==19182== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1

==19182== at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130)

==19182== by 0x8048385: f (main.cpp:5)

==19182== by 0x80483AB: main (main.cpp:11)

==60127== HEAP SUMMARY:

==60127== in use at exit: 4,184 bytes in 2 blocks

==60127== total heap usage: 3 allocs, 1 frees, 4,224 bytes allocated

==60127==

==60127== LEAK SUMMARY:

==60127== definitely lost: 128 bytes in 1 blocks !!memory leak

==60127== indirectly lost: 0 bytes in 0 blocks

==60127== possibly lost: 0 bytes in 0 blocks

==60127== still reachable: 4,184 bytes in 2 blocks !!not deallocated

==60127== suppressed: 0 bytes in 0 blocks

45/107

valgrind 4/9

Memory leaks are divided into four categories:

• Deﬁnitely lost

• Indirectly lost

• Still reachable

• Possibly lost

When a program terminates, it releases all heap memory allocations. Despite this,

leaving memory leaks is considered a bad practice and makes the program unsafe with

respect to multiple internal iterations of a functionality. If a program has memory leaks

for a single iteration, is it safe for multiple iterations?

A robust program prevents any memory leak even when abnormal conditions occur

46/107

valgrind 5/9

Deﬁnitely lost indicates blocks that are not deleted at the end of the program (return

from the main() function). The common case is local variables pointing to newly

allocated heap memory

void f() {

int* y = new int[3]; // 12 bytes definitely lost

}

int main() {

int* x = new int[10]; // 40 bytes definitely lost

f();

}

47/107

valgrind 6/9

Indirectly lost indicates blocks pointed by other heap variables that are not deleted.

The common case is global variables pointing to newly allocated heap memory

struct A {

int* array;

};

int main() {

A* x = new A; // 8 bytes definitely lost

x->array = new int[4]; // 16 bytes indirectly lost

}

48/107

valgrind 7/9

Still reachable indicates blocks that are not deleted but they are still reachable at the

end of the program

int* array;

int main() {

array = new int[3];

}

// 12 bytes still reachable (global static class could delete it)

# include <cstdlib>

int main() {

int* array = new int[3];

std::abort(); // early abnormal termination

// 12 bytes still reachable

... // maybe it is delete here

}

49/107

valgrind 8/9

Possibly lost indicates blocks that are still reachable but pointer arithmetic makes the

deletion more complex, or even not possible

# include <cstdlib>

int main() {

int* array = new int[3];

array++; // pointer arithmetic

std::abort(); // early abnormal termination

// 12 bytes still reachable

... // maybe it is delete here but you should be able

// to revert pointer arithmetic

}

50/107

valgrind 9/9

Advanced ﬂags:

• –leak-check=full print details for each “definitely lost" or “possibly lost"

block, including where it was allocated

• –show-leak-kinds=all to combine with –leak-check=full. Print all leak kinds

• –track-fds=yes list open ﬁle descriptors on exit (not closed)

• –track-origins=yes tracks the origin of uninitialized values (very slow execution)

valgrind –leak-check=full –show-leak-kinds=all

–track-fds=yes –track-origins=yes ./program <args...>

Track stack usage:

valgrind –tool=drd –show-stack-usage=yes ./program <args...>

51/107

Hardening

Techniques

Overview and References

Hardening techniques are compiler and linker options that enhance the security and

reliability of applications by mitigating vulnerabilities such as memory safety issues,

undeﬁned behavior, and exploitation risks

• Compiler Options Hardening Guide for C and C++ [March, 2024]

• Hardened mode of standard library implementations

52/107

Compile-time Stack Usage

• -Wstack-usage=<byte-size> Warn if the stack usage of a function might

exceed byte-size. The computation done to determine the stack usage is

conservative (no VLA)

• -fstack-usage Makes the compiler output stack usage information for the

program, on a per-function basis

• -Wvla Warn if a variable-length array is used in the code

• -Wvla-larger-than=<byte-size> Warn for declarations of variable-length

arrays whose size is either unbounded, or bounded by an argument that allows the

array size to exceed byte-size bytes

Use compiler flags for stack protection in GCC and Clang

53/107

Compile-time Stack Protection

• -Wtrampolines Check whether the compiler generates trampolines for pointers

to nested functions which may interfere with stack virtual memory protection

• -Wl,-z,noexecstack Enable data execution prevention by marking stack

memory as non-executable

54/107

Run-time Stack Usage

• -fstack-clash-protection Enables run-time checks for variable-size stack

allocation validity

• -fstack-protector-strong Enables run-time checks for stack-based buﬀer

overﬂows using strong heuristic

• -fstack-protector-all Enables run-time checks for stack-based buﬀer

overﬂows for all functions

55/107

Standard C Library Hardening 1/2

Harderning the standard C library libc allows to checks for buﬀer overﬂows of

fundamental C functions

_FORTIFY_SOURCE macro: enable buﬀer overﬂow checks for the following functions:

memcpy , mempcpy , memmove , memset , strcpy , stpcpy , strncpy , strcat ,

strncat , sprintf , vsprintf , snprintf , vsnprintf , gets .

Recent compilers (e.g. GCC 12+, Clang 9+) allow detects buﬀer overﬂows with

enhanced coverage, e.g. dynamic pointers, with _FORTIFY_SOURCE=3 *

*GCC’s new fortification level: The gains and costs

56/107

Standard C Library Hardening 2/2

# include <cstring> // std::memset

# include <string> // std::stoi

int main(int argc, char** argv) {

int size = std::stoi(argv[1]);

char buffer[24];

std::memset(buffer, 0xFF, size);

}

$ gcc -O1 -D_FORTIFY_SOURCE program.cpp -o program

$ ./program 12 # OK

$ ./program 32 # Wrong

$ *** buffer overflow detected ***: ./program terminated

57/107

Standard C++ Library Hardening 1/2

The standard C++ library provides run-time precondition checks for library calls, such

as bounds-checks for strings ( std::string , std::string_view ) and containers

( std::vector , std::span , std::optional , etc.), null-pointer checks, etc.

“Adoption of standard C++ library harderning led to a 30% reduction in segmentation fault

rate in production code at Google at the cost of only 0.3% performance slowdown. This

technique would prevent 1,000 to 2,000 new bugs yearly"

Retrofitting spatial safety to hundreds of millions of lines of C++

58/107

Standard C++ Library Hardening 2/2

GCC _GLIBCXX_ASSERTIONS (libstdc++)

LLVM _LIBCPP_HARDENING_MODE=<Hardening Level> (libc++)

MSVC _MSVC_STL_HARDENING (Microsoft STL)

<Hardening Level> :

_LIBCPP_HARDENING_MODE_FAST Lightweight security-critical checks

_LIBCPP_HARDENING_MODE_EXTENSIVE Non-security-critical lightweight checks

_LIBCPP_HARDENING_MODE_DEBUG Enables all the available checks in the library

C++26 requires the option to enable C++ standard library hardening P3471 

Libc++ Hardening Modes

59/107

Safe Buﬀers

Clang can be used to harden C++ code against buﬀer overﬂows. The technique

enforces the usage of standard containers and views instead of raw pointers

std::array , std::vector , std::string , std::span , std::string_view

Compiler ﬂags:

• -Wunsafe-buffer-usage : emit a warning for any operation applied to a raw

pointer: array indexing, pointer arithmetic, bounds-unsafe standard C functions

such as std::memcpy() , smart pointer operations

• -D_LIBCPP_HARDENING_MODE=_LIBCPP_HARDENING_MODE_FAST : enforce

bounds-safe containers and views

C++ Safe Buffers

60/107

Undeﬁned Behavior Protections 1/2

• -fno-strict-overflow Prevent code optimization (code elimination) due to

signed integer undeﬁned behavior

• -fwrapv Signed integer has the same semantic of unsigned integer, with a

well-deﬁned wrap-around behavior

• -ftrapv Terminate the program if signed integer occurs. Also useful for

debugging

• -fno-strict-aliasing Strict aliasing means that two objects with the same

memory address are not same if they have a diﬀerent type, undeﬁned behavior

otherwise. The ﬂag disables this constraint

61/107

Undeﬁned Behavior Protections 2/2

• -fno-delete-null-pointer-checks NULL pointer dereferencing is undeﬁned

behavior and the compiler can assume that it never happens. The ﬂag disable this

optimization

• -ftrivial-auto-var-init[=<hex pattern>] Ensures that default

initialization initializes variables with a ﬁxed 1-byte pattern. Explicit uninitialized

variables requires the [[uninitialized]] attribute

62/107

Control Flow Protections

• -fcf-protection=full Enable control ﬂow protection to counter Return

Oriented Programming (ROP) and Jump Oriented Programming (JOP) attacks

on many x86 architectures

• -mbranch-protection=standard Enable branch protection to counter Return

Oriented Programming (ROP) and Jump Oriented Programming (JOP) attacks

on AArch64

63/107

Other Run-time Checks

• -fPIE -pie Position-Independent Executable enables the support for address

space layout randomization, which makes exploits more diﬃcult

• -Wl,-z,relro,-z,now Prevents modiﬁcation of the Global Oﬀset Table

(locations of functions from dynamically linked libraries) after the program startup

• -Wl,-z,nodlopen Restrict dlopen(3) calls to shared objects

64/107

Sanitizers

Address Sanitizer

Sanitizers are compiler-based instrumentation components to perform dynamic

analysis

Sanitizers are used during development and testing to discover and diagnose memory

misuse bugs and potentially dangerous undeﬁned behavior

Sanitizers are implemented in Clang (from 3.1), gcc (from 4.8), MSVC, and Xcode

Examples of projects using Sanitizers: Chromium, Firefox, Linux kernel, Android

Memory error checking in C and C++: Comparing Sanitizers and Valgrind

65/107

Address Sanitizer

Address Sanitizer  is a memory error detector

• heap/stack/global out-of-bounds

• memory leaks

• use-after-free, use-after-return, use-after-scope

• double-free, invalid free

• initialization order bugs

* Similar to valgrind but faster (50X slowdown)

clang++ -O1 -g -fsanitize=address -fno-omit-frame-pointer <program>

-O1 disable inlining

-g generate symbol table

• github.com/google/sanitizers/wiki/AddressSanitizer

• gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html

• MSVC AddressSanitizer

66/107

Leak Sanitizer

LeakSanitizer  is a run-time memory leak detector

• integrated into AddressSanitizer, can be used as standalone tool

* almost no performance overhead until the very end of the process

clang++ -O1 -g -fsanitize=leak -fno-omit-frame-pointer <program>

• github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer

• gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html

67/107

Memory Sanitizers

Memory Sanitizer  is a detector of uninitialized reads

• stack/heap-allocated memory read before it is written

* Similar to valgrind but faster (3X slowdown)

clang++ -O1 -g -fsanitize=memory -fno-omit-frame-pointer <program>

-fsanitize-memory-track-origins=2

track origins of uninitialized values

Note: not compatible with Address Sanitizer

• github.com/google/sanitizers/wiki/MemorySanitizer

• gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html

68/107

Undeﬁned Behavior Sanitizer

UndefinedBehaviorSanitizer  is an undeﬁned behavior detector

• signed integer overﬂow, ﬂoating-point types overﬂow, enumerated not in range

• out-of-bounds array indexing, misaligned address

• divide by zero

• etc.

* Not included in valgrind

clang++ -O1 -g -fsanitize=undefined -fno-omit-frame-pointer <program>

gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html

69/107

Undeﬁned Behavior Sanitizer

-fsanitize=<options> :

undefined All of the checks other than float-divide-by-zero,

unsigned-integer-overflow, implicit-conversion,

local-bounds and the nullability-* group of checks

float-divide-by-zero Undeﬁned behavior in C++, but deﬁned by Clang and IEEE-754

integer Checks for undeﬁned or suspicious integer behavior (e.g. unsigned

integer overﬂow)

implicit-conversion Checks for suspicious behavior of implicit conversions

local-bounds Out of bounds array indexing, in cases where the array bound can be

statically determined

nullability Checks passing null as a function parameter, assigning null to an

lvalue, and returning null from a function

70/107

Type Sanitizer

Type Sanitizer  is an strict aliasing rule violation detector.

Violation of the strict aliasing rule can lead to optimizations and bugs. clang/gcc

provides the ﬂag -fno-strict-aliasing to prevent any construct that could lead

to strict aliasing violation, sacriﬁcing performance

int x = 100;

float& y = reinterpret_cast<float&>(x);

y += 2.0f; // strict aliasing violation

clang++ -O1 -g -fsanitize=type -fno-omit-frame-pointer <program>

71/107

Sampling-Based Sanitizer

GWPSan  is a framework to implement low-overhead sampling-based dynamic binary

instrumentation, designed for detecting various bugs where more expensive dynamic

analysis would otherwise not be feasible

• tsan (thread-sanitizer) data races

• uar use-after-return bugs

• lmsan Uninitialized variables

clang++ -fexperimental-sanitize-metadata=atomics,uar <program>

72/107

Sanitizers vs. Valgrind

Valgrind - A neglected tool from the shadows or a serious debugging tool?

73/107

Debugging Summary

How to Debug Common Errors

Segmentation fault

• gdb, valgrind, sanitizers

• Segmentation fault when just entered in a function → stack overflow

Double free or corruption

• gdb, valgrind, sanitizers

Inﬁnite execution

• gdb + (CTRL + C)

Incorrect results

• valgrind + assertion + gdb + sanitizers

74/107

Compiler Warnings

Compiler Warnings - GCC and Clang

Enable speciﬁc warnings:

g++ -W<warning> <args...>

Disable speciﬁc warnings:

g++ -Wno-<warning> <args...>

Common warning ﬂags to minimize accidental mismatches:

-Wall Enables many standard warnings (∼50 warnings)

-Wextra Enables some extra warning ﬂags that are not enabled by -Wall (∼15 warnings)

-Wpedantic Issue all the warnings demanded by strict ISO C/C++

-Werror Treat warnings as errors

Enable ALL warnings, only clang: -Weverything

75/107

Compiler Warnings - MSVC

Enable speciﬁc warnings:

cl.exe /W<level><warning_id> <args...>

Disable speciﬁc warnings:

cl.exe /We<warning_id> <args...>

Common warning ﬂags to minimize accidental mismatches:

/W1 Severe warnings

/W2 Signiﬁcant warnings

/W3 Production quality warnings

/W4 Informational warnings

/Wall All warnings

/WX Treat warnings as errors

76/107

The Impact of Compiler Warnings on Code Quality 2/2

low value → higher code quality

The Impact of Compiler Warnings on Code Quality in C++ Projects

77/107

Static Analysis

Overview

Static analysis is the process of source code examination to ﬁnd potential issues

Beneﬁts of static code analysis:

• Problem identiﬁcation before the execution

• Analyze the program outside the execution environment

• The analysis is independent of the run-time tests

• Enforce code quality and compliance by ensuring that the code follows speciﬁc

rules and standards

• Identify security vulnerabilities

78/107

Compiler-Provided Static Analyzers

The Clang Static Analyzer  (LLVM suite) ﬁnds bugs by reasoning

about the semantics of code (may produce false positives)

scan-build make

The GCC Static Analyzer  can diagnose various kinds of problems

in C/C++ code at compile-time (e.g. double-free, use-after-free, stdio

related, etc.) by adding the -fanalyzer ﬂag

The MSVC Static Analyzer  Enables code analysis and control op-

tions (e.g. double-free, use-after-free, stdio related, etc.) by adding the

/analyze ﬂag

79/107

Open-Source Static Analyzers 1/2

cppcheck  provides code analysis to detect bugs, undeﬁned behavior and

dangerous coding construct. The goal is to detect only real errors in the

code (i.e. have very few false positives)

cppcheck --enable=warning,performance,style,portability,information,error <file>

cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .

cppcheck --enable=<enable_flags> --project=compile_commands.json

Debian source code test case:

• Buﬀer overﬂows → ∼ 1 900

• Uninitialized variables → ∼ 16 000

• Null pointer dereference → ∼ 8 000

• Number of “error” → 94 275

Building Cppcheck - What We Learned from 17 Years of Development, Daniel Marjamäki

80/107

Open-Source Static Analyzers 2/2

FBInfer  is a static analysis tool (also available online) to checks for

null pointer dereferencing, memory leak, coding conventions, unavailable

APIs, etc.

Nasa IKOS  (Inference Kernel for Open Static Analyzers) is a static

analyzer for C/C++ based on the theory of Abstract Interpretation.

81/107

Proprietary Static Analyzers 1/2

PVS-Studio  is a high-quality proprietary (free for open source projects)

static code analyzer supporting C, C++

Customers: IBM, Intel, Adobe, Microsoft, Nvidia, Bosh, IdGames, EpicGames, etc.

SonarSource  is a static analyzer which inspects source code for bugs,

code smells, and security vulnerabilities for multiple languages (C++, Java,

etc.)

SonarLint plugin is available for Visual Code, Visual Studio Code, Eclipse, and IntelliJ IDEA

Customers: Amazon AWS, Facebook/Ocolus, Instagram, Whatapp, Mozilla, Spotify, Uber,

Sky, etc.

82/107

Proprietary Static Analyzers 2/2

deepCode  is an AI-powered code review system, with machine learning

systems trained on billions of lines of code from open-source projects

Available for Visual Studio Code, Sublime, IntelliJ IDEA, and Atom

see also: A curated list of static analysis tool

83/107

Static Analysis Tools Eﬀectiveness

Evaluation over a dataset which comprises 319 real-world vulnerabilities from 815

vulnerability-contributing commits (VCCs) in 92 C and C++ projects.

An Empirical Study of Static Analysis Tools for Secure Code Review

84/107

Code Testing

see Case Study 4: The $440 Million Software Error at Knight Capital

from: Kat Maddox (on Twitter)

85/107

Code Testing

“A QA engineer walks into a bar.

Orders a beer, 0 beers, 99999999999 beers, a lizard, -1 bear and

ueicbksjdhd.

The ﬁrst real customer walks in and ask where the bathroom is. The

bar bursts into ﬂames.”

86/107

Code Testing

Unit Test A unit is the smallest piece of code that can be logically isolated in a

system. Unit test refers to the veriﬁcation of a unit. It supposes the

full knowledge of the code under testing (white-box testing)

Goals: meet speciﬁcations/requirements, fast development/debugging

Functional Test Output validation instead of the internal structure (black-box testing)

Goals: performance, regression (same functionalities of previous

version), stability, security (e.g. sanitizers), composability (e.g.

integration test)

87/107

Unit Testing 1/3

Unit testing involves breaking your program into pieces, and subjecting each piece to

a series of tests

Unit testing should observe the following key features:

• Isolation: Each unit test should be independent and avoid external interference

from other parts of the code

• Automation: Non-user interaction, easy to run, and manage

• Small Scope: Unit tests focus on small portions of code or speciﬁc

functionalities, making it easier to identify bugs

Popular C++ Unit testing frameworks:

catch, doctest, Google Test, CppUnit, Boost.Test

88/107

Unit Testing 2/3

89/107

Unit Testing 3/3

JetBrains C++ Developer Ecosystem 2022

90/107

Test-Driven Development (TDD)

Unit testing is often associated with the Test-Driven Development (TDD)

methodology. The practice involves the deﬁnition of automated functional tests before

implementing the functionality

The process consists of the following steps:

1. Write a test for a new functionality

2. Write the minimal code to pass the test

3. Improve/Refactor the code iterating with the test veriﬁcation

4. Go to 1.

91/107

Test-Driven Development (TDD) - Main advantages

• Software design. Strong focus on interface deﬁnition, expected behavior,

speciﬁcations, and requirements before working at lower level

• Maintainability/Debugging Cost Small, incremental changes allow you to catch

bugs as they are introduced. Later refactoring or the introduction of new features

still rely on well-deﬁned tests

• Understandable behavior. New user can learn how the system works and its

properties from the tests

• Increase conﬁdence. Developers are more conﬁdent that their code will work as

intended because it has been extensively tested

• Faster development. Incremental changes, high conﬁdence, and automation

make it easy to move through diﬀerent functionalities or enhance existing ones

92/107

catch 1/2

Catch2  is a multi-paradigm test framework for C++

Catch2 features

• Header only and no external dependencies

• Assertion macro

• Floating point tolerance comparisons

Basic usage:

• Create the test program

• Run the test

$ ./test_program [<TestName>]

• github.com/catchorg/Catch2

• The Little Things: Testing with Catch2

93/107

catch 2/2

# define CATCH_CONFIG_MAIN // This tells Catch to provide a main()

# include "catch.hpp" // only do this in one cpp file

unsigned Factorial(unsigned number) {

return number <= 1 ? number : Factorial(number - 1) * number;

}

"Test description and tag name"

TEST_CASE( "Factorials are computed", "[Factorial]" ) {

REQUIRE( Factorial(1) == 1 );

REQUIRE( Factorial(2) == 2 );

REQUIRE( Factorial(3) == 6 );

REQUIRE( Factorial(10) == 3628800 );

}

float floatComputation() { ... }

TEST_CASE( "floatCmp computed", "[floatComputation]" ) {

REQUIRE( floatComputation() == Approx( 2.1 ) );

}

94/107

Code Coverage 1/3

Code coverage is a measure used to describe the degree to which the source code of

a program is executed when a particular execution/test suite runs

gcov and llvm-profdata/llvm-cov are tools used in conjunction with compiler

instrumentation (gcc, clang) to interpret and visualize the raw code coverage

generated during the execution

gcovr and lcov are utilities for managing gcov/llvm-cov at higher level and

generating code coverage results

Step for code coverage:

• Compile with –coverage ﬂag (objects + linking)

• Run the program / test

• Visualize the results with gcovr, llvm-cov, lcov

95/107

Code Coverage 2/3

program.cpp:

# include <iostream>

# include <string>

int main(int argc, char* argv[]) {

int value = std::stoi(argv[1]);

if (value % 3 == 0)

std::cout << "first\n";

if (value % 2 == 0)

std::cout << "second\n";

}

$ gcc -g –coverage program.cpp -o program

$ ./program 9

first

$ gcovr -r --html --html-details <program_path> # generate html

# or

$ lcov --coverage --directory <program_path> --output-file coverage.info

$ genhtml coverage.info --output-directory <program_path> # generate html

96/107

Code Coverage 3/3

1: 4:int main(int argc, char* argv[]) {

1: 5: int value = std::stoi(argv[1]);

1: 6: if (value % 3 == 0)

1: 7: std::cout << "first\n";

1: 8: if (value % 2 == 0)

# ####: 9: std::cout << "second\n";

4: 10:}

97/107

Coverage-Guided Fuzz Testing

A fuzzer is a specialized tool that tracks which areas of the code are reached, and

generates mutations on the corpus of input data in order to maximize the code

coverage

LibFuzzer  is the library provided by LLVM and feeds fuzzed inputs to the library via

a speciﬁc fuzzing entrypoint

The fuzz target function accepts an array of bytes and does something interesting with these

bytes using the API under test:

extern "C" int LLVMFuzzerTestOneInput(const uint8_t* Data,

size_t Size) {

DoSomethingInterestingWithMyAPI(Data, Size);

return 0;

}

98/107

Code Quality

Linters - clang-tidy 1/2

lint: The term was derived from the name of the undesirable bits of ﬁber

clang-tidy  provides an extensible framework for diagnosing and ﬁxing typical

programming errors, like style violations, interface misuse, or bugs that can be deduced

via static analysis

$ cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .

$ clang-tidy -p .

clang-tidy searches the conﬁguration ﬁle .clang-tidy ﬁle located in the closest

parent directory of the input ﬁle

clang-tidy is included in the LLVM suite

99/107

Linters - clang-tidy 2/2

Coding Guidelines:

• CERT Secure Coding Guidelines

• C++ Core Guidelines

• High Integrity C++ Coding Standard

Supported Code Conventions:

• Fuchsia

• Google

• LLVM

Bug Related:

• Android related

• Boost library related

• Misc

• Modernize

• Performance

• Readability

• clang-analyzer checks

• bugprone code constructors

.clang-tidy

Checks: ’android-*,boost-*,bugprone-*,cert-*,cppcoreguidelines-*,

clang-analyzer-*,fuchsia-*,google-*,hicpp-*,llvm-*,misc-*,modernize-*,

performance-*,readability-*’

100/107

Code Complexity

Cyclomatic Complexity 1/2

Cyclomatic Complexity (CCN): is a software metric used to indicate the complexity of

a program. It is a quantitative measure of the number of linearly independent paths

through a program source code

It was originally intended “to identify software modules that will be diﬃcult to test or

maintain”

Thomas J. McCabe, "A Complexity Measure", IEEE Transactions on Software

Engineering, 1976

101/107

Cyclomatic Complexity 2/2

CCN = 3

102/107

Cyclomatic Complexity Analyzer - lyzard

CC Risk Evaluation

1-10 a simple program, without much risk

11-20 more complex, moderate risk

21-50 complex, high risk

> 50 untestable program, very high risk

CC Guidelines

1-5 The routine is probably ﬁne

6-10 Start to think about ways to simplify the routine

> 10 Break part of the routine

blog.feabhas.com/2018/07/code-quality-cyclomatic-complexity

103/107

Cyclomatic Complexity Analyzer - lyzard

Lizard  is an extensible Cyclomatic Complexity Analyzer for many programming

languages including C/C++

> lizard my_project/

==============================================================

NLOC CCN token param functionlinefile

--------------------------------------------------------------

10 2 29 2 start_new_player26./html_game.c

6 1 3 0 set_shutdown_flag449./httpd.c

24 3 61 1 server_main454./httpd.c

--------------------------------------------------------------

• CCN: cyclomatic complexity (should not exceed a threshold)

• NLOC: lines of code without comments

• token: Number of conditional statements

• param: Parameter count of functions

Risk: Lizard: 15, OCLint: 10

104/107

Cognitive Complexity 1/3

Cognitive complexity has been introduced to address the weak points of cyclomatic

complexity

• Cyclomatic complexity has been formulated in a Fortran environment. It doesn’t

include modern language structures like try/catch, and lambdas

• It doesn’t take into account the complexity of a class as a whole

Cognitive complexity has been formulated to address modern language structures,

and to produce values that are meaningful at the class and application levels. More

importantly, it aims to measure the cognitive eﬀort required to understand the program

ﬂows

Cognitive Complexity - A new way of measuring understandability

105/107

Cognitive Complexity 2/3

Cyclomatic complexity issues:

int sumOfPrimes(int max) { // +1

int total = 0;

OUT: for (int i = 1; i <= max; ++i) { // +1

for (int j = 2; j < i; ++j) { // +1

if (i % j == 0) { // +1

continue OUT;

}

total += i;

}

return total;

} // Cyclomatic Complexity 4

String getWords(int number) { // +1

switch (number) {

case 1: // +1

return "one";

case 2: // +1

return "a couple";

case 3: // +1

return "a few";

default:

return "lots";

}

} // Cyclomatic Complexity 4

106/107

Cognitive Complexity 3/3

int sumOfPrimes(int max) {

int total = 0;

OUT: for (int i = 1; i <= max; ++i) { // +1

for (int j = 2; j < i; ++j) { // +2

if (i % j == 0) { // +3

continue OUT; // +1

}

total += i;

}

return total;

} // Cognitive Complexity: 7

String getWords(int number) {

switch (number) { // +1

case 1:

return "one";

case 2:

return "a couple";

case 3:

return "a few";

default:

return "lots";

}

} // Cognitive Complexity: 1

Tools: clang-tidy , SonarSource 

107/107