Modern C++

Programming

14. Debugging and Testing

Federico Busato

2024-03-29

Table of Contents

1 Debugging Overview

2 Assertions

3 Execution Debugging

Breakpoints

Watchpoints / Catchpoints

Control Flow

Stack and Info

Disassemble

1/76

Table of Contents

4 Memory Debugging

valgrind

5 Hardening Techniques

Stack Usage

Standard Library Checks

Undeﬁned Behavior Protections

Control Flow Protections

2/76

Table of Contents

6 Sanitizers

Address Sanitizer

Leak Sanitizer

Memory Sanitizers

Undeﬁned Behavior Sanitizer

7 Debugging Summary

8 Compiler Warnings

3/76

Table of Contents

9 Static Analysis

10 Code Testing

Unit Testing

Test-Driven Development (TDD)

Code Coverage

Fuzz Testing

11 Code Quality

clang-tidy

4/76

Feature Complete

5/76

Debugging Overview

Is this a bug?

for (int i = 0; i <= (2ˆ32) - 1; i++) {

“Software developers spend 35-50 percent of their time vali-

dating and debugging software. The cost of debugging, test-

ing, and veriﬁcation is estimated to account for 50-75 percent

of the total budget of software development projects”

from: John Regehr (on Twitter)

The Debugging Mindset

6/76

Errors, Defects, and Failures

• An error is a human mistake. Errors lead to software defects

• A defects is an unexpected behavior of the software (correctness, performance,

etc.). Defects potentially lead to software failures

• A failure is an observable incorrect behavior

7/76

Cost of Software Defects 1/2

8/76

Cost of Software Defects 2/2

Some examples:

• The Millennium Bug (2000): $100 billion

• The Morris Worm (1988): $10 million (single student)

• Ariane 5 (1996): $370 million

• Knight’s unintended trades (2012): $440 million

• Bitcoin exchange error (2011): $1.5 million

• Pentium FDIV Bug (1994): $475 million

• Boeing 737 MAX (2019): $3.9 million

see also:

11 of the most costly software errors in history

Historical Software Accidents and Errors

List of software bugs

9/76

Types of Software Defects

Ordered by ﬁx complexity, (time to ﬁx):

(1) Typos, Syntax, Formatting (seconds)

(2) Compilation Warnings/Errors (seconds, minutes)

(3) Logic, Arithmetic, Runtime Errors (minutes, hours, days)

(4) Resource Errors (minutes, hours, days)

(5) Accuracy Errors (hours, days)

(6) Performance Errors (days)

(7) Design Errors (weeks, months)

10/76

Causes of Bugs

• C++ is very error prone language, see 60 terrible tips for a C++

developer

• Human behavior, e.g. copying & pasting code is very common practice and can

introduce subtle bugs → check the code carefully, deep understanding of its

behavior

11/76

Program Errors

A program error is a set of conditions that produce an incorrect result or unexpected

behavior, including performance regression, memory consumption, early termination,

etc.

We can distinguish between two kind of errors:

Recoverable Conditions that are not under the control of the program. They

indicates “exceptional” run-time conditions. e.g. ﬁle not found, bad

allocation, wrong user input, etc.

Unrecoverable It is a synonym of a bug. It indicates a problem in the program logic.

The program must terminate and modiﬁed. e.g. out-of-bound, division

by zero, etc.

A recoverable should be considered unrecoverable if it is extremely rare and diﬃcult to

handle, e.g. bad allocation due to out-of-memory error

12/76

Dealing with Software Defects

Software defects can be identiﬁes by:

Dynamic Analysis A mitigation strategy that acts on the runtime state of a program.

Techniques: Print, run-time debugging, sanitizers, fuzzing, unit test support,

performance regression tests

Limitations: Infeasible to cover all program states

Static Analysis A proactive strategy that examines the source code for (potential)

errors.

Techniques: Warnings, static analysis tool, compile-time checks

Limitations: Turing’s undecidability theorem, exponential code paths

How programmers make sure that their software is correct

13/76

Assertions

Unrecoverable Errors and Assertions

Unrecoverable errors cannot be handled. They should be prevented by using assertion

for ensuring pre-conditions and post-conditions

An assertion is a statement to detect a violated assumption. An assertion represents

an invariant in the code

It can happen both at run-time ( assert ) and compile-time ( static assert ).

Run-time assertion failures should never be exposed in the normal program execution

(e.g. release/public)

14/76

Assertion

# include <cassert> // <-- needed for "assert"

# include <cmath> // std::is_finite

# include <type_traits> // std::is_arithmetic_v

template<typename T>

T sqrt(T value) {

static_assert(std::is_arithmetic_v<T>, // precondition

"T must be an arithmetic type");

assert(std::is_finite(value) && value >= 0); // precondition

int ret = ... // sqrt computation

assert(std::is_finite(value) && ret >= 0 && // postcondition

(ret == 0 || ret == 1 || ret < value));

return ret;

}

Assertions may slow down the execution. They can be disable by deﬁne the NDEBUG

macro

# define NDEBUG // or with the flag "-DNDEBUG"

15/76

Execution

Debugging

Execution Debugging (gdb)

How to compile and run for debugging:

g++ -O0 -g [-g3] <program.cpp> -o program

gdb [--args] ./program <args...>

-O0 Disable any code optimization for helping the debugger. It is implicit for most

compilers

-g Enable debugging

- stores the symbol table information in the executable (mapping between assembly

and source code lines)

- for some compilers, it may disable certain optimizations

- slow down the compilation phase and the execution

-g3 Produces enhanced debugging information, e.g. macro deﬁnitions. Available for

most compilers. Suggested instead of -g

16/76

gdb - Breakpoints

Command Abbr. Description

breakpoint <ﬁle>:<line> b insert a breakpoint in a speciﬁc line

breakpoint <function name> b insert a breakpoint in a speciﬁc function

breakpoint <ref > if <condition> b insert a breakpoint with a conditional statement

delete d delete all breakpoints or watchpoints

delete <breakpoint number > d delete a speciﬁc breakpoint

clear [function name/line number ] delete a speciﬁc breakpoint

enable/disable <breakpoint number > enable/disable a speciﬁc breakpoint

info breakpoints info b list all active breakpoints

17/76

gdb - Watchpoints / Catchpoints

Command Abbr. Description

watch <expression>

stop execution when the value of expression changes

(variable, comparison, etc.)

rwatch <variable/location> stop execution when variable/location is read

delete <watchpoint number > d delete a speciﬁc watchpoint

info watchpoints list all active watchpoints

catch throw stop execution when an exception is thrown

18/76

gdb - Control Flow

Command Abbr. Description

run [args] r run the program

continue c continue the execution

finish f continue until the end of the current function

step s execute next line of code (follow function calls)

next n execute next line of code

until <program point>

continue until reach line number,

function name, address, etc.

CTRL+C stop the execution (not quit)

quit q exit

help [<command>] h show help about command

19/76

gdb - Stack and Info

Command Abbr. Description

list l print code

list <function or #start,#end> l print function/range code

up u move up in the call stack

down d move down in the call stack

backtrace [full] bt prints stack backtrace (call stack) [local vars]

info args print current function arguments

info locals print local variables

info variables print all variables

info <breakpoints/watchpoints/registers>

show information about program

breakpoints/watchpoints/registers

20/76

gdb - Print

Command Abbr. Description

print <variable> p print variable

print/h <variable> p/h print variable in hex

print/nb <variable> p/nb print variable in binary (n bytes)

print/w <address> p/w print address in binary

p /s <char array/address> print char array

p *array var@n print n array elements

p (int[4])<address> print four elements of type int

p *(char**)&<std::string> print std::string

21/76

gdb - Disassemble

Command Description

disasseble <function name> disassemble a speciﬁed function

disasseble <0xStart,0xEnd addr> disassemble function range

nexti <variable>

execute next line of code (follow

function calls)

stepi <variable> execute next line of code

x/nfu <address>

examine address

n number of elements,

f format (d: int, f: ﬂoat, etc.),

u data size (b: byte, w: word, etc.)

22/76

gdb - Notes

The debugger automatically stops when:

• breakpoint (by using the debugger)

• assertion fail

• segmentation fault

• trigger software breakpoint (e.g. SIGTRAP on Linux)

github.com/scottt/debugbreak

Full story: www.yolinux.com/TUTORIALS/GDB-Commands.html (it also contains a

script to de-referencing STL Containers)

gdb reference card V5 link

23/76

Memory Debugging

Memory Vulnerabilities 1/3

“70% of all the vulnerabilities in Microsoft products are memory safety

issues”

Matt Miller, Microsoft Security Engineer

“Chrome: 70% of all security bugs are memory safety issues”

Chromium Security Report

“you can expect at least 65% of your security vulnerabilities to be

caused by memory unsafety”

What science can tell us about C and C++’s security

Microsoft: 70% of all security bugs are memory safety issues

Chrome: 70% of all security bugs are memory safety issues

What science can tell us about C and C++’s security

24/76

Memory Vulnerabilities 2/3

“Memory Unsafety in Apple’s OS represents 66.3%- 88.2% of all the

vulnerabilities”

“Out of bounds (OOB) reads/writes comprise ∼70% of all the vul-

nerabilities in Android”

Jeﬀ Vander, Google, Android Media Team

“Memory corruption issues are the root-cause of 68% of listed CVEs”

Ben Hawkes, Google, Project Zero

Memory Unsafety in Apple’s Operating Systems

Google Security Blog: Queue the Hardening Enhancements

Google Project Zero

25/76

Memory Vulnerabilities 2/2

Terms like buﬀer overﬂow, race condition, page fault, null pointer, stack exhaustion,

heap exhaustion/corruption, use-after-free, or double free – all describe memory

safety vulnerabilities

Mitigation:

• Run-time check

• Static analysis

• Avoid unsafe language constructs

26/76

valgrind 1/9

valgrind  is a tool suite to automatically detect many

memory management and threading bugs

How to install the last version:

$ wget ftp://sourceware.org/pub/valgrind/valgrind-3.21.tar.bz2

$ tar xf valgrind-3.21.tar.bz2

$ cd valgrind-3.21

$ ./configure --enable-lto

$ make -j 12

$ sudo make install

$ sudo apt install libc6-dbg #if needed

some linux distributions provide the package through apt install valgrid , but it could be an old version

27/76

valgrind 2/9

Basic usage:

• compile with -g

• $ valgrind ./program <args...>

Output example 1:

==60127== Invalid read of size 4 !!out-of-bound access

==60127== at 0x100000D9E: f(int) (main.cpp:86)

==60127== by 0x100000C22: main (main.cpp:40)

==60127== Address 0x10042c148 is 0 bytes after a block of size 40 alloc'd

==60127== at 0x1000161EF: malloc (vg_replace_malloc.c:236)

==60127== by 0x100000C88: f(int) (main.cpp:75)

==60127== by 0x100000C22: main (main.cpp:40)

28/76

valgrind 3/9

Output example 2:

!!memory leak

==19182== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1

==19182== at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130)

==19182== by 0x8048385: f (main.cpp:5)

==19182== by 0x80483AB: main (main.cpp:11)

==60127== HEAP SUMMARY:

==60127== in use at exit: 4,184 bytes in 2 blocks

==60127== total heap usage: 3 allocs, 1 frees, 4,224 bytes allocated

==60127==

==60127== LEAK SUMMARY:

==60127== definitely lost: 128 bytes in 1 blocks !!memory leak

==60127== indirectly lost: 0 bytes in 0 blocks

==60127== possibly lost: 0 bytes in 0 blocks

==60127== still reachable: 4,184 bytes in 2 blocks !!not deallocated

==60127== suppressed: 0 bytes in 0 blocks

29/76

valgrind 4/9

Memory leaks are divided into four categories:

• Deﬁnitely lost

• Indirectly lost

• Still reachable

• Possibly lost

When a program terminates, it releases all heap memory allocations. Despite this,

leaving memory leaks is considered a bad practice and makes the program unsafe with

respect to multiple internal iterations of a functionality. If a program has memory leaks

for a single iteration, is it safe for multiple iterations?

A robust program prevents any memory leak even when abnormal conditions occur

30/76

valgrind 5/9

Deﬁnitely lost indicates blocks that are not deleted at the end of the program (return

from the main() function). The common case is local variables pointing to newly

allocated heap memory

void f() {

int* y = new int[3]; // 12 bytes definitely lost

}

int main() {

int* x = new int[10]; // 40 bytes definitely lost

f();

}

31/76

valgrind 6/9

Indirectly lost indicates blocks pointed by other heap variables that are not deleted.

The common case is global variables pointing to newly allocated heap memory

struct A {

int* array;

};

int main() {

A* x = new A; // 8 bytes definitely lost

x->array = new int[4]; // 16 bytes indirectly lost

}

32/76

valgrind 7/9

Still reachable indicates blocks that are not deleted but they are still reachable at the

end of the program

int* array;

int main() {

array = new int[3];

}

// 12 bytes still reachable (global static class could delete it)

# include <cstdlib>

int main() {

int* array = new int[3];

std::abort(); // early abnormal termination

// 12 bytes still reachable

... // maybe it is delete here

}

33/76

valgrind 8/9

Possibly lost indicates blocks that are still reachable but pointer arithmetic makes the

deletion more complex, or even not possible

# include <cstdlib>

int main() {

int* array = new int[3];

array++; // pointer arithmetic

std::abort(); // early abnormal termination

// 12 bytes still reachable

... // maybe it is delete here but you should be able

// to revert pointer arithmetic

}

34/76

valgrind 9/9

Advanced ﬂags:

• --leak-check=full print details for each “definitely lost” or “possibly lost”

block, including where it was allocated

• --show-leak-kinds=all to combine with --leak-check=full. Print all leak kinds

• --track-fds=yes list open ﬁle descriptors on exit (not closed)

• --track-origins=yes tracks the origin of uninitialized values (very slow execution)

valgrind --leak-check=full --show-leak-kinds=all

--track-fds=yes --track-origins=yes ./program <args...>

Track stack usage:

valgrind --tool=drd --show-stack-usage=yes ./program <args...>

35/76

Hardening

Techniques

References

• Compiler Options Hardening Guide for C and C++ [March, 2024]

• Hardened mode of standard library implementations

36/76

Compile-time Stack Usage

• -Wstack-usage=<byte-size> Warn if the stack usage of a function might

exceed byte-size. The computation done to determine the stack usage is

conservative (no VLA)

• -fstack-usage Makes the compiler output stack usage information for the

program, on a per-function basis

• -Wvla Warn if a variable-length array is used in the code

• -Wvla-larger-than=<byte-size> Warn for declarations of variable-length

arrays whose size is either unbounded, or bounded by an argument that allows the

array size to exceed byte-size bytes

Use compiler flags for stack protection in GCC and Clang

37/76

Compile-time Stack Protection

• -Wtrampolines Check whether the compiler generates trampolines for pointers

to nested functions which may interfere with stack virtual memory protection

• -Wl,-z,noexecstack Enable data execution prevention by marking stack

memory as non-executable

38/76

Run-time Stack Usage

• -fstack-clash-protection Enables run-time checks for variable-size stack

allocation validity

• -fstack-protector-strong Enables run-time checks for stack-based buﬀer

overﬂows using strong heuristic

• -fstack-protector-all Enables run-time checks for stack-based buﬀer

overﬂows for all functions

39/76

libc Buﬀer Overﬂow Checks 1/2

FORTIFY SOURCE deﬁne: the compiler provides buﬀer overﬂow checks for the

following functions:

memcpy , mempcpy , memmove , memset , strcpy , stpcpy , strncpy , strcat ,

strncat , sprintf , vsprintf , snprintf , vsnprintf , gets .

Recent compilers (e.g. GCC 12+, Clang 9+) allow detects buﬀer overﬂows with

enhanced coverage, e.g. dynamic pointers, with FORTIFY SOURCE=3 *

*GCC’s new fortification level: The gains and costs

40/76

libc Buﬀer Overﬂow Checks 2/2

# include <cstring> // std::memset

# include <string> // std::stoi

int main(int argc, char** argv) {

int size = std::stoi(argv[1]);

char buffer[24];

std::memset(buffer, 0xFF, size);

}

$ gcc -O1 -D FORTIFY SOURCE program.cpp -o program

$ ./program 12 # OK

$ ./program 32 # Wrong

$ *** buffer overflow detected ***: ./program terminated

41/76

Standard Library Precondictions

The standard library provides run-time precondition checks for library calls, such as

bounds-checks for strings and containers, and null-pointer checks, etc.

-D GLIBCXX ASSERTIONS for libstdc++ (GCC)

-D LIBCPP ASSERT , LIBCPP HARDENING MODE EXTENSIVE for libc++ (LLVM):

42/76

Undeﬁned Behavior Protections 1/2

• -fno-strict-overflow Prevent code optimization (code elimination) due to

signed integer undeﬁned behavior

• -fwrapv Signed integer has the same semantic of unsigned integer, with a

well-deﬁned wrap-around behavior

• -fno-strict-aliasing Strict aliasing means that two objects with the same

memory address are not same if they have a diﬀerent type, undeﬁned behavior

otherwise. The ﬂag disables this constraint

43/76

Undeﬁned Behavior Protections 2/2

• -fno-delete-null-pointer-checks NULL pointer dereferencing is undeﬁned

behavior and the compiler can assume that it never happens. The ﬂag disable this

optimization

• -ftrivial-auto-var-init[=<hex pattern>] Ensures that default

initialization initializes variables with a ﬁxed 1-byte pattern. Explicit uninitialized

variables requires the [[uninitialized]] attribute

44/76

Control Flow Protections

• -fcf-protection=full Enable control ﬂow protection to counter Return

Oriented Programming (ROP) and Jump Oriented Programming (JOP) attacks

on many x86 architectures

• -mbranch-protection=standard Enable branch protection to counter Return

Oriented Programming (ROP) and Jump Oriented Programming (JOP) attacks

on AArch64

45/76

Other Run-time Checks

• -fPIE -pie Position-Independent Executable enables the support for address

space layout randomization, which makes exploits more diﬃcult.

• -Wl,-z,relro,-z,now Prevents modiﬁcation of the Global Oﬀset Table

(locations of functions from dynamically linked libraries) after the program startup

• -Wl,-z,nodlopen Restrict dlopen(3) calls to shared objects

46/76

Sanitizers

Address Sanitizer

Sanitizers are compiler-based instrumentation components to perform dynamic

analysis

Sanitizer are used during development and testing to discover and diagnose memory

misuse bugs and potentially dangerous undeﬁned behavior

Sanitizer are implemented in Clang (from 3.1), gcc (from 4.8) and Xcode

Project using Sanitizers:

• Chromium

• Firefox

• Linux kernel

• Android

Memory error checking in C and C++: Comparing Sanitizers and Valgrind

47/76

Address Sanitizer

Address Sanitizer  is a memory error detector

• heap/stack/global out-of-bounds

• memory leaks

• use-after-free, use-after-return, use-after-scope

• double-free, invalid free

• initialization order bugs

* Similar to valgrind but faster (50X slowdown)

clang++ -O1 -g -fsanitize=address -fno-omit-frame-pointer <program>

-O1 disable inlining

-g generate symbol table

• github.com/google/sanitizers/wiki/AddressSanitizer

• gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html

48/76

Leak Sanitizer

LeakSanitizer  is a run-time memory leak detector

• integrated into AddressSanitizer, can be used as standalone tool

* almost no performance overhead until the very end of the process

g++ -O1 -g -fsanitize=address -fno-omit-frame-pointer <program>

clang++ -O1 -g -fsanitize=leak -fno-omit-frame-pointer <program>

• github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer

• gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html

49/76

Memory Sanitizers

Memory Sanitizer  is a detector of uninitialized reads

• stack/heap-allocated memory read before it is written

* Similar to valgrind but faster (3X slowdown)

clang++ -O1 -g -fsanitize=memory -fno-omit-frame-pointer <program>

-fsanitize-memory-track-origins=2

track origins of uninitialized values

Note: not compatible with Address Sanitizer

• github.com/google/sanitizers/wiki/MemorySanitizer

• gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html

50/76

Undeﬁned Behavior Sanitizer

UndefinedBehaviorSanitizer  is an undeﬁned behavior detector

• signed integer overﬂow, ﬂoating-point types overﬂow, enumerated not in range

• out-of-bounds array indexing, misaligned address

• divide by zero

• etc.

* Not included in valgrind

clang++ -O1 -g -fsanitize=undefined -fno-omit-frame-pointer <program>

gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html

51/76

Undeﬁned Behavior Sanitizer

-fsanitize=<options> :

undefined All of the checks other than float-divide-by-zero,

unsigned-integer-overflow, implicit-conversion,

local-bounds and the nullability-* group of checks

float-divide-by-zero Undeﬁned behavior in C++, but deﬁned by Clang and IEEE-754

integer Checks for undeﬁned or suspicious integer behavior (e.g. unsigned

integer overﬂow)

implicit-conversion Checks for suspicious behavior of implicit conversions

local-bounds Out of bounds array indexing, in cases where the array bound can be

statically determined

nullability Checks passing null as a function parameter, assigning null to an

lvalue, and returning null from a function

52/76

Sanitizers vs. Valgrind

Valgrind - A neglected tool from the shadows or a serious debugging tool?

53/76

Debugging Summary

How to Debug Common Errors

Segmentation fault

• gdb, valgrind, sanitizers

• Segmentation fault when just entered in a function → stack overflow

Double free or corruption

• gdb, valgrind, sanitizers

Inﬁnite execution

• gdb + (CTRL + C)

Incorrect results

• valgrind + assertion + gdb + sanitizers

54/76

Compiler Warnings

Compiler Warnings - GCC and Clang

Enable speciﬁc warnings:

g++ -W<warning> <args...>

Disable speciﬁc warnings:

g++ -Wno-<warning> <args...>

Common warning ﬂags to minimize accidental mismatches:

-Wall Enables many standard warnings (∼50 warnings)

-Wextra Enables some extra warning ﬂags that are not enabled by -Wall (∼15 warnings)

-Wpedantic Issue all the warnings demanded by strict ISO C/C++

Enable ALL warnings, only clang: -Weverything

55/76

Compiler Warnings - MSVC

Enable speciﬁc warnings:

cl.exe /W<level><warning id> <args...>

Disable speciﬁc warnings:

cl.exe /We<warning id> <args...>

Common warning ﬂags to minimize accidental mismatches:

/W1 Severe warnings

/W2 Signiﬁcant warnings

/W3 Production quality warnings

/W4 Informational warnings

/Wall All warnings

56/76

Static Analysis

Overview

Static analysis is the process of source code examination to ﬁnd potential issues

Beneﬁts of static code analysis:

• Problem identiﬁcation before the execution

• Analyze the program outside the execution environment

• The analysis is independent from the run-time tests

• Enforce code quality and compliance by ensuring that the code follows speciﬁc

rules and standards

• Identify security vulnerabilities

57/76

Static Analyzers - Clang and GCC

The Clang Static Analyzer  (LLVM suite) ﬁnds bugs by reasoning

about the semantics of code (may produce false positives)

void test() {

int i, a[10];

int x = a[i]; // warning: array subscript is undefined

}

scan-build make

The GCC Static Analyzer  can diagnose various kinds of problems

in C/C++ code at compile-time (e.g. double-free, use-after-free, stdio

related, etc) by adding the -fanalyzer ﬂag

58/76

Static Analyzers - cppcheck

The MSVC Static Analyzer  Enables code analysis and control op-

tions (e.g. double-free, use-after-free, stdio related, etc) by adding the

/analyze ﬂag

cppcheck  provides code analysis to detect bugs, undeﬁned behavior and

dangerous coding construct. The goal is to detect only real errors in the

code (i.e. have very few false positives)

cppcheck --enable=warning,performance,style,portability,information,error

<src_file/directory>

cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .

cppcheck --enable=<enable_flags> --project=compile_commands.json

59/76

Popular Static Analyzers - PVS-Studio, SonarLint

PVS-Studio  is a high-quality proprietary (free for open source projects)

static code analyzer supporting C, C++

Customers: IBM, Intel, Adobe, Microsoft, Nvidia, Bosh, IdGames, EpicGames, etc.

SonarSource  is a static analyzer which inspects source code for bugs,

code smells, and security vulnerabilities for multiple languages (C++, Java,

etc.)

SonarLint plugin is available for Visual Code, Visual Studio Code, Eclipse, and IntelliJ IDEA

60/76

Other Static Analyzers - FBInfer, DeepCode

FBInfer  is a static analysis tool (also available online) to checks for

null pointer dereferencing, memory leak, coding conventions, unavailable

APIs, etc.

Customers: Amazon AWS, Facebook/Ocolus, Instagram, Whatapp, Mozilla, Spotify, Uber,

Sky, etc.

deepCode  is an AI-powered code review system, with machine learning

systems trained on billions of lines of code from open-source projects

Available for Visual Studio Code, Sublime, IntelliJ IDEA, and Atom

see also: A curated list of static analysis tool

61/76

Code Testing

see Case Study 4: The $440 Million Software Error at Knight Capital

from: Kat Maddox (on Twitter)

62/76

Code Testing

Unit Test A unit is the smallest piece of code that can be logically isolated in a

system. Unit test refers to the veriﬁcation of a unit. It supposes the

full knowledge of the code under testing (white-box testing)

Goals: meet speciﬁcations/requirements, fast development/debugging

Functional Test Output validation instead of the internal structure (black-box testing)

Goals: performance, regression (same functionalities of previous

version), stability, security (e.g. sanitizers), composability (e.g.

integration test)

63/76

Unit Testing 1/3

Unit testing involves breaking your program into pieces, and subjecting each piece to

a series of tests

Unit testing should observe the following key features:

• Isolation: Each unit test should be independent and avoid external interference

from other parts of the code

• Automation: Non-user interaction, easy to run, and manage

• Small Scope: Unit tests focus on small portions of code or speciﬁc

functionalities, making it easier to identify bugs

Popular C++ Unit testing frameworks:

catch, doctest, Google Test, CppUnit, Boost.Test

64/76

Unit Testing 2/3

65/76

Unit Testing 3/3

JetBrains C++ Developer Ecosystem 2022

66/76

Test-Driven Development (TDD)

Unit testing is often associated with the Test-Driven Development (TDD)

methodology. The practice involves the deﬁnition of automated functional tests before

implementing the functionality

The process consists of the following steps:

1. Write a test for a new functionality

2. Write the minimal code to pass the test

3. Improve/Refactor the code iterating with the test veriﬁcation

4. Go to 1.

67/76

Test-Driven Development (TDD) - Main advantages

• Software design. Strong focus on interface deﬁnition, expected behavior,

speciﬁcations, and requirements before working at lower level

• Maintainability/Debugging Cost Small, incremental changes allow you to catch

bugs as they are introduced. Later refactoring or the introduction of new features

still rely on well-deﬁned tests

• Understandable behavior. New user can learn how the system works and its

properties from the tests

• Increase conﬁdence. Developers are more conﬁdent that their code will work as

intended because it has been extensively tested

• Faster development. Incremental changes, high conﬁdence, and automation

make it easy to move through diﬀerent functionalities or enhance existing ones

68/76

catch 1/2

Catch2  is a multi-paradigm test framework for C++

Catch2 features

• Header only and no external dependencies

• Assertion macro

• Floating point tolerance comparisons

Basic usage:

• Create the test program

• Run the test

$ ./test_program [<TestName>]

• github.com/catchorg/Catch2

• The Little Things: Testing with Catch2

69/76

catch 2/2

# define CATCH_CONFIG_MAIN // This tells Catch to provide a main()

# include "catch.hpp" // only do this in one cpp file

unsigned Factorial(unsigned number) {

return number <= 1 ? number : Factorial(number - 1) * number;

}

"Test description and tag name"

TEST_CASE( "Factorials are computed", "[Factorial]" ) {

REQUIRE( Factorial(1) == 1 );

REQUIRE( Factorial(2) == 2 );

REQUIRE( Factorial(3) == 6 );

REQUIRE( Factorial(10) == 3628800 );

}

float floatComputation() { ... }

TEST_CASE( "floatCmp computed", "[floatComputation]" ) {

REQUIRE( floatComputation() == Approx( 2.1 ) );

}

70/76

Code Coverage 1/3

Code coverage is a measure used to describe the degree to which the source code of

a program is executed when a particular execution/test suite runs

gcov and llvm-profdata/llvm-cov are tools used in conjunction with compiler

instrumentation (gcc, clang) to interpret and visualize the raw code coverage

generated during the execution

gcovr and lcov are utilities for managing gcov/llvm-cov at higher level and

generating code coverage results

Step for code coverage:

• Compile with --coverage ﬂag (objects + linking)

• Run the program / test

• Visualize the results with gcovr, llvm-cov, lcov

71/76

Code Coverage 2/3

program.cpp:

# include <iostream>

# include <string>

int main(int argc, char* argv[]) {

int value = std::stoi(argv[1]);

if (value % 3 == 0)

std::cout << "first\n";

if (value % 2 == 0)

std::cout << "second\n";

}

$ gcc -g --coverage program.cpp -o program

$ ./program 9

first

$ gcovr -r --html --html-details <path> # generate html

# or

$ lcov --coverage --directory . --output-file coverage.info

$ genhtml coverage.info --output-directory <path> # generate html

72/76

Code Coverage 3/3

1: 4:int main(int argc, char* argv[]) {

1: 5: int value = std::stoi(argv[1]);

1: 6: if (value % 3 == 0)

1: 7: std::cout << "first\n";

1: 8: if (value % 2 == 0)

# ####: 9: std::cout << "second\n";

4: 10:}

73/76

Coverage-Guided Fuzz Testing

A fuzzer is a specialized tool that tracks which areas of the code are reached, and

generates mutations on the corpus of input data in order to maximize the code

coverage

LibFuzzer  is the library provided by LLVM and feeds fuzzed inputs to the library via

a speciﬁc fuzzing entrypoint

The fuzz target function accepts an array of bytes and does something interesting with these

bytes using the API under test:

extern "C" int LLVMFuzzerTestOneInput(const uint8_t* Data,

size_t Size) {

DoSomethingInterestingWithMyAPI(Data, Size);

return 0;

}

74/76

Code Quality

Linters - clang-tidy 1/2

lint: The term was derived from the name of the undesirable bits of ﬁber

clang-tidy  provides an extensible framework for diagnosing and ﬁxing typical

programming errors, like style violations, interface misuse, or bugs that can be deduced

via static analysis

$ cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .

$ clang-tidy -p .

clang-tidy searches the conﬁguration ﬁle .clang-tidy ﬁle located in the closest

parent directory of the input ﬁle

clang-tidy is included in the LLVM suite

75/76

Linters - clang-tidy 2/2

Coding Guidelines:

• CERT Secure Coding Guidelines

• C++ Core Guidelines

• High Integrity C++ Coding Standard

Supported Code Conventions:

• Fuchsia

• Google

• LLVM

Bug Related:

• Android related

• Boost library related

• Misc

• Modernize

• Performance

• Readability

• clang-analyzer checks

• bugprone code constructors

.clang-tidy

Checks: 'android-*,boost-*,bugprone-*,cert-*,cppcoreguidelines-*,

clang-analyzer-*,fuchsia-*,google-*,hicpp-*,llvm-*,misc-*,modernize-*,

performance-*,readability-*'

76/76