Modern C++

Programming

1. Introduction

Federico Busato

2025-04-14

Table of Contents

1 A Little History of C/C++ Programming Language

2 Areas of Application and Popularity

3 C++ Philosophy

4 C++ Weaknesses

C++ Alternatives

Why Switching to a New Language is Hard?

5 The Course

1/55

About Motivation 1/5

“When recruiting research assistants, I look at grades as the last indi-

cator. I ﬁnd that imagination, ambition, initiative, curiosity, drive,

are far better predictors of someone who will do useful work with me. Of

course, these characteristics are themselves correlated with high grades,

but there is something to be said about a student who decides that a

given course is a waste of time and that he works on a side project in-

stead.

Breakthroughs don’t happen in regular scheduled classes, they happen

in side projects. We want p eople who complete the work they were as-

signed, but we also need people who can reﬂect critically on what

is genuinely important"

Daniel Lemire, Prof. at the University of Quebec

2/55

About Motivation 2/5

Academic excellence is not a strong predictor

of career excellence

“Across industries, research shows that the correlation between grades

and job performance is modest in the ﬁrst year after college and trivial

within a handful of years...

Academic grades rarely assess qualities like creativity, leadership and team-

work skills, or social, emotional and political intelligence. Yes, straight-A

students master cramming information and regurgitating it on exams.

But career success is rarely about ﬁnding the right solution to a

problem — it’s more about ﬁnding the right problem to solve...”

3/55

About Motivation 3/5

“Getting straight A’s requires conformity. Having an inﬂuential

career demands originality.

This might explain why Steve Jobs ﬁnished high school with a 2.65

G.P.A., J.K. Rowling graduated from the University of Exeter with

roughly a C average, and the Rev. Dr. Martin Luther King Jr. got only

one A in his four years at Morehouse

If your goal is to graduate without a blemish on your transcript, you

end up taking easier classes and staying within your comfort zone. If

you’re willing to tolerate the occasional B...You gain experience coping

with failures and setbacks, which builds resilience”

4/55

About Motivation 4/5

“Straight-A students also miss out socially. More time studying in

the library means less time to start lifelong friendships, join new clubs or

volunteer...Looking back, I don’t wish my grades had been higher. If I

could do it over again, I’d study less”

Adam Grant, the New York Times

www.nytimes.com/2018/12/08/opinion/college-gpa-career-success.html

5/55

About Motivation 5/5

“Got a 2.4 GPA my ﬁrst semester in college. Thought maybe I wasn’t

cut out for engineering. Today I’ve landing two spacecraft on Mars, and

designing one for the moon.

STEM is hard for everyone. Grades ultimately aren’t what matters.

Curiosity and persistence matter”

Ben Cichy, Chief Software Engineer,

NASA Mars Science Laboratory

twitter.com/bencichy/status/1197752802929364992?s=20

6/55

About Programming 1/2

“And programming computers was so fascinating. You create your

own little universe, and then it does what you tell it to do”

Vint Cerf, TCP/IP co-inventor and Turing Award

“Most good programmers do programming not because they expect to

get paid or get adulation by the public, but because it is fun to program”

Linus Torvalds, principal developer of the Linux kernel

“You might not think that programmers are artists, but programming

is an extremely creative profession. It’s logic-based creativity”

John Romero, co-founder of id Software

7/55

About Programming 2/2

Creativity Programming is extremely creative. The ability to p erceive the problem in

a novel way, provide new and original solutions. Creativity allows

recognizing and generating alternatives

Form of Art Art is the expression of human creative skills. Every programmer has his

own style. Codes and algorithms show elegance and beauty in the same

way as painting or music

Learn Programming gives the opportunity to learn new things every day,

improve own skills and knowledge

Challenge Programming is a challenge. A challenge against yourself, the problem,

and the environment

8/55

Knowledge-Experience Relation

9/55

Learning and Thinking

“In software development, learning is not a big part of the job.

It is the job."

Woody Zuill

“Programming is not about typing, it’s about thinking."

Rich Hickey

10/55

A Little History of

C/C++

Programming

Language

The Assembly Programming Language

A long time ago, in a galaxy far,

far away....there was Assembly

• Extremely simple instructions

• Requires lots of code to do simple tasks

• Can express anything your computer can do

• Hard to read, write

• ...redundant, boring programming, bugs pro-

liferation

main:

.Lfunc_begin0:

push rbp

.Lcfi0:

.Lcfi1:

mov rbp, rsp

.Lcfi2:

sub rsp, 16

movabs rdi, .L.str

.Ltmp0:

mov al, 0

call printf

xor ecx, ecx

mov dword ptr [rbp - 4], eax

mov eax, ecx

add rsp, 16

pop rbp

ret

.Ltmp1:

.Lfunc_end0:

.L.str:

.asciz

"Hello World\n"

11/55

A Little History of C 1/3

In the 1969 Dennis M. Ritchie and Ken Thompson (AT&T, Bell Labs) worked on

developing an operating system for a large computer that could be used by a thousand

users. The new operating system was called UNIX

The whole system was still written in assembly code. Besides assembler and Fortran,

UNIX also had an interpreter for the programming language B. A high-level language

like B made it possible to write many pages of code task in just a few lines of code. In

this way the code could be produced much faster than in assembly

A drawback of the B language was that it did not know data-types

(everything was

expressed in machine words). Another functionality that the B language did not provide

was the use of “structures”. The lack of these things formed the reason for Dennis

M. Ritchie to develop the programming language C. In 1988 they delivered the ﬁnal

standard deﬁnition ANSI C

12/55

A Little History of C 2/3

Dennis M. Ritchie and Ken Thompson

#include "stdio.h"

int main() {

printf(

"Hello World\n");

}

13/55

A Little History of C 3/3

Areas of Application:

• UNIX operating system

• Computer games

• Due to their power and ease of use, C were used in the programming of the

special eﬀects for Star Wars

Star Wars - The Empire Strikes Back

14/55

A Little History of C++ 1/3

The C++ programming language (originally named “C with Classes") was devised

by Bjarne Stroustrup also an employee from Bell Labs (AT&T). Stroustrup started

working on C with Classes in 1979. (The ++ is C language operator)

The ﬁrst commercial release of the C++ language was in October 1985

15/55

A Little History of C++ 2/3

The roots of C++

“The Evolution of C++Past, Present, and Future”, B. Stroustrup, CppCon16

16/55

A Little History of C++ 3/3

17/55

About Evolution

“If you’re teaching today what you were teaching ﬁve

years ago, either the ﬁeld is dead or you are”

Noam Chomsky

18/55

Areas of Application

and Popularity

Most popular compilers:

• Microsoft Visual Code (MSVC) is the compiler oﬀered by Microsoft

• The GNU Compiler Collection (GCC) contains the most popular C++ Linux

compiler

• Clang is a C++ compiler based on LLVM Infrastructure available for

Linux/Windows/Apple (default) platforms

Suggested compiler on Linux for beginner: Clang

• Comparable performance with GCC/MSVC and low memory usage

• Expressive diagnostics (examples and propose corrections)

• Strict C++ compliance. GCC/MSVC compatibility (inverse direction is not ensured)

• Includes very useful tools: memory sanitizer, static code analyzer, automatic formatting,

linter, etc.

10/23

Install the Compiler on Linux

Install the last gcc/g++ (v14)

$ sudo add-apt-repository ppa:ubuntu-toolchain-r/test

$ sudo apt update

$ sudo apt install gcc-14 g++-14

$ gcc-14 --version

Install the last clang/clang++ (v19)

$ wget https://apt.llvm.org/llvm.sh

$ chmod +x llvm.sh

$ sudo ./llvm.sh 19

$ clang++ --version

11/23

Install the Compiler on Windows

Microsoft Visual Studio

• Direct Installer: Visual Studio Community 2022

Clang on Windows

Two ways:

• Windows Subsystem for Linux (WSL)

•

Run → optionalfeatures

• Select

Windows Subsystem for Linux , Hyper-V ,

Virtual Machine Platform

•

Run → ms-windows-store: → Search and install Ubuntu 24.04 LTS

• Clang + MSVC Build Tools

• Download Build Tools per Visual Studio

• Install

Desktop development with C++

12/23

What Editor/IDE/Compiler Should I Use? 1/3

Popular C++ IDE (Integrated Development Environment):

• Microsoft Visual Studio (MSVC) (

link). Most popular IDE for Windows

• Clion (

link). (free for student). Powerful IDE with a lot of options

• QT-Creator (

link). Fast (written in C++), simple

• XCode. Default on Mac OS

• Cevelop (Eclipse) (

link)

Standalone GUI-based coding editors:

• Microsoft Visual Studio Code (VSCode) (

link)

• Sublime (link)

• Lapce (

link)

• Zed (

link)

13/23

What Editor/IDE/Compiler Should I Use? 2/3

Standalone text-based coding editors (powerful, but needs expertise):

• Vim

• Emacs

• NeoVim (link)

• Helix (

link)

Not suggested: Notepad, Gedit, and other similar editors (lack of support for

programming)

14/23

What Editor/IDE/Compiler Should I Use? 3/3

StackOverflow Developer Survey 2024

15/23

How to compile?

How to Compile?

Compile C++11, C++14, C++17, C++20, C++23, C++26 programs:

g++ -std=c++11 <program.cpp> -o program

g++ -std=c++14 <program.cpp> -o program

g++ -std=c++<version> <program.cpp> -o program

Any C++ standard is backward compatible*

C++ is also backward compatible with C in most case, except if it contains C++

keywords (new, template, class, typename, etc.)

We can potentially compile a pure C program in C++26

*except for very minor deprecated features

16/23

C++ Standard

Compiler

C++11 C++14 C++17 C++20

Core Library Core Library Core Library Core Library

g++ 4.8.1 5.1 5.1 5.1 7.1 9.0 11 14

clang++ 3.3 3.3 3.4 3.5 5.0 11.0 19+ 19+

MSVC 19.0 19.0 19.10 19.0 19.15 19.15 19.29+ 19.29

C++23, C++26 are working in progress

en.cppreference.com/w/cpp/compiler_support

17/23

Hello World

Hello World 1/2

C code with printf :

#include <stdio.h>

int main() {

printf(

"Hello World!\n");

}

printf

prints on standard output

C++ code with

streams :

#include <iostream>

int main() {

std

::cout << "Hello World!\n";

}

cout

represents the standard output stream

18/23

Hello World 2/2

The previous example can be written with the global std namespace:

#include <iostream>

using namespace std;

int main() {

cout

<< "Hello World!\n";

}

Note: For sake of space and for improving the readability, we intentionally omit the

std namespace in most slides

19/23

I/O Stream (std::cout) 1/3

std::cout is an example of output stream. Data is redirected to a destination, in

this case the destination is the standard output

#include <stdio.h>

int main() {

int a = 4;

double b = 3.0;

char c[] = "hello";

printf("%d %f %s\n", a, b, c);

}

#include <iostream>

int main() {

int a = 4;

double b = 3.0;

char c[] = "hello";

std::cout << a << " " << b << " " << c << "\n";

}

20/23

C++:

I/O Stream (Why should we prefer I/O stream?) 2/3

• Type-safe: The type of object provided to the I/O stream is known statically by the

compiler. In contrast,

printf uses % ﬁelds to ﬁgure out the types dynamically

• Less error prone: With I/O Stream, there are no redundant

% tokens that have to

be consistent with the actual objects passed to I/O stream. Removing redundancy

removes a class of errors

• Extensible: The C++ I/O Stream mechanism allows new user-deﬁned types to be

passed to I/O stream without breaking existing code

• Comparable performance: If used correctly may be faster than C I/O (

printf ,

scanf , etc.) .

21/23

I/O Stream (Common C errors) 3/3

• Forget the number of parameters:

printf("long phrase %d long phrase %d", 3);

• Use the wrong format:

int a = 3;

...many lines of code...

printf(

" %f", a);

• The

%c conversion speciﬁer does not automatically skip any leading white space:

scanf("%d", &var1);

scanf(" %c", &var2);

22/23

std::print

C++23 introduces an improved version of printf function std::print based on

formatter strings that provides all beneﬁts of C++ stream and is less verbose

#include <print>

int main() {

std::print("Hello World! {}, {}, {}\n", 3, 4ll, "aa");

// print "Hello World! 3 4 aa"

}

This will be the default way to print when the

C++23 standard is widely adopted

23/23

Modern C++

Programming

3. Basic Concepts I

Type System, Fundamental Types, and Operators

Federico Busato

2025-04-14

Table of Contents

1 The C++ Type Sys tem

Type Categories

Type Properties

⋆

2 Fundamental Types Overview

Arithmetic Types

Suﬃx and Preﬁx

Non-Standard Arithmetic Types

void Type

nullptr

1/29

Table of Contents

3 Conversion Rules

4 auto Keyword

5 C++ Operators

Operators Precedence

Preﬁx/Postﬁx Increment/Decrement Semantic

Assignment, Compound, and Comma Operators

Spaceship Operator <=>

⋆

Safe Comparison Operators

⋆

2/29

The C++ Type

System

The C++ Type System

C++ is a strongly typed and statically typed language

Every entity has a type and that type never changes

Every variable, function, or expression has a type in order to be compiled. Users can

introduce new types with class or struct

The type speciﬁes:

• The amount of memory allocated for the variable (or expression result)

• The kinds of values that may be stored and how the compiler interprets the bit

patterns in those values

• The operations that are permitted for those entities and provides semantics

3/29

Type Categories

C++ organizes the language types in two main categories:

• Fundamental types (often called primitive types): Types provided by the

language itself and don’t require additional hea ders

• Arithmetic types: integer and ﬂoating point

•

void

• nullptr C++11

• Compound types: Composition or references to other types

• Pointers

• References

• Enumerators

• Arrays

• struct , class , union

• Functions

4/29

Type Properties

⋆

1/2

C++ types can be also classiﬁed based on their properties:

• Objects:

• size:

sizeof is deﬁned

• alignment requirement:

alignof is deﬁned

• storage duration: describe when an object is allocated and deallocated

• lifetime, bounded by storage duration or temporary

• value, potentially indeterminate

• optionally, a name.

Types: Arithmetic, Pointers and nullptr , Enumerators, Arrays, struct ,

class , union

5/29

Type Properties

⋆

2/2

• Scalar:

• Hold a single value and is not composed of other objects

• Trivially Copyable: can be copied bit for bit

• Standard Layout: compatible with C functions and structs

• Implicit Lifetime: no user-provided constructor or destructor

Types: Arithmetic, Pointers and nullptr , Enumerators

• Trivial types: Trivial default/copy constructor, copy assignment operator, and

destructor → Trivially Copyable

Types: Scalar, trivial class types, arrays of such types

• Incomplete types: A type that has been declared but not yet deﬁned

Types: void , incompletely-deﬁned object types, e.g. struct A; , array of elements of

incomplete type

6/29

C++ Types Summary

7/29

Fundamental Types

Overview

Arithmetic Types - Integral

Native Type Bytes Range

Fixed width types

bool 1 true, false

char

†

1 implementation deﬁned

signed char 1 -128 to 127 int8_t

unsigned char 1 0 to 255 uint8_t

short 2 -2

to 2

-1 int16_t

unsigned short 2 0 to 2

-1 uint16_t

int 4 -2

to 2

-1 int32_t

unsigned int 4 0 to 2

-1 uint32_t

long int 4/8

∗

int32_t/int64_t

long unsigned int 4/8

∗

uint32_t/uint64_t

long long int 8 -2

to 2

-1 int64_t

long long unsigned int 8 0 to 2

-1 uint64_t

∗

4 bytes on Windows64 systems,

†

signed/unsigned, two-complement from C++11

8/29

Arithmetic Types - Floating-Point

Native Type IEEE Bytes Range

Fixed width types

C++23 <stdfloat>

(bﬂoat16) N 2 ±1.18 × 10

−38

to ±3.4 × 10

+38

std::bﬂoat16_t

(ﬂoat16) Y 2 0.00006 to 65, 536 std::ﬂoat16_t

float Y 4 ±1.18 × 10

−38

to ±3.4 × 10

+38

std::ﬂoat32_t

double Y 8 ±2.23 × 10

−308

to ±1.8 × 10

+308

std::ﬂoat64_t

9/29

Arithmetic Types - Short Name

Signed Type short name

signed char /

signed short int short

signed int int

signed long int long

signed long long int long long

Unsigned Type short name

unsigned char /

unsigned short int unsigned short

unsigned int unsigned

unsigned long int unsigned long

unsigned long long int unsigned long long

10/29

Arithmetic Types - Suﬃx (Literals)

Type SUFFIX Example Notes

int / 2

unsigned int u, U 3u

long int l, L 8L

long unsigned ul, UL 2ul

long long int ll, LL 4ll

long long unsigned int ull, ULL 7ULL

float f, F 3.0f only decimal numbers

double 3.0 only decimal numbers

C++23 Type SUFFIX Example Notes

std::bfloat16_t bf16, BF16 3.0bf16 only decimal numbers

std::float16_t f16, F16 3.0f16 only decimal numbers

std::float32_t f32, F32 3.0f32 only decimal numbers

std::float64_t f64, F64 3.0f64 only decimal numbers

std::float128_t f128, F128 3.0f128 only decimal numbers

11/29

Arithmetic Types - Preﬁx (Literals)

Representation PREFIX Example

Binary C++14 0b 0b010101

Octal 0 0307

Hexadecimal 0x or 0X 0xFFA010

C++14 also allows digit separators for improving the readability 1'000'000

12/29

Other Arithmetic Types

• C++ also provides long double (no IEEE-754) of size 8/12/16 bytes

depending on the implementation

• Reduced precision ﬂoating-point supports before

C++23:

- Some compilers provide support for half (16-bit ﬂoating-point) (GCC for ARM:

__fp16 ,

LLVM compiler:

half )

- Some modern CPUs and GPUs provide half instructions

- Software support: OpenGL, Photoshop, Lightroom, half.sourceforge.net

• C++ does not provide 128-bit integers even if some architectures support it.

clang and gcc allow 128-bit integers as compiler extension (

__int128 )

13/29

void Type

void is an incomplete type (not deﬁned) without a value

• void indicates also a function with no return type or no parameters

e.g.

void f() , f(void)

• In C

sizeof(void) == 1 (GCC), while in C++ sizeof(void) does not

compile!!

int main() {

// sizeof(void); // compile error

}

14/29

nullptr Keyword

C++11 introduces the keyword nullptr to represent a null pointer ( 0x0 ) and

replacing the

NULL macro

nullptr is an object of type nullptr_t → safer

int* p1 = NULL; // ok, equal to int* p1 = 0l

int* p2 = nullptr; // ok, nullptr is convertible to a pointer

int n1 = NULL; // ok, we are assigning 0 to n1

//int n2 = nullptr; // compile error nullptr is not convertible to an integer

//int* p2 = true ? 0 : nullptr; // compile error incompatible types

15/29

Conversion Rules

Implicit type conversion rules, applied in order, before any operation:

⊗: any operation (*, +, /, -, %, etc.)

(A) Floating point promotion

floating_type ⊗ integer_type → floating_type

(B) Implicit integer promotion

small_integral_type := any signed/unsigned integral type smaller than

int

small_integral_type ⊗ small_integral_type →

int

small_type ⊗ large_type → large_type

(D) Sign promotion

signed_type ⊗ unsigned_type → unsigned_type

16/29

Examples and Common Errors

float f = 1.0f;

unsigned u = 2;

int i = 3;

short s = 4;

uint8_t c = 5; // unsigned char

f * u; // float × unsigned → float: 2.0f

s * c; // short × unsigned char → int: 20

u * i; // unsigned × int → unsigned: 6u

+c; // unsigned char → int: 5

Integers are not ﬂoating points!

int b = 7;

float a = b / 2; // a = 3 not 3.5!!

int c = b / 2.0; // again c = 3 not 3.5!!

17/29

Implicit Promotion

Integral data types smaller than 32-bit are implicitly promoted to int , independently

if they are signed or unsigned

• Unary +, -, ∼ and Binary +, -, &, etc. promotion:

char a = 48; // '0'

cout << a; // print '0'

cout << +a; // print '48'

cout << (a + 0); // print '48'

uint8_t a1 = 255;

uint8_t b1 = 255;

cout << (a1 + b1); // print '510' (no overflow)

18/29

auto Keyword

auto Keyword 1/3

C++11 The auto keyword speciﬁes that the type of the variable will be automatically

deduced by the compiler (from its initializer)

auto a = 1 + 2; // 1 is int, 2 is int, 1 + 2 is int!

// -> 'a' is "int"

auto b = 1 + 2.0; // 1 is int, 2.0 is double. 1 + 2.0 is double

// -> 'b' is "double"

auto can be very useful for maintainability and for hiding complex type deﬁnitions

for (auto i = k; i < size; i++)

...

On the other hand, it may make the code less readable if excessively used because of

type hiding

Example: auto x = 0; in general makes no sense ( x is int )

19/29

auto Keyword - Functions

⋆

2/3

In C++11/C++14, auto (as well as decltype ) can be used to deﬁne function

output types

auto g(int x) -> int { return x * 2; } // C++11

// "-> int" is the deduction type

// a better way to express it is:

auto g2(int x) -> decltype(x * 2) { return x * 2; } // C++11

auto h(int x) { return x * 2; } // C++14

//--------------------------------------------------------------

int x = g(3); // C++11

20/29

auto Keyword - Functions

⋆

3/3

In C++20, auto can be also used to deﬁne function input

void f(auto x) {}

// equivalent to templates but less expensive at compile-time

//--------------------------------------------------------------

f(3); // 'x' is int

f(3.0); // 'x' is double

21/29

C++ Operators

Operators Overview

Precedence Operator Description Associativity

1 a++ a– Suﬃx/postﬁx increment and decrement Left-to-right

+a -a ++a –a

! not ∼

Plus/minus, Preﬁx increment/decrement,

Logical/Bitwise Not

Right-to-left

3 a*b a/b a%b Multiplication, division, and remainder Left-to-right

4 a+b a-b Addition and subtraction Left-to-right

5 ≪ ≫ Bitwise left shift and right shift Left-to-right

6 < <= > >= Relational operators Left-to-right

7 == != Equality operators Left-to-right

8 & Bitwise AND Left-to-right

9 ˆ Bitwise XOR Left-to-right

10 | Bitwise OR Left-to-right

11 && and Logical AND Left-to-right

12 || or Logical OR Left-to-right

= += -= *= /= %=

«= »= &= ˆ= |=

Assignment and Compound op erators Right-to-left

22/29

Operators Precedence 1/2

Operators precedence W:

• Unary operators have higher precedence than binary operators

• Standard math operators (+, *, etc.) have

higher precedence than

comparison, bitwise, and logic operators

• Bitwise and logic operators have higher precedence than comparison operators

• Bitwise operators have

higher precedence than logic operators

• Compound assignment operators += , -= , *= , /= , %= , ^= , != , &= , »= ,

«= have lower priority

• The comma operator has the lowest precedence (see next slides)

23/29

Operators Precedence 2/2

Examples:

a + b * 4; // a + (b * 4)

a * b / c % d; // ((a * b) / c) % d

a + b < 3 >> 4; // (a + b) < (3 >> 4)

a && b && c || d; // (a && b && c) || d

a and b and c or d; // (a && b && c) || d

a | b & c || e && d; // ((a | (b & c)) || (e && d)

Important: sometimes parenthesis can make an expression verbose... but they can

help!

24/29

Preﬁx/Postﬁx Increment Semantic

Preﬁx Increment/Decrement ++i , –i

(1) Update the value

(2) Return the new (updated) value

Postﬁx Increment/Decrement

i++ , i–

(1) Save the old value (temporary)

(2) Update the value

(3) Return the old (original) value

Preﬁx/Postﬁx increment/decrement semantic applies not only to built-in types but

also to objects

25/29

Operation Ordering Undeﬁned Behavior

⋆

Expressions with undeﬁned (implementation-deﬁned) behavior:

int i = 0;

= ++i + 2; // until C++11: undefined behavior

// since C++11: i = 3

i = 0;

= i++ + 2; // until C++17: undefined behavior

// since C++17: i = 3

f(i = 2, i = 1); // until C++17: undefined behavior

// since C++17: i = 2

i = 0;

a[i]

= ++i; // until C++17: undefined behavior

// since C++17: a[1] = 1

f(++i, ++i); // undefined behavior

i = ++i + i++; // undefined behavior

26/29

Assignment, Compound, and Comma Operators

Assignment and compound assignment operators have right-to-left associativity

and their expressions return the assigned value

int y = 2;

int x = y = 3; // y=3, then x=3

// the same of x = (y = 3)

if (x = 4) // assign x=4 and evaluate to true

The comma operator

⋆

has left-to-right associativity. It evaluates the left expression,

discards its result, and returns the right expression

int a = 5, b = 7;

int x = (3, 4); // discards 3, then x=4

int y = 0;

int z;

= y, x; // z=y (0), then returns x (4)

27/29

Spaceship Operator <=>

⋆

C++20 provides the three-way comparison operator <=> , also called spaceship

operator, which allows comparing two objects similarly of strcmp . The operator

returns an object that can be directly compared with a positive, 0, or negative integer

value

(3 <=> 5) == 0; // false

('a' <=> 'a') == 0; // true

(3 <=> 5) < 0; // true

(7 <=> 5) < 0; // false

The semantic of the spaceship operator can be extended to any object (see next

lectures) and can greatly simplify the comparison operators overloading

28/29

Safe Comparison Operators

⋆

C++20 introduces a set of functions <utility> to safely compare integers of

diﬀerent types (signed, unsigned)

bool cmp_equal(T1 a, T2 b)

bool cmp_not_equal(T1 a, T2 b)

bool cmp_less(T1 a, T2 b)

bool cmp_greater(T1 a, T2 b)

bool cmp_less_equal(T1 a, T2 b)

bool cmp_greater_equal(T1 a, T2 b)

example:

# include <utility>

unsigned a = 4;

int b = -3;

bool v1 = (a > b); // false!!!, see next slides

bool v2 = std::cmp_greater(a, b); // true

How to compare signed and unsigned integers in C++20?

29/29

Modern C++

Programming

4. Basic Concepts II

Integral and Floating-point Types

Federico Busato

2025-04-14

Table of Contents

1 Integral Data Types

Fixed Width Integers

size_t

ptrdiff_t

⋆

uintptr_t

⋆

Arithmetic Operation Semantics

Promotion, Truncation

Undeﬁned Behavior

Saturation Arithmetic

⋆

1/83

Table of Contents

2 Floating-point Types and Arithmetic

IEEE Floating-point Standard and Other Representations

Normal/Denormal Values

Inﬁnity (∞)

Not a Number (NaN)

Machine Epsilon

Units at the Last Place (ULP)

Cheatsheet

Limits and Useful Functions

2/83

Table of Contents

Arithmetic Properties

Special Values Behavior

Floating-Point Undeﬁned Behavior

Detect Floating-point Errors

⋆

3 Floating-point Issues

Catastrophic Cancellation

Floating-point Comparison

3/83

Integral Data Types

A Firmware Bug

“Certain SSDs have a ﬁrmware bug causing them to irrecoverably fail after

exactly 32,768 hours of operation. SSDs that were put into service at the

same time will fail simultaneously, so RAID won’t help”

HPE SAS Solid State Drives - Critical Firmware Upgrade

Via twitter.com/martinkl/status/1202235877520482306?s=19

4/83

Overﬂow Implementations

Note: Computing the average in the right way is not trivial, see On finding the average

of two unsigned integers without overflow

related operations: ceiling division, rounding division

ai.googleblog.com/2006/06/extra-extra-read-all-about-it-nearly.html

5/83

Potentially Catastrophic Failure

51 days = 51 · 24 · 60 · 60 ·1000 = 4 406 400 000 ms

Boeing 787s must be turned off and on every 51 days to prevent ‘misleading data’

being shown to pilots

6/83

C++ Data Model

Number of Bits

C++ Data

Model

short int long long long pointer

ILP32 Windows/Unix 32-bit 16 32 32 64 32

LLP64 Windows 64-bit 16 32 32 64 64

LP64 Linux 64-bit 16 32 64 64 64

char is always 1 byte

LP32: Windows 16-bit APIs (no more used)

C++ Fundamental types

7/83

Fixed Width Integers 1/3

int*_t <cstdint>

C++11 provides ﬁxed width integer types.

They have the same size on any architecture:

int8_t, uint8_t

int16_t

, uint16_t

int32_t, uint32_t

int64_t, uint64_t

Good practice: Prefer ﬁxed-width integers instead of native types. int and

unsigned can be directly used as they are widely accepted by C++ data models

8/83

Fixed Width Integers 2/3

int*_t types are not “real” types, they are merely typedefs to appropriate

fundamental types

C++ standard does not ensure a one-to-one mapping:

• There are ﬁve distinct fundamental types (

char , short , int , long ,

long long )

• There are four

int*_t overloads ( int8_t , int16_t , int32_t , and

int64_t )

ithare.com/c-on-using-int_t-as-overload-and-template-parameters

9/83

Fixed Width Integers 3/3

Warning: I/O Stream interprets uint8_t and int8_t as char and not as integer

values

int8_t var;

cin

>> var; // read '2'

cout << var; // print '2'

int a = var * 2;

cout << a; // print '100' !!

10/83

size_t

size_t <cstddef>

size_t W is an alias data type capable of storing the biggest representable value on

the current architecture

•

size_t is an unsigned integer type (of at least 16-bit)

•

size_t is the return type of sizeof() and commonly used to represent size

measures

•

size_t is 4 bytes on 32-bit architectures, and 8 bytes on 64-bit architectures

•

C++23 adds uz / UZ literals for size_t , e.g. 5uz

11/83

ptrdiff_t

⋆

ptrdiff_t <cstddef>

ptrdiff_t W is an aliases data type used to store the results of the diﬀerence

between pointers or iterators

• ptrdiff_t is the signed version of size_t commonly used for computing

pointer diﬀerences

•

ptrdiff_t are 4 bytes on 32-bit architectures, and 8 bytes on 64-bit

architectures

•

C++23 adds z / Z for ptrdiff_t , e.g. 5z

12/83

uintptr_t

⋆

uintptr_t <cstdint>

uintptr_t W (C++11) is an integer type that can be converted from and to a

void pointer

•

uintptr_t is an unsigned type

•

sizeof(uintptr_t) == sizeof(void*)

• uintptr_t is an optional type of the standard and compilers may not provide it

13/83

Arithmetic Operation Semantics

Overﬂow The result of an arithmetic operation exceeds the word length, namely the

largest positive/negative values

Wraparound The result of an arithmetic operation is reduced modulo 2

where N is the

number of bits of the word

Saturation The result of an arithmetic operation is constrained within a ﬁxed range

deﬁned by a minimum and maximum value. If the result of an operation

exceeds this range, it is “clamped” to the boundary value

14/83

Signed/Unsigned Integer Characteristics

Without undeﬁned behavior, signed and unsigned integers use the same exact

hardware, and they are equivalent at binary level thanks to the two-complement

representation

#include <cstdint>

int a1 = INT_MAX;

int b1 = a1 + 4; // 10000000000000000000000000000011

unsigned a2 = INT_MAX;

unsigned b2 = a2 + 4; // 10000000000000000000000000000011

However, signed and unsigned integers have diﬀerent semantics in C++. The

compiler can exploit undeﬁned behavior to optimize the code even if such operations

are well-deﬁned at hardware level

15/83

Signed Integer

Represent positive, negative, and zero values (Z

Properties: Commutative, reﬂexive, not associative (overﬂow/underﬂow)

Represent the human intuition of numbers

All bitwise operations are well-deﬁned, except shift

16/83

Signed Integer - Problems

More negative values (2

− 1) than positive (2

− 2)

Even multiply, division, and modulo by -1 can fail, e.g.

INT_MIN * -1

Overﬂow/underﬂow semantic → undeﬁned behavior

Possible behavior: overﬂow: (2

− 1) + 1 → min , underﬂow: −2

− 1 → max

Shift could lead to undeﬁned behavior x ≪ y

• undeﬁned behavior if y is larger than the number of bits of x

• implementation-deﬁned if x is negative (until C++20)

• undeﬁned behavior

if y is negative

17/83

Unsigned Integer

Represent only non-negative values (N

Properties: commutative, reﬂexive, associative

Discontinuity in 0, 2

− 1

Wraparound semantic → well-deﬁned (modulo 2

)

Bit-wise operations are well-deﬁned, except shift

Shift could lead to undeﬁned behavior x ≪ y

• undeﬁned behavior

if y is larger than the number of bits of x

18/83

When Use Signed/Unsigned Integer? 1/2

Google Style Guide

Because of historical accident, the C++ standard also uses unsigned integers to

represent the size of containers - many members of the standards body believe this

to be a mistake, but it is eﬀectively impossible to ﬁx at this point

Solution: use int64_t

max value: 2

− 1 = 9,223,372,036,854,775,807 or

9 quintillion (9 billion of billion),

about 292 years in nanoseconds,

9 million terabytes

19/83

When Use Signed/Unsigned Integer? 2/2

When use signed integer?

• if it can be mixed with negative values, e.g. subtracting byte sizes

• prefer expressing non-negative values with signed integer and assertions

• optimization purposes, e.g. exploit undeﬁned behavior for overﬂow or in loops

When use unsigned integer?

• if the quantity can never be mixed with negative values (?)

• bitmask values

• optimization purposes, e.g. division, modulo

• safety-critical system, signed integer overﬂow could be “non-deterministic”

Subscripts and sizes should be signed, Bjarne Stroustrup

Don’t add to the signed/unsigned mess, Bjarne Stroustrup

Integer Type Selection in C++: in Safe, Secure and Correct Code, Robert C. Seacord

20/83

Arithmetic Type Limits

Query properties of arithmetic types in C++11:

# include <limits>

std::numeric_limits<int>::max(); // 2

− 1

std::numeric_limits<uint16_t>::max(); // 65, 535

std::numeric_limits<int>::min(); // −2

std::numeric_limits<unsigned>::min(); // 0

* this syntax will be explained in the next lectures

21/83

Promotion and Truncation

Promotion to a larger type keeps the sign

int16_t x = -1;

int y = x; // sign extend

cout << y; // print -1

Truncation to a smaller type is implemented as a modulo operation with respect to

the number of bits of the smaller type

int x = 65537; // 2^16 + 1

int16_t y = x; // x % 2^16

cout << y; // print 1

int z = 32769; // 2^15 + 1 (does not fit in a int16_t)

int16_t w = z; // (int16_t) (x % 2^16 = 32769)

cout << w; // print -32767

22/83

Mixing Signed/Unsigned Errors 1/3

unsigned a = 10; // array is small

int b = -1;

array[

10ull + a * b] = 0; // ?

Segmentation fault!

int f(int a, unsigned b, int* array) { // array is small

if (a > b)

return array[a - b]; // ?

return 0;

}

Segmentation fault for

a < 0 !

// v.size() return unsigned

for (size_t i = 0; i < v.size() - 1; i++)

array[i] = 3; // ?

Segmentation fault for v.size() == 0 !

23/83

Mixing Signed/Unsigned Errors  2/3

Easy case:

unsigned x = 32; // x can be also a pointer

x += 2u - 4; // 2u - 4 = 2 + (2^32 - 4)

// = 2^32 - 2

// (32 + (2^32 - 2)) % 2^32

cout << x; // print 30 (as expected)

What about the fol lowing code?

uint64_t x = 32; // x can be also a pointer

x += 2u - 4;

cout

<< x;

24/83

Mixing Signed/Unsigned Errors  3/3

A real-world case:

// allocate a zero rtx vector of N elements

// sizeof(struct rtvec_def) == 16

// sizeof(rtunion) == 8

rtvec rtvec_alloca(int n) {

rtvec rt;

int i;

rt = (rtvec)obstack_alloc(

rtl_obstack,

sizeof(struct rtvec_def) + ((n - 1) * sizeof(rtunion)));

// ...

return rt;

}

Garbage In, Garbage Out: Arguing about Undefined Behavior with Nasal Daemons,

Chandler Carruth, CppCon 2016

25/83

Undeﬁned Behavior 1/7

The C++ standard does not prescribe any speciﬁc behavior (undeﬁned behavior) for

several integer/unsigned arithmetic operations

• Signed integer overﬂow/underﬂow

int x = std::numeric_limits<int>::max() + 20;

• More negative values than positive

int x = std::numeric_limits<int>::max() * -1; // (2^31 -1) * -1

cout << x; // -2^31 +1 ok

int y = std::numeric_limits<int>::min() * -1; // -2^31 * -1

cout << y; // hard to see in complex examples // 2^31 overflow!!

The Usual Arithmetic Confusions

26/83

Undeﬁned Behavior 2/7

• Initialize an integer with a value larger than its range is undeﬁned behavior

int z = 3000000000; // undefined behavior!!

• Bitwise operations on signed integer types with negative value is undeﬁned

behavior

int y = -1 << 12; // undefined behavior!!, until C++20

int z = 1 << -12; // undefined behavior!!

• Shift larger than #bits of the data type is undeﬁned behavior even for unsigned

unsigned y = 1u << 32u; // undefined behavior!!

• Undeﬁned behavior in implicit conversion

uint16_t a = 65535; // 0xFFFF

uint16_t b = 65535; // 0xFFFF expected: 4'294'836'225

cout << (a * b); // print '-131071' undefined behavior!! (int overflow)

27/83

Undeﬁned Behavior - Signed Overﬂow Example 1 3/7

# include <climits>

# include <cstdio>

void f(int* ptr, int pos) {

pos

++;

if (pos < 0) // <-- the compiler could assume that signed overflow never

return; // happen and "simplify" the condition to check

ptr[pos] = 0;

}

int main() { // the code compiled with optimizations, e.g. -O3

int* tmp = new int[10]; // leads to segmentation faults with clang, while

f(tmp, INT_MAX); // it terminates correctly with gcc

printf("%d\n", tmp[0]);

}

28/83

Undeﬁned Behavior - Signed Overﬂow Example 2 4/7

s/open.c of the Linux kernel

int do_fallocate(..., loff_t offset, loff_t len) {

inode

*inode = ...;

if (offset < 0 || len <= 0)

return -EINVAL;

/* Check for wrap through zero too */

if ((offset + len > inode->i_sb->s_maxbytes) || (offset + len < 0))

return -EFBIG; // the compiler is able to infer that both 'offset' and

... // 'len' are non-negative and can eliminate this check,

} // without verify integer overflow

29/83

Undeﬁned Behavior - Division by Zero Example 5/7

src/backend/utils/adt/int8.c of PostgreSQL

if (arg2 == 0) {

ereport(ERROR, (errcode(ERRCODE_DIVISION_BY_ZERO),

// the compiler is not aware

errmsg("division by zero"))); // that this function

} // doesn't return

/* No overflow is possible */

PG_RETURN_INT32((int32) arg1 / arg2); // the compiler assumes that the divisor is

// non-zero and can move this statement on

// the top (always executed)

Undefined Behavior: What Happened to My Code?

30/83

Undeﬁned Behavior - Implicit Overﬂow Example 6/7

Even worse example:

# include <iostream>

int main() {

for (int i = 0; i < 4; ++i)

std::cout << i * 1000000000 << std::endl;

}

// with optimizations, it is an infinite loop

// --> 1000000000 * i > INT_MAX

undefined behavior!!

// the compiler translates the multiplication constant into an addition

Why does this loop produce undefined behavior?

31/83

Undeﬁned Behavior - Common Loops  7/7

Is the following loop safe?

void f(int size) {

for (int i = 0; i < size; i += 2)

...

}

• What happens if size is equal to INT_MAX ?

• How to make the previous loop safe?

•

i >= 0 && i < size is not the solution because of undeﬁned behavior of

signed overﬂow

• Can we generalize the solution when the increment is

i += step ?

32/83

Overﬂow / Underﬂow

Detecting wraparound for unsigned integral types is not trivial

// some examples

bool is_add_overflow(unsigned a, unsigned b) {

return (a + b) < a || (a + b) < b;

}

bool is_mul_overflow(unsigned a, unsigned b) {

unsigned x = a * b;

return a != 0 && (x / a) != b;

}

Detecting overﬂow/underﬂow for signed integral types is even harder and must be

checked before performing the operation

33/83

Saturation Arithmetic

⋆

C++26 adds four main functions to perform saturation arithmetic with integer types

in the <numeric> library. In other words, the undeﬁned behavior or the wrap-around

behavior for overﬂow/underﬂow is replaced by saturation values, namely the minimum

or maximum values of the operands

•

T add_sat(T x, T y)

•

T sub_sat(T x, T y)

• T mul_sat(T x, T y)

•

T div_sat(T x, T y)

• R saturate_cast<R>(T x)

34/83

Floating-point Types

and Arithmetic

IEEE Floating-Point Standard

IEEE754 is the technical standard for ﬂoating-point arithmetic

The standard deﬁnes the binary format, operations behavior, rounding rules, exception

handling, etc.

First Release : 1985

Second Release : 2008. Add 16-bit, 128-bit, 256-bit ﬂoating-point types

Third Release : 2019. Specify min/max behavior

see The IEEE Standard 754: One for the History Books

IEEE754 technical document:

754-2019 - IEEE Standard for Floating-Point Arithmetic

In general, C/C++ adopts IEEE754 ﬂoating-point standard:

en.cppreference.com/w/cpp/types/numeric_limits/is_iec559

35/83

32/64-bit Floating-Point

• IEEE754 Single-precision (32-bit) float

Sign

1-bit

Exponent (or base)

8-bit

Mantissa (or signiﬁcant)

23-bit

• IEEE754 Double-precision (64-bit)

double

Sign

1-bit

Exponent (or base)

11-bit

Mantissa (or signiﬁcant)

52-bit

36/83

128/256-bit Floating-Point

• IEEE754 Quad-Precision (128-bit) std::float128_t C++23

Sign

1-bit

Exponent (or base)

15-bit

Mantissa (or signiﬁcant)

112-bit

• IEEE754 Octuple-Precision (256-bit) (not standardized in C++)

Sign

1-bit

Exponent (or base)

19-bit

Mantissa (or signiﬁcant)

236-bit

37/83

16-bit Floating-Point

• IEEE754 16-bit Floating-p oint ( std::binary16_t ) C++23 → GPU, Arm7

Sign

1-bit

Exponent

5-bit

Mantissa

10-bit

• Google 16-bit Floating-point (

std::bfloat16_t ) C++23 → TPU, GPU, Arm8

Sign

1-bit

Exponent

8-bit

Mantissa

7-bit

half-precision-arithmetic-fp16-versus-bfloat16

38/83

8-bit Floating-Point (Non-Standardized in C++/IEEE)

• E4M3

Sign

1-bit

Exponent

4-bit

Mantissa

3-bit

•

E5M2

Sign

1-bit

Exponent

5-bit

Mantissa

2-bit

• Floating Point Formats for Machine Learning, IEEE draft

• FP8 Formats for Deep Learning, Intel, Nvidia, Arm

39/83

Other Real Value Representations (Non-standardized in C++/IEEE) 1/2

• TensorFloat-32 (TF32) Specialized ﬂoating-point format for deep learning

applications

• Posit (John Gustafson, 2017), also called unum II I (universal number), represents

ﬂoating-point values with variable-width of exponent and mantissa.

It is implemented in experimental platforms

• NVIDIA Hopper Architecture In-Depth

• Beating Floating Point at its Own Game: Posit Arithmetic

• Posits, a New Kind of Number, Improves the Math of AI

• Comparing posit and IEEE-754 hardware cost

40/83

Other Real Value Representations (Non-standardized in C++/IEEE) 2/2

• Microscaling Formats (MX) Speciﬁcation for low-precision ﬂoating-point

formats deﬁned by AMD, Arm, Intel, Meta, Microsoft, NVIDIA, and Qualcomm.

It includes FP8, FP6, FP4, (MX)INT8

• Fixed-point representation has a ﬁxed number of digits after the radix p oint

(decimal point). The gaps between adjacent numbers are always equal. The range

of their values is signiﬁcantly limited c ompared to ﬂoating-point numbers.

It is widely used on embedded systems

• OCP Microscaling Formats (MX) Specification

41/83

Floating-point Representation 1/2

Floating-point number:

• Radix (or base): β

• Precision (or digits): p

• Exponent (magnitude): e

• Mantissa: M

n = M

|{z}

×β

→ IEEE754: 1.M × 2

float f1 = 1.3f; // 1.3

float f2 = 1.1e2f; // 1.1 · 10

float f3 = 3.7E4f; // 3.7 · 10

float f4 = .3f; // 0.3

double d1 = 1.3; // without "f"

double d2 = 5E3; // 5 · 10

42/83

Floating-point Representation 2/2

Exponent Bias

In IEEE754 ﬂoating point numbers, the exp onent value is oﬀset from the actual value

by the exponent bias

• The exponent is stored as an unsigned value suitable for comparison

• Floating point values are

lexicographic ordered

• For a single-precision number, the exponent is stored in the range [1, 254] (0 and 255

have special meanings), and is

biased by subtracting 127 to get an exponent value in the

range [−126, +127]

0 10000111 11000000000000000000000

+ 2

(135−127)

= 2

= 0.5+0.25 = 0.75

normal

→

1.75

+1.75 ∗ 2

= 448.0

43/83

Floating-point - Normal/Denormal 1/2

Normal number

A normal number is a ﬂoating point value that can be represented with at least one

bit set in the exponent or the mantissa has all 0s

Denormal number

Denormal (or subnormal) numbers ﬁll the underﬂow gap around zero in

ﬂoating-point arithmetic. Any non-zero number with magnitude smaller than the

smallest normal number is

denormal

A denormal number is a ﬂoating point value that can be represented with all 0s in

the exponent, but the mantissa is non-zero

44/83

Floating-point - Normal/Denormal 2/2

Why denormal numbers make sense: (↓ normal numbers)

The problem: distance values from zero (↓ denormal numbers)

Floating-point representation, by Carl Burch

45/83

Inﬁnity ∞ 1/2

Inﬁnity

In the IEEE754 standard, inf (inﬁnity value) is a numeric data type value that

exceeds the maximum (or minimum) representable value

Operations generating

inf :

• ±∞ · ±∞

• ±∞ · ±finite_value

• finite_value op finite_value > max_value

• ﬁnite value / ± 0

There is a single representation for +inf and -inf

Comparison: (inf == finite_value) → false

(±inf == ±inf) → true

46/83

Inﬁnity 2/2

cout << 5.0 / 0.0; // print "inf"

cout << -5.0 / 0.0; // print "-inf"

auto inf = std::numeric_limits<float>::infinity;

cout

<< (-0.0 == 0.0); // true, 0 == 0

cout << ((5.0f / inf) == ((-5.0f / inf)); // true, 0 == 0

cout << (10e40f) == (10e40f + 9999999.0f); // true, inf == inf

cout << (10e40) == (10e40f + 9999999.0f); // false, 10e40 != inf

47/83

Not a Number (NaN) 1/2

NaN

In the IEEE754 standard, NaN (not a number) is a numeric data type value

representing an undeﬁned or non-representable value

Floating-point operations generating

NaN :

• Operations with a NaN as at least one operand

• ±∞ · ∓∞ , 0 · ∞

• 0/0, ∞/∞

•

√

x, log(x) for x < 0

• sin

−1

(x), cos

−1

(x) for x < −1 or x > 1

Comparison: (NaN == x) → false, for every x

(NaN == NaN) → false

48/83

Not a Number (NaN) 2/2

There are many representations for NaN (e.g. 2

− 2 for float)

The speciﬁc (bitwise) NaN value returned by an operation is implementation/compiler

speciﬁc

cout << 0 / 0; // undefined behavior

cout << 0.0 / 0.0; // print "nan" or "-nan"

quiet_NaN

49/83

Machine Epsilon

Machine epsilon

Machine epsilon ε

ε (or machine accuracy) is deﬁned to be the smallest number that

can be added to 1.0 to give a number other than one

IEEE 754 Single precision : ε

ε = 2

−23

≈ 1.19209 ∗ 10

−7

IEEE 754 Double precision : ε

ε = 2

−52

≈ 2.22045 ∗ 10

−16

50/83

Units at the Last Place (ULP)

ULP

Units at the Last Place is the gap between consecutive ﬂoating-point numbers

ULP(p, e) = β

e−(p−1)

→ 2

e−(p−1)

Example:

β = 10, p = 3

π = 3.1415926... → x = 3.14 × 10

ULP(3, 0) = 10

−2

= 0.01

Relation with ε

ε:

• ε

ε = ULP(p, 0)

• ULP

= ε

ε ∗ β

e(x)

51/83

Floating-Point Representation of a Real Number

The machine ﬂoating-point representation ﬂ(x) of a real number x is expressed as

ﬂ (x) = x (1 + δ), where δ is a small constant

The approximation of a real number x has the following properties:

Absolute Error : |ﬂ(x ) − x| ≤

· ULP

Relative Error :



ﬂ (x) − x



≤

·ε

52/83

Floating-point - Cheatsheet 1/3

• NaN (mantissa = 0)

∗ 11111111 ***********************

• ± inﬁnity

∗ 11111111 00000000000000000000000

• Lowest/Largest (±3.40282 ∗ 10

+38

)

∗ 11111110 11111111111111111111111

• Minimum (normal) (±1.17549 ∗ 10

−38

)

∗ 00000001 00000000000000000000000

• Denormal number (< 2

−126

)(minimum: 1.4 ∗ 10

−45

)

∗ 00000000 ***********************

• ±0

∗ 00000000 00000000000000000000000

53/83

Floating-point - Cheatsheet 2/3

E4M3 E5M2 float16_t

Exponent 4 [0*-14] (no inf) 5-bit [0*-30]

Bias 7 15

Mantissa 4-bit 2-bit 10-bit

Largest (±)

1.75 ∗ 2

448

1.75 ∗ 2

57, 344

65, 536

Smallest (±)

−6

0.015625

−14

0.00006

Smallest Denormal

−9

0.001953125

−16

1.5258 ∗ 10

−5

−24

6.0 · 10

−8

Epsilon

−4

0.0625

−2

0.25

−10

0.00098

Floating-point - Cheatsheet 3/3

bfloat16_t float double

Exponent 8-bit [0*-254] 11-bit [0*-2046]

Bias 127 1023

Mantissa 7-bit 23-bit 52-bit

Largest (±)

128

3.4 · 10

1024

1.8 · 10

308

Smallest (±)

−126

1.2 · 10

−38

−1022

2.2 · 10

−308

Smallest Denormal /

−149

1.4 · 10

−45

−1074

4.9 · 10

−324

Epsilon

−7

0.0078

−23

1.2 · 10

−7

−52

2.2 · 10

−16

Floating-point - Limits

# include <limits>

// T: float or double

std::numeric_limits<T>::max(); // largest value

std::numeric_limits<T>::lowest(); // lowest value (C++11)

std::numeric_limits<T>::min(); // smallest value

std::numeric_limits<T>::denorm_min() // smallest (denormal) value

std::numeric_limits<T>::epsilon(); // epsilon value

std::numeric_limits<T>::infinity() // infinity

std::numeric_limits<T>::quiet_NaN() // NaN

58/83

Floating-point - Useful Functions

# include <cmath> // C++11

bool std::isnan(T value) // check if value is NaN

bool std::isinf(T value) // check if value is ±infinity

bool std::isfinite(T value) // check if value is not NaN

// and not ±infinity

bool std::isnormal(T value); // check if value is Normal

T std::ldexp(T x, p) // exponent shift x ∗ 2

int std::ilogb(T value) // extracts the exponent of value

59/83

Floating-point Arithmetic Properties 1/3

Floating-point operations are written

• ⊕ addition

• ⊖ subtraction

• ⊗ multiplication

• ⊘ division

⊙ ∈ {⊕, ⊖, ⊗, ⊘}

op ∈ {+, −, ∗, /} denotes exact precision operations

60/83

Floating-point Arithmetic Properties 2/3

(P1) In general, a op b = a ⊙ b

(P2) Not Reﬂexive a = a

• Reﬂexive without NaN

(P3) Not Commutative a ⊙ b = b ⊙ a

• Commutative without NaN (NaN = NaN)

(P4) In general, Not Associative (a ⊙ b) ⊙ c = a ⊙ (b ⊙ c)

• even excluding NaN and inf in intermediate computations

(P5) In general, Not Distributive (a ⊕ b) ⊗ c = (a ⊗ c) ⊕ (b ⊗ c)

• even excluding NaN and inf in intermediate computations

61/83

Floating-point Arithmetic Properties 3/3

(P6) Identity on operations is not ensured

• (a ⊖ b) ⊕ b = a

• (a ⊘ b) ⊗ b = a

(P7) Overﬂow/Underﬂow Floating-point has

“saturation” values inf, -inf

• as opposite to integer arithmetic with wrap-around behavior

62/83

Special Values Behavior

Zero behavior

• a ⊘ 0 = inf, a ∈ {ﬁnite − 0} [IEEE-764], undeﬁned behavior in C++

• 0 ⊘ 0, inf ⊘ 0 = NaN [IEEE-764], undeﬁned behavior in C++

• 0 ⊗ inf = NaN

• +0 = -0 but they have a diﬀerent binary representation

Inf behavior

• inf ⊙ a = inf, a ∈ {ﬁnite − 0}

• inf ⊕⊗ inf = inf

• inf ⊖⊘ inf = NaN

• ± inf ⊙ ∓ inf = NaN

• ± inf = ± inf

NaN behavior

• NaN ⊙ a = NaN

• NaN = a

63/83

Floating-Point Undeﬁned Behavior

• Division by zero

e.g., 10

/0.0

• Conversion to a narrower ﬂoating-point type:

e.g.,

0.1 double → float

• Conversion from ﬂoating-point to integer:

e.g.,

float → int

• Operations on signaling NaNs: Arithmetic operations that cause an “invalid

operation” exception to be signaled

e.g.,

inf - inf

• Incorrectly assuming IEEE-754 compliance for all platforms:

e.g., Some embedded Linux distribution on ARM

64/83

Detect Floating-point Errors

⋆

1/2

C++11 allows determining if a ﬂoating-point exceptional condition has o ccurred by

using ﬂoating-point exception facilities provided in

<cfenv>

#include <cfenv>

// MACRO

FE_DIVBYZERO // division by zero

FE_INEXACT // rounding error

FE_INVALID // invalid operation, i.e. NaN

FE_OVERFLOW // overflow (reach saturation value +inf)

FE_UNDERFLOW // underflow (reach saturation value -inf)

FE_ALL_EXCEPT // all exceptions

// functions

std::feclearexcept(FE_ALL_EXCEPT); // clear exception status

std::fetestexcept(<macro>); // returns a value != 0 if an

// exception has been detected

65/83

Detect Floating-point Errors

⋆

2/2

#include <cfenv> // floating point exceptions

#include <iostream>

#pragma STDC FENV_ACCESS ON // tell the compiler to manipulate the floating-point

// environment (not supported by all compilers)

// gcc: yes, clang: no

int main() {

std::feclearexcept(FE_ALL_EXCEPT); // clear

auto x = 1.0 / 0.0; // all compilers

std::cout << (bool) std::fetestexcept(FE_DIVBYZERO); // print true

std::feclearexcept(FE_ALL_EXCEPT); // clear

auto x2 = 0.0 / 0.0; // all compilers

std::cout << (bool) std::fetestexcept(FE_INVALID); // print true

std::feclearexcept(FE_ALL_EXCEPT); // clear

auto x4 = 1e38f * 10; // gcc: ok

std::cout << std::fetestexcept(FE_OVERFLOW); // print true

}

see What is the difference between quiet NaN and signaling NaN?

66/83

Floating-point Issues

Some Examples... 1/4

Ariene 5: data conversion from 64-bit

ﬂoating point value to 16-bit signed in-

teger → $137 million

Patriot Missile: small chopping error

at each operation, 100 hours activity

→ 28 deaths

67/83

Some Examples... 2/4

Integer type is more accurate than ﬂoating type for large numbers

cout << 16777217; // print 16777217

cout << (int) 16777217.0f; // print 16777216!!

cout << (int) 16777217.0; // print 16777217, double ok

float numbers are diﬀerent from double numbers

cout << (1.1 != 1.1f); // print true !!!

68/83

Some Examples... 3/4

The ﬂoating point precision is ﬁnite!

cout << setprecision(20);

cout << 3.33333333f; // print 3.333333254!!

cout << 3.33333333; // print 3.333333333

cout << (0.1 + 0.1 + 0.1 + 0.1 + 0.1 + 0.1); // print 0.59999999999999998

Floating point arithmetic is not associative

cout << 0.1 + (0.2 + 0.3) == (0.1 + 0.2) + 0.3; // print false

IEEE754 Floating-point computation guarantees to produce deterministic output,

namely the exact bitwise value for each run, if and only if the order of the operations

is always the same

→ same result on any machine and for a ll runs

69/83

Some Examples... 4/4

“Using a double-precision ﬂoating-point value, we can represent easily the

number of atoms in the universe.

If your software ever produces a number so large that it will not ﬁt in a

double-precision ﬂoating-point value, chances are good that you have a bug”

Daniel Lemire, Prof. at the University of Quebec

“ NASA uses just 15 digits of π to calculate interplanetary travel.

With 40 digits, you could calculate the circumference of a circle the size of the

visible universe with an accuracy that would fall by less than the diameter of

a single hydrogen atom”

Latest in space, Twitter

Number of atoms in the universe versus floating-point values

70/83

Floating-point Algorithms

• addition algorithm (simpliﬁed):

(1) Compare the exponents of the two numbers. Shift the smaller number to the right until its

exponent would match the larger exponent

(2) Add th e mantissa

(3) Normalize the sum if needed (shift right/left the exponent by 1)

• multiplication algorithm (simpliﬁed):

(1) Multiplication of mantissas. The number of bits of the result is twice the size of the operands

(46 + 2 bits, with +2 for implicit normalization)

(2) Normalize the product if needed (shift right/left the exponent by 1)

(3) Addition of the exponents

• fused multiply-add (fma):

• Recent architectures (also GPUs) provide fma to compute addition and multiplication in a single

instruction (performed by the compiler in most cases)

• The rounding error of fma(x , y, z) is less than (x ⊗ y ) ⊕ z

71/83

Catastrophic Cancellation 1/5

Catastrophic Cancellation

Catastrophic cancellation (or loss of signiﬁcance) refers to loss of relevant

information in a ﬂoating-point computation that cannot be revered

Two cases:

(C1) a ± b, where a ≫ b or b ≫ a. The value (or part of the value) of the smaller

number is lost

(C2) a − b, where a, b are approximation of exact values and a ≈ b, namely a loss of

precision in both a and b. a − b cancels most of the relevant part of the result

because a ≈ b. It implies a small absolute error but a large relative error

72/83

Catastrophic Cancellation (case 1) - Granularity 2/5

Intersection = 16, 777, 216 = 2

73/83

Catastrophic Cancellation (case 1) 3/5

How many iterations performs the following code?

while (x > 0)

x = x - y;

How many iterations?

float: x = 10,000,000 y = 1 -> 10,000,000

float: x = 30,000,000 y = 1 -> does not terminate

float: x = 200,000 y = 0.001 -> does not terminate

bfloat: x

= 256 y = 1 -> does not terminate !!

74/83

Catastrophic Cancellation (case 1) 4/5

Floating-point increment

float x = 0.0f;

for (int i = 0; i < 20000000; i++)

+= 1.0f;

What is the value of

x at the end of the loop?

Ceiling division





// std::ceil((float) 101 / 2.0f) -> 50.5f -> 51

float x = std::ceil((float) 20000001 / 2.0f);

What is the value of

x ?

75/83

Catastrophic Cancellation (case 2) 5/5

Let’s solve a quadratic equation:

1,2

−b ±

√

− 4ac

+ 5000x + 0.25

(-5000 + std::sqrt(5000.0f * 5000.0f - 4.0f * 1.0f * 0.25f)) / 2 // x2

(-5000 + std::sqrt(25000000.0f - 1.0f)) / 2 // catastrophic cancellation (C1)

(-5000 + std::sqrt(25000000.0f)) / 2

(-5000 + 5000) / 2 = 0 // catastrophic cancellation (C2)

// correct result: 0.00005!!

relative error:

|0 − 0.00005|

0.00005

100%

76/83

Floating-point Comparison 1/3

The problem

cout << (0.11f + 0.11f < 0.22f); // print true!!

cout << (0.1f + 0.1f > 0.2f); // print true!!

Do not use absolute error margins!!

bool areFloatNearlyEqual(float a, float b) {

if (std::abs(a - b) < epsilon); // epsilon is fixed by the user

return true;

return false;

}

Problems:

• Fixed epsilon “looks small" but it could be too large when the numb ers being compared

are

very small

• If the compared numbers are very large, the epsilon could end up being smaller than the

smallest rounding error, so that the comparison always returns false

77/83

Floating-point Comparison 2/3

Solution: Use relative error

|a−b|

< ε

bool areFloatNearlyEqual(float a, float b) {

if (std::abs(a - b) / b < epsilon); // epsilon is fixed

return true;

return false;

}

Problems:

• a=0, b=0 The division is evaluated as 0.0/0.0 and the whole if statement is (nan <

espilon) which always returns false

• b=0 The division is evaluated as abs(a)/0.0 and the whole if statement is (+inf <

espilon) which always returns false

• a and b very small. The result should be true but the division by b may produces

wrong results

• It is not commutative. We always divide by b

78/83

Floating-point Comparison 3/3

Possible solution:

|a−b|

max(|a|,|b|)

< ε

bool areFloatNearlyEqual(float a, float b) {

constexpr float normal_min = std::numeric_limits<float>::min();

constexpr float relative_error = <user_defined>

if (!std::isfinite(a) || !isfinite(b)) // a = ±∞, NaN or b = ±∞, NaN

return false;

float diff = std::abs(a - b);

// if "a" and "b" are near to zero, the relative error is less effective

if (diff <= normal_min) // or also: user_epsilon * normal_min

return true;

float abs_a = std::abs(a);

float abs_b = std::abs(b);

return (diff / std::max(abs_a, abs_b)) <= relative_error;

}

79/83

Minimize Error Propagation - Summary

• Prefer multiplication/division rather than addition/subtraction

• Try to reorganize the computation to keep near numbers with the same scale

(e.g. sorting numbers)

• Consider putting a zero very small number (under a threshold). Common

application: iterative algorithms

• Scale by a power of two is safe

• Switch to log scale. Multiplication becomes Add, and Division becomes

Subtraction

• Use a compensation algorithm like Kahan summation, Dekker’s FastTwoSum,

Rump’s AccSum

80/83

References

Suggest readings:

• What Every Computer Scientist Should Know About Floating-Point Arithmetic

• Do Developers Understand IEEE Floating Point?

• Yet another floating point tutorial

• Unavoidable Errors in Computing

Floating-point Comparison readings:

• The Floating-Point Guide - Comparison

• Comparing Floating Point Numbers, 2012 Edition

• Some comments on approximately equal FP comparisons

• Comparing Floating-Point Numbers Is Tricky

Floating point tools:

• IEEE754 visualization/converter

• Find and fix floating-point problems

81/83

System/360 Model 44

Ken Shirriff: Want to adjust your computer’s floating point precision by turning

a knob? You could do that on the System/360 Model 44

82/83

On Floating-Point

83/83

Modern C++

Programming

5. Basic Concepts III

Entities and Control Flow

Federico Busato

2025-04-14

Table of Contents

1 Entities

2 Declaration and Deﬁnition

3 Enumerators

4 struct, Bitﬁeld, and union

struct

Anonymous and Unnamed struct

⋆

Bitﬁeld

union

1/65

Table of Contents

5 Control Flow

if Statement

for and while Loops

Range-based for Loop

switch

goto

Avoid Unused Variable Warning

2/65

Table of Contents

6 Namespace

Explicit Global Namespace

Namespace Alias

using-Declaration

using namespace-Directive

inline Namespace

⋆

3/65

Table of Contents

7 Attributes

⋆

[[nodiscard]]

[[maybe_unused]]

[[deprecated]]

[[noreturn]]

4/65

Entities

A C++ program is set of language-speciﬁc keywords (for, if, new, true, etc.),

identiﬁers (symbols for variables, functions, structures, namespaces, etc.), expressions

deﬁned as sequence of operators, and literals (constant value tokens)

C++ Entity

An entity is a value, object, reference, function, enumerator, type, class member, or

template

Identiﬁers and user-deﬁned operators are the names used to refer to entities

Entities also captures the result(s) of an expression

Preprocessor macros are not C++ entities

5/65

Declaration and

Deﬁnition

Declaration/Deﬁnition

Declaration/Prototype

A declaration (or prototype) introduces an entity with an identiﬁer describing its

type and properties

A declaration is what the compiler and the linker needs to accept references (usage) to

that identiﬁer

Entities can be declared

multiple times. All declarations are the same

Deﬁnition/Implementation

An entity deﬁnition is the implementation of a declaration. It deﬁnes the properties

and the behavior of the entity

For each entity, only a single deﬁnition is allowed

6/65

Declaration/Deﬁnition Function Example

void f(int a, char* b); // function declaration

void f(int a, char*) { // function deﬁnition

... // "b" can be omitted if not used

}

void f(int a, char* b); // function declaration

// multiple declarations is valid

f(3, "abc"); // usage

void g(); // function declaration

g(); // linking error "g" is not defined

7/65

Declaration/Deﬁnition struct Example

A declaration without a concrete implementation is an incomplete type (as void )

struct A; // declaration 1

struct A; // declaration 2 (ok)

struct B { // declaration and deﬁnition

int b;

// A x; // compile error incomplete type

A* y; // ok, pointer to incomplete type

};

struct A { // deﬁnition

char c;

}

8/65

Enumerators

Enumerator - enum

Enumerator

An enumerator enum is a data type that groups a set of named integral constants

enum color_t { BLACK, BLUE, GREEN };

color_t color

= BLUE;

cout << (color == BLACK); // print false

The problem:

enum color_t { BLACK, BLUE, GREEN };

enum fruit_t { APPLE, CHERRY };

color_t color

= BLACK; // int: 0

fruit_t fruit = APPLE; // int: 0

bool b = (color == fruit); // print 'true'!!

// and, most importantly, does the match between a color and

// a fruit make any sense?

9/65

Strongly Typed Enumerator - enum class

enum class (C++11)

enum class (scoped enum) data type is a type safe enumerator that is not implicitly

convertible to int

enum class Color { BLACK, BLUE, GREEN };

enum class Fruit { APPLE, CHERRY };

Color color

= Color::BLUE;

Fruit fruit = Fruit::APPLE;

// bool b = (color == fruit) compile error we are trying to match colors with fruits

// BUT, they are different things entirely

// int a1 = Color::GREEN; compile error

// int a2 = Color::RED + Color::GREEN; compile error

int a3 = (int) Color::GREEN; // ok, explicit conversion

10/65

enum/enum class Features

• enum/enum class can be compared

enum class Color { RED, GREEN, BLUE };

cout

<< (Color::RED < Color::GREEN); // print true

• enum/enum class are automatically enumerated in increasing order

enum class Color { RED, GREEN = -1, BLUE, BLACK };

// (0) (-1) (0) (1)

Color::RED == Color::BLUE; // true

• enum/enum class can contain alias

enum class Device { PC = 0, COMPUTER = 0, PRINTER };

• C++11 enum/enum class allows setting the underlying type

enum class Color : int8_t { RED, GREEN, BLUE };

11/65

enum class Features - C++17

• C++17 enum class supports direct-list-initialization

enum class Color { RED, GREEN, BLUE };

Color a{2}; // ok, equal to Color:BLUE

• C++17 enum/enum class support attributes

enum class Color { RED, GREEN, BLUE [[deprecated]] };

auto x = Color::BLUE; // compiler warning

12/65

enum class Features - C++20

• C++20 allows introducing the enumerator identiﬁers into the local scope to

decrease the verbosity

enum class Color { RED, GREEN, BLUE };

switch (x) {

using enum Color; // C++20

case RED:

case GREEN:

case BLUE:

}

The same behavior can be emulated in older C++ versions with

enum class Color { RED, GREEN, BLUE };

constexpr auto RED = Color::RED;

13/65

enum/enum class - Common Errors

• enum/enum class should be always initialized

enum class Color { RED, GREEN, BLUE };

Color my_color;

// "my_color" may be outside RED, GREEN, BLUE!!

• C++17 Cast from out-of-range values respect to the underlying type of

enum/enum class leads to undeﬁned behavior

enum Color : uint8_t { RED, GREEN, BLUE };

Color value

= 256; // undefined behavior

14/65

enum/enum class and constexpr

⋆

• C++17 constexpr expressions don’t allow out-of-range values for (only) enum

without explicit underlying type

enum Color { RED };

enum Fruit : int { APPLE };

enum class Device { PC };

// constexpr Color a1 = (Color) -1; compile error

const Color a2 = (Color) -1; // ok

constexpr Fruit a3 = (Fruit) -1; // ok

constexpr Device a4 = (Device) -1; // ok

Construction Rules for enum class Values

15/65

struct, Bitﬁeld, and

union

struct 1/2

A struct (structure) aggregates diﬀerent variables into a single unit

struct A {

int x;

char y;

};

It is possible to declare one or more variables after the deﬁnition of a

struct

struct A {

int x;

} a, b;

Enumerators can be declared within a

struct without a name

struct A {

enum {X, Y}

};

::X;

16/65

struct 2/2

It is possible to declare a struct in a local scope (with some restrictions), e.g.

function scope

int f() {

struct A {

int x;

} a;

return a.x;

}

17/65

Anonymous and Unnamed struct

⋆

Unnamed struct : a structure without a name, but with an associated type

Anonymous

struct : a structure without a name and type

The C++ standard allows unnamed

struct but, contrary to C, does not allow

anonymous struct (i.e. without a name)

struct {

int x;

} my_struct; // unnamed struct, ok

struct S {

int x;

struct { int y; }; // anonymous struct, compiler warning with -Wpedantic

}; // -Wpedantic: diagnose use of non-strict ISO C++ extensions

18/65

Bitﬁeld

A bitﬁeld is a variable of a structure with a predeﬁned bit width. A bitﬁeld can hold

bits instead bytes

struct S1 {

int b1 : 10; // range [0, 1023]

int b2 : 10; // range [0, 1023]

int b3 : 8; // range [0, 255]

}; // sizeof(S1): 4 bytes

struct S2 {

int b1 : 10;

int : 0; // reset: force the next field

int b2 : 10; // to start at bit 32

}; // sizeof(S2): 8 bytes

19/65

union 1/2

Union

A union is a special data type that allows to store diﬀerent data types in the same

memory location

• The union is only as big as necessary to hold its largest data member

• The union is a kind of “overlapping” storage

20/65

union 2/2

union A {

int x;

char y;

};

// sizeof(A): 4

A a;

a.x

= 1023; // bits: 00..000001111111111

a.y = 0; // bits: 00..000001100000000

cout << a.x; // print 512 + 256 = 768

NOTE: Little-Endian encoding maps the bytes of a value in memory in the reverse order. y

maps to the last byte of

Contrary to

struct , C++ allows anonymous union (i.e. without a name)

C++17 introduces std::variant to represent a type-safe union

21/65

Control Flow

if Statement

The if statement executes the ﬁrst branch if the speciﬁed condition is evaluated to

true , the second branch otherwise

• Short-circuiting:

if (<true expression> r| array[-1] == 0)

...

// no error!! even though index is -1

// left-to-right evaluation

• Ternary operator:

<expression1> and <expression2> must return a value of the same or convertible

type

int value = (a == b) ? a : (b == c ? b : 3); // nested

22/65

for and while Loops

• for

for ([init]; [cond]; [increment]) {

...

}

To use when number of iterations is known

• while

while (cond) {

...

}

To use when number of iterations is not known

• do while

do {

...

}

while (cond);

To use when number of iterations is not known, but there is at least one iteration

23/65

for Loop Features and Jump Statements

• C++ allows multiple initializations and increments in the declaration:

for (int i = 0, k = 0; i < 10; i++, k += 2)

...

• Inﬁnite loop:

for (;;) // also while(true);

...

• Jump statements (

break, continue, return):

for (int i = 0; i < 10; i++) {

if (<condition>)

break; // exit from the loop

if (<condition>)

continue; // continue with a new iteration and exec. i++

return; // exit from the function

}

24/65

Range-based for Loop 1/3

C++11 introduces the range-based for loop to simplify the verbosity of traditional

for loop constructs. They are equivalent to the for loop operating over a range of

values, but safer

The range-based for lo op avoids the user to specify start, end, and increment of the

loop

for (int v : { 3, 2, 1 }) // INITIALIZER LIST

cout << v << " "; // print: 3 2 1

int values[] = { 3, 2, 1 };

for (int v : values) // ARRAY OF VALUES

cout << v << " "; // print: 3 2 1

for (auto c : "abcd") // RAW STRING

cout << c << " "; // print: a b c d

25/65

Range-based for Loop ⇝ 2/3

Range-based for loop can be applied in three cases:

• Fixed-size array int array[3] , "abcd"

• Branch Initializer List {1, 2, 3}

• Any object with begin() and end() methods

std::vector vec{1, 2, 3, 4};

for (auto x : vec) {

cout

<< x << ", ";

// print: "1, 2, 3, 4"

int matrix[2][4];

for (auto& row : matrix) {

for (auto element : row)

cout << "@";

cout << "\n";

}

// print: @@@@

// @@@@

26/65

Range-based for Loop ⇝ 3/3

C++17 extends the concept of range-based lo op for structure binding

struct A {

int x;

int y;

};

A array[]

= { {1,2}, {5,6}, {7,1} };

for (auto [x1, y1] : array)

cout << x1 << "," << y1 << " "; // print: 1,2 5,6 7,1

27/65

switch 1/2

The switch statement evaluates an expression ( int , char , enum class , enum )

and executes the statement associate d with the matching case value

char x = ...

switch (x) {

case 'a': y = 1; break;

default: return -1;

}

return y;

Switch scope:

int x = 1;

switch (1) {

case 0: int x; // nearest scope

case 1: cout << x; // undefined!!

case 2: { int y; } // ok

// case 3: cout << y; // compile error

}

28/65

switch 2/2

Fall-through:

MyEnum x

int y = 0;

switch (x) {

case MyEnum::A: // fall-through

case MyEnum::B: // fall-through

case MyEnum::C: return 0;

default: return -1;

}

C++17 [[fallthrough]] attribute

char x = ...

switch (x) {

case 'a': x++;

[[fallthrough]]; // C++17: avoid warning

case 'b': return 0;

default: return -1;

}

29/65

Control Flow with Initializing Statement

Control ﬂow with initializing statement aims at simplifying complex actions before

the condition evaluation and restrict the scope of a variable which is visible only in the

control ﬂow body

C++17 introduces if statement with initializer

if (int ret = x + y; ret < 10)

cout

<< ret;

C++17 introduces switch statement with initializer

switch (auto i = f(); x) {

case 1: return i + x;

C++20 introduces range-for loop statement with initializer

for (int i = 0; auto x : {'A', 'B', 'C'})

cout << i++ << ":" << x << " "; // print: 0:A 1:B 2:C

30/65

goto 1/4

When goto could b e useful:

bool flag = true;

for (int i = 0; i < N && flag; i++) {

for (int j = 0; j < M && flag; j++) {

if (<condition>)

flag

= false;

}

become:

for (int i = 0; i < N; i++) {

for (int j = 0; j < M; j++) {

if (<condition>)

goto LABEL;

}

LABEL: ;

31/65

goto 2/4

Best solution:

bool my_function(int M, int M) {

for (int i = 0; i < N; i++) {

for (int j = 0; j < M; j++) {

if (<condition>)

return false;

}

return true;

}

32/65

goto 3/4

33/65

goto 4/4

34/65

Avoid Unused Variable Warning 1/3

Most compilers issue a warning when a variable is unused. There are diﬀerent

situations where a variable is expected to be unused

// EXAMPLE 1: macro dependency

int f(int value) {

int x = value;

# if defined(ENABLE_SQUARE_PATH)

return x * x;

# else

return 0;

# endif

}

35/65

Avoid Unused Variable Warning

⋆

2/3

// EXAMPLE 2: constexpr dependency (MSVC)

template<typename T>

int f(T value) {

if constexpr (sizeof(value) >= 4)

return 1;

else

return

}

// EXAMPLE 3: decltype dependency (MSVC)

template<typename T>

int g(T value) {

using R = decltype(value);

return R{};

}

36/65

Avoid Unused Variable Warning 3/3

There are diﬀerent ways to solve the problem de pending on the standard used

• Before

C++17: static_cast<void>(var)

• C++17 [[maybe_unused]] attribute

• C++26 auto _

[[maybe_unused]] int x = value;

int y = 3;

static_cast<void>(y);

auto _ = 3;

auto _ = 4; // _ repetition is not an error

void f([[maybe_unused]] int x) {}

37/65

Namespace

Overview

The problem: Named entities, such as variables, functions, and compound types declared

outside any block has global scope, meaning that its name/symbol is valid anywhere in

the code

Namespaces W allow grouping named entities that otherwise would have global

scope into narrower scopes, giving them namespace scope

Namespaces provide a method for preventing name conﬂicts in large projects. Symbols

declared inside a namespace block are placed in a named scope that prevents them from

being mistaken for symbols with identical names

38/65

Namespace Syntax

namespace [<name>] {

<identifier> // variable, function, struct, type, etc.

} // namespace <name>

<name>::<identifier> // use the identifier

The operator :: is called scope resolution operator and it allows accessing

identiﬁers that are deﬁned in other namespaces

39/65

Namespace Example 1

# include <iostream>

namespace my_namespace1 {

void f() {

std

::cout << "my_namespace1" << std::endl;

}

// namespace my_namespace1

namespace my_namespace2 {

void f() {

std

::cout << "my_namespace2" << std::endl;

}

// namespace my_namespace2

int main () {

my_namespace1

::f(); // print "my_namespace1"

my_namespace2::f(); // print "my_namespace2"

// f(); // compile error f() is not visible

}

40/65

Namespace - Alternative Syntax

It is also possible to declare entities in a preexisting namespace by adding the name as

a preﬁx:

namespace <name> {}

# include <iostream>

namespace my_namespace1 {}

void my_namespace2::f() { std::cout << "my_namespace2" << std::endl; }

int main () {

my_namespace1

::f(); // print "my_namespace1"

}

41/65

Special Namespaces

• All functionalities and data types provided with the standard library (distributed

along with the compiler) are declared within the

std namespace

• The global namespace can be speciﬁed with ::identifier and can be useful to

prevent conﬂicts with surrounding namespaces

• It is also possible to deﬁne a namespace without a name. The concept refers to

anonymous (or unnamed) namespace

See "Translation Unit I" lecture for more details

42/65

Nested Namespaces

namespace my_namespace1 {

void f() { cout << "my_namespace1::f()"; }

namespace my_namespace2 {

void f() { cout << "my_namespace1::my_namespace2::f()"; }

}

// namespace my_namespace2

} // namespace my_namespace1

my_namespace1::my_namespace2::f();

C++17 allows nested namespace deﬁnitions with a le ss verbose syntax:

namespace my_namespace1::my_namespace2 {

void h();

}

43/65

Explicit Global Namespace

The explicit global namespace syntax ::identifier can be useful to prevent conﬂicts

with surrounding namespaces

void f() { cout << "global::f()"; }

namespace my_namespace {

void f() { cout << "my_namespace::f()"; }

void g() {

f();

// print "my_namespace::f()"

::f(); // print "global::f()"

}

} // namespace my_namespace

44/65

Namespace Alias

Namespace alias allows declaring an alternate name for an existing namespace

namespace very_long_namespace {

namespace even_longer {

void g() {}

}

// namespace even_longer

} // namespace very_long_namespace

namespace ns1 = very_long_namespace::even_longer; // namespace alias

int main() {

namespace ns2 = very_long_namespace::even_longer; // namespace alias

// available only in this scope

ns1::g();

ns2::g();

}

45/65

using-Declaration 1/2

The using -declaration introduces a speciﬁc name/system from a namespace into the

current scope. This is useful for improving code readability and reducing verbosity

The

using -declaration is roughly equivalent of declaring the name/system in the

current scope

Syntax:

namespace <name> {

}

using <name>::<identifier>;

<identifier>;

46/65

using-Declaration 2/2

namespace my_namespace {

void f() { cout << "my_namespace::f()"; }

struct S {};

using T = int;

}

// namespace my_namespace

using my_namespace::f;

using my_namespace::S;

using my_namespace::T;

f(); // print "my_namespace::f()"

S s;

T x;

// struct S {}; // compile error "struct S" already defined by my_namespace::T

47/65

using namespace-Directive

The using namespace -directive introduces all the identiﬁers in a scope without

having to specify them explicitly with the namespace name

Similarly to

using -declaration, it is useful for improving code readability and reducing

verbosity. On the other hand, it could make the code bug-prone because of the

complex name lookup rules, especially if coupled with function overloadding

It is generally recommended

not to write using namespace , especially at the global

level. Otherwise, it defeats the purpose of the namespace

48/65

using namespace-Directive

namespace my_namespace {

void f() { cout << "my_namespace::f()"; }

struct S {};

}

// namespace my_namespace

int main () {

using namespace my_namespace;

f(); // print "my_namespace::f()"

S s;

}

49/65

using namespace-Directive vs. using-declaration

namespace A { int x = 0; }

namespace B {

int y = 3;

int x = 7;

}

int main () {

using namespace A;

int x = 3; // ok!! even if it is already defined in my_namespace

using B::y;

// int y = 5; // compiler error!! "y" is already defined in this scope

}

void f() {

using B::x;

using namespace A;

cout

<< x; // print 7, B::x has higher priority

}

50/65

using namespace-Directive Transitive Property 1/3

using namespace -directive has the transitive property for its identiﬁers when used

into another namespace

namespace A {

void f() { cout << "A::f()"; }

}

namespace B {

using namespace A;

}

int main() {

using namespace B;

f();

// ok, print "A::f()"

}

51/65

using namespace-Directive Transitive Property

⋆

2/3

The unqualiﬁed name lo okup is the mechanism by which the compiler searches for

the declaration of an identiﬁer without using any explicit scope qualiﬁers like the ::

operator

Unqualiﬁed name lookup and using namespace-Directive:

Every name from

namespace-name is visible as if it is declared in the nearest

enclosing namespace which contains both the

using -directive and namespace-name

52/65

using namespace-Directive Transitive Property

⋆

3/3

namespace A { int i = 0; }

namespace C {

int i = 3;

namespace B {

using namespace A; // unqualified name lookup of A within B:

int x = i; // it is the nearest enclosing namespace which contains

} // namespace B // both A and B -> global namespace

// "int x = i" -> "int x = C::i" because C has higher

} // namespace C // precedence than the global namespace

int main() {

using namespace B;

cout << C::B::x; // print "3"

}

53/65

inline Namespace

⋆

inline namespace is a concept similar to library versioning. It is a mechanism that

makes a nested namespace look and act as if all its declarations were in the

surrounding namespace

namespace my_namespace1 {

inline namespace V99 { void f(int) {} } // most recent version

namespace V98 { void f(int) {} }

}

// namespace my_namespace1

using namespace my_namespace1;

V98::f(1); // call V98

V99::f(1); // call V99

f(1); // call default version (V99)

54/65

Attributes

⋆

C++ Attribute Overview

C++ attributes provide additional information to the compiler to enforce constraints

or enable code optimization

Attributes are annotation on top of standard code that can be applied to functions,

variables, classes, enumerator, types, etc.

C++11 introduces a standardized syntax for attributes: [[my-attribute]]

__attribute__((always_inline)) // < C++11, GCC/Clang/GNU compilers

__forceinline // < C++11, MSVC

[[gnu::always_inline]] // C++11, GCC/Clang/GNU compilers

[[msvc::forceinline]] // C++11, MSVC

In addtion, C++11 and later add standard attributes W such as maybe_unused ,

deprecated , and nodiscard

55/65

[[nodiscard]] Attribute 1/3

C++17 introduces the attribute [[nodiscard]] to issue a warning if the return

value of a function is discarded (not handled)

C++20 extends the attribute by allowing to add a reason

[[nodiscard("reason")]]

[[nodiscard]] bool empty();

empty();

// WARNING "discard return value"

C++23 adds the [[nodiscard]] attribute to lambda expressions

auto lambda = [] [[nodiscard]] (){ return 4; };

lambda(); // compiler warning

auto x = lambda(); // ok

56/65

[[nodiscard]] Attribute 2/3

[[nodiscard]] can be also be applied to enumerators enum / enum class and

structures

struct / class

enum class [[nodiscard]] MyEnum { EnumValue };

struct [[nodiscard]] MyStruct {};

MyEnum

f() { return MyEnum::EnumValue; }

MyStruct g() {

MyStruct s;

return s;

}

f(); // WARNING "discard return value"

g(); // WARNING "discard return value"

57/65

[[nodiscard]] Attribute 3/3

[[nodiscard]] can be also be applied to class constructors

MyStruct g() {

[[nodiscard]] MyStruct() {}

[[nodiscard]] MyStruct(

const MyStruct&) {}

}

MyStruct{};

// WARNING "discard return value"

MyStruct s{};

static_cast<MyStruct>(s); // WARNING "discard return value" for

// MyStruct(const MyStruct&)

58/65

[[maybe_unused]] Attribute 1/2

[[maybe_unused]] W applies to

• Variables

• Structure binding

• Functions parameters and return value

• Types

• Classes and structures

• Enumerators and single value enumerators

The limits of [[maybe_unused]]

59/65

[[maybe_unused]] Attribute 2/2

[[maybe_unused]] int x1;

[[maybe_unused]]

auto [x2, x3] = ...;

[[maybe_unused]]

int f([[maybe_unused]] int x4);

struct [[maybe_unused]] S {};

using MyInt [[maybe_unused]] = int;

enum [[maybe_unused]] Enum {

E1 [[maybe_unused]];

};

enum class [[maybe_unused]] EnumClass {

E2 [[maybe_unused]];

};

60/65

[[deprecated]] Attribute 1/4

C++14 allows to deprecate, namely discourage, use of entities by adding the

[[deprecated]] W attribute, optionally with a message

[[deprecated("reason")]] . It applies to:

• Functions

• Variables

• Classes and structures

• Enumerators

• Single value enumerator in

C++17

• Types

• Namespaces

61/65

[[deprecated]] Attribute 2/4

[[deprecated]] void f() {}

struct [[deprecated]] S1 {};

using MyInt [[deprecated]] = int;

struct S2 {

[[deprecated]] int var = 3;

[[deprecated]]

static constexpr int var2 = 4;

};

f();

// compiler warning

S1 s1; // compiler warning

MyInt i; // compiler warning

S2{}.var; // compiler warning

S2::var2; // compiler warning

62/65

[[deprecated]] Attribute and Enumerator 3/4

C++17 allows to deprecate individual enumerator values

enum [[deprecated]] E { EnumValue }; // C++14

enum class MyEnum { A, B [[deprecated]] = 42 }; // C++17

auto x = EnumValue; // compiler warning

MyEnum::B; // compiler warning

63/65

[[deprecated]] Attribute and Namespace 4/4

C++17 allows deﬁning attribute on namespaces

namespace [[deprecated("please use my_namespace_v2")]] my_namespace {

void f() {}

}

// namespace my_namespace

my_namespace::f(); // compiler warning

64/65

[[noreturn]] Attribute

[[noreturn]] indicates that a function does not return (e.g. program termination)

and the compiler should issue a compiler warning if the code contains other statements

that cannot be executed because it means a wrong user intention

[[noreturn]] void g() { std::exit(0); }

g();

// WARNING: no code should be exectuted after calling this function

y = x + 1;

65/65

Modern C++

Programming

6. Basic Concepts IV

Memory Concepts

Federico Busato

2025-04-14

Table of Contents

1 Pointers

Pointer Operations

Address-of operator &

struct Member Access

void Pointer

Pointer Conversion

Pointer Arithmetic

Wild and Dangling Pointers

1/93

Table of Contents

2 Heap and Stack

Stack Memory

new, delete

Non-Allocating Placement Allocation

⋆

Non-Throwing Allocation

⋆

Memory Leak

2/93

Table of Contents

3 Initialization

Variable Initialization

Uniform Initialization

Array Initialization

Structure Initialization

Structure Binding

Dynamic Memory Initialization

4 References

3/93

Table of Contents

5 const and Constant Expressions

Constants and Literals

const

constexpr

consteval

constinit

if constexpr

std::is_constant_evaluated()

if consteval

4/93

Table of Contents

6 volatile Keyword

⋆

7 Explicit Type Conversion

static_cast

const_cast

reinterpret_cast

Type Punning

std::bit_cast

Uniform Initialization Conversion

gls::narrow_cast

⋆

5/93

Table of Contents

8 sizeof Operator

[[no_unique_address]]

6/93

Pointers

Pointers and Pointer Operations 1/3

Pointer

A pointer T* is a value referring to a location in memory

Pointer Dereferencing

Pointer dereferencing (*ptr) means obtaining the value stored in at the location

referred to the p ointer

Subscript Operator []

The subscript operator (ptr[]) allows accessing to the pointer element at a given

position

7/93

Pointers and Pointer Operations 2/3

The type of a pointer (e.g. void* ) is an unsigned integer of 32-bit/64-bit

depending on the underlying architecture

• It only supports the operators

+, -, ++, – , comparisons

==, !=, <, <=, >, >= , subscript [] , and dereferencing *

• A pointer can be explicitly converted to an integer type

void* x;

size_t y = (size_t) x; // ok (explicit conversion)

// size_t y = x; // compile error (implicit conversion)

8/93

Pointers and Pointer Operations 3/3

Dereferencing:

int* ptr1 = new int;

*ptr1 = 4; // dereferencing (assignment)

int a = *ptr1; // dereferencing (get value)

Array subscript:

int* ptr2 = new int[10];

ptr2[2] = 3;

int var = ptr2[4];

Common error:

int *ptr1, ptr2; // one pointer and one integer!!

int *ptr1, *ptr2; // ok, two pointers

9/93

Address-of operator &

The address-of operator (&) returns the address of a variable

int a = 3;

int* b = &a; // address-of operator,

// 'b' is equal to the address of 'a'

a++;

cout

<< *b; // print 4;

To not confuse with Reference syntax:

T& var = ...

10/93

struct Member Access

• The dot (.) operator is applied to local objects and references (see next slides)

• The arrow operator (->) is used with a pointer to an object

struct A {

int x;

};

A a;

// local object

a.x; // dot syntax

A* ptr = &a; // pointer

ptr->x; // arrow syntax: same of (*ptr).x

11/93

void Pointer - Generic Pointer

Instead of declaring diﬀerent types of pointer variable it is possible to declare single

pointer variable which can act as any pointer types

• void* can be compared

• Common p ointer operations are not allowed because there is no speciﬁc type pointed

cout << (sizeof(void*) == sizeof(int*)); // print true

int array[] = { 2, 3, 4 };

void* ptr;

cout

<< (array == ptr);

// *ptr; // compile error

// ptr + 2; // compile error

12/93

Pointer Conversion

• Any pointer type can be implicitly converted to void*

• Non- void pointers must be explicitly converted

•

static_cast (see next slides) does not allow pointer conversion for safety

reasons, except for

void*

int* ptr1 = ...;

void* ptr2 = ptr1; // int* -> void*, implicit conversion

void* ptr3 = ...;

int* ptr4 = (int*) ptr3; // void* -> int, explicit conversion required

// static_cast allowed

int* ptr5 = ...;

char* ptr6 = (char*) ptr5; // int* -> char*, explicit conversion required,

// static_cast not allowed, dangerous

13/93

1 + 1 = 2 : Pointer Arithmetic 1/3

Subscript operator meaning:

ptr[i] is equal to *(ptr + i)

Note: subscript operator accepts also negative values

Pointer arithmetic rule:

address(ptr + i) = address(ptr) + (sizeof(T) * i)

where T is the type of elements pointed by ptr

int array[4] = {1, 2, 3, 4};

cout << array[1]; // print 2

cout << *(array + 1); // print 2

cout << array; // print 0xFFFAFFF2

cout << array + 1; // print 0xFFFAFFF6!!

int* ptr = array + 2;

cout << ptr[-1]; // print 2

14/93

1 + 1 = 2 : Pointer Arithmetic 2/3

char arr[4] = "abc"

value address

’a’ 0x0 ←arr[0]

’b’ 0x1 ←arr[1]

’c’ 0x2 ←arr[2]

’\0’ 0x3 ←arr[3]

int arr[3] = {4,5,6}

value address

0x0 ←arr[0]

0x1

0x2

0x3

0x4 ←arr[1]

0x5

0x6

0x7

0x8 ←arr[2]

0x9

0x10

0x11

15/93

Pointer Arithmetic - Undeﬁned Behavior

⋆

3/3

lib/vsprintf.c of the Linux kernel

int vsnprintf(char *buf, size_t size, ...) {

char *end;

/* Reject out-of-range values early

Large positive sizes are used for unknown buffer sizes */

if (WARN_ON_ONCE((int) size < 0))

return 0;

end

= buf + size;

/* Make sure end is always >= buf */

if (end < buf) { ... } // Even if pointers are represented with unsigned values,

... // pointer overflow is undefined behavior.

// Both GCC and Clang will simplify the overflow check

// buf + size < buf to size < 0 by eliminating

} // the common term buf

16/93

Wild and Dangling Pointers

A wild pointer is a pointer not initialized

int* ptr; // wild pointer

A dangling pointer points to a deallocated memory region

int* array = new int[10];

delete[] array; // ok -> "array" now is a dangling pointer

*array; // Potential segmentation fault

delete[] array; // double free or corruption!!

17/93

Heap and Stack

Process Address Space

higher memory

addresses

0x00FFFFFF

Stack

↓

stack memory int data[10]

↑

Heap

dynamic memory

new int[10]

malloc(40)

BSS and Data

Segment

.bss/.data

Static/Global

data

int data[10]

(global scope)

lower memory

addresses

0x00FF0000

Code

.text

18/93

Data and BSS Segment

int data[] = {1, 2}; // DATA segment memory

int big_data[1000000] = {}; // BSS segment memory

// (zero-initialized)

int main() {

int A[] = {1, 2, 3}; // stack memory

}

Data/BSS (Block Started by Symbol) segments are larger than stack memory (max

≈ 1GB in general) but slower

19/93

Stack and Heap Memory Overview

Stack Heap

Memory

Organization

Contiguous (LIFO)

Contiguous within an allocation,

Fragmented between allocations

(relies on virtual memory)

Max size

Small (8MB on Linux, 1MB on

Windows)

Whole system memory

If exceed

Program crash at function

entry (hard to debug)

Exception or nullptr

Allocation Compile-time Run-time

Locality High Low

Thread View Each thread has its own stack Shared among threads

20/93

Stack Memory

A local variable is either in the stack memory or CPU registers

int x = 3; // not on the stack (data segment)

struct A {

int k; // depends on where the instance of A is

};

int main() {

int y = 3; // on stack

char z[] = "abc"; // on stack

A a; // on stack (also k)

void* ptr = malloc(4); // variable "ptr" is on the stack

}

The organization of the stack memory enables much higher performance. On the

other hand, this memory space is limited!!

21/93

Stack Memory Data

Types of data stored in the stack:

Local variables Variable in a local scope

Function arguments Data passed from caller to a function

Return addresses Data passed from a function to a caller

Compiler temporaries Compiler speciﬁc instructions

Interrupt contexts

22/93

Stack Memory

Every object which resides in the stack is not valid outside his scope!!

int* f() {

int array[3] = {1, 2, 3};

return array;

}

int* ptr = f();

cout

<< ptr[0]; // Illegal memory access!!

void g(bool x) {

const char* str = "abc";

if (x) {

char xyz[] = "xyz";

str = xyz;

}

cout << str; // if "x" is true, then Illegal memory access!!

}

23/93

Heap Memory - new, delete Keywords

new, delete

new/new[] and delete/delete[] are C++ keywords that perform dynamic

memory allocation/deallocation, and object construction/destruction at runtime

malloc and free are C functions and they only allocate and free memory blocks

(expressed in bytes)

24/93

new, delete Advantages

• Language keywords, not functions → safer

• Return type:

new returns exact data type, while malloc() returns void*

• Failure: new throws an exception, while malloc() returns a NULL pointer → it

cannot be ignored, zero-size allocations do not need special code

• Allocation size: The number of bytes is calculated by the compiler with the

new

keyword, while the user must take care of manually calculate the size for

malloc()

• Initialization:

new can be used to initialize besides allocate

• Polymorphism: objects with virtual functions must be allocated with new to

initialize the virtual table pointer

25/93

Dynamic Memory Allocation

• Allocate a single element

int* value = (int*) malloc(sizeof(int)); // C

int* value = new int; // C++

• Allocate N elements

int* array = (int*) malloc(N * sizeof(int)); // C

int* array = new int[N]; // C++

• Allocate N structures

MyStruct* array = (MyStruct*) malloc(N * sizeof(MyStruct)); // C

MyStruct* array = new MyStruct[N]; // C++

• Allocate and zero-initialize N elements

int* array = (int*) calloc(N, sizeof(int)); // C

int* array = new int[N](); // C++

26/93

Dynamic Memory Deallocation

• Deallocate a single element

int* value = (int*) malloc(sizeof(int)); // C

free(value);

int* value = new int; // C++

delete value;

• Deallocate N elements

int* value = (int*) malloc(N * sizeof(int)); // C

free(value);

int* value = new int[N]; // C++

delete[] value;

27/93

Allocation/Deallocation Properties

Fundamental properties:

• Each object allocated with

malloc() must be deallocated with free()

• Each object allocated with

new must be deallocated with delete

• Each object allocated with new[] must be deallocated with delete[]

• malloc() , new , new[] never produce NULL pointer in the success case,

except for zero-size allocations (implementation-deﬁned)

•

free() , delete , and delete[] applied to NULL / nullptr pointers do not

produce errors

Mixing new , new[] , malloc with something diﬀerent from their counterparts leads

to undeﬁned behavior

28/93

2D Memory Allocation 1/2

Easy on the stack - dimensions known at c ompile-time:

int A[3][4]; // C/C++ uses row-major order: move on row elements, then columns

Dynamic Memory 2D allocation/deallocation - dimensions known at run-time:

int** A = new int*[3]; // array of pointers allocation

for (int i = 0; i < 3; i++)

A[i] = new int[4]; // inner array allocations

for (int i = 0; i < 3; i++)

delete[] A[i]; // inner array deallocations

delete[] A; // array of pointers deallocation

29/93

2D Memory Allocation

⋆

2/2

Dynamic memory 2D allocation/deallocation C++11:

auto A = new int[3][4]; // allocate 3 objects of type int[4]

int n = 3; // dynamic value

auto B = new int[n][4]; // ok

// auto C = new int[n][n]; // compile error

delete[] A; // same for B, C

30/93

Non-Allocating Placement

⋆

2/2

A non-allocating placement (ptr) type allows to explicitly specify the memory

location (previously allocated) of individual objects

// STACK MEMORY

char buffer[8];

int* x = new (buffer) int;

short* y = new (x + 1) short[2];

// no need to deallocate x, y

// HEAP MEMORY

unsigned* buffer2 = new unsigned[2];

double* z = new (buffer2) double;

delete[] buffer2; // ok

// delete[] z; // ok, but bad practice

31/93

Non-Allocating Placement and Objects

⋆

⇝ 2/2

Placement allocation of non-trivial objects requires to explicitly call the object

destructor as the runtime is not able to detect when the object is out-of-scope

struct A {

∼A() { cout

<< "destructor"; }

};

char buffer[10];

auto x = new (buffer) A();

// delete x; // runtime error 'x' is not a valid heap memory pointer

x->∼A(); // print "destructor"

C++23 introduces a type safe placement allocation function

std::start_lifetime_as() W

32/93

Non-Throwing Allocation

⋆

The new operator allows a non-throwing allocation by passing the std::nothrow

object. It returns a

NULL pointer instead of throwing std::bad_alloc exception if

the memory allocation fails

int* array = new (std::nothrow) int[very_large_size];

note:

new can return NULL pointer even if the allocated size is 0

std::nothrow doesn’t mean that the allocated object(s) cannot throw an exception

itself

struct A {

A() { throw std::runtime_error{}; }

};

A* array = new (std::nothrow) A; // throw std::runtime_error

33/93

Memory Leak

A memory leak is a dynamically allocated entity in the heap memory that is

no longer used

by the program, but still maintained overall its execution

Problems:

• Illegal memory accesses → segmentation fault/wrong results

• Undeﬁned values a their propagation→ segmentation fault/wrong results

• Additional memory consumption (potential segmentation fault)

int main() {

int* array = new int[10];

array = nullptr; // memory leak!!

} // the memory can no longer be deallocated!!

Note: the memory leaks are especially diﬃcult to detect in complex code and when objects are

widely used

34/93

Dynamic Memory Allocation and OS

A program does not directly allocate memory itself but, it asks for a chuck of memory

from the OS. The OS provides the memory at the granularity of memory pages (virtual

memory), e.g. 4KB on Linux

Implication: out-of-bound accesses do not always lead to segmentation fault (lucky

case). The worst case is an execution with undeﬁned behavior

int* x = new int;

int num_iters = 4096 / sizeof(int); // 4 KB

for (int i = 0; i < num_iters; i++)

x[i]

= 1; // potential segmentation fault

35/93

Initialization

Variable Initialization

C++03:

int a1; // default initialization (undefined value)

int a2(2); // direct (or value) initialization

int a3(0); // direct (or value) initialization (zero-initialization)

// int a4(); // a4 is a function

int a5 = 2; // copy initialization

int a6 = 2u; // copy initialization (+ implicit conversion)

int a7 = int(2); // copy initialization

int a8 = int(); // copy initialization (zero-initialization)

int a9 = {2}; // copy list initialization, brace-initialization/braced-init-list syntax

36/93

Uniform Initialization

C++11 Uniform Initialization W syntax allows to initialize diﬀerent entities

(variables, objects, structures, etc.) in a consistent

way with brace-initialization or

braced-init-list syntax:

int b1{2}; // direct list (or value) initialization

int b2{}; // direct list (or value) initialization (default constructor/

// zero-initialization)

int b3 = int{}; // copy initialization (default constr./zero-initialization)

int b4 = int{4}; // copy initialization

int b5 = {}; // copy list initialization (default constr./zero-initialization)

37/93

Brace Initialization Advantages

The uniform initialization can be also used to safely convert arithmetic types,

preventing implicit narrowing, i.e potential value loss. The syntax is also more concise

than modern casts

int b4 = -1; // ok

int b5{-1}; // ok

unsigned b6 = -1; // ok

//unsigned b7{-1}; // compile error

float f1{10e30}; // ok

float f2 = 10e40; // ok, "inf" value

//float f3{10e40}; // compile error

38/93

Array Initialization 1/2

Arrays are aggregate types and can be initialized with brace-initialization syntax, also

called braced-init-list

or aggregate-initialization

One dimension:

int a[3] = {1, 2, 3}; // explicit size

int b[] = {1, 2, 3}; // implicit size

char c[] = "abcd"; // implicit size

int d[3] = {1, 2}; // d[2] = 0 -> zero/default value

int e[4] = {0}; // all values are initialized to 0

int f[3] = {}; //

all values are initialized to 0 (C++11)

int g[3] {}; // all values are initialized to 0 (C++11)

39/93

Array Initialization 2/2

Two dimensions:

int a[][2] = { {1,2}, {3,4}, {5,6} }; // ok

int b[][2] = { 1, 2, 3, 4 }; // ok

// the type of "a" and "b" is an array of type int[]

// int c[][] = ...; // compile error

// int d[2][] = ...; // compile error

40/93

Structure Initialization - C++03 1/4

Structures are also aggregate types and can be initialized with brace-initialization

syntax, also called braced-init-list or aggregate-initialization

struct S {

unsigned x;

unsigned y;

};

S s1;

// default initialization, x,y undefined values

S s2 = {}; // copy list initialization, x,y default constr./zero-init

S s3 = {1, 2}; // copy list initialization, x=1, y=2

S s4 = {1}; // copy list initialization, x=1, y default constr./zero-init

//S s5(3, 5); // compiler error, constructor not found

S f() {

S s6 = {1, 2}; // verbose

return s6;

}

41/93

Structure Initialization - C++11 2/4

struct S {

unsigned x;

unsigned y;

void* ptr;

};

S s1{};

// direct list (or value) initialization

// x,y,ptr default constr./zero-initialization

S s2{1, 2}; // direct list (or value) initialization

// x=1, y=2, ptr default constr./zero-initialization

// S s3{1, -2}; // compile error, narrowing conversion

S f() { return {3, 2}; } // non-verbose

42/93

Structure Initialization - Brace or Equal Initialization 3/4

Non-Static Data Member Initialization (NSDMI) W, also called brace or equal

initialization:

struct S1 {

unsigned x = 3; // equal initialization

unsigned y = 2; // equal initialization

// auto z = 3; // auto is not allowed for non-static member variables

};

struct S2 {

unsigned x {3}; // brace initialization

};

//----------------------------------------------------------------------------------

S1 s1; // call the default constructor (x=3, y=2)

S1 s2{}; // call the default constructor (x=3, y=2)

S1 s3{1, 4}; // set x=1, y=4

S2 s4; // call the default constructor (x=3)

S2 s5{5}; // set x=5

43/93

Structure Initialization - Designated Initializer List 4/4

C++20 introduces the designated initializer list W

struct A {

int x, y, z;

};

A a1{

1, 2, 3}; // is the same of

A a2{.x = 1, .y = 2, .z = 3}; // designated initializer list

Designated initializer list can be very useful for improving code readability

void f1(bool a, bool b, bool c, bool d, bool e) {}

// long list of the same data type -> error-prone

struct B {

bool a, b, c, d, e;

};

// f2(B b)

f2({.a = true, .c = true}); // b, d, e = false

44/93

Structure Binding

Structure Binding declaration C++17 binds the speciﬁed names to elements of

initializer:

struct A {

int x = 1;

int y = 2;

} a;

f() { return A{4, 5}; }

// Case (1): struct

auto [x1, y1] = a; // x1=1, y1=2

auto [x2, y2] = f(); // x2=4, y2=5

int b[2] = {1,2}; // Case (2): raw arrays

auto [x3, y3] = b; // x3=1, y3=2

auto [x4, y4] = std::tuple<float, int>{3.0f, 2}; // Case (3): tuples

// constexpr auto [x1, y1] = a; // constexpr structure binding is not allowed

// because it relies on references

45/93

Dynamic Memory Initialization

Dynamic memory initialization applies the same rules of the object that is allocated

C++03:

int* a1 = new int; // undefined

int* a2 = new int(); // zero-initialization, call "= int()"

int* a3 = new int(4); // allocate a single value equal to 4

int* a4 = new int[4]; // allocate 4 elements with undefined values

int* a5 = new int[4](); // allocate 4 elements zero-initialized, call "= int()"

// int* a6 = new int[4](3); // not valid

C++11:

int* b1 = new int[4]{}; // allocate 4 elements zero-initialized, call "= int{}"

int* b2 = new int[4]{1, 2}; // set first, second, zero-initialized

46/93

Initialization - Undeﬁned Behavior Example

⋆

lib/libc/stdlib/rand.c of the FreeBSD libc

struct timeval tv;

unsigned long junk; // not initialized, undefined value

/* XXX left uninitialized on purpose */

gettimeofday(&tv, NULL);

srandom((getpid()

<< 16) ^ tv.tv_sec ^ tv.tv_usec ^ junk);

// A compiler can assign any value not only to the variable,

// but also to expressions derived from the variable

// GCC assigns junk to a register. Clang further eliminates computation

// derived from junk completely, and generates code that does not use

// either gettimeofday or getpid

47/93

References

Reference 1/2

Reference

A variable reference T& is an alias, namely another name for an already existing

variable. Both variable and variable reference can be applied to refer the value of the

variable

• A pointer has its own memory address and size on the stack, reference shares the

same memory address (with the original variable)

• The compiler

can internally implement references as pointers, but treats them in a

very diﬀerent way

48/93

Reference 2/2

References are safer than pointers:

• References cannot have NULL value. You must always be able to assume that a

reference is connected to a legitimate storage

• References

cannot be changed. Once a reference is initialized to an object, it

cannot be changed to refer to another object

(Pointers can be pointed to another object a t any time)

• References must be

initialized when they are created

(Pointers can be initialized at any time)

49/93

Reference - Examples

Reference syntax: T& var = ...

//int& a; // compile error no initialization

//int& b = 3; // compile error

"3" is not a variable

int c = 2;

int& d = c; // reference. ok valid initialization

int& e = d; // ok. the reference of a reference is a reference

++d; // increment

++e; // increment

cout << c; // print 4

int a = 3;

int* b = &a; // pointer

int* c = &a; // pointer

++b; // change the value of the pointer 'b'

++*c; // change the value of 'a' (a = 4)

int& d = a; // reference

++d; // change the value of 'a' (a = 5)

50/93

Reference - Function Arguments 1/2

Reference vs. pointer arguments:

void f(int* value) {} // value may be a nullptr

void g(int& value) {} // value is never a nullptr

int a = 3;

&a); // ok

f(0); // dangerous but it works!! (but not with other numbers)

//f(a); // compile error

"a" is not a pointer

g(a); // ok

//g(3); // compile error "3" is not a reference of something

//g(&a); // compile error

"&a" is not a reference

51/93

Reference - Function Arguments 2/2

References can be use to indicate ﬁxed size arrays:

void f(int (&array)[3]) { // accepts only arrays of size 3

cout << sizeof(array);

}

void g(int array[]) {

cout << sizeof(array); // any surprise?

}

int A[3], B[4];

int* C = A;

//------------------------------------------------------

f(A); // ok

// f(B); // compile error B has size 4

// f(C); // compile error C is a pointer

g(A); // ok

g(B); // ok

g(C); // ok

52/93

Reference - Arrays

⋆

int A[4];

int (&B)[4] = A; // ok, reference to array

int C[10][3];

int (&D)[10][3] = C; // ok, reference to 2D array

auto c = new int[3][4]; // type is int (*)[4]

// read as "pointer to arrays of 4 int"

// int (&d)[3][4] = c; // compile error

// int (*e)[3] = c; // compile error

int (*f)[4] = c; // ok

int array[4];

// &array is a pointer to an array of size 4

int size1 = (&array)[1] - array;

int size2 = *(&array + 1) - array;

cout << size1; // print 4

cout << size2; // print 4

53/93

const and Constant

Expressions

Constants and Literals

A constant expression W is an expression that can be evaluated at compile-time

A literal W is a ﬁxed value that can be assigned to a constant

formally, “Literals are the tokens of a C++ program that represent constant values

embedded in the source code”

Literal types:

• Concrete values of the scalar types

bool , char , int , float , double , e.g.

true , ‘a’ , 3 , 2.0f

• String literal of type const char[] , e.g "literal"

• nullptr

• User-deﬁned literals, e.g. 2s

54/93

const Keyword

const keyword

The const W keyword declares an object that never changes value after the

initialization. A const variable must be initialized when declared

const variable is evaluated at compile-time value if the right expression is also

evaluated at compile-time

int size = 3; // 'size' is dynamic

int A[size] = {1, 2, 3}; // technically possible but, variable size stack array

// are considered BAD programming

const int SIZE = 3;

// SIZE = 4; // compile error

, SIZE is const

int B[SIZE] = {1, 2, 3}; // ok

const int size2 = size; // 'size2' is dynamic

55/93

const Keyword and Pointers 1/3

• int* → const int*

•

const int*



→ int*

void read(const int* array) {} // the values of 'array' cannot be modified

void write(int* array) {}

int* ptr = new int;

const int* const_ptr = new int;

read(ptr); // ok

write(ptr); // ok

read(const_ptr); // ok

// write(const_ptr); // compile error

56/93

const Keyword and Pointers 2/3

• int* pointer to int

- The value of the pointer

can be modiﬁed

- The elements referred by the pointer can be modiﬁed

•

const int* pointer to const int. Read as (const int)*

- The value of the pointer can be modiﬁed

- The elements referred by the pointer

cannot be modiﬁed

•

int *const const pointer to int

- The value of the pointer cannot be modiﬁed

- The elements referred by the pointer

can be modiﬁed

• const int *const const pointer to const int

- The value of the pointer

cannot be modiﬁed

- The elements referred by the pointer cannot be modiﬁed

Note: const int* (West notation) is equal to int const* (East notation)

Tip: pointer types should be read from right to left

57/93

const Keyword and Pointers

⋆

3/3

Common error: adding const to a pointer is not the same as adding const to a

type alias of a pointer

using ptr_t = int*;

using const_ptr_t = const int*;

void f1(const int* ptr) { // read as '(const int)*'

// ptr[0] = 0; // not allowed

: pointer to const objects

ptr = nullptr; // allowed

}

void f2(const_ptr_t ptr) {} // same as before

void f3(const ptr_t ptr) { // warning!! equal to 'int* const'

ptr[0] = 0; // allowed!!

// ptr = nullptr; // not allowed: const pointer to modifiable objects

}

58/93

constexpr Keyword

constexpr (C++11)

constexpr W speciﬁer declares an expression that can be evaluated at

compile-time

•

constexpr can improve performance and memory usage

• constexpr can potentially impact the compilation time

59/93

constexpr Variable

constexpr variables are always evaluated at compile-time

• const guarantees the value of a variable cannot change after the initialization

• constexpr implies const

const int v1 = 3; // compile-time evaluation

const int v2 = v1 * 2; // compile-time evaluation

int a = 3; // "a" is dynamic

const int v3 = a; // run-time evaluation!!

constexpr int c1 = v1; // ok

// constexpr int c2 = v3; // compile error, "v3" is a run-time variable

60/93

constexpr Function 1/4

constexpr Function

constexpr guarantees compile-time evaluation of a function as long as all its

arguments are evaluated at compile-time

constexpr int square(int value) {

return value * value;

}

square(

4); // compile-time evaluation, '4' is a literal

int a = 4; // "a" is dynamic

square(a); // run-time evaluation

• C++11: must contain exactly one return statement, and no loops or switch

• C++14: no restrictions

61/93

constexpr Function - Constraints 2/4

A constexpr function is always evaluated at run-time if:

• contains run-time arguments with a lifetime that begins with the expression,

even

if the function doesn’t depend on them

constexpr int f(int v) { return 3; }

constexpr int g(int& v) { return 3; }

int v = ...

f(v);

// run-time evaluation

g(v); // compile-time evaluation lifetime of 'v' began outside the expression

• contains run-time functions, namely non- constexpr functions

(detected with -Winvalid-constexpr )

• contains references to run-time global variables

62/93

constexpr Function - Limitations

⋆

3/4

• cannot contain run-time features such as exceptions and RTTI

• cannot contain

assert() until C++14

• cannot be a virtual member function or a destructor ∼T until C++20

• cannot contain or try-catch blocks or asm statements until C++20

• cannot contain static variables or goto until C++23

• undeﬁned behavior code is not allowed, e.g. reinterpret_cast , unsafe usage

of union , signed integer overﬂow, etc.

63/93

constexpr Function - Limitations

⋆

4/4

constexpr non-static member functions of run-time objects cannot be used at

compile-time if they contain data members or non-compile-time functions

Note:

static constexpr member functions don’t present this issue because they don’t

depend on a speciﬁc instance

struct A {

int v = 3;

constexpr int f() const { return v; }

static constexpr int g() { return 3; }

};

A a1;

// constexpr int x = a1.f(); // compile error, f() is evaluated at run-time

constexpr int y = a1.g(); // ok, same as 'A::g()'

constexpr A a2;

constexpr int x = a2.f(); // ok

64/93

consteval Keyword

consteval (C++20)

consteval W, or immediate function, guarantees compile-time evaluation.

A run-time value always produces a compile error

consteval int square(int value) {

return value * value;

}

square(

4); // compile-time evaluation

int v = 4; // "v" is at run-time

// square(v); //

compile error

65/93

constinit Keyword

constinit (C++20)

constinit W guarantees compile-time initialization of a variable. A run-time

initialization value always produces a compile error

• The value of a variable can change during the execution

•

const constinit does not imply constexpr , while the opposite is true

constexpr int square(int value) {

return value * value;

}

constinit int v1 = square(4); // compile-time evaluation

v1 = 3; // ok, v1 can change

int a = 4; // "v" is dynamic

// constinit int v2 = square(a); // compile error

66/93

if constexpr

if constexpr W C++17 allows to conditionally compile code based on a

compile-time predicate

The

if constexpr statement forces the compiler to evaluate the branch at

compile-time (similarly to the #if preprocessor)

auto f() {

if constexpr (sizeof(void*) == 8)

return "hello"; // const char*

else

return

3; // int, never compiled

}

Note: Ternary (conditional) operator does not provide constexpr variant

67/93

if constexpr Example

constexpr int fib(int n) {

return (n == 0 || n == 1) ? 1 : fib(n - 1) + fib(n - 2);

}

int main() {

if constexpr (sizeof(void*) == 8)

return fib(5);

else

return

fib(3);

}

Generated assembly code (x64 OS):

main:

mov eax, 8

ret

Advanced example: C++17 Compile-time Quick-Sort

68/93

if constexpr Pitfalls

if constexpr only works with explicit if/else statements

auto f1() {

if constexpr (my_constexpr_fun() == 1)

return 1;

// return 2.0; compile error // this is not part of constexpr

}

else if branch requires constexpr

auto f2() {

if constexpr (my_constexpr_fun() == 1)

return 1;

else if (my_constexpr_fun() == 2) // -> else if constexpr

// return 2.0;

compile error // this is not part of constexpr

else

return

3L;

}

69/93

std::is_constant_evaluated()

C++20 provides std::is_constant_evaluated() W utility to evaluate if the

current function is evaluated at compile time

#include <type_traits> // std::is_constant_evaluated

constexpr int f(int n) {

if (std::is_constant_evaluated())

return 0;

return 4;

}

3); // return 0

int v = 3;

f(v); // return = 4

70/93

if consteval 1/2

std::is_constant_evaluated() has two problems that if consteval W C++23

solves:

(1) Calling a

consteval function cannot be used within a constexpr function if it

is called with a run-time parameter

consteval int g(int n) { return n * 3; }

constexpr int f(int n) {

if (std::is_constant_evaluated()) // it works with if consteval

return g(n);

return 4;

}

// f(3); compiler error

71/93

if consteval 2/2

(2) if constexpr (std::is_constant_evaluated()) is a bug because it is

always evaluated to true

constexpr int f(int x) {

if constexpr (std::is_constant_evaluated()) // if consteval avoids this error

return 3;

return 4;

}

constexpr int g(int x) {

if consteval {

return 3;

}

return 4;

}

72/93

volatile Keyword

⋆

volatile Keyword

volatile

volatile is a hint to the com piler to avoid aggressive memory optimizations

involving a pointer or an object

Use cases:

• Low-level programming: driver development, interaction with assembly, etc.

(force writing to a speciﬁc memory location)

• Multi-thread program: variables shared between threads/processes to

communicate (don’t optimize, delay variable update)

• Benchmarking: some operations need to not be optimized away

Note: volatile reads/writes can still be reordered with respect to non-volatile ones

73/93

volatile Keyword - Example

The following code compiled with -O3 (full optimization) and without volatile

could work ﬁne

volatile int* ptr = new int[1]; // actual alloction size is much

int pos = 128 * 1024 / sizeof(int); // larger, typically 128 KB

ptr[pos] = 4; //

segfault

74/93

volatile Deprecation

C++20 deprecates volatile outside single load and store operations

volatile int v = 3;

auto v1 = v + 4; // ok, one load

v = 4; // ok, one store

v += 4; // deprecated, load + store

volatile int f() {} // deprecated, volatile return value

void g1(volatile int) {} // deprecated, volatile argument

void g2(volatile int*) {} // ok

struct A {

volatile int x = 4; // deprecated, volatile data member

};

75/93

Explicit Type

Conversion

static_cast 1/3

static_cast converts between types and performs compile-time (not run-time) type

check

It is equivalent to the old style cast

(T) var or T(var) for value semantic

int a = 6;

short b1 = (short) a; // the compiler can issue a warning without

short b2 = short(a); // explicit cast

short b3 = static_cast<short>(a);

long c = a; // not needed

76/93

static_cast 2/3

static_cast prevents accidental/unsafe conversions between pointer types,

especially across classes in a hierarchy

char* a = new char[4]{1, 2, 3, 4};

int* b = (int*) a; // ok

cout << b[0]; // print 67305985, not 1!!

//int* c = static_cast<int*>(a); // compile error

unsafe conversion

static_cast also prevents accidental/unsafe const conversions

const char* a = new char;

char* b = (char*) a; // ok

//char* c = static_cast<char*>(a); // compile error

unsafe conversion

77/93

static_cast 3/3

static_cast prevents accidental/unsafe conversions between unrelated classes

struct A {};

struct B : A {};

struct C {};

A a;

B b;

auto x1 = (A&) b; // ok

auto x2 = (C&) a; // ok

auto x3 = (C*) &a; // ok

auto x4 = static_cast<A&>(b); // ok

//auto x5 = static_cast<C&>(a); // compile error

unsafe conversion

//auto x6 = static_cast<C*>(&a); // compile error

unsafe conversion

Note: (T&) v is equal to *((T*) &v)

78/93

const_cast

const_cast can add or cast away (remove) constness or volatility

const int* ptr = new int[4];

auto x1 = (int*) ptr ; // ok

auto x2 = (char*) ptr; // ok

auto x3 = const_cast<int*>(ptr); // ok

//auto x4 = const_cast<char*>(ptr); // compile error

unsafe conversion

const int a = 5;

const_cast<int>(a) = 3; // ok, but undefined behavior

int b = 5;

const_cast<volatile int>(b) = 3; // ok

79/93

reinterpret_cast

reinterpret_cast allows a subset of unsafe conversion:

• between p ointers/references of diﬀerent type with same constness

• between p ointers and integer types

float b = 3.0f; // bits: 01000000010000000000000000000000

int c = reinterpret_cast<int&>(b); // bits: 01000000010000000000000000000000

const int* ptr = new int;

//reinterpret_cast<int*>(ptr); // compile error

uintptr_t my_int = reinterpret_cast<uintptr_t>(ptr); // ok

// ARRAY RESHAPING

int a[3][4];

int (&b)[2][6] = reinterpret_cast<int (&)[2][6]>(a);

int (*c)[6] = reinterpret_cast<int (*)[6]>(a);

80/93

Type Punning 1/2

Pointer Aliasing

One pointer aliases another when they both point to the same memory location

Type Punning

Type punning refers to circumvent the type system of a programming language to

achieve an eﬀect that would be diﬃcult or impossible to achieve within the bounds of

the formal language

The compiler assumes that the strict aliasing rule is never violated: Accessing a value

using a type which is diﬀerent from the original one is not allowed and it is classiﬁed as

undeﬁned behavior

81/93

Type Punning 2/2

// slow without optimizations. The branch breaks the CPU instruction pipeline

float abs(float x) {

return (x < 0.0f) ? -x : x;

}

// optimized with bitwise operation

float abs(float x) {

unsigned uvalue = reinterpret_cast<unsigned&>(x);

unsigned tmp = uvalue & 0x7FFFFFFF; // clear the last bit

return reinterpret_cast<float&>(tmp);

}

// this is undefined behavior!!

GCC warning (not clang): -Wstrict-aliasing

• blog.qt.io/blog/2011/06/10/type-punning-and-strict-aliasing

• What is the Strict Aliasing Rule and Why do we care?

• Type Punning In C++17

82/93

std::bit_cast

The right way to avoid undeﬁned behavior is by using memcpy

#include <cstring> // std::memcpy

float v1 = 32.3f;

unsigned v2;

std

::memcpy(&v2, &v1, sizeof(float));

Problems: memcpy is unsafe if the variables have not the same size or are not trivially

copyable. Also, it doesn’t work at compile-time ( constexpr )

C++20 std::bit_cast provides a safe alternative to reinterpret_cast and

memcpy that also works at compile-time

#include <bit> // std::bit_cast

constexpr float v1 = 32.3f;

constexpr unsigned v2 = std::bit_cast<unsigned>(v1);

83/93

Uniform Initialization Conversion

A narrowing conversion occurs when the destination type may not be able to

represent all the values of the source type

Brace initialization

{} C++11 disallows narrowing conversions

// RUN-TIME VALUES

int a = 3;

long long x1{a}; // ok

//unsigned x2{a}; //

compile error, 'a' could be negative

//float x3{a}; //

compile error, 'a' could not be representable with float

double b = 3;

//long long x4{b}; // compile error, 'b' could be a number with decimals

//float x5{b}; // compile error, 'b' could not be representable with float

gcc issues a warning instead of a compile error for run-time narrowing conversions

84/93

Uniform Initialization Conversion

// COMPILE-TIME VALUES

constexpr int c = 3;

unsigned x6{c}; // ok

constexpr int d = -1;

unsigned x7{d}; // compile error, 'd' is negative

constexpr float e = 4;

//int x8{e}; // compile error, 'float' cannot be narrowed to 'int'

constexpr double f = std::numbers::pi_v<double>; // π, C++20 <numbers>

float x9{f}; // ok

constexpr double g = 1e+40;

//float x10{g}; // compile error, too large for 'float'

85/93

gls::narrow_cast

⋆

The Guidelines Support Library (GSL) W contains functions and types that are

suggested for use by the C++ Core Guidelines W maintained by the Standard C++

Foundation

GLS oﬀers

narrow_cast operation for specifying that narrowing is acceptable and a

narrow (“narrow if”) that throws an exception if a narrowing would throw away legal

values

# include <gsl/gsl>

double a = 1.1;

int x1 = gsl::narrow_cast<int>(d); // ok, explicit narrowing: 'a' becomes 1

int x2 = gsl::narrow<int>(d); // ok, throws 'narrowing_error'

86/93

sizeof Operator

sizeof operator

sizeof

The sizeof is a compile-time operator that determines the size, in bytes, of a

variable or data type

•

sizeof returns a value of type size_t

• sizeof(anything) never returns 0 (*except for arrays of size 0)

• sizeof(char) always returns 1

• When applied to structures, it also takes into account the internal padding

• When applied to a reference, the result is the size of the referenced type

•

sizeof(incomplete type) produces compile error, e.g. void

•

sizeof(bitfield member) produces compile error

* gcc allows array of size 0 (not allowed by the C++ standard)

87/93

sizeof - Pointer 1/5

sizeof(int); // 4 bytes

sizeof(int*) // 8 bytes on a 64-bit OS

sizeof(void*) // 8 bytes on a 64-bit OS

sizeof(size_t) // 8 bytes on a 64-bit OS

int f(int array[]) { // dangerous!!

cout << sizeof(array);

}

int array1[10];

int* array2 = new int[10];

cout

<< sizeof(array1); // sizeof(int) * 10 = 40 bytes

cout << sizeof(array2); // sizeof(int*) = 8 bytes

f(array1); // 8 bytes (64-bit OS)

88/93

sizeof - struct 2/5

struct A {

int x; // 4-byte alignment

char y; // offset 4

};

sizeof(A); // 8 bytes: 4 + 1 (+ 3 padding), must be aligned to its largest member

struct B {

int x; // offset 0 -> 4-byte alignment

char y; // offset 4 -> 1-byte alignment

short z; // offset 6 -> 2-byte alignment

};

sizeof(B); // 8 bytes : 4 + 1 (+ 1 padding) + 2

struct C {

short z; // offset 0 -> 2-byte alignment

int x; // offset 4 -> 4-byte alignment

char y; // offset 8 -> 1-byte alignment

};

sizeof(C); // 12 bytes : 2 (+ 2 padding) + 4 + 1 + (+ 3 padding)

89/93

sizeof - Reference and Array 3/5

char a;

char& b = a;

sizeof(&a); // 8 bytes in a 64-bit OS (pointer)

sizeof(b); // 1 byte, equal to sizeof(char)

// NOTE: a reference is not a pointer

struct S1 {

void* p;

};

sizeof(S1); // 8 bytes

struct S2 {

char& c;

};

sizeof(S2); // 8 bytes, same as sizeof(void*)

sizeof(S2{}.c); // 1 byte

90/93

sizeof - Special Cases 4/5

struct A {};

sizeof(A); // 1 : sizeof never return 0

A array1[10];

sizeof(array1); // 10 : array of empty structures

int array2[0]; // C++ doesn't allow array of size 0, as opposed to C

only gcc, compiler error for other compilers

sizeof(array2); // 0 : special case

91/93

[[no_unique_address]] 5/5

C++20 [[no_unique_address]] allows a structure member to be overlapped with

other data members of a diﬀerent type

struct Empty {}; // empty class, sizeof(Empty) == 1

struct A { // sizeof(A) == 8 (4 + 1 + 3 for padding)

int i;

Empty e;

};

struct B { // sizeof(B) == 4, 'e' overlaps with 'i'

int i;

[[no_unique_address]] Empty e;

};

Notes: [[no_unique_address]] is ignored by MSVC even in C++20 mode;

instead, [[msvc::no_unique_address]] is provided

92/93

sizeof and Size of a Byte

Interesting: C++ does not explicitly deﬁne the size of a byte (see Exotic

architectures the standards committees care about)

93/93

Modern C++

Programming

7. Basic Concepts V

Functions and Preprocessing

Federico Busato

2025-04-14

Table of Contents

1 Functions

Pass by-Value

Pass by-Pointer

Pass by-Reference

Function Signature and Overloading

Overloading and =delete

Default Parameters

1/62

Table of Contents

2 Function Pointers and Function Objects

Function Pointer

Function Object (or Functor)

2/62

Table of Contents

3 Lambda Expressions

Capture List

Lambda Expression and Function Relation

Parameter Notes

Composability

Recursion

constexpr/consteval

template

mutable

Capture List and Classes

3/62

Table of Contents

4 Preprocessing

Preprocessors

Common Errors

Source Location Macros

Condition Compiling Macros

Stringizing Operator #

#error and #warning

#pragma

Token-Pasting Operator ##

⋆

Variadic Macro

⋆

4/62

Functions

Overview

A function (procedure or routine) is a piece of code that performs a speciﬁc

task

Purpose

• Avoiding code duplication: less code for the same functionality → less

bugs

• Readability: better express what the code does

• Organization: break the co de in separate modules

5/62

Function Parameter and Argument

Function Parameter [formal]

A parameter is the variable which is part of the method signature

Function Argument [actual]

An argument is the actual value (instance) of the variable that gets passed to the

function

void f(int a, char* b); // parameters: int a, char* b

// return type: void

f(3, "abc"); // arguments: 3, "abc"

6/62

Pass by-Value

Call-by-value

The object is copied and assigned to input arguments of the method f(T x)

Advantages:

• Changes made to the parameter inside the function have no eﬀect on the argument

Disadvantages:

• Performance penalty if the copied arguments are large (e.g. a structure with several data

members)

When to use:

• Built-in data type and small objects (≤ 8 bytes)

When not to use:

• Fixed size arrays which decay into pointers

• Large objects

7/62

Pass by-Pointer

Call-by-pointer

The address of a variable is copied and assigned to input arguments of the method

f(T* x)

Advantages:

• Allows a function to change the value of the argument

• The argument is not copied (fast)

Disadvantages:

• The argument may be a null pointer

• Dereferencing a pointer is slower than accessing a value directly

When to use:

• Raw arrays (use const T* if read-only)

When not to use:

• All other cases

8/62

Pass by-Reference

Call-by-reference

The reference of a variable is copied and assigned to input arguments of the method

f(T& x)

Advantages:

• Allows a function to change the value of the argument (better readability compared with

pointers)

• The argument is not copied (fast)

• References must be initialized (no null pointer)

• Avoid implicit conversion (without

const T& )

When to use:

• All cases except raw pointers

When not to use:

• Pass by-value could give performance advantages and improve the readability with built-in

data type and small objects that are trivially copyable

9/62

Examples

struct MyStruct;

void f1(int a); // pass by-value

void f2(int& a); // pass by-reference

void f3(const int& a); // pass by-const reference

void f4(MyStruct& a); // pass by-reference

void f5(int* a); // pass by-pointer

void f6(const int* a); // pass by-const pointer

void f7(MyStruct* a); // pass by-pointer

void f8(int*& a); // pass a pointer by-reference

//--------------------------------------------------------------

char c = 'a';

f1(c); // ok, pass by-value (implicit conversion)

// f2(c); // compile error different types

f3(c); // ok, pass by-value (implicit conversion)

10/62

Function Signature and Overloading 1/2

Signature

Function signature deﬁnes the input types for a (specialized) function and the

inputs + outputs types for a template function

A function signature includes the

number of arguments, the types of arguments, and

the

order of the arguments

• The C++ standard prohibits a function declaration that only diﬀers in the return

type

• Function declarations with diﬀerent signatures can have distinct return types

Overloading

Function overloading allows having distinct functions with the same name but with

diﬀerent signatures

11/62

Function Signature and Overloading 2/2

void f(int a, char* b); // signature: (int, char*)

// char f(int a, char* b); // compile error

same signature

// but different return types

void f(const int a, char* b); // same signature, ok

// const int == int

void f(int a, const char* b); // overloading with signature: (int, const char*)

int f(float); // overloading with signature: (float)

// the return type is different

GCC 14 adds the ﬂag -fdiagnostics-all-candidates to show all function candidates when

overload resolution failure occurs

New C++ features in GCC 14

12/62

Overloading Resolution Rules

• An exact match

• A promotion (e.g.

char to int )

• A standard type conversion (e.g.

float and int )

• A constructor or user-deﬁned type conversion ⇝

void f(int a);

void f(float b); // overload

void f(float b, char c); // overload

//--------------------------------------------------------------

f(0); // exact match

f('a'); // promotion from char to int (promotion)

// f(3LL); // compile error ambiguous match

f(2.3f); // exact match

// f(2.3); // compile error ambiguous match

f(2.3, 'a'); // standard type conversion, ambiguity is not possible here

13/62

Overloading and =delete

=delete can be used to prevent calling the wrong overload

void g(int) {}

void g(double) = delete;

3); // ok

g(3.0); // compile error

# include <cstddef> // std::nullptr_t

void f(int*) {}

void f(std::nullptr_t) = delete;

f(nullptr); // compile error

14/62

Function Default Parameters

Default/Optional parameter

A default parameter is a function parameter that has a default value

• If the user does not supply a value for this parameter, the default value will be used

• All default parameters must be the rightmost parameters

• Default parameters must be declared only once

• Default parameters can improve compile time and avoid redundant code because they

avoid deﬁning other overloaded functions

void f(int a, int b = 20); // declaration

//void f(int a, int b = 10) { ... } // compile error, already set in the declaration

void f(int a, int b) { ... } // definition, default value of "b" is already set

f(5); // b is 20

15/62

Function Pointers

and Function

Objects

Function Pointer - Function as Argument 1/2

Standard C achieves generic programming capabilities and composability through the

concept of function pointer

A function can be passed as a pointer to another function and behaves as an “indirect

call”

#include <stdlib.h> // qsort

int descending(const void* a, const void* b) {

return *((const int*) a) > *((const int*) b);

}

int array[] = {7, 2, 5, 1};

qsort(array, 4, sizeof(int), descending);

// array: { 7, 5, 2, 1 }

16/62

Function Pointer - Function as Argument 2/2

int eval(int a, int b, int (*f)(int, int)) {

return f(a, b);

}

// type: int (*)(int, int)

int add(int a, int b) { return a + b; }

int sub(int a, int b) { return a - b; }

cout

<< eval(4, 3, add); // print 7

cout << eval(4, 3, sub); // print 1

Problems:

Safety There is no check of the argument type in the generic case (e.g. qsort )

Performance Any operation requires an indirect call to the original function. Function

inlining is not possible

17/62

Function Object (or Functor) 1/2

Function Object

A function object, or functor, is a callable object that can be treated as a

parameter

C++ provides a more eﬃcient and convenient way to pass “procedure” to other

functions called function object

#include <algorithm> // for std::sort

struct Descending { // <-- function object

bool operator()(int a, int b) { // function call operator

return a > b;

}

};

int array[] = {7, 2, 5, 1};

std::sort(array, array + 4, Descending{});

// array: { 7, 5, 2, 1 }

18/62

Function Object (or Functor) 2/2

Advantages:

Safety Argument type checking is always possible. It could involve templates

Performance The compiler injects

operator() in the code of the destination function

and then compile the routine. Operator inlining is the standard behavior

C++11 simpliﬁes the concept by providing less verbose function objects called

lambda expressions

19/62

Lambda Expressions

Lambda Expression

A C++11 lambda expression is an inline local-scope function object

auto x = [capture clause] (parameters) { body }

• The [capture clause] marks the declaration of the lambda and how the local

scope arguments are captured (by-value, by-reference, etc.)

• The

parameters of the lambda are normal function parameters (optional in

C++23*)

• The body of the lambda is a normal function body

The expression to the right of

= is the lambda expression, and the runtime object

x created by that expression is the closure

* some compilers support lambda expressions without parameters in previous C++ standards

20/62

Lambda Expression Examples

# include <algorithm> // for std::sort

int array[] = {7, 2, 5, 1};

auto lambda = [](int a, int b){ return a > b; }; // named lambda

std::sort(array, array + 4, lambda);

// array: { 7, 5, 2, 1 }

// in alternative, in one line of code: //

unnamed lambda

std::sort(array, array + 4, [](int a, int b){ return a > b; });

// array: { 7, 5, 2, 1 }

auto lambda2 = []{ return 3; }; // no parameters, C++23

21/62

Capture List

Lambda expressions capture external variables used in the body of the lambda in two

ways:

• Capture by-value

• Capture by-reference (can modify external variable values)

Capture list can be passed as follows

•

[] no capture

•

[=] captures all variables by-value

• [&] captures all variables by-reference

• [var1] captures only var1 by-value

• [&var2] captures only var2 by-reference

• [var1, &var2] captures var1 by-value and var2 by-reference

22/62

Capture List Examples

// GOAL: find the first element greater than "limit"

# include <algorithm> // for std::find_if

int limit = ...

auto lambda1 = [=](int value) { return value > limit; }; // by-value

auto lambda2 = [&](int value) { return value > limit; }; // by-reference

auto lambda3 = [limit](int value) { return value > limit; }; // "limit" by-value

auto lambda4 = [&limit](int value) { return value > limit; }; // "limit" by-reference

// auto lambda5 = [](int value) { return value > limit; }; // no capture

// compile error

int array[] = {7, 2, 5, 1};

std

::find_if(array, array + 4, lambda1);

23/62

Capture List - Other Cases

• [=, &var1] captures all variables used in the body of the lambda by-value,

except var1 that is captured by-reference

•

[&, var1] captures all variables used in the body of the lam bda by-reference,

except

var1 that is captured by-value

• A lambda expression can read a variable without capturing it if the variable is

constexpr

constexpr int limit = 5;

int var1 = 3, var2 = 4;

auto lambda1 = [](int value){ return value > limit; };

auto lambda2 = [=, &var2]() { return var1 > var2; };

24/62

Lambda Expression and Function Relation

A lambda expression can be converted to a function (stateless) if its capture list is

empty

// lambda_func is equivalent to

// int lambda_func(int first, int second){ return first + second; };

void f(int (lambda_func)(int, int)) {

cout

<< lambda_func(2, 3);

}

auto lambda = [](int first, int second){ return first + second; };

f(lambda); // print 5

25/62

Parameter Notes

C++14 Lambda expression parameters can be automatically deduced

auto x = [](auto value) { return value + 4; };

C++14 Lambda expression parameters can be initialized

auto x = [](int i = 6) { return i + 4; };

26/62

Composability 1/2

Lambda expressions can be composed

auto lambda1 = [](int value){ return value + 4; };

auto lambda2 = [](int value){ return value * 2; };

auto lambda3 = [&](int value){ return lambda2(lambda1(value)); };

// returns (value + 4) * 2

A function can return a lambda

(dynamic dispatch is also possible if the capture list is empty)

auto f() {

return [](int value){ return value + 4; };

}

auto lambda = f();

cout

<< lambda(2); // print "6"

27/62

Composability 2/2

A lambda expression can contain another lambda expression

auto lambda1 = [](auto value) {

int x = 5;

auto lambda2 = [=](auto v) { return x * value + v; };

return lambda2(3);

};

cout

<< lambda1(2); // print "13"

28/62

Recursion

⋆

Lambda expressions can be called recursively

auto factorial = [](int n, auto fac) {

return (n <= 1) ? 1 : n * fac(n - 1, fac);

};

factorial(

5, factorial);

C++23 allows to access the this pointer of a lambda object with the syntax

this auto as ﬁrst parameter

auto factorial = [](this auto self, int n) -> int { // or 'this auto&&'

return (n <= 1) ? 1 : n * self(n - 1);

};

factiorial(

5);

29/62

constexpr/consteval Lambda Expression

C++17 Lambda expressions support constexpr

C++20 Lambda expressions support consteval

// constexpr lambda

auto factorial = [](int value) constexpr {

int ret = 1;

for (int i = 2; i <= value; i++)

ret

*= i;

return ret;

};

auto mul = [](int v) consteval { return v * 2; };

constexpr int v1 = factorial(4) + mul(5); // '24' + '10'

30/62

template Lambda Expression ⇝ 1/2

C++20 Lambda expression supports template and requires clause

auto lambda = []<typename T>(T value)

requires std::is_arithmetic_v<T> {

return value * 2;

};

auto v = lambda(3.4); // v: 6.8 (double)

// lambda(nullptr); // compiler error

31/62

template Lambda Expression ⇝ 2/2

Before C++20, template arguments can be emulated with auto + decltype

auto lambda = [](auto value) {

using T = decltyle(value); // T: double

};

lambda(

3.4);

Lambda and template without automatic deduction needs the full syntax

auto lambda = []<typename T>(int value) {

return value * sizeof(T);

};

// lambda<double>(3); // compiler error

lambda.operator()<double>(3); // ok

32/62

mutable Lambda Expression

Lambda capture is by-const-value

mutable speciﬁer allows the lambda to modify the parameters captured by-value

int var = 1;

auto lambda1 = [&](){ var = 4; }; // ok

lambda1();

cout

<< var; // print '4'

// auto lambda2 = [=](){ var = 3; }; // compile error

// lambda operator() is const

auto lambda3 = [=]() mutable { var = 3; }; // ok

lambda3();

cout << var; // print '4', lambda3 captures by-value

33/62

Capture List and Classes ⇝

• [this] captures the current object (*this) by-reference (implicit in C++17)

• [x = x] captures the current object member x by-value C++14

• [&x = x] captures the current object member x by-reference C++14

• [=] default capture of this pointer by value has been deprecated C++20

class A {

int data = 1;

void f() {

int var = 2; // <-- local variable

auto lambda1 = [=]() { return var; }; // copy by-value, return 2

auto lambda2 = [=]() { int var = 3; return var; }; // return 3 (nearest scope)

auto lambda3 = [this]() { return data; }; // copy by-reference, return 1

auto lambda4 = [*this]() { return data; }; // copy by-value (C++17), return 1

// auto lambda5 = [data]() { return data; }; // compile error 'data' is not visible

auto lambda6 = [data = data]() { return data; }; // return 1

}

};

34/62

Preprocessing

Preprocessing and Macro

A preprocessor directive is any line preceded by a hash symbol (#) which tells the

compiler how to interpret the source code before

compiling it

Macro are preprocessor directives which substitute any occurrence of an identiﬁer in

the rest of the code by replace ment

Macro are evil:

Do not use macro expansion!!

...or use as little as possible

• Macro cannot be directly debugged

• Macro expansions can have unexpected side eﬀects

• Macro have no namespace or scope

35/62

Preprocessors

All statements starting with #

• #include "my_file.h"

Inject the code in the current ﬁle

•

#define MACRO <expression>

Deﬁne a new macro

•

#undef MACRO

Undeﬁne a macro

(a macro should be undeﬁned as early as possible for safety reasons)

Multi-line Preprocessing:

\ at the end of the line

Indent: # define

36/62

Conditional Compiling

• #if <condition>

code

#elif <condition>

code

#else

code

#endif

•

#if defined(MACRO) equal to #ifdef MACRO

#elif defined(MACRO) equal to #elifdef MACRO C++23

Check if a macro is deﬁned

• #if !defined(MACRO) equal to #ifndef MACRO

#elif !defined(MACRO) equal to #elifndef MACRO C++23

Check if a macro is not deﬁned

37/62

Common Error 1

A Deﬁne macros in header ﬁles and before includes!!

# include <iostream>

# define value // very dangerous!!

# include "big_lib.hpp"

int main() {

std

::cout << f(4); // should print 7, but it always prints 3

}

big_lib.hpp:

int f(int value) { // 'value' disappears

return value + 3;

}

It is very hard to see this problem when the macro is in a header

38/62

Common Error 2

#if defined can introduce bugs related to macro visibility

// #include "macro_definition.hpp" // forget to add the header that defines ENABLE_DEBUG

# if defined(ENABLE_DEBUG)

void f(int v) { cout << v << endl; return v * 3; }

# else

void f(int v) { return v * 3; }

# endif

# if ENABLE_DEBUG // evaluated to 0 or 1

void f(int v) { cout << v << endl; return v * 3; }

# else

void f(int v) { return v * 3; }

# endif

39/62

Common Error 3

Forget to use parenthesis in macro deﬁnitions!!

# include <iostream>

# define SUB1(a, b) a - b // WRONG

# define SUB2(a, b) (a - b) // WRONG

# define SUB3(a, b) ((a) - (b)) // correct

int main() {

std

::cout << (5 * SUB1(2, 1)); // print 9 not 5!!

std::cout << SUB2(3 + 3, 2 + 2); // print 6 not 2!!

std::cout << SUB3(3 + 3, 2 + 2); // print 2

}

40/62

Common Error 4

Macros make hard to ﬁnd compile errors!!

1: # include <iostream>

# define F(a) { \

4: ... \

5: ... \

6: return v;

int main() {

9: F(

3); // compile error at line 9!!

10: }

• In which line is the error??!*

*modern compilers are able to roll out the macro

41/62

Common Error 5

Macro can introduce bugs related to the evaluation of their expressions!!

# if defined(DEBUG)

# define CHECK(EXPR)

// do something with EXPR

void check(bool b) { /* do something with b */ }

# else

# define CHECK(EXPR)

// do nothing

void check(bool) {} // do nothing

# endif

bool f() { /* return a boolean value */ }

check( f() )

CHECK( f() ) // <-- problem here

• What happens when DEBUG is not deﬁned?

f() is not evaluated by using the macro

42/62

Common Error 6

Forget curly brackets in multi-lines macros!!

# include <iostream>

# include <nuclear_explosion.hpp>

# define NUCLEAR_EXPLOSION \ // {

std::cout << "start nuclear explosion"; \

nuclear_explosion();

// }

int main() {

bool never_happen = false;

if (never_happen)

NUCLEAR_EXPLOSION

} // BOOM!!

The second line is executed!!

43/62

Common Error 7

Macros do not have scope!!

# include <iostream>

void f() {

# define value 4

std::cout << value;

}

int main() {

f();

// 4

std::cout << value; // 4

# define value 3

f(); // 4

std::cout << value; // 3

}

* In general, compilers raise a warning for multiple deﬁnitions of the same macro

44/62

Common Error 8

Macros can have side eﬀect!!

# define MIN(a, b) ((a) < (b) ? (a) : (b))

int main() {

int array1[] = { 1, 5, 2 };

int array2[] = { 6, 3, 4 };

int i = 0;

int j = 0;

int v1 = MIN(array1[i++], array2[j++]); // v1 = 5!!

int v2 = MIN(array1[i++], array2[j++]); // undefined behavior/

} // segmentation fault

arne-mertz.de/2019/03/macro-evil

45/62

Common Error 9

Macros can have undeﬁned behavior themselves!!

# define MY_MACRO defined(EXTERNAL_MACRO)

# if MY_MACRO

# define MY_VALUE 1

# else

# define MY_VALUE 0

# endif

int f() { return MY_VALUE; } // undefined behavior

46/62

When Preprocessors are Necessary

• Conditional compiling : diﬀerent architectures, compiler features, etc.

• Mixing diﬀerent languages: code generation (example: asm assembly)

• Complex name replacing: see template programming

Otherwise, prefer const and constexpr for constant values and functions

# define SIZE 3 // replaced with

const int SIZE = 3; // only C++11 at global scope

# define SUB(a, b) ((a) - (b)) // replaced with

constexpr int sub(int a, int b) {

return a - b;

}

Are We Macro free Yet, CppCon2019

47/62

Source Location Macros 1/3

__LINE__ Integer value representing the current line in the source code ﬁle

being compiled

__FILE__ A string literal containing the name of the source ﬁle being

compiled

__FUNCTION__ (non-standard, gcc, clang) A string literal containing the name of

the function in the ‘macro scope’

__PRETTY_FUNCTION__ (non-standard, gcc, clang) A string literal containing the full

signature of the function in the ‘mac ro scope’

__func__ (C++11 keyword) A string containing the name of the function in

the ‘macro scope’

48/62

Source Location Macros 2/3

source.cpp:

# include <iostream>

void f(int p) {

std

::cout << __FILE__ << ":" << __LINE__; // print 'source.cpp:4'

std::cout << __FUNCTION__; // print 'f'

std::cout << __func__; // print 'f'

}

// see template lectures

template<typename T>

float g(T p) {

std::cout << __PRETTY_FUNCTION__; // print 'float g(T) [T = int]'

return 0.0f;

}

void g1() { g(3); }

49/62

Source Location Macros 3/3

C++20 provides source location utilities for replacing macro-based approach

#include <source_location>

current() get source location info (static member)

line() source code line

column() line column

file_name() current ﬁle name

function_name() current function name

# include <source_location>

void f(std::source_location s = std::source_location::current()) {

cout

<< "function: " << s.function_name() << ", line " << s.line();

}

f();

// print: "function: f, line 6"

50/62

Condition Compiling Macros 1/2

Select code depending on the C/C++ version

•

#if defined(__cplusplus) C++ code

•

#if __cplusplus == 199711L ISO C++ 1998/2003

•

#if __cplusplus == 201103L ISO C++ 2011*

•

#if __cplusplus == 201402L ISO C++ 2014*

•

#if __cplusplus == 201703L ISO C++ 2017

Select code depending on the compiler

•

#if defined(__GNUG__) The compiler is gcc/g++

†

•

#if defined(__clang__) The compiler is clang/clang++

•

#if defined(_MSC_VER) The compiler is Microsoft Visual C++

* MSVC defines __cplusplus == 199711L even for C++11/14

† __GNUC__ is defined by many compilers, e.g clang

51/62

Condition Compiling Macros 2/2

Select code depending on the operating system or environment

• #if defined(_WIN64) OS is Windows 64-bit

•

#if defined(__linux__) OS is Linux

•

#if defined(__APPLE__) OS is Mac OS

•

#if defined(__MINGW32__) OS is MinGW 32-bit

• ...and many others

__DATE__ A string literal in the form "MMM DD YYYY" containing the date in which

the compilation process began

__TIME__ A string literal in the form "hh:mm:ss" containing the time at which the

compilation process began

52/62

Other Macros

Very comprehensive macro list:

• sourceforge.net/p/predef/wiki/Home/

• How to detect the operating system type using compiler predefined

macros

• Abseil platform macros

• Boost.Predef

53/62

Feature Testing Macro

C++17 introduces __has_include macro which returns 1 if header or source ﬁle

with the speciﬁed name exists

# if __has_include(<iostream>)

# include

# endif

C++20 introduces a set of macros to evaluate if a given feature is supported by the

compiler

# if __cpp_constexpr

constexpr int square(int x) { return x * x; }

# endif

Feature Testing Macros

54/62

Common Error 10 ⇝

Macros depend on compi lers and environment!!

struct A {

int x; // enable C++11 code

# if __cplusplus >= 201103

A() = default;

# else

A() {}

# endif

};

// should return ≈ 10.0f

float safe_function() {

A a{}; // zero-initialization

for (int i = 0; i < 10; i++)

a.x += 1.0f;

return a.x;

}

// what is the behavior ???

The code works ﬁne on Linux, but not under Windows MSVC. MSVC sets __cplusplus to

199711 even if C++11/14/17 ﬂag is set!! in this case the code can return NaN

see Lecture “Object-Oriented Programming II - Zero Initialization" and MSVC now correctly

reports __cplusplus

55/62

Stringizing Operator (#)

The stringizing macro operator ( # ) causes the corresponding actual argument to b e

enclosed in double quotation marks

# define STRING_MACRO(string) #string

cout << STRING_MACRO(hello); // equivalent to "hello"

# define INFO_MACRO(my_func) \

{ \

my_func \

cout << "call " << #my_func << " at " \

<< __FILE__ << ":" __LINE__; \

}

void g(int) {}

INFO_MACRO( g(3) ) // print: "call g(3) at my_file.cpp:7"

56/62

Common Error 11

Code injection

# include <cstdio>

# define CHECK_ERROR(condition) \

{ \

if (condition) { \

std::printf("expr: " #condition " failed at line %d\n",\

__LINE__); \

} \

}

int t = 6, s = 3;

CHECK_ERROR(t > s) // print "expr: t > s failed at line 13"

CHECK_ERROR(t % s == 0) // segmentation fault!!!

// printf interprets "% s" as a format specifier

57/62

#error and #warning

• #error "text" The directive emits a user-spec iﬁed error message at compile

time when the compiler parse it and stop the compilation process

•

C++23 #warning "text" The directive emits a user-speciﬁed warning message

at compile time when the compiler parse it without stopping the compilation

process

58/62

#pragma

The #pragma directive controls implementation-speciﬁc behavior of the compiler. In

general, it is not portable

•

#pragma message "text" Display informational messages at compile time

(every time this instruction is parsed)

•

#pragma GCC diagnostic warning "-Wformat"

Disable a GCC warning

•

_Pragma(<command>) (C++11)

It is a keyword and can be embedded in a #define

#define MY_MESSAGE \

_Pragma("message(\"hello\")")

59/62

Token-Pasting Operator (##)

⋆

The token-concatenation (or pasting) macro operator ( ## ) allows combining two

tokens (without leaving no blank spaces)

# define FUNC_GEN_A(tokenA, tokenB) \

void tokenA##tokenB() {}

# define FUNC_GEN_B(tokenA, tokenB) \

void tokenA##_##tokenB() {}

FUNC_GEN_A(my, function)

FUNC_GEN_B(my, function)

myfunction(); // ok, from FUNC_GEN_A

my_function(); // ok, from FUNC_GEN_B

60/62

Variadic Macro

⋆

A variadic macro C++11 is a special macro accepting a variable number of arguments

(separated by comma)

Each occurrence of the special identiﬁer

__VA_ARGS__ in the mac ro replacement list

is replaced by the passed arguments

Example:

void f(int a) { printf("%d", a); }

void f(int a, int b) { printf("%d %d", a, b); }

void f(int a, int b, int c) { printf("%d %d %d", a, b, c); }

# define PRINT(...) \

__VA_ARGS__);

PRINT(1, 2)

PRINT(

1, 2, 3)

61/62

Macro Trick

⋆

Convert a number literal to a string literal

# define TO_LITERAL_AUX(x) #x

# define TO_LITERAL(x) TO_LITERAL_AUX(x)

Motivation: avoid integer to string conversion (performance)

int main() {

int x1 = 3 * 10;

int y1 = __LINE__ + 4;

char x2[] = TO_LITERAL(3);

char y2[] = TO_LITERAL(__LINE__);

}

62/62

Modern C++

Programming

8. Object-Oriented

Programming I

Class Concepts

Federico Busato

2025-04-14

Table of Contents

1 C++ Classes

RAII Idiom

2 Class Hierarchy

3 Access speciﬁers

Inheritance Access Speciﬁers

When Use public/protected/private/ for Data Members?

1/67

Table of Contents

4 Class Constructor

Default Constructor

Class Initialization

Uniform Initialization for Objects

Delegate Constructor

explicit Keyword

2/67

Table of Contents

5 Copy Constructor

6 Class Destructor

7 Defaulted Constructors, Destructor, and Operators

(=default)

3/67

Table of Contents

8 Class Keywords

this

static

const

mutable

using

friend

delete

4/67

C++ Classes

C Structure

A C structure (struct) is a collection of variables of the same or diﬀerent data

types under a single name

C++ Class

A class (class) extends the concept of structure to hold functions as members

struct vs. class in C++

Structures and classes are semantically equivalent in C++. However, the keywords

should be used to distinguish between diﬀerent semantics:

• struct represents passive objects, namely the physical state (set of data)

• class represents active objects, namely the logical state (data abstraction)

5/67

Class Members - Data and Function Members

Data Member

Data within a class are called data members or class ﬁelds

Function Member

Functions within a class are called function members or methods

6/67

RAII Idiom - Resource Acquisition is Initialization

Holding a resource is a class invariant, and is tied to object

lifetime

RAII Idiom consists in three steps:

• Encapsulate a resource into a class (constructor)

• Use the resource via a local instance of the class

• The resource is automatically released when the object gets out of scope

(destructor)

Implication 1: C++ programming language does not require the garbage collector!!

Implication 2 :The programmer has the responsibility to manage the resources

7/67

struct/class Declaration and Deﬁnition

struct declaration and deﬁnition

struct A; // struct declaration

struct A { // struct deﬁnition

int x; // data member

void f(); // function member

};

class declaration and deﬁnition

class A; // class declaration

class A { // class deﬁnition

int x; // data member

void f(); // function member

};

8/67

struct/class Function Declaration and Deﬁnition

struct A {

void g(); // function member declaration

void f() { // function member declaration

cout << "f"; // inline definition

}

};

void A::g() { // function member definition

cout << "g"; // out-of-line definition

}

9/67

struct/class Members

struct B {

void g() { cout << "g"; } // function member

};

struct A {

int x; // data member

B b; // data member

void f() { cout << "f"; } // function member

};

A a;

a.x;

a.f();

a.b.g();

10/67

Class Hierarchy

Class Hierarchy 1/3

Child/Derived Class or Subclass

A new class that inheriting variables and functions from another class is called a

derived or child class

Parent/Base Class

The closest class providing variables and functions of a derived class is called parent

or base class

Extend a base class refers to creating a new class which retains characteristics of the

base class and on top it can add (

and never remove) its own members

Syntax:

class DerivedClass : [<inheritance attribute>] BaseClass {

11/67

Class Hierarchy 2/3

struct A { // base class

int value = 3;

void g() {}

};

struct B : A { // B is a derived class of A (B extends A)

int data = 4; // B inherits from A

int f() { return data; }

};

A a;

B b;

a.value;

b.g();

12/67

Class Hierarchy 3/3

struct A {};

struct B : A {};

void f(A a) {} // copy

void g(B b) {} // copy

void f_ref(A& a) {} // the same for A*

void g_ref(B& b) {} // the same for B*

A a;

B b;

f(a); // ok, also f(b), f_ref(a), g_ref(b)

g(b); // ok, also g_ref(b), but not g(a), g_ref(a)

A a1 = b; // ok, also A& a2 = b

// B b1 = a; //

compile error

13/67

Access speciﬁers

Access speciﬁers 1/2

The access speciﬁers deﬁne the visibility of inherited members of the subsequent base

class. The keywords

public , private , and protected specify the sections of

visibility

The goal of the access speciﬁers is to prevent direct access to the internal

representation of the class for avoiding wrong usage and potential inconsistency

(access control)

•

public: No restriction (function memb ers , derived classes, outside the class)

•

protected: Function members and derived classes access

•

private: Function members only access (internal)

struct has default public members

class has default private members

14/67

Access speciﬁers 2/2

struct A1 {

int value; // public (by default)

protected:

void f1() {} // protected

private:

void f2() {} // private

};

class A2 {

int data; // private (by default)

};

struct B : A1 {

void h1() { f1(); } // ok, "f1" is visible in B

// void h2() { f2(); } //

compile error "f2" is private in A1

};

A1 a;

a.value; // ok

// a.f1() // compile error protected

// a.f2() //

compile error private

15/67

Inheritance Access Speciﬁers 1/3

The access speciﬁers are also used for deﬁning how the visibility is propagated from

the base class to a speciﬁc derived class in the inheritance

Member

declaration

Inheritance Derived classes

public

protected → → protected

private \

public

protected

protected → → protected

private \

public

private

protected → → private

private \

struct has default public members

16/67

Inheritance Access Speciﬁers 2/3

struct A {

int var1; // public

protected:

int var2; // protected

};

struct B : protected A {

int var3; // public

};

B b;

// b.var1; // compile error, var1 is protected in B

// b.var2; // compile error, var2 is protected in B

b.var3; // ok, var3 is public in B

17/67

Inheritance Access Speciﬁers 3/3

class A {

public:

int var1;

protected:

int var2;

};

class B1 : A {}; // private inheritance

class B2 : public A {}; // public inheritance

B1 b1;

// b1.var1; // compile error, var1 is private in B1

// b1.var2; // compile error, var2 is private in B1

B2 b2;

b2.var1;

// ok, var1 is public in B2

// b2.var2; //

compile error, var2 is protected in B2

18/67

When Use public/protected/private/ for Data Members?

When use protected/private data members:

• They are not part of the interface, namely the logical state of the object (not

useful for the user)

• They must preserve the

const correctness (e.g. for pointer), see Advanced

Concepts I

When use

public data members:

• They can potentially change any time

• const correctness is preserved for values and references, as opposite to pointers.

Data members should be preferred to member functions in this case

19/67

Class Constructor

Constructor [ctor]

A constructor is a special member function of a class that is executed when a new

instance of that class is created

Goals

: initialization and resource acquisition

Syntax

: T(...) same named of the class and no return type

• A constructor is supposed to initialize all

data members

• We can deﬁne multiple constructors with diﬀerent signatures

• Any constructor can be constexpr

20/67

Default Constructor

The default constructor T() is a constructor with no argument

Every class has

always either an implicit, explicit, or deleted default constructor

struct A {

A() {}

// explicit default constructor

A(int) {} // user-deﬁned (non-default) constructor

};

struct A {

int x = 3; // implicit default constructor

};

A a{}; // call the default constructor, equivalent to: A a;

Note: an implicit default constructor is constexpr

21/67

Default Constructor Examples

struct A {

A() { cout

<< "A"; } // default constructor

};

A a1;

// call the default constructor

// A a2(); //

interpreted as a function declaration!!

A a3{}; // ok, call the default constructor

// direct-list initialization (C++11)

A array[3]; // print "AAA"

A* ptr = new A[4]; // print "AAAA"

22/67

Deleted Default Constructor 1/2

The implicit default constructor of a class is marked as deleted if (simpliﬁed):

• It has any user-deﬁned constructor

struct A {

A(int x) {}

};

// A a; // compile error

• It has a non-static member/base class of reference/const type

struct NoDefault { // deleted default constructor

int& x;

const int y;

};

23/67

Deleted Default Constructor 2/2

• It has a non-static member/base class which has a deleted (or inaccessible)

default constructor

struct A {

NoDefault var;

// deleted default constructor

};

struct B : NoDefault {}; // deleted default constructor

• It has a non-static member/base class with a deleted or inaccessible destructor

struct A {

private:

∼A() {}

};

24/67

Initializer List

The Initializer list is used for initializing the data members of a class or explicitly call

the base class constructor

before entering the constructor body

(Not to be confused with

std::initializer_list )

struct A {

int x, y;

int x1) : x(x1) {} // ": x(x1)" is the Initializer list

// direct initialization syntax

A(int x1, int y1) : // ": x{x1}, y{y1}"

x{x1}, // is the Initializer list

y{y1} {} // direct-list initialization syntax

}; // (C++11)

25/67

In-Class Member Initializer

C++11 In-class non-static data members initialization (NSDMI) allows initializing

the data members where they are declared. A user-deﬁned constructor can be used to

override their default values

struct A {

int x = 0; // in-class member initializer

const char* str = nullptr; // in-class member initializer

A() {} // "x" and "str" are well-defined if

// the default constructor is called

A(const char* str1) : str{str1} {}

};

26/67

Data Member Initialization

const and reference data members must be initialized by using the initialization list

or by using in-class brace-or-equal-initializer syntax (

C++11)

struct A {

int x;

const char y; // must be initialized

int& z; // must be initialized

int& v = x; // equal-initializer (C++11)

const int w{4}; // brace initializer (C++11)

A() : x(3), y('a'), z(x) {}

};

27/67

Initialization Order

Class member initialization follows the order of declarations and not the order in the

initialization list

struct ArrayWrapper {

int* array;

int size;

ArrayWrapper(

int user_size) :

size{user_size},

array{

new int[size]} {}

// wrong!!: "size" is still undefined

};

ArrayWrapper

a(10);

cout

<< a.array[4]; // segmentation fault

28/67

Uniform Initialization for Objects

Uniform Initialization (C++11)

Uniform Initialization {}, also called list-initialization, is a way to fully initialize any

object independently of its data type

• Minimizing Redundant Typenames

- In function arguments

- In function returns

• Solving the “Most Vexing Parse" problem

- Constructor interpreted as function prototype

mbevin.wordpress.com/2012/11/16/uniform-initialization

29/67

Minimizing Redundant Typenames

struct Point {

int x, y;

Point(int x1, int y1) : x(x1), y(y1) {}

};

C++03

Point add(Point a, Point b) {

return Point(a.x + b.x, a.y + b.y);

}

Point c

= add(Point(1, 2), Point(3, 4));

C++11

Point add(Point a, Point b) {

return { a.x + b.x, a.y + b.y }; // here

}

auto c = add({1, 2}, {3, 4}); // here

30/67

“Most Vexing Parse" problem 1/2

struct A {

int) {}

};

struct B {

// A a(1); // compile error It works in a function scope

A a{2}; // ok, call the constructor

};

31/67

“Most Vexing Parse" problem

⋆

2/2

struct A {};

struct B {

B(A a) {}

void f() {}

};

b( A() ); // "b" is interpreted as function declaration

// with a single argument A (*)() (func. pointer)

// b.f() // compile error "Most Vexing Parse" problem

// solved with B b{ A{} };

32/67

Constructors and Inheritance

Class constructors are never inherited

A Derived class must

call implicitly or explicitly a Base constructor before the current

class constructor

Class constructors are called

in order from the top Base class to the most

Derived class (C++ objects are constructed like onions)

struct A {

A() { cout << "A"; };

};

struct B1 : A { // call "A()" implicitly

int y = 3; // then, "y = 3"

};

struct B2 : A { // call "A()" explicitly

B2() : A() { cout << "B"; }

};

B1 b1; // print "A"

B2 b2; // print "A", then print "B"

33/67

Delegate Constructor

The problem:

Most constructors usually perform identical initialization steps before executing

individual operations

C++11 A delegate constructor calls another constructor of the same class to reduce

the repetitive code by adding a function that does all the initialization steps

struct A {

int a;

float b;

bool c;

// standard constructor:

A(int a1, float b1, bool c1) : a(a1), b(b1), c(c1) {

// do a lot of work

}

int a1, float b1) : A(a1, b1, false) {} // delegate construtor

A(float b1) : A(100, b1, false) {} // delegate construtor

};

34/67

explicit Keyword 1/2

explicit

The explicit keyword speciﬁes that a constructor or conversion operator (C++11)

does not allow implicit conversions or copy-initialization from single arguments or

braced initializers

The problem:

struct MyString {

MyString(

int n); // (1) allocates n bytes for the string

MyString(const char *p); // (2) initializes starting from a raw string

};

MyString string = 'a'; // calls (1), implicit conversion!!

explicit cannot be applied to copy/move-constructors

Most C++ constructors should be explicit

35/67

explicit Keyword 2/2

struct A {

A() {}

int) {}

int, int) {}

};

void f(const A&) {}

A a1

= {}; // ok

A a2(2); // ok

A a3 = 1; // ok (implicit)

A a4{4, 5}; // ok. Selected A(int, int)

A a5 = {4, 5}; // ok. Selected A(int, int)

f({}); // ok

f(1); // ok

f({1}); // ok

struct B {

explicit B() {}

explicit B(int) {}

explicit B(int, int) {}

};

void f(const B&) {}

// B b1 = {}; // error implicit conversion

B b2(2); // ok

// B b3 = 1; // error

implicit conversion

B b4{4, 5}; // ok. Selected B(int, int)

// B b5 = {4, 5}; // error implicit conversion

B b6 = (B) 1; // OK: explicit cast

// f({}); // error

implicit conversion

// f(1); // error implicit conversion

// f({1}); // error

implicit conversion

f(B{1}); // ok

36/67

Copy Constructor

A copy constructor T(const T&) creates a new object as a deep copy of an

existing object

struct A {

A() {}

// default constructor

A(int) {} // non-default constructor

A(const A&) {} // copy constructor → direct initialization

}

37/67

Copy Constructor Details

• Every class always deﬁnes an implicit or explicit copy constructor, potentially

deleted

• The copy constructor implicitly calls the default Base class constructor

• Even the copy constructor is considered a user-deﬁned constructor

• The copy constructor doesn’t have template parameters, otherwise it is a standard

member function

• The copy constructor must not be confused with the assignment operator

operator=

MyStruct x;

MyStruct y{x}; // copy constructor

y = x; // call the assignment operator=, not the copy constructor

// → copy initialization, see next lecture

38/67

Copy Constructor Example

struct Array {

int size;

int* array;

Array(

int size1) : size{size1} {

array

= new int[size];

}

// copy constructor, ": size{obj.size}" initializer list

Array(const Array& obj) : size{obj.size} {

array

= new int[size];

for (int i = 0; i < size; i++)

array[i] = obj.array[i];

}

};

Array x{100}; // do something with x.array ...

Array y{x}; // call "Array::Array(const Array&)"

39/67

Copy Constructor Usage

The copy constructor is used to:

• Initialize

one object from another one having the same type

- Direct constructor

- Assignment operator

A a1;

a2(a1); // Direct copy initialization

A a3{a1}; // Direct copy initialization

A a4 = a1; // Copy initialization

A a5 = {a1}; // Copy list initialization

• Copy an object which is passed by-value as input parameter of a function

void f(A a);

• Copy an object which is returned as result from a function***

A f() { return A(3); } // *** without RVO optimization (see 'Advanced Concepts I' lecture)

40/67

Copy Constructor Usage Examples

struct A {

A() {}

const A& obj) { cout << "copy"; }

};

void f(A a) {} // pass by-value

A g1(A& a) { return a; }

g2() { return A(); }

A a;

A b

= a; // copy constructor (assignment) "copy"

A c(b); // copy constructor (direct) "copy"

f(b); // copy constructor (argument) "copy"

g1(a); // copy constructor (return value) "copy"

A d = g2(); // * see RVO optimization (Advanced Concepts I)

41/67

Pass by-value and Copy Constructor

struct A {

A() {}

const A& obj) { cout << "expensive copy"; }

};

struct B : A {

B() {}

const B& obj) { cout << "cheap copy"; }

};

void f1(B b) {}

void f2(A a) {}

B b1;

f1(b1); // cheap copy

f2(b1); // expensive copy!! It calls A(const A&) implicitly

42/67

Deleted Copy Constructor ⇝ 1/3

The implicit copy constructor of a class is marked as deleted if:

• The class has the move constructor (next lectures)

struct A {

A(A

&&) {}; // 'A' implicit copy constructor is deleted

};

• The class has a deleted copy assignment operator

struct A {

A& operator=(const A&) = delete; // 'A' implicit copy constructor is deleted

};

43/67

Deleted Copy Constructor ⇝ 2/3

• It has a non-static member/base class with a deleted (or inaccessible) copy

constructor

# include <memory> // std::unique_ptr

struct A {

A(const A&) = delete; // explicitly deleted

};

struct B {

std::unique_ptr<int> ptr; // unique_ptr is non-copyable

}; // 'B' implicit copy constructor is deleted

class C {

C(const C&) {} // copy constructor is private

};

struct D1 : A {}; // 'D1' implicit copy constructor is deleted

struct D2 : C {}; // 'D2' implicit copy constructor is deleted

struct E {

A a;

};

// 'E' implicit copy constructor is deleted

44/67

Deleted Copy Constructor⇝ 3/3

• It has a non-static member/base class with a deleted (or inaccessible) destructor

struct A {

∼A()

= delete; // explicitly deleted

};

class B {

∼B() {} // destructor is private

};

struct C1 : A {}; // 'C1' implicit copy constructor is deleted

struct C2 : B {}; // 'C2' implicit copy constructor is deleted

struct D {

A a;

};

// 'D' implicit copy constructor is deleted

45/67

Class Destructor

Class Destructor 1/3

Destructor [dtor]

A destructor is a special member function that is executed whenever an object is

out-of-scope or whenever the delete/delete[] expression is applied to a pointer

of that class

Goals

: resources releasing

Syntax

: ∼T() same name of the class and no return type

• Any object has exactly one destructor, which is always implicitly or explicitly

declared

•

C++20 The destructor can be constexpr

46/67

Class Destructor 2/3

struct Array {

int* array;

Array() {

// constructor

array = new int[10];

}

∼Array() {

// destructor

delete[] array;

}

};

int main() {

Array a; // call the constructor

for (int i = 0; i < 5; i++)

Array b; // call 5 times the constructor + destructor

} // call the destructor of "a"

47/67

Class Destructor - Order of Calls 3/3

Class destructor is never inherited. Base class destructor is invoked after the

current class destructor

Class destructors are called in reverse order. From the most Derived to the top

Base class

struct A {

∼A() { cout << "A"; }

};

struct B {

∼B() { cout

<< "B"; }

};

struct C : A {

B b; // call ∼B()

∼C() { cout << "C"; }

};

int main() {

C b; // print "C", then "B", then "A"

}

48/67

Defaulted

Constructors,

Destructor, and

Operators

(=default)

Defaulted Constructors, Destructor, and Operators (=default) 1/3

C++11 The compiler can automatically generate

• default/copy/move constructors

A() = default

A(const A&) = default

A(A&&) = default

• destructor

∼A() = default

• copy/move assignment operators A& operator=(const A&) = default

A& operator=(A&&) = default

• spaceship operator

auto operator<=>(const A&) const = default

= default implies constexpr , but not noexcept or explicit

49/67

Defaulted Constructors, Destructor, and Operators (=default) 1/3

When the compiler-generated constructors, destructors, and operators are useful:

• Change the visibility of non-user provided constructors and assignment operators

( public , protected , private )

• Make visible the declarations of such members

The defaulted default constructor has a

::::::

similar eﬀect as a user-deﬁned constructor

with empty body and empty initializer list

When the compiler-generated constructor is useful:

• Any user-provided constructor disables implicitly-generated default constructor

• Force the default values for the class data members

50/67

Defaulted Constructors, Destructor, and Operators (=default) 3/3

struct A {

A(int v1) {} // delete implicitly-defined default ctor because

// a user-provided constructor is defined

A() = default; // now, A has the default constructor

};

struct B {

protected:

B() = default; // now it is protected

};

struct C {

int x;

// C() {} // 'x' is undefined

C() = default; // 'x' is zero

};

51/67

Class Keywords

this Keyword

this

Every object has access to its own address through the pointer this

Explicit usage is not mandatory (and not suggested)

this is necessary when:

• The name of a local variable is equal to some member name

• Return reference to the calling object

struct A {

int x;

void f(int x) {

this->x = x; // without "this" has no effect

}

const A& g() {

return *this;

}

};

52/67

static Keyword 1/5

static Keyword

The keyword static declares members (ﬁelds or methods) that are not bound to

class instances. A static memb er is shared by all objects of the class

struct A {

int x;

int f() { return x; }

static int g() { return 3; } // g() cannot access 'x' as it is associated

}; // with class instances

A a{4};

a.f(); // call the class instance method

A::g(); // call the static class method

a.g(); // as an alternative, a class instance can access static class members

53/67

static Keyword - Constant Members 2/5

struct A {

static const int a = 4; // C++03

static constexpr float b = 4.2f; // better, C++11

// static const float c = 4.2f; // only GNU extension (GCC)

static constexpr int f() { return 1; } // ok, C++11

// static const int g() { return 1; } // 'const' refers to the return type

};

54/67

static Keyword - Mutable Members 3/5

Non- const static data members cannot be directly initialized “inline" before

C++17 (see also Translation Units I lecture)

struct A {

// static int a = 4; // compiler error

static int a; // ok, declaration only

static inline int b = 4; // ok from C++17

static int f() { return 2; }

static int g(); // ok, declaration only

};

int A::a = 4; // ok

int A::g() { return 3; } // ok

// NOTE:

link error (undefined reference) without the two previous definitions

55/67

static Keyword - Example 4/5

struct A {

static int x; // declaration

static int f() { return x; }

static int& g() { return x; }

};

int A::x = 3; // definition

//---------------------------------------------------------------------------------

A::f(); // return 3

A::x++;

A::f(); // return 4

A::g() = 7;

::f(); // return 7

56/67

static Keyword - Member Visibility 5/5

• A static member function can only access static class members

• A non-

static member function can access static class members

struct A {

int x = 3;

static inline int y = 4;

int f1() { return x; } // ok

// static int f2() { return x; } // compiler error

, 'x' is not visible

int g1() { return y; } // ok

static int g2() { return y; } // ok

struct B {

int h() { return y + g2(); } // ok

}; // 'x', 'f1()', 'g1()' are not visible within 'B'

};

57/67

const Keyword 1/3

Const member functions

Const member functions (inspectors or observers) are functions marked with

const that are not allowed to change the object logical state

The compiler prevents from inadvertently mutating/changing the data members of

observer functions → All data members are marked

const within an observer

method, including the

this pointer

• The physical state can still be modiﬁed, see mutable member functions ⇝

• Member functions without a

const suﬃx are called non-const member functions

or mutators/modiﬁers

58/67

const Keyword 2/3

struct A {

int x = 3;

int* p;

int get() const {

// x = 2; // compile error class variables cannot be modified

// p = nullptr; // compile error

class variables cannot be modified

p[0] = 3; // ok, p is 'int* const' -> its content is

// not protected

return x;

}

};

A common case where const member functions are useful is to enforce const correctness when

accessing pointers, see Advanced Concepts I, Const Correctness

59/67

const Keyword - const Overloading 3/3

The const keyword is part of the function signature. Therefore, a class can

implement two similar methods, one which is called when the object is

const , and

one that is not

class A {

int x = 3;

public:

int& get1() { return x; } // read and write

int get1() const { return x; } // read only

int& get2() { return x; } // read and write

};

A a1;

cout

<< a1.get1(); // ok

cout << a1.get2(); // ok

a1.get1() = 4; // ok

const A a2;

cout << a2.get1(); // ok

// cout << a2.get2(); // compile error "a2" is const

//a2.get1() = 5; // compile error

only "get1() const" is available

60/67

mutable Keyword

mutable

mutable data members of const class instances are modiﬁable. They should be

part of the object physical state, but not of the logical state

• It is particularly useful if most of the members should be constant but a few need to be

modiﬁed

• Conceptually,

mutable members should not change anything that can be retrieved from

the class interface

struct A {

int x = 3;

mutable int y = 5;

};

const A a;

// a.x = 3; // compiler error const

a.y = 5; // ok

61/67

using Keyword for type declaration

The using keyword is used to declare a type alias tied to a speciﬁc class

struct A {

using type = int;

};

typename A::type x = 3; // "typename" keyword is needed when we refer to types

struct B : A {};

typename B::type x = 4; // B can use "type" as it is public in A

62/67

using Keyword for Inheritance

The using keyword can be also used to change the inheritance attribute of data

members and functions

struct A {

protected:

int x = 3;

};

struct B : A {

public:

using A::x;

};

B b;

b.x = 3; // ok, "b.x" is public

63/67

friend Keyword 1/3

friend Class

A friend class can access the private and protected members of the class in

which it is declared as a friend

Friendship properties:

• Not Symmetric: if class A is a friend of class B, class B is not automatically a

friend of class A

• Not Transitive: if class A is a friend of class B, and class B is a friend of class C,

class A is not automatically a friend of class C

• Not Inherited: if class Base is a friend of class X, subclass Derived is not

automatically a friend of class X; and if class X is a friend of class Base, class X is

not automatically a friend of subclass Derived

64/67

friend Keyword 2/3

class B; // class declaration

class A {

friend class B;

int x; // private

};

class B {

int f(A a) { return a.x; } // ok, B is friend of A

};

class C : B {

// int f(A a) { return a.x; } // compile error not inherited

};

65/67

friend Keyword 3/3

friend Method

A non-member function can access the private and protected members of a class

if it is declared a friend of that class

class A {

int x = 3; // private

friend int f(A a); // friendship declaration, no implementation

};

//'f' is not a member function of any class

int f(A a) {

return a.x; // A is friend of f(A)

}

friend methods are commonly used for implementing the stream operator operator«

66/67

delete Keyword

delete Keyword (C++11)

The delete keyword explicitly marks a member function as deleted and any use

results in a compiler error. When it is applied to copy/move constructor or

assignment, it prevents the compiler from implicitly generating these functions

The default copy/move functions for a class can produce unexpected results. The

keyword

delete prevents these errors

struct A {

A() = default;

const A&) = delete; // e.g. deleted because unsafe or expensive

};

void f(A a) {} // implicit call to copy constructor

A a;

// f(a); // compile error marked as deleted

67/67

Modern C++

Programming

9. Object-Oriented

Programming II

Polymorphism and Operator Overloading

Federico Busato

2025-04-14

Table of Contents

1 Polymorphism

C++ Mechanisms for Polymorphism

virtual Methods

Virtual Table

override Keyword

final Keyword

Common Errors

Pure Virtual Method

Abstract Class and Interface

2 Inheritance Casting and Run-time Type Identiﬁcation

⋆

1/66

Table of Contents

3 Operator Overloading

Overview

Comparison Operator operator<

Spaceship Operator operator<=>

Subscript Operator operator[]

Multidimensional Subscript Operator operator[]

Function Call Operator operator()

static operator() and static operator[]

Conversion Operator operator T()

Return Type Overloading Resolution

⋆

2/66

Table of Contents

Increment and Decrement Operators operator++/–

Assignment Operator operator type=

Stream Operator operator«

Operator Notes

4 C++ Object Layout

⋆

Aggregate

Trivial Class

Standard-Layout Class

Plain Old Data (POD)

Hierarchy

3/66

Polymorphism

Polymorphism (meaning “having multiple forms”) is the capability of an entity of

mutating its behavior in accordance with the speciﬁc usage context

Polymorphism dispatch can be implemented at

• Compile-time (static polymorphism): when the called instance is known

before the program start

• Run-time (dynamic polymorphism): when the called instance is known only

during the execution, i.e. dep ends on run-time values

In C++, the term polymorphic is strongly associated with

dynamic polymorphism

(overriding)

4/66

Function Binding

Connecting the function call to the f unction body is called Binding

• In Early Binding or Static Binding or Compile-time Binding, the compiler identiﬁes

the type of object at

compile-time

- the program can jump directly to the function address

• In Late Binding or Dynamic Binding or Run-time binding, the run-time identiﬁes

the type of object at

execution-time and then matches the function call with the

correct function deﬁnition

- the program has to read the address held in the pointer and then jump to that

address (less eﬃcient since it involves an extra level of indirection)

C++ achieves late binding by declaring a

virtual function

5/66

Polymorphism Forms

• Ad-hoc polymorphism: when it involves to a set of individually speciﬁed types,

e.g. function overloading

void f(int);

void f(double);

• Parametric polymorphism: when it involves generic types, e.g. templates

template<typename T>

void f(T);

• Subtyping: when it operates on elements of subtypes, e.g. virtual functions

// B : A

void f(A*); // also works for B if the called function are virtual

6/66

C++ Mechanisms for Polymorphism 1/2

• Preprocessing

# define ADD(x, y) x + y // ADD(3, 4) or ADD(3.0, 4.0)

• Function/Operator overloading

void f(int);

void f(double);

• Templates

template<typename T>

void f(T); // f(3) or f(3.0)

• Virtual functions (see next slides)

7/66

C++ Mechanisms for Polymorphism 2/2

Mechanism Implementation Form

Preprocessing static Parametric

Function/Operator overloading static Ad-hoc

Template static Parametric

Virtual function dynamic Subtyping

8/66

Dynamic Polymorphism in C++

• At run-time, objects of a base class behave as objects of a derived class

• A Base class may deﬁne and implement polymorphic methods, and derived

classes can override them, which means they provide their own implementations,

invoked at run-time depending on the context

struct A {

void f() { cout << "A"; }

};

struct B : A {

void f() { cout << "B"; }

};

void g(A& a) { a.f(); } // accepts A and B

// note: g(B&) would only accept B

A a; B b;

g(a); // print "A"

g(b); // print "A" not "B"!!!

9/66

Polymorphism - virtual method

struct A {

virtual void f() { cout << "A"; }

};

// now "f()" is virtual, evaluated at run-time

struct B : A {

void f() override { cout << "B"; }

// now B::f() overrides A::f(), run-time dispatch

// 'virtual void f()' is also valid

}; // 'override' is a c++11 feature, more details in the next slides

void g(A& a) { a.f(); } // accepts A and B

A a;

B b;

g(a); // print "A"

g(b); // NOW, print "B"!!!

10/66

When virtual works

struct A {

virtual void f() { cout << "A"; }

};

struct B : A {

void f() override { cout << "B"; }

};

void f(A& a) { a.f(); } // ok, print "B"

void g(A* a) { a->f(); } // ok, print "B"

void h(A a) { a.f(); } // does not work with pass-by value!! print "A"

B b;

f(b);

// print "B"

g(&b); // print "B"

h(b); // print "A" (cast to A)

11/66

Polymorphism Dynamic Behavior

struct A {

virtual void f() { cout << "A"; }

};

struct B : A {

void f() override { cout << "B"; }

};

* get_object(bool selectA) {

return (selectA) ? new A() : new B();

}

get_object(true)->f(); // print "A"

get_object(false)->f(); // print "B"

12/66

Virtual Table 1/2

vtable

The virtual table (vtable) is a lookup table of functions used to resolve function

calls and support dynamic dispatch (late binding)

A virtual table contains one entry for each

virtual function that can be called by

objects of the class. Each entry in this table is simply a function pointer that points to

the most-derived function accessible by that class

The compiler adds a hidden pointer to the base class which points to the virtual table

for that class (

sizeof considers the vtable pointer)

13/66

Virtual Table 2/2

struct A {

virtual void f();

virtual void g();

};

struct B : A {

void f() override;

};

14/66

Does the vtable really exist? (answer: YES)

struct A {

int x = 3;

virtual void f() { cout << "abc"; }

};

* a1 = new A;

* a2 = (A*) malloc(sizeof(A));

cout

<< a1->x; // print "3"

cout << a2->x; // undefined value!!

a1->f(); // print "abc"

a2->f(); // segmentation fault

Lesson learned: Never use malloc in C++

15/66

Virtual Method Notes

virtual classes allocate one extra pointer (hidden)

struct A {

virtual void f1();

virtual void f2();

};

class B : A {};

cout << sizeof(A); // 8 bytes (vtable pointer)

cout << sizeof(B); // 8 bytes (vtable pointer)

16/66

override Keyword 1/2

override Keyword (C++11)

The override keyword ensures that the function is virtual and is overriding a

virtual function from a base class

• It forces the compiler to check the base class to see if there is a

virtual

function with this exact

signature

• override clearly expresses the intent of the function, making the code easier to

understand

override implies virtual ( virtual should be omitted)

17/66

override Keyword 2/2

struct A {

virtual void f(int a); // a "float" value is casted to "int"

}; // ∗ ∗ ∗

struct B : A {

void f(int a) override; // ok

void f(float a); // (still) very dangerous!!

// ∗ ∗ ∗

// void f(float a) override; // compile error

not safe

// void f(int a) const override; // compile error not safe

};

// ∗ ∗ ∗ f(3.3f) has a different behavior between A and B

18/66

final Keyword

final Keyword (C++11)

The final keyword prevents inheriting from classes or overriding methods in

derived classes

struct A {

virtual void f(int a) final; // "final" method

};

struct B : A {

// void f(int a); // compile error f(int) is "final"

void f(float a); // dangerous (still possible)

}; // "override" prevents these errors

struct C final { // cannot be extended

};

// struct D : C { // compile error C is "final"

// };

19/66

Virtual Methods (Common Error 1)

All classes with at least one virtual method should declare a virtual

destructor

struct A {

∼A() { cout << "A"; } // <-- here the problem (not virtual)

virtual void f(int a) {}

};

struct B : A {

int* array;

B() { array = new int[1000000]; }

∼B() {

delete[] array; }

};

//----------------------------------------------------------------------

void destroy(A* a) {

delete a; // call ∼A()

}

B* b = new B;

destroy(b); // without virtual, ∼B() is not called

// destroy() prints only "A" -> huge memory leak!!

20/66

Virtual Methods (Common Error 2)

Do not call virtual methods in constructor and destructor

• Constructor: The derived class is not ready until constructor is completed

• Destructor: The derived class is already destroyed

struct A {

A() { f(); } // what instance is called? "B" is not ready

// it calls A::f(), even though A::f() is virtual

virtual void f() { cout << "Explosion"; }

};

struct B : A {

B() = default; // call A(). Note: A() may be also implicit

void f() override { cout << "Safe"; }

};

B b; // call B(), print "Explosion", not "Safe"!!

21/66

Virtual Methods (Common Error 3)

Do not use default parameters in virtual methods

Default parameters are not inherited

struct A {

virtual void f(int i = 5) { cout << "A::" << i << "\n"; }

virtual void g(int i = 5) { cout << "A::" << i << "\n"; }

};

struct B : A {

void f(int i = 3) override { cout << "B::" << i << "\n"; }

void g(int i) override { cout << "B::" << i << "\n"; }

};

A a; B b;

a.f();

// ok, print "A::5"

b.f(); // ok, print "B::3"

A& ab = b;

ab.f(); // !!! print "B::5" // the virtual table of A

// contains f(int i = 5) and

ab.g(); // !!! print "B::5" // g(int i = 5) but it points

// to B implementations

22/66

Pure Virtual Method 1/2

Pure Virtual Method

A pure virtual metho d is a function that must be implemented in derived classes

(concrete implementation)

Pure virtual functions can

have or not have a body

struct A {

virtual void f() = 0; // pure virtual without body

virtual void g() = 0; // pure virtual with body

};

void A::g() {} // pure virtual implementation (body) for g()

struct B : A {

void f() override {} // must be implemented

void g() override {} // must be implemented

};

23/66

Pure Virtual Method 2/2

A class with one pure virtual function cannot be instantiated

struct A {

virtual void f() = 0;

};

struct B1 : A {

// virtual void f() = 0; // implicitly declared

};

struct B2 : A {

void f() override {}

};

// A a; // "A" has a pure virtual method

// B1 b1; // "B1" has a pure virtual method

B2 b2; // ok

24/66

Abstract Class and Interface

• A class is interface if it has only pure virtual functions and optionally (suggested)

a virtual destructor. Interfaces do not have implementation or data

• A class is abstract if it has

at least one pure virtual function

struct A { // INTERFACE

virtual ∼A(); // to implement

virtual void f() = 0;

};

struct B { // ABSTRACT CLASS

B() {} // abstract classes may have a contructor

virtual void g() = 0; // at least one pure virtual

protected:

int x; // additional data

};

25/66

Inheritance Casting

and Run-time Type

Identiﬁcation

⋆

Hierarchy Casting

Class-casting allows implicit or explicit conversion of a class into another one across

its hierarchy

26/66

Hierarchy Casting

Upcasting Conversion between a derived class reference or pointer to a base class

- It can be implicit or explicit

- It is safe

static_cast or dynamic_cast // see next slides

Downcasting Conversion between a base class reference or pointer to a derived class

- It is only explicit

- It can be dangerous

static_cast or dynamic_cast

Sidecasting (Cross-cast) Conversion between a class reference or pointer to another

class of the

same hierarchy level

- It is only explicit

- It can be dangerous

dynamic_cast

27/66

Upcasting and Downcasting Example

struct A {

virtual void f() { cout << "A"; }

};

struct B : A {

int var = 3;

void f() override { cout << "B"; }

};

A a;

B b;

& a1 = b; // implicit cast upcasting

static_cast<A&>(b).f(); // print "B" upcasting

static_cast<B&>(a).f(); // print "A" downcasting

cout << b.var; // print 3 (no cast)

cout << static_cast<B&>(a).var; // potential segfault!!! downcasting

// "var" does not exist in "A"

28/66

Sidecasting Example

struct A {

virtual void f() { cout << "A"; }

};

struct B1 : A {

void f() override { cout << "B1"; }

};

struct B2 : A {

void f() override { cout << "B2"; }

};

B1 b1;

B2 b2;

dynamic_cast<B2&>(b1).f(); // sidecasting, throw std::bad_cast

dynamic_cast<B1&>(b2).f(); // sidecasting, throw std::bad_cast

// static_cast<B1&>(b2).f(); // compile error

29/66

Run-time Type Identiﬁcation

RTTI

Run-Time Type Information (RTTI) is a mechanism that allows the type of object

to be determined at runtime

C++ expresses RTTI through three features:

•

dynamic_cast keyword: conversion of p olymorphic types

•

typeid keyword: identifying the exact type of object

•

type_info class: type information returned by the typeid operator

RTTI is available only for classes that are polymorphic, which means they have at least

one virtual method

30/66

type_info and typeid

type_info class has the method name() which returns the name of the type

struct A {

virtual void f() {}

};

struct B : A {};

A a;

B b;

& a1 = b; // implicit upcasting

cout << typeid(a).name(); // print "1A"

cout << typeid(b).name(); // print "1B"

cout << typeid(a1).name(); // print "1B"

31/66

dynamic_cast

dynamic_cast , diﬀerently from static_cast , uses RTTI for deducing the

correctness of the output type

This operation happens at run-time and it is expensive

dynamic_cast<New>(Obj) has the following properties:

• Convert between a

derived class Obj to a base class New → upcasting.

New/Obj are both pointers or references

• Throw

std::bad_cast if New/Obj are references and New/Obj cannot be

converted

• Returns

NULL if New/Obj are pointers and New/Obj cannot be converted

32/66

dynamic_cast Example 1

struct A {

virtual void f() { cout << "A"; }

};

struct B : A {

void f() override { cout << "B"; }

};

A a;

B b;

dynamic_cast<A&>(b).f(); // print "B" upcasting

// dynamic_cast<B&>(a).f(); // throw std::bad_cast

// wrong

downcasting

dynamic_cast<B*>(&a); // returns nullptr

// wrong downcasting

33/66

dynamic_cast Example 2

struct A {

virtual void f() { cout << "A"; }

};

struct B : A {

void f() override { cout << "B"; }

};

* get_object(bool selectA) {

return (selectA) ? new A() : new B();

}

void g(bool value) {

A* a = get_object(value);

B* b = dynamic_cast<B*>(a); // downcasting + check

if (b != nullptr)

b->f(); // exectuted only when it is safe

}

34/66

Operator

Overloading

Operator Overloading

Operator overloading is a special case of polymorphism in which some operators

are treated as polymorphic functions and have diﬀerent behaviors depending on the

type of its arguments

struct Point {

int x, y;

Point

operator+(const Point& p) const {

return {x + p.x, y + p.y};

}

};

Point a{1, 2};

Point b{5, 3};

Point c = a + b; // "c" is (6, 5)

35/66

Operator Overloading

Category Operators

Arithmetic + - * / % ++ –

Comparison

== != < <= > >= <=>

Bitwise

| & ^ ∼ « »

Logical

! && ||

Compound Assignment Arithmetic

+= -= *= /= %=

Compound Assignment Bitwise

»= «= |= &= ^=

Subscript

[]

Function call ()

Address-of, Reference, Dereferencing & -> ->* *

Memory new new[] delete delete[]

Comma ,

• Categories not in bold are rarely used in practice

• Operators that cannot be overloaded: ? . .* :: sizeof typeid

36/66

Comparison Operator operator<

Relational and comparison operators operator<, <=, ==, >= > are used for

comparing two objects

In particular, the

operator< is used to determine the ordering of a set of objects

(e.g. sort)

# include <algorithm>

struct A {

int x;

bool operator<(A a) const {

return x * x < a.x * a.x;

}

};

A array[] = {5, -1, 4, -7};

std::sort(array, array + 4);

// array: {-1, 4, 5, -7}

37/66

Spaceship Operator operator<=> 1/4

C++20 allows overloading the spaceship operator <=> (also called three-way

comparison) for replacing

all comparison operators operator<, <=, ==, >= >

struct A {

bool operator==(const A&) const; // *** equal comparison is special,

bool operator!=(const A&) const; // see next slides

bool operator<(const A&) const;

bool operator<=(const A&) const;

bool operator>(const A&) const;

bool operator>=(const A&) const;

};

// replaced by

struct B {

auto operator<=>(const B&) const;

};

38/66

Spaceship Operator operator<=> 2/4

struct Obj {

int x;

auto operator<=>(const Obj& other) const {

return x - other.x; // or even better "x <=> other.x"

}

};

Obj a{

3};

Obj b{

5};

a < b; // true, operator< is generated

(a <=> b) < 0; // true

Note: a non-defaulted operator<=> doesn’t generate the operators == and !=

(see next slide)

Looks Like a Duck, Swims Like a Duck, and Quacks Like operator==

39/66

Spaceship Operator operator<=> 3/4

The compiler can also generate the code for the spaceship operator = default , even

for multiple ﬁelds and arrays, by using the default comparison semantic of its members

struct Obj {

int x;

char y;

short z[2];

auto operator<=>(const Obj&) const = default;

// if x == other.x, then compare y

// if y == other.y, then compare z

// if z[0] == other.z[0], then compare z[1]

};

Obj a{3}, b{5};

a == b; // false, operator== is generated (= default)

a != b; // true, operator!= is generated (= default)

40/66

Spaceship Operator operator<=> 4/4

The spaceship operator returns one of following ordering (classes) <compare> :

std::strong_ordering • If a is equivalent to b , f(a) is also equivalent to f(b)

• Exactly one of < , == , or > must be true

◦ e.g., integral types ( int , char )

std::weak_ordering • If a is equivalent to b , f(a) may not be equivalent to f(b)

• Exactly one of

< , == , or > must be true

◦ e.g., rectangles

R{2, 5} == R{5, 2}

std::partial_ordering • If a is equivalent to b , f(a) may not be equivalent to f(b)

• < , == , or > may all be false

◦ e.g., ﬂoating-point (

float with NaN )

41/66

Subscript Operator operator[]

The array subscript operator[] allows accessing to an object in an array-like fashion

The operator accepts everything as parameter, not just integers

struct A {

char permutation[] {'c', 'b', 'd', 'a', 'h', 'y'};

char& operator[](char c) { // read/write

return permutation[c - 'a'];

}

char operator[](char c) const { // read only

return permutation[c - 'a'];

}

};

A a;

'd'] = 't';

42/66

Multidimensional Subscript Operator operator[]

C++23 introduces the multidimensional subscript operator and replaces the standard

behavior of the comma operator

struct A {

int operator[](int x) { return x; }

};

struct B {

int operator[](int x, int y) { return x * y; } // not allowed before C++23

};

int main() {

A a;

cout << a[3, 4]; // return 4 (bug)

B b;

cout << b[3, 4]; // return 12, C++23

}

43/66

Function Call Operator operator()

The function call op erator operator() is generally overloaded to create objects

which behave like functions, or for classes that have a primary operation (see Basic

Concepts IV lecture)

# include <numeric> // for std::accumulate

struct Multiply {

int operator()(int a, int b) const {

return a * b;

}

};

int array[] = { 2, 3, 4 };

int factorial = std::accumulate(array, array + 3, 1, Multiply{});

cout << factorial; // 24

44/66

static operator() and static operator[]

C++23 introduces the static version of the function call operator operator()

and the subscript operator

operator[] to avoid passing the this pointer

# include <numeric> // for std::accumulate

struct Multiply {

// int operator()(int a, int b); // declaration only

static int operator()(int a, int b); // best efficiency, no need to access

}; // internal data members

struct MyArray {

// int operator[](int x);

static int operator[](int x); // best efficiency

};

int array[] = { 2, 3, 4 };

int factorial = std::accumulate(array, array + 3, 1, Multiply{});

45/66

Conversion Operator operator T() 1/2

The conversion operator operator T() allows objects to be either implicitly or

explicitly (casting) converted to another type

class MyBool {

int x;

public:

MyBool(int x1) : x{x1} {}

operator bool() const { // implicit return type

return x == 0;

}

};

MyBool my_bool{

3};

bool b = my_bool; // b = false, call operator bool()

46/66

Conversion Operator operator T() 2/2

C++11 Conversion operators can be marked explicit to prevent implicit

conversions. It is a go od practice as for class constructors

struct A {

operator bool() { return true; }

};

struct B {

explicit operator bool() { return true; }

};

A a;

B b;

bool c1 = a;

// bool c2 = b; // compile error: explicit

bool c3 = static_cast<bool>(b);

47/66

Return Type Overloading Resolution

⋆

struct A {

operator float() { return 3.0f; }

operator int() { return 2; }

};

auto f() {

return A{};

}

float x = f();

int y = f();

cout

<< x << " " << y; // x=3.0f, y=2

48/66

Increment and Decrement Operators operator++/–

The increment and decrement ope rators operator++, operator– are used to update

the value of a variable by one unit

struct A {

int* ptr;

int pos;

A& operator++() { // Preﬁx notation (++var):

++ptr; // returns the new copy of the object by-reference

++pos;

return *this;

}

operator++(int a) { // Postﬁx notation (var++):

A tmp = *this; // returns the old copy of the object by-value

++ptr;

++pos;

return tmp;

}

};

49/66

Assignment Operator operator= 1/3

The assignment operator operator= is used to copy values from one object to

another already existing object

# include <algorithm> //std::fill, std::copy

struct Array {

char* array;

int size;

Array(

int size1, char value) : size{size1} {

array

= new char[size];

std

::fill(array, array + size, value);

}

∼Array() { delete[] array; }

Array& operator=(const Array& x) { .... } // --> see next slide

};

Array a{

5, 'o'}; // ["ooooo"]

Array b{3, 'b'}; // ["bbb"]

50/66

Assignment Operator operator= 2/3

• First option:

Array& operator=(const Array& x) {

if (this == &x) // (1) Check for self assignment

return *this;

delete[] array; // (2) Release class resources

size = x.size; // (3) Re-initialize class resources

array = new int[x.size];

std

::copy(x.array, x.array + size, array); // (4) deep copy

return *this;

}

• Second option (less intuitive):

Array& operator=(Array x) { // pass by-value

swap(*this, x); // now we need a swap function for A

return *this; // x is destroyed at the end

} // --> see next slide

51/66

Assignment Operator operator=

⋆

3/3

swap method:

friend void swap(A& x, A& y) {

using std::swap;

swap(x.size, y.size);

swap(x.array, y.array);

}

• why using

std::swap ? if swap(x, y) ﬁnds a better match, it will use that

instead of

std::swap

• why

friend ? it allows the function to be used from outside the structure/class

scope

stackoverflow.com/questions/3279543

stackoverflow.com/questions/5695548

52/66

Stream Operator operator«

The stream operation operator« can be overloaded to perform input and output for

user-deﬁned types

# include <iostream>

struct Point {

int x, y;

friend std::ostream& operator<<(std::ostream& stream,

const Point& point) {

stream

<< "(" << point.x << "," << point.y << ")";

return stream;

}

// operator<< is a member of std::ostream -> need friend

}; // implementation and definition can be splitted (not suggested for operator<<)

Point point{1, 2};

std

::cout << point; // print "(1, 2)"

53/66

Operators Precedence

Operators preserve precedence and short-circuit properties

struct MyInt {

int x;

int operator^(int exp) { // exponential

int ret = 1;

for (int i = 0; i < exp; i++)

ret

*= x;

return ret;

}

};

MyInt x{

3};

int y = x^2;

cout << y; // 9

int z = x^2 + 2;

cout << z; // 81 !!!

54/66

Binary Operators Note

Binary operators should be implemented as friend methods

struct A {}; struct C {};

struct B : A {

bool operator==(const A& x) { return true; }

};

struct D : C {

friend bool operator==(const C& x, const C& y) { return true; } // inline

};

// bool operator==(const C& x, const C& y) { return true; } // out-of-line

A a; B b; C c; D d;

b == a; // ok

// a == b; // compile error // "A" does not have == operator

c == d; // ok, use operator==(const C&, const C&)

d == c; // ok, use operator==(const C&, const C&)

55/66

C++ Object Layout

⋆

Overview

The term layout refers to how an object is arranged in memory

C++ deﬁnes four types of layouts:

• aggregate

• trivial copyable

• standard layout

• plain-old data (POD)

Such layouts are important to understand how the C++ objects interact with pure C

API and for optimization purposes, e.g. pass in registers, memcpy , and serialization

56/66

Aggregate 1/3

Aggregate

An aggregate W is an array, struct, or class which supports aggregate

initialization (form of list-initialization) through curly braces syntax {}

• No user-provided constructors

• No

private / protected non- static data members and base class

• No

virtual functions

* No base classes, until C++17

* No brace-or-equal-initializers for non-static data members, until C++14

R Apply recursively to base classes non- static data members

No restrictions:

• Non- static uninitialized (until C++14) data and function members

•

static data and function members

stackoverflow.com/questions/4178175

57/66

Aggregate - Examples 2/3

struct Aggregate {

int x; // ok, public member

int y[3]; // ok, arrays are also fine

int z { 3 }; // only C++14

Aggregate() = default; // ok, defaulted constructor

Aggregate& operator=(const& Aggregate); // ok, function

private: // copy-assignment

void f() {} // ok, private function

};

struct NotAggregate1 {

NotAggregate1(); // !! user-provided constructor

virtual void f(); // !! virtual function

};

class NotAggregate2 : NotAggregate1 { // !! the base class is not an aggregate

int x; // !! x is private

NotAggregate1 y; // !! y is not an aggregate (recursive property)

};

58/66

Aggregate - Examples 3/3

struct Aggregate1 {

int x;

struct Aggregate2 {

int a;

int b[3];

} y;

};

int array1[3] = {1, 2, 3};

int array2[3] {1, 2, 3};

Aggregate1 agg1 = {1, {2, {3, 4, 5}}};

Aggregate1 agg2 {1, {2, {3, 4, 5}}};

Aggregate1 agg3 = {1, 2, 3, 4, 5};

59/66

Trivial Class 1/2

Trivial Class

A Trivial Class W is a class trivial copyable W (supports memcpy)

Trivial copyable:

• No user-provided copy/move/default

constructors, destructor, and copy/move

assignment operators

• No

virtual functions

R Apply recursively to base classes and non-

static data members

No restrictions:

• User-declared constructors diﬀerent from copy/move/default

• Functions or static ,non- static data members initialization

•

protected / private members

60/66

Trivial Class - Examples 2/2

struct NonTrivial {

NonTrivial(); // !! user-provided constructor

virtual void f(); // !! virtual function

};

struct Trivial1 {

Trivial1() = default; // ok, defaulted constructor

Trivial1(int) {} // ok, user-default constructor

static int x; // ok, static member

void f(); // ok, function

private:

int z { 3 } // ok, private and initialized

};

struct Trivial2 : Trivial1 { // ok, base class is trivial

int Trivial1[3]; // ok, array of trivials is trivial

};

61/66

Standard-Layout Class 1/2

Standard-Layout

A standard-layout class W is a class with the same memory layout of the

equivalent C struct or union (useful for communicating with other languages)

• No

virtual functions

• Only one control access (

public / protected / private ) for all non- static

data members

• No base classes with non-

static data members

• No base classes of the same type as the ﬁrst non- static data member

R Apply recursively to base classes and non- static data members

62/66

Standard-Layout Class (examples) 2/2

struct StandardLayout1 {

StandardLayout1(); // ok, user-provided contructor

void f(); // ok, non-virtual function

};

class StandardLayout2 : StandardLayout1 {

int x, y; // ok, both are private

StandardLayout1 y; // ok, 'y' is not the first data member

};

struct StandardLayout4 : StandardLayout1, StandardLayout2 {

// ok, can use multiple inheritance as long as only

// one class in the hierarchy has non-static data members

};

63/66

Plain Old Data (POD)

Plain Old Data (POD): Trivial copyable (T) + Standard-Layout (S)

(T) No user-provided copy/move/default

constructors, destructor, and copy/move

assignment operators

(S) Only one control access (

public / protected / private ) for all non- static

data members

(S) No base classes with non-

static data members

(S) No base classes of the same type as the ﬁrst non- static data member

(T, S) No

virtual functions

R Apply recursively to base classes and non- static data members

64/66

C++ std Utilities

C++11 provides three utilities to check if a type is POD, Trivial Copyable,

Standard-Layout

•

std::is_pod checks for POD, deprecated in C++20

• std::is_trivially_copyable checks for trivial copyable

•

std::is_standard_layout checks for standard-layout

# include <type_traits>

struct A {

int x;

private:

int y;

};

cout

<< std::is_trivially_copyable_v<A>; // true

cout << std::is_standard_layout_v<A>; // false

cout << std::is_pod_v<A>; // false

65/66

Object Layout Hierarchy

66/66

Modern C++

Programming

10. Templates and

Meta-programming I

Function Templates and Compile-Time Utilities

Federico Busato

2025-04-14

Table of Contents

1 Function Template

Overview

Template Instantiation

Template Parameters

Template Parameters - Default Value

Overloading

Specialization

1/47

Table of Contents

2 Template Variable

3 Template Parameter Types

Generic Type Notes

auto Placeholder

Function Type

⋆

2/47

Table of Contents

4 Compile-Time Utilities

static_assert

using Keyword

decltype Keyword

5 Type Traits

Overview

Type Traits Library

Type Manipulation

3/47

Template Books

C++ Templates: The

Complete Guide (2nd)

D. Vandevoorde, N. M. Josuttis,

D. Gregor, 2017

4/47

Function Template

Template Overview

Template

A template is a mechanism for generic programming to provide a “schema” (or

placeholders) to represent the structure of an entity

In C++, templates are a compile-time functionality to represent:

• A family of functions

• A family of classes

• A family of variables

C++14

5/47

Function Template 1/2

The problem: We want to deﬁne a function to handle diﬀerent types

int add(int a, int b) {

return a + b;

}

float add(float a, float b) { // overloading

return a + b;

}

char add(char a, char b) { ... } // overloading

ClassX add(ClassX a, ClassX b) { ... } // overloading

• Redundant code!!

• How many functions we have to write!?

• If the user introduces a new type we have to write another function!!

6/47

Function Template 2/2

Function Template

A function template is a function schema that operates with generic types

(independent of any particular type) or concrete values

A function template works with multiple types without repeating the entire co de for

each of them

template<typename T> // or template<class T>

T add(T a, T b) {

return a + b;

}

int c1 = add(3, 4); // c1 = 7

float c2 = add(3.0f, 4.0f); // c2 = 7.0f

7/47

Templates: Beneﬁts and Drawbacks

Beneﬁts

• Generic Programming: Less code and reusable. Reduce redundancy, better

maintainability and ﬂexibility

• Performance. Computation can b e done/optimized at compile-time → faster

Drawbacks

• Readability. “With respect to C++, the syntax and idioms of templates are

esoteric compared to conventional C++ programming, and templates can be very

diﬃcult to understand” [wikip edia] → hard to read, cryptic error messages

• Compile Time/Binary Size. Templates are implicitly instantiated for every

distinct parameters

8/47

Template Instantiation

The template instantiation is the substitution of template parameters with concrete

values or types

The compiler automatically generates a function implementation for

each template

instantiation

template<typename T>

T add(T a, T b) {

return a + b;

}

add(

3, 4); // generates: int add(int, int)

add(3.0f, 4.0f); // generates: float add(float, float)

add(2, 6); // already generated

// other instances are not generated

// e.g. char add(char,char)

9/47

Implicit and Explicit Template Instantiation

Implicit Template Instantiation

Implicit template instantiation occurs when the compiler generates code

depending on the deduced argument types or the explicit template arguments and

only when the deﬁnition is needed

Explicit Template Instantiation

Explicit template instantiation occurs when the compiler generates code

depending only on the explicit template arguments speciﬁed in the

declaration.

Useful when dealing with multiple translation units to reduce the binary size

10/47

Implicit and Explicit Template Instantiation

template<typename T>

void f(T a) {}

void g() {

3); // generates: void f(int) → implicit

f<short>(3.0); // generates: void f(short) → implicit

}

template void f<int>(int); // generates: void f(int) → explicit

11/47

Template Parameters

Template Parameters are the names following the template keyword

template<typename T>

void f() {}

<int>();

typename T is the template parameter

int is the template argument

A template parameter can be a generic type, i.e. typename , as well as a

non-type template parameters (NTTP), e.g. int , enum , etc.

The template argument of a generic type is a built-in or user-declared type, while a

concrete value for a

non-type template parameter

12/47

Examples 1/2

int parameter

template<int A, int B>

int add_int() {

return A + B; // sum is computed at compile-time

} // e.g. add_int<3, 4>();

enum parameter

enum class Enum { Left, Right };

template<Enum Z>

int add_enum(int a, int b) {

return (Z == Enum::Left) ? a + b : a;

} // e.g. add_enum<Enum::Left>(3, 4);

13/47

Examples 2/2

• Ceiling division

template<int DIV, typename T>

T ceil_div(T value) {

return (value + DIV - 1) / DIV;

}

// e.g. ceil_div<5>(11); // returns 3

• Rounded division

template<int DIV, typename T>

T round_div(T value) {

return (value + DIV / 2) / DIV;

}

// e.g. round_div<5>(11); // returns 2 (2.2)

Since DIV is known at compile-time, the compiler can heavily optimize the division

(almost for every number, not just for power of two)

14/47

Template Parameters - Default Value 1/3

C++11 Template parameters can have default values

template<int A = 3, int B = 4>

void print1() { cout << A << ", " << B; }

template<int A = 3, int B> // still possible, but little sense

void print2() { cout << A << ", " << B; }

print1

<2, 5>(); // print 2, 5

print1<2>(); // print 2, 4 (B: default)

print1<>(); // print 3, 4 (A,B: default)

print1(); // print 3, 4 (A,B: default)

print2<2, 5>(); // print 2, 5

// print2<2>();

compile error

// print2<>(); compile error

// print2();

compile error

15/47

Template Parameters - Default Value 2/3

Template parameters may have no name

void f() {}

template<typename = void>

void g() {}

int main() {

g();

// generated

}

f() is always generated in the ﬁnal code

g() is generated in the ﬁnal code only if it is called

16/47

Template Parameters - Default Value 3/3

C++11 Unlike function parameters, template parameters can be initialized by

previous values

template<int A, int B = A + 3>

void f() {

cout << B;

}

template<typename T, int S = sizeof(T)>

void g(T) {

cout

<< S;

}

f<3>(); // B is 6

g(3); // S is 4

17/47

Function Template Overloading

Template Functions can be overloaded

Concrete type overloading has higher precedence

template<typename T>

T add(T a, T b) { return a + b; } // e.g add(3, 4);

template<typename T>

int add(int a, int b) { return a + b + 1; } // higher precendence over

// the generic version

// different number of parameters

template<typename T>

T add(T a, T b, T c) { return a + b + c;} // e.g add(3, 4, 5);

Also, templates themselves can be overloaded

template<int C, typename T> // it is not in conflict with

T add(T a, T b) { return a + b + C; } // T add(T a, T b)

// "C" is part of the signature

18/47

Template Specialization 1/2

Template Specialization

Template specialization refers to the concrete implementation for a speciﬁc

combination of template parameters

The problem:

template<typename T>

bool compare(T a, T b) {

return a < b;

}

The direct comparison between two ﬂoating-point values is dangerous due to rounding

errors

19/47

Template Specialization 2/2

Solution: Template specialization

template<>

bool compare<float>(float a, float b) {

return ... // a better floating point implementation

}

Full Specialization: Function templates can be specialized only if ALL template

arguments are specialized

20/47

Template Variable

C++14 allows variables with templates

A template variable can be considered a special case of a class template (see next

lecture)

template<typename T>

constexpr T pi{ 3.1415926535897932385 }; // variable template

template<typename T>

T circular_area(T r) {

return pi<T> * r * r; // pi<T> is a variable template instantiation

}

circular_area(

3.3f); // float

circular_area(3.3); // double

// circular_area(3); // compile error, narrowing conversion with "pi"

21/47

Template Parameter

Types

Template Parameter Types

Template parameters can be:

• integral type

•

enum , enum class

• ﬂoating-point type

C++20

• auto placeholder C++17

• class literals and concepts C++20

• generic type typename

and rarely:

• function

• reference/pointer to global

static function or object

• pointer to member type

• nullptr_t C++14

22/47

Generic Type Notes

Pass multiple values and ﬂoating-point types

template<float V> // only in C++20

void print_float() {}

template<typename T>

void print() {

cout

<< T::x << ", " << T::y;

}

struct Multi {

static const int x = 1;

static constexpr float y = 2.0f;

};

print<Multi>(); // print "1, 2"

23/47

auto Placeholder

C++17 introduces automatic deduction of non-type template parameters with the

auto keyword

template<int X, int Y>

void f() {}

template<typename T1, T1 X, typename T2, T2 Y>

void g1() {} // before C++17

template<auto X, auto Y>

void g2() {}

<2u, 2u>(); // X: int, Y: int

g1<int, 2, char, 'a'>(); // X: int, Y: char

g2<2, 'a'>(); // X: int, Y: char

24/47

Template Parameter - Function Type

⋆

2/2

Function

template<int (*F)(int, int)> // <-- signature of "f"

int apply1(int a, int b) {

return F(a, b);

}

int f(int a, int b) { return a + b; }

int g(int a, int b) { return a * b; }

template<decltype(f) F> // alternative syntax

int apply2(int a, int b) {

return F(a, b);

}

int main() {

apply1<f>(2, 3); // return 5

apply2<g>(2, 3); // return 6

}

25/47

Compile-Time

Utilities

static_assert

C++11 static_assert is used to test an assertion at compile-time, e.g.

sizeof , lite rals, templates, constexpr

If the static assertion fails, the program does not compile

static_assert(2 + 2 == 4, "test1"); // ok, it compiles

static_assert(2 + 2 == 5, "test2"); // compile error, print "test2"

C++17: assertions without messages

template<typename T, typename R>

void f() { static_assert(sizeof(T) == sizeof(R)); }

<int, unsigned>(); // ok, it compiles

// f<int, char>(); // compile error

C++26: assertions with text formatting

static_assert(sizeof(T) != 4, std::format("test1 with sizeof(T)={}", sizeof(T)));

26/47

using Keyword 1/2

using keyword (C++11)

The using keyword introduces an alias-declaration or alias-template

•

using is an enhanced version of typedef with a more readable syntax

•

using can be combined with templates, as opposite to typedef

•

using is useful to simplify complex template expression

•

using allows introducing new names for partial and full specializations

typedef int distance_t; // equal to:

using distance_t = int;

typedef void (*function)(int, float); // equal to:

using function = void (*)(int, float);

27/47

using Keyword 2/2

Full/Partial specialization alias:

template<typename T, int Size>

struct Vector {}; // see next lecture for further details

// on class template

template<int Size>

using Bitset = Vector<bool, Size>; // partial specialization alias

using IntV4 = Vector<int, 4>; // full specialization alias

Accessing a type within a structure:

struct A {

using type = int;

};

using Alias = A::type;

28/47

decltype Keyword 1/4

C++11 decltype keyword deduces the type of an entity or expression

• decltype is always evaluated at compile-type

•

decltype(entity) returns the declared type of the entity

•

decltype(expression) returns the type of the expression

◦ A variable evaluated as an expression, i.e.

decltype((var)) , is deduced as

an lvalue

◦ A general expression, e.g. decltype((a + b)) , is deduced as its ﬁnal type

29/47

decltype Keyword (value) 2/4

int x = 3;

int& y = x;

const int z = 4;

int array[2];

void f(int, float);

decltype(x); // int

decltype(2 + 3.0); // double

decltype(y); // int&

decltype(z); // const int

decltype(array); // int[2]

decltype(f(1, 2.0f)); // void, i.e. the return type of 'f'

decltype(f); // void (int, float), i.e. the signature of 'f'

decltype(x) y = 3; // 'y' is int

using T = y; // T is int&

30/47

decltype Keyword ((expression))

⋆

3/4

bool f(int);

struct A {

int x;

};

int x = 3;

const A a{4};

decltype(x) d1; // int

decltype((x)) d2 = x; // int&

decltype(f) d3; // bool (int)

decltype((f)) d4 = f; // bool (&)(int)

decltype(a.x) d5; // int

decltype((a.x)) d6 = x; // const int&

www.ibm.com/support/knowledgecenter

31/47

decltype Keyword + Function templates 4/4

C++11

template<typename T, typename R>

decltype(T{} + R{}) add(T x, R y) {

return x + y;

}

unsigned v1 = add(1, 2u);

double v2 = add(1.5, 2u);

C++14

template<typename T, typename R>

auto add(T x, R y) {

return x + y;

}

32/47

Type Traits

Type Traits 1/4

Introspection

Introspection is the ability to inspect a type and query its properties

Reﬂection

Reﬂection is the ability of a computer program to examine, introspect, and modify

its own structure and behavior

C++ provides

compile-time reﬂection and introspection capabilities through

type traits

33/47

Type Traits 2/4

Type traits (C++11)

Type traits deﬁne a compile-time interface to query or modify the properties of

types

The problem:

template<typename T>

T integral_div(T a, T b) {

return a / b;

}

integral_div(

7, 2); // returns 3 (int)

integral_div(7l, 2l); // returns 3 (long int)

integral_div(7.0, 3.0); // !!! a floating-point value is not an integral type

Two alternatives: (1) Specialize (2) Type Traits + static_assert

34/47

Type Traits 3/4

If we want to prevent ﬂoating-point/other objects division at compile-time, a

ﬁrst solution consists in specialize for all integral types

template<typename T>

T integral_div(T a, T b); // declaration (error for other types)

template<>

char integral_div<char>(char a, char b) { // specialization

return a / b;

}

template<>

int integral_div<int>(int a, int b) { // specialization

return a / b;

}

...unsigned char

...short

...

Very redundant!!

35/47

Type Traits 4/4

The best solution is to use type traits

# include <type_traits> // <-- std type traits library

template<typename T>

T integral_div(T a, T b) {

static_assert(std::is_integral<T>::value,

"integral_div accepts only integral types");

return a / b;

}

std::is_integral<T> is a struct with a static constexpr boolean ﬁeld value

value is true if T is bool , char , short , int , long , long long , false otherwise

C++17 provides utilities to improve the readability of type traits

std::is_integral_v<T>; // std::is_integral<T>::value

36/47

Type Traits Library - Query Fundamental and Scalar Types 1/3

• is_integral checks for an integral type ( bool , char , unsigned char ,

short , int , long , etc.)

•

is_floating_point checks for a ﬂoating-point type ( float , double )

•

is_arithmetic checks for a integral or ﬂoating-point type

•

is_signed checks for a signed type ( float , int , etc.)

•

is_unsigned checks for an unsigned type ( unsigned , bool , etc.)

•

is_enum checks for an enumerator type ( enum , enum class )

• is_void checks for ( void )

• is_pointer checks for a pointer ( T* )

•

is_null_pointer checks for a ( nullptr ) C++14

37/47

Type Traits Library - Query References, Functions, Objects 2/3

Entity type queries:

•

is_reference checks for a reference ( T& )

•

is_array checks for an array ( T (&)[N] )

•

is_function checks for a function type

Class queries:

• is_class checks for a class type ( struct , class )

•

is_abstract checks for a class with at least one pure virtual function

• is_polymorphic checks for a class with at least one virtual function

38/47

Type Traits Library - Query Type Relation 3/3

Type property queries:

• is_const checks if a type is const

Type relation:

•

is_same<T, R> checks if T and R are the same type

•

is_base_of<T, R> checks if T is base of R

• is_convertible<T, R> checks if T can be converted to R

Full list: en.cppreference.com/w/cpp/header/type_traits

39/47

Example - const Deduction

# include <type_traits>

template<typename T>

void f(T x) { cout << std::is_const_v<T>; }

template<typename T>

void g(T& x) { cout << std::is_const_v<T>; }

const int a = 3;

f(a);

// print false, "const" drop in pass by-value

g(a); // print true

const int* b = nullptr;

g(b); // print false!! T: (const int)*, 'b' can be modified by 'g()'

int* const c = nullptr;

g(c);

// print true!! T: const (int*), 'c' cannot be modified by 'g()'

40/47

Example - Type Relation

# include <type_traits>

template<typename T, typename R>

T add(T a, R b) {

static_assert(std::is_same_v<T, R>, "T and R must have the same type");

return a + b;

}

add(1, 2); // ok

// add(1, 2.0); // compile error

, "T and R must have the same type"

# include <type_traits>

struct A {};

struct B : A {};

std::is_base_of_v<A, B>; // true

std::is_convertible_v<int, float>; // true

41/47

Type Manipulation

Type traits allow also to manipulate types by using the type ﬁeld

Example: produce

unsigned from int

# include <type_traits>

using U = typename std::make_unsigned<int>::type; // see next lecture to understand

// why 'typename' is needed here

U y = 5; // unsigned

C++14 provides utilities to improve the readability of type traits

std::make_unsigned_t<T>; // instead of 'typename std::make_unsigned<T>::type'

42/47

Type Traits Library - Type Manipulation 1/2

Signed and Unsigned types:

•

make_signed makes a signed type

•

make_unsigned makes an unsigned type

Pointers and References:

•

remove_pointer remove pointer ( T* → T )

•

remove_reference remove reference ( T& → T )

• add_pointer add pointer ( T → T* )

• add_lvalue_reference add reference ( T → T& )

43/47

Type Traits Library - Type Transformation 2/2

const speciﬁers:

• remove_const remove const ( const T → T )

•

add_const add const

Other type transformation:

• common_type<T, R> returns the common type between T and R

• conditional<pred, T, R> returns T if pred is true , R otherwise

•

decay<T> returns the same type as a function parameter passed by-value

44/47

Type Manipulation Example

# include <type_traits>

template<typename T>

void f(T ptr) {

using R = std::remove_pointer_t<T>;

R x = ptr[0]; // char

}

template<typename T>

void g(T x) {

using R = std::add_const_t<T>;

R y

= 3;

// y = 4; // compile error

}

char a[] = "abc";

f(a);

// T: char*

g(3); // T: int

45/47

std::common_type Example

# include <type_traits>

template<typename T, typename R>

std::common_type_t<R, T> // <-- return type

add(T a, R b) {

return a + b;

}

// we can also use decltype to derive the result type

using result_t = decltype(add(3, 4.0f));

result_t x = add(3, 4.0f);

46/47

std::conditional Example

# include <type_traits>

template<typename T, typename R>

auto f(T a, R b) {

constexpr bool pred = sizeof(T) > sizeof(R);

using S = std::conditional_t<pred, T, R>;

return static_cast<S>(a) + static_cast<S>(b);

}

2, 'a'); // return 'int'

f( 2, 2ull); // return 'unsigned long long'

f(2.0f, 2ull); // return 'unsigned long long'

47/47

Modern C++

Programming

11. Templates and

Meta-programming II

Class Templates , Sfinae, and Concepts

Federico Busato

2025-04-14

Table of Contents

1 Class Template

Class Specialization

Class Template Constructor

2 Constructor Template Automatic Deduction (CTAD)

1/84

Table of Contents

3 Class Template - Advanced Concepts

Class + Function - Specialization

Dependent Names - typename and template Keywords

Class Template Hierarchy and using

friend Keyword

Template Template Arguments

4 Template Meta-Programming

2/84

Table of Contents

5 SFINAE: Substitution Failure Is Not An Error

Function SFINAE

Class SFINAE

6 Variadic Templates

Homogeneous Variadic Parameters

Folding Expression

Variadic Class Template

⋆

3/84

Table of Contents

7 C++20 Concepts

Overview

concept Keyword

requires Clause

requires Expression

requires Expression + Clause

requires Clause + Expression

requires and constexpr

Nested requires

8 Template Debugging

4/84

Class Template

Similarly to function templates, class templates are used to build a family of classes

template<typename T>

struct A { // class template (typename template)

T x = 0;

};

template<int N1>

struct B { // class template (numeric template)

int N = N1;

};

<int> a1; // a1.x is int x = 0

A<float> a2; // a2.x is float x = 0.0f

B<1> b1; // b1.N is 1

B<2> b2; // b2.N is 2

5/84

Class Template Specialization 1/2

The main diﬀerence with template functions is that classes can be partially specialized

Note: Every class specialization (both partial and full) is a completely

new class, and it does

not share anything with the generic class

template<typename T, typename R>

struct A {}; // generic class template

template<typename T>

struct A<T, int> {}; // partial specialization

template<>

struct A<float, int> {}; // full specialization

6/84

Class Template Specialization 2/2

template<typename T, typename R>

struct A { // GENERIC class template

T x;

};

template<typename T>

struct A<T, int> { // PARTIAL specialization

T y;

};

<float, float> a1;

a1.x; // ok, generic template

// a1.y; // compile error

A<float, int> a2;

a2.y; // ok, partial specialization

// a2.x; // compile error

7/84

Example 1: Implement a Simple Type Trait

template<typename T, typename R> // GENERIC template declaration

struct is_same {

static constexpr bool value = false;

};

template<typename T>

struct is_same<T, T> { // PARTIAL template specialization

static constexpr bool value = true;

};

cout << is_same< int, char>::value; // print false, generic template

cout << is_same<float, float>::value; // print true, partial template

8/84

Example 2: Check if a Pointer is const

# include <type_traits>

// std::true_type

and std::false_type contain a field "value"

// set to true or false respectively

template<typename T>

struct is_pointer_to_const : std::false_type {}; // GENERIC template declaration

template<typename R> // PARTIAL specialization

struct is_pointer_to_const<const R*> : std::true_type {};

cout

<< is_pointer_to_const<int*>::value; // print false, generic template

cout << is_pointer_to_const<const int*>::value; // print true, partial template

cout << is_pointer_to_const<int* const>::value; // print false, generic template

9/84

Example 3: Compare Class Templates

# include <type_traits>

template<typename T>

struct A {};

template<typename T, typename R>

struct Compare : std::false_type {}; // GENERIC template declaration

template<typename T, typename R>

struct Compare<A<T>, A<R>> : std::true_type {}; // PARTIAL specialization

cout << Compare<int, float>::value; // false, generic template

cout << Compare<A<int>, A<int>>::value; // true, partial template

cout << Compare<A<int>, A<float>>::value; // true, partial template

10/84

Class Template Constructor

Class template arguments don’t need to be repeated if they are the default ones

template<typename T>

struct A {

A(const A& x); // A(const A<T>& x);

A f(); // A<T> f();

};

11/84

Constructor

Template Automatic

Deduction (CTAD)

Constructor Template Automatic Deduction (CTAD)

C++17 introduces automatic deduction of class template arguments in constructor

calls

template<typename T, typename R>

struct A {

A(T x, R y) {}

};

<int, float> a1(3, 4.0f); // < C++17

A a2(3, 4.0f); // C++17

// A<int> a{3, 5};

compile error, "partial" specialization

12/84

CTAD - User-Deﬁned Deduction Guides

Template deduction guide is a mechanism to instruct the compiler how to map

constructor parameter types into class template parameters

template<typename T>

struct MyString {

MyString(T) {}

};

// constructor class instantiation

MyString(char const*) -> MyString<std::string>; // deduction guide

MyString s{"abc"}; // construct 'MyString<std::string>'

13/84

CTAD - User-Deﬁned Deduction Guides - Aggregate Example

template<typename T>

struct A {

T x, y;

};

template<typename T>

A(T, T) -> A<T>; // deduction guide

// not required in C++20+ for aggregates

A a{1, 3}; // construct 'A<int, int>'

14/84

CTAD - User-Deﬁned Deduction Guides - Independent Argument Example

template<int I>

struct A {

template<typename T>

A(T) {}

};

template<typename T>

A(T) -> A<sizeof(T)>; // deduction guide

A a{1}; // construct 'A<4>', 4 == sizeof(int)

15/84

CTAD - User-Deﬁned Deduction Guides - Universal Reference Example

# include <type_traits> // std::remove_reference_t

template<typename T>

struct A {

template<typename R>

A(R&&) {}

};

template<typename R>

A(R&&) -> A<std::remove_reference_t<R>>; // deduction guide

int x;

A a{x}; // construct 'A<int>' instead of 'A<int&>'

16/84

CTAD - User-Deﬁned Deduction Guides - Iterator Example

# include <type_traits> // std::remove_reference_t

# include <vector> // std::vector

template<typename T>

struct Container {

template<typename Iter>

Container(Iter beg, Iter end) {}

};

template<typename Iter>

Container(Iter b, Iter e) -> //

deduction guide

Container<typename std::iterator_traits<Iter>::value_type>;

std::vector v{1, 2, 3};

Container c{v.begin(), v.end()}; // construct 'Container<int>'

17/84

CTAD - User-Deﬁned Deduction Guides - Alias Template

Alias template deduction requires C++20

template<typename T>

struct A {

A(T) {}

};

template<typename T>

A(T) -> A<int>; // deduction guide

template<typename T>

using B = A<T>; // alias template

B c{3.0}; // alias template deduction

// construct 'A<int>'

18/84

CTAD User-Deﬁned Deduction Guides - Limitation

Template deduction guide doesn’t work within the class scope

template<typename T>

struct MyString {

MyString(T) {}

MyString f() {

return MyString("abc"); } // create 'MyString<const char*>'

}; // not 'MyString<std::string>'

MyString(const char*) -> MyString<std::string>; // deduction guide

MyString<const char*> s{"abc"}; // construct 'MyString<const char*>'

The problem can be avoided by using a factory

template<typename T>

auto make_my_string(const T& x) { return MyString(x); }

19/84

Class Template -

Advanced Concepts

Class + Function - Specialization 1/3

Given a class template and a template member function

template<typename T, typename R>

struct A {

template<typename X, typename Y>

void f();

};

There are two ways to specialize the class/function:

• Generic class + generic function

• Full class specialization + generic/full specialization function

20/84

Class + Function - Specialization 2/3

template<typename T, typename R>

template<typename X, typename Y>

void A<T, R>::f() {}

// ok, A<T, R> and f<X, Y> are not specialized

template<>

template<typename X, typename Y>

void A<int, int>::f() {}

// ok, A<int, int> is full specialized

// ok, f<X, Y> is not specialized

template<>

void A<int, int>::f<int, int>() {}

// ok, A<int, int> and f<int, int> are full specialized

21/84

Class + Function - Specialization 3/3

template<typename T>

template<typename X, typename Y>

void A<T, int>::f() {}

// error A<T, int> is partially specialized

// (A<T, int> class must be defined before)

template<typename T, typename R>

template<typename X>

void A<T, R>::f<int, X>() {}

// error function members cannot be partially specialized

template<typename T, typename R>

template<>

void A<T, R>::f<int, int>() {}

// error function members of a non-specialized class cannot be specialized

// (requires a binding to a specific template instantiation at compile-time)

22/84

Accessing a Dependent Type - typename Keyword 1/2

Structure templates can have diﬀerent data members for each specialization.

The compiler needs to know in advance if a symbol within a structure is a

type or a

static member when the structure template depends on another template parameter

The keyword

typename placed before a structure template solves this ambiguous

template<typename T>

struct A {

using type = int;

};

template<typename R>

void g() {

using X = typename A<R>::type; // "type" is a typename or

} // a data member depending on R

23/84

Accessing a Dependent Type - typename Keyword 2/2

The using keyword can be used to simply the expression to get the structure type

template<typename T>

struct A {

using type = int;

};

template<typename T>

using AType = typename A<T>::type;

template<typename R>

void g() {

using X = AType<R>;

}

24/84

Template Dependent Names - template Keyword

The template keyword tells the compiler that what follows is a template name

(function or class)

note: recent compilers don’t strictly require this keyword in simple cases

template<typename T>

struct A {

template<typename R>

void g() {}

};

template<typename T> // A<T> is a dependent name (from T)

void f(A<T> a) {

// a.g<int>(); // compile error A<T> is dependent on T

// interpreted as: "a.g < int > ();"

// namely: "(a.g < int) > ();"

a.template g<int>(); // ok

}

25/84

Class Template Hierarchy and using

Member of class templates can be used internally in derived class templates by

specifying the particular type of the base class with the keyword

using

template<typename T>

struct A {

T x;

void f() {}

};

template<typename T>

struct B : A<T> {

using A<T>::x; // needed (otherwise it could be another specialization)

using A<T>::f; // needed

void g() {

x; // without 'using': this->x

f();

}

};

26/84

virtual Function and Template

Virtual functions cannot have template arguments

• Templates are a

compile-time feature

• Virtual functions are a

run-time feature

Full story:

The reason for the language disallowing the particular construct is that there are

potentially

inﬁnite diﬀerent types that could be instantiating your template member

function, and that in turn means that the compiler would have to generate code to

dynamically dispatch those many types, which is infeasible

stackoverflow.com/a/79682130

27/84

friend Keyword

template<typename T> struct A {};

template<typename T, typename R> struct B {};

template<typename T> void f() {}

//----------------------------------------------------------------------------------

class C {

friend void f<int>(); // match only f<int>

template<typename T> friend void f(); // match all templates

friend struct A<int>; // match only A<int>

template<typename> friend struct A; // match all A templates

// template<typename T> friend struct B<int, T>;

// partial specialization cannot be declared as a friend

};

28/84

Template Template Arguments

Template template parameters match templates instead of concrete types

template<typename T> struct A {};

template< template<typename> class R >

struct B {

R<int> x;

<float> y;

};

template< template<typename> class R, typename S >

void f(R<S> x) {} // works with every class with exactly one template parameter

B<A> y;

f( A

<int>() );

class and typename keyword are interchangeably in C++17

29/84

Template

Meta-Programming

Template Meta-Programming

“Metaprogramming is the writing of computer programs with the

ability to treat programs as their data. It means that a program could

be designed to read, generate, analyze or transform other programs, and

even modify itself while running”

“Template meta-programming refers to uses of the C++ template

system to perform computation at compile-time within the code.

Templates meta-programming include compile-time constants, data

structures, a nd complete functions”

30/84

Template Meta-Programming

• Template Meta-Programming is fast (runtime)

Template Metaprogramming is computed at compile-time (nothing is computed at

run-time)

• Template Meta-Programming is Turing Complete

Template Metaprogramming is capable of expressing all tasks that standard

programming language can accomplish

• Template Meta-Programming requires longer compile time

Template recursion heavily slows down the compile time, and requires much more

memory than compiling standard code

• Template Meta-Programming is complex

Everything is expressed recursively. Hard to read, hard to write, and also very hard

to debug

31/84

Example 1: Factorial

template<int N>

struct Factorial { // GENERIC template: Recursive step

static constexpr int value = N * Factorial<N - 1>::value;

};

template<>

struct Factorial<0> { // FULL SPECIALIZATION: Base case

static constexpr int value = 1;

};

constexpr int x = Factorial<5>::value; // 120

// int y = Factorial<-1>::value; // Infinite recursion :)

32/84

Example 1: Factorial (Notes)

The previous example can be easily written as a constexpr in C++14

template<typename T>

constexpr int factorial(T value) {

T tmp = 1;

for (int i = 2; i <= value; i++)

tmp

*= i;

return tmp;

};

Advantages

• Easy to read and write (easy to debug)

• Faster compile time (no recursion)

• Works with diﬀerent types (typename T)

• Works at run-time and compile-time

33/84

Example 2: Log2

template<int N>

struct Log2 { // GENERIC template: Recursive step

static_assert(N > 0, "N must be greater than zero");

static constexpr int value = 1 + Log2<N / 2>::value;

};

template<>

struct Log2<1> { // FULL SPECIALIZATION: Base case

static constexpr int value = 0;

};

constexpr int x = Log2<20>::value; // 4

34/84

Example 3: Log

template<int A, int B>

struct Max { // utility

static constexpr int value = A > B ? A : B;

};

template<int N, int BASE>

struct Log { // GENERIC template: Recursive step

static_assert(N > 0, "N must be greater than zero");

static_assert(BASE > 0, "BASE must be greater than zero");

// Max is used to avoid Log<0, BASE>

static constexpr int TMP = Max<1, N / BASE>::value;

static constexpr int value = 1 + Log<TMP, BASE>::value;

};

template<int BASE>

struct Log<1, BASE> { // PARTIAL SPECIALIZATION: Base case

static constexpr int value = 0;

};

constexpr int x = Log<20, 2>::value; // 4

35/84

Example 4: Unroll (Compile-time/Run-time Mix)

⋆

template<int NUM_UNROLL, int STEP = 0>

struct Unroll { // GENERIC template: Recursive step

template<typename Op>

static void run(Op op) {

op(STEP);

Unroll

<NUM_UNROLL, STEP + 1>::run(op);

}

};

template<int NUM_UNROLL>

struct Unroll<NUM_UNROLL, NUM_UNROLL> { // PARTIAL SPECIALIZATION: Base case

template<typename Op>

static void run(Op) {}

};

auto lambda = [](int step) { cout << step << ", "; };

Unroll

<5>::run(lambda); // print "0, 1, 2, 3, 4"

36/84

SFINAE:

Substitution Failure

Is Not An Error

SFINAE

Substitution Failure Is Not An Error (SFINAE) applies during overload resolution

of function templates. When substituting the deduced type for the template

parameter

fails, the specialization is discarded from the overload set instead of

causing a compile error

37/84

The Problem

template<typename T>

T ceil_div(T value, T div);

template<>

unsigned ceil_div<unsigned>(unsigned value, unsigned div) {

return (value + div - 1) / div;

}

template<>

int ceil_div<int>(int value, int div) { // handle negative values

return (value > 0)

∧

(div > 0) ?

(value / div) : (value + div - 1) / div;

}

What about

long long int , long long unsigned , short , unsigned short ,

etc.?

38/84

std::enable_if Type Trait

The common way to adopt SFINAE is using the

std::enable_if/std::enable_if_t type traits

std::enable_if allows a function template or a class template specialization to

include or exclude itself from a set of matching functions/classes

template<bool Condition, typename T = void>

struct enable_if {

// "type" is not defined if "Condition == false"

};

template<typename T>

struct enable_if<true, T> {

using type = T;

};

helper alias:

std::enable_if_t<T> instead of typename std::enable_if<T>::type

39/84

Function SFINAE - Return type 1/5

# include <type_traits> // std::is_signed_v, std::enable_if_t

template<typename T>

std::enable_if_t<std::is_signed_v<T>>

f(T) {

cout

<< "signed";

}

template<typename T>

std::enable_if_t<!std::is_signed_v<T>>

f(T) {

cout << "unsigned";

}

f(1); // print "signed"

f(1u); // print "unsigned"

40/84

Function SFINAE - Parameter 2/5

# include <type_traits> compiler-explorer W

template<typename T>

void f(std::enable_if_t<std::is_signed_v<T>, T>) {

cout

<< "signed";

}

template<typename T>

void f(std::enable_if_t<!std::is_signed_v<T>, T>) {

cout << "unsigned";

}

// NOTE: explicit SFINAE on parameter prevents argument deduction

f<int>(1); // print "signed"

f<unsigned>(1u); // print "unsigned"

// f(1); // compile error

// f(1u); //

compile error

41/84

Function SFINAE - Hidden Parameter 3/5

# include <type_traits>

template<typename T>

void f(T,

std

::enable_if_t<std::is_signed_v<T>, int> = 0) {

cout

<< "signed";

}

template<typename T>

void f(T,

std::enable_if_t<!std::is_signed_v<T>, int> = 0) {

cout << "unsigned";

}

f(1); // print "signed"

f(1u); // print "unsigned"

42/84

Function SFINAE - Hidden Template Parameter 4/5

# include <type_traits>

template<typename T,

std

::enable_if_t<std::is_signed_v<T>, int> = 0>

void f(T) {}

template<typename T,

std

::enable_if_t<!std::is_signed_v<T>, int> = 0>

void f(T) {}

f(4);

4u);

43/84

Function SFINAE - decltype + return type 5/5

# include <type_traits>

template<typename T, typename R> // (1)

decltype(T{} + R{}) add(T a, R b) { // T{} + R{} is not possible with 'A'

return a + b;

}

template<typename T, typename R> // (2)

std::enable_if_t<std::is_class_v<T>, T> // 'int' is not a class

add(T a, R b) {

return a;

}

struct A {};

add(

1, 2u); // call (1)

add(A{}, A{}); // call (2)

// if 'A' supports operator+, then we have a conflict

44/84

Function SFINAE Example - Array vs. Pointer

# include <type_traits> compiler-explorer W

template<typename T, int Size>

void f(T (&array)[Size]) {} // (1)

// template<typename T>

// void f(T array) {} // (2)

template<typename T>

std::enable_if_t<std::is_pointer_v<T>>

f(T ptr) {} // (3)

// void f(int* pointer) {} // (4) has the highest priority among (1), (2), and (3)

int array[3];

f(array);

// It is not possible to call (1) if (2) is present

// The reason is that 'array' decays to a pointer

// Now with (3), the code calls (1)

45/84

Function SFINAE Notes

The wrong way to achieve SFINAE

template<typename T, typename = std::enable_if_t<std::is_signed_v<T>>>

void f(T) {}

// template<typename T, typename = std::enable_if_t<!std::is_signed_v<T>>>

// void f(T) {}

compile error redefinition of the second template parameter

Using std::enable_if_t for the return type prevents auto deduction

// template<typename T>

// std::enable_if_t<std::is_signed_v<T>, auto> f(T) {}

// compile error auto is not allowed here

46/84

Class SFINAE

# include <type_traits>

template<typename T, typename Enable = void>

struct A;

template<typename T>

struct A<T, std::enable_if_t<std::is_signed_v<T>>> {

};

template<typename T>

struct A<T, std::enable_if_t<!std::is_signed_v<T>>> {

};

A<int> a1;

A<unsigned> a2;

47/84

Check Struct Member

⋆

1/3

SFINAE can be also used to check if a structure has a speciﬁc data member or type

Let consider the following structures:

struct A {

static int x;

int y;

using type = int;

};

struct B {};

48/84

Check Struct Member - Variable

⋆

2/3

# include <type_traits>

template<typename T, typename = void>

struct has_x : std::false_type {};

template<typename T>

struct has_x<T, decltype((void) T::x)> : std::true_type {};

template<typename T, typename = void>

struct has_y : std::false_type {};

template<typename T>

struct has_y<T, decltype((void) std::declval<T>().y)> : std::true_type {};

has_x< A >::value; // returns true

has_x< B >::value; // returns false

has_y< A >::value; // returns true

has_y< B >::value; // returns false

49/84

Check Struct Member - Type

⋆

3/3

template<typename...>

using void_t = void; // included in C++17 <utility>

template<typename T, typename = void>

struct has_type : std::false_type {};

template<typename T>

struct has_type<T,

std

::void_t<typename T::type> > : std::true_type {};

has_type< A >::value; // returns true

has_type< B >::value; // returns false

50/84

Support Trait for Stream Operator

⋆

template<typename T>

using EnableP = decltype( std::declval<std::ostream&>() <<

std::declval<T>() );

template<typename T, typename = void>

struct is_stream_supported : std::false_type {};

template<typename T>

struct is_stream_supported<T, EnableP<T>> : std::true_type {};

struct A {};

is_stream_supported

<int>::value; // returns true

is_stream_supported<A>::value; // returns false

51/84

SFINAE

https://twitter.com/IAmErikN/status/1252316405724336128

52/84

Variadic Templates

Variadic Template 1/2

Variadic template (C++11)

A variadic template captures a parameter pack of arguments, which hold an

arbitrary number of values or types

template<typename... TArgs> // Variadic typename -> parameter pack: ... TArgs

void f(TArgs... args) {} // pack expansion -> pattern: TArgs

A parameter pack is introduced by an identiﬁer TArgs preﬁxed by an ellipsis

... TArgs . Once captured, a parameter pack can later be used in a pattern

expanded by an ellipsis ...

A pack expansion is equivalent to a comma-separated list of instances of the pattern

A pattern is a set of tokens containing the identiﬁers of one or more parameter packs.

When a pattern contains more than one parameter pack, all packs must have the same length

53/84

Variadic Template 2/2

template<typename... TArgs>

void f(TArgs... args) { // Typename expansion

int values[] = {args...}; // Arguments expansion

}

1, 2, 3);

The pack

TArgs expands in a template-argument-list, i.e. list of template arguments

The pack

args expands in an initializer-list, i.e. list of values

The number of variadic arguments can be retrieved with the

sizeof... operator

sizeof...(args) // e.g. 3

Note: variadic arguments must be the last one in the declaration

C++20 idioms for parameter packs

54/84

Example 1

// BASE CASE

template<typename T, typename R>

auto add(T a, R b) {

return a + b;

}

// RECURSIVE CASE

template<typename T, typename... TArgs> // Variadic typename

auto add(T a, TArgs... args) { // Typename expansion

return a + add(args...); // Arguments expansion

}

add(2, 3.0); // 5

add(2, 3.0, 4); // 9

add(2, 3.0, 4, 5); // 14

// add(2); // compile error

the base case accepts only two arguments

55/84

Example 2 - Function Unpack

template<typename T, typename... TArgs>

auto add(T a, TArgs... args); // see previous slides

struct A {

int v;

int f() { return v; }

};

template<typename... TArgs>

int f(TArgs... args) {

return add(args.f()...); // equivalent to: 'A{1}.f(), A{2}.f(), A{3}.f()'

}

f(A{

1}, A{2}, A{3}); // return 6

56/84

Example 3 - Function Application

template<typename T, typename... TArgs>

auto add(T a, TArgs... args); // see previous slides

template<typename T>

T square(T value) { return value * value; }

//-----------------------------------------------------------

template<typename... TArgs>

auto add_square(TArgs... args) {

return add(square(args)...); // square() is applied to each

} // variadic argument

add_square(2, 2, 3.0f); // returns 17.0f

57/84

Example 4 - Type Expansion

template<typename... TArgs>

int g(TArgs... args) {}

template<typename... TArgs>

int f(TArgs... args) {

<std::make_unsigned_t<TArgs>...>(args...);

}

f(1, 2, 3);

58/84

Function Initializer List Types

template<typename... TArgs>

void f(TArgs... args) {} // pass by-value

template<typename... TArgs>

void g(const TArgs&... args) {} // pass by-const reference

template<typename... TArgs>

void h(TArgs*... args) {} // pass by-pointer

template<int... Sizes>

void l(int (&...arrays)[Sizes]) {} // pass a list of array references

int a[] = {1, 2};

int b[] = {1, 2, 3};

f(1, 2.0);

h(a, b);

l(a, b);

// same as g()

59/84

Homogeneous Variadic Template Parameters

Parameter pack can be also used to create a homogeneous variadic template

parameters

template<int... IntSeq> // sequence of integers

void f() {}

<1, 2, 3>();

template<int... IntSeq> // sequence of integers

class A {};

A<1, 2, 3> a{};

60/84

Other Usages

Variadic templates can be also applied to lambdas with generic parameters (C++14)

and concepts (

C++20)

auto lambda = [](auto... args) {};

lambda(

1, 2u, 3.0f, 1ull);

void f(std::floating_point auto... args) {}

1.0, 2.0f); // ok

// f(1u, 2.0f); //

compile error

61/84

Advanced Usages

⋆

Besides initializer-lists, template-argument-list, parameter pack can be used in:

capture list, constructor initializer-list, using declaration

template<typename... BaseClasses>

struct A : BaseClasses... { // : BaseClass_1, BaseClass_2, ...

A(int v) : BaseClasses...{v} {} // BaseClass_1{v}, BaseClass_2{v}, ...

using BaseClasses::f;

// equivalent to:

// using BaseClass_1::f;

// using BaseClass_2::f;

// ...

};

void f(auto... args) {

auto lambda = [arg&...](){}; // capture by-reference

}

62/84

Folding Expression 1/2

C++17 Folding expressions perform a fold of a template parameter pack over any

binary operator in C++ ( + , * , , , += , && , <= etc.)

Unary/Binary folding

template<typename... Args>

auto add_unary(Args... args) { // Unary folding

return (... + args); // unfold: 1 + 2.0f + 3ull

}

template<typename... Args>

auto add_binary(Args... args) { // Binary folding

return (1 + ... + args); // unfold: 1 + 1 + 2.0f + 3ull

}

add_unary(1, 2.0f, 3ll); // returns 6.0f (float)

add_binary(1, 2.0f, 3ll); // returns 7.0f (float)

63/84

Example 1 - Extract The Last Argument

template<typename... TArgs>

int f(TArgs... args) {

return (args, ...); // the comma operator discards left values

} // same as (..., args)

f(1, 2, 3); // return 3

64/84

Example 2 - Function Application

Same example of “Variadic Template - Function Application” ... but shorter

template<typename T>

T square(T value) { return value * value; }

template<typename... TArgs>

auto add_square(TArgs... args) {

return (square(args) + ...); // square() is applied to each

} // variadic argument

add_square(2, 2, 3.0f); // returns 17.0f

65/84

Example 3 - Homogeneous Variadic Parameter Type

Parameter pack can be constrained to obtain a homogeneous variadic parameter

type

template <typename ... TArgs>

std::enable_if_t<(std::is_same_v<TArgs, int> && ... && true)>

f(const TArgs ... args) {}

1, 2, 3); // ok

// f(1u, 2, 3); // compile error

66/84

Variadic Template and Classes

template<int... NArgs>

struct Add; // data structure declaration

template<int N1, int N2>

struct Add<N1, N2> { // BASE case

static constexpr int value = N1 + N2;

};

template<int N1, int... NArgs>

struct Add<N1, NArgs...> { // RECURSIVE case

static constexpr int value = N1 + Add<NArgs...>::value;

};

Add<2, 3, 4>::value; // returns 9

// Add<>; // compile error no match

// Add<2>::value; // compile error

// call Add<N1, NArgs...>, then Add<>

67/84

Variadic Class Template

⋆

Variadic Template can be used to build recursive data structures

template<typename... TArgs>

struct Tuple; // data structure declaration

template<typename T>

struct Tuple<T> { // base case

T value; // specialization with one parameter

};

template<typename T, typename... TArgs>

struct Tuple<T, TArgs...> { // recursive case

T value; // specialization with more

Tuple<TArgs...> tail; // than one parameter

};

Tuple<int, float, char> t1 { 2, 2.0, 'a' };

t1.value; // 2

t1.tail.value; // 2.0

t1.tail.tail.value; // 'a'

68/84

Variadic Template and Class Specialization

⋆

1/3

Get function arity at compile-time:

template <typename T>

struct GetArity;

// generic function pointer

template<typename R, typename... Args>

struct GetArity<R(*)(Args...)> {

static constexpr int value = sizeof...(Args);

};

// generic function reference

template<typename R, typename... Args>

struct GetArity<R(&)(Args...)> {

static constexpr int value = sizeof...(Args);

};

// generic function object

template<typename R, typename... Args>

struct GetArity<R(Args...)> {

static constexpr int value = sizeof...(Args);

};

69/84

Variadic Template and Class Specialization

⋆

2/3

void f(int, char, double) {}

int main() {

// function object

GetArity<decltype(f)>::value;

auto& g = f;

// function reference

GetArity<decltype(g)>::value;

// function reference

GetArity<decltype((f))>::value;

auto* h = f;

// function pointer

GetArity<decltype(h)>::value;

}

Get function arity from template parameter

70/84

Variadic Template and Class Specialization

⋆

3/3

Get operator() (and lambda) arity at compile-time:

template <typename T>

struct GetArity;

template<typename R, typename C, typename... Args>

struct GetArity<R(C::*)(Args...)> { // class member

static constexpr int value = sizeof...(Args);

};

template<typename R, typename C, typename... Args>

struct GetArity<R(C::*)(Args...) const> { // "const" class member

static constexpr int value = sizeof...(Args);

};

struct A {

void operator()(char, char) {}

void operator()(char, char) const {}

};

GetArity<A>::value; // call GetArity<R(C::*)(Args...)>

GetArity<const A>::value; // call GetArity<R(C::*)(Args...) const>

71/84

C++20 Concepts

C++20 introduces concepts as an extension for templates to enforce constraints,

which speciﬁes the requirements on template arguments

Concepts allows performing compile-time validation of template arguments

Advantages compared to SFINAE (

std::enable_if ):

• Concepts are easier to read and write

• Clear compile-time messages for debugging

• Faster compile time

Keyword:

concept Constrain

requires Constrain list/Requirements, clause and expression

• The concept behind C++ concepts

• Constraints and concepts

• What are C++20 concepts and constraints? How to use them?

72/84

The Problem

Goal: deﬁne a function to sum only arithmetic types

template<typename T>

T add(T valueA, T valueB) {

return valueA + valueB;

}

struct A {};

add(

3, 4); // ok

// add(A{}, A{}); // not supported

SFINAE solution (ugly, verbose):

template<typename T>

std::enable_if_t<std::is_arithmetic_v<T>, T>

add(T valueA, T valueB) {

return valueA + valueB;

}

73/84

concept Keyword

[template arguments]

concept [name] = [compile-time boolean expression];

Example: arithmetic type concept

template<typename T>

concept Arithmetic = std::is_arithmetic_v<T>;

• Template argument constrain

template<Arithmetic T>

T add(T valueA, T valueB) {

return valueA + valueB;

}

• auto deduction constrain (constrained auto )

auto add(Arithmetic auto valueA, Arithmetic auto valueB) {

return valueA + valueB;

}

74/84

requires Clause

requires [compile-time boolean expression or Concept]

it acts like SFINAE

• After template parameter list

template<typename T>

requires Arithmetic<T>

T add(T valueA, T valueB) {

return valueA + valueB;

}

• After function declaration

template<typename T>

T add(T valueA, T valueB) requires (sizeof(T) == 4) {

return valueA + valueB;

}

75/84

requires Clause and concept Notes

Concepts and requirements can have multiple statements. It must be a primary

expression, e.g. constexpr value (not a constexpr function) or a sequence of

primary expressions joined with the operator

&& or ||

template<typename T>

concept Arithmetic2 = std::is_arithmetic_v<T> && sizeof(T) >= 4;

Concepts and requirements can be used together

template<Arithmetic T>

requires (sizeof(T) >= 4)

T add(T valueA, T valueB) {

76/84

requires Expression 1/2

A requires expression is a compile-time expression of type bool that deﬁnes the

constraints on template arguments

requires [(arguments)] {

[SFINAE contrain];

// or

requires [predicate];

} -> bool

template<typename T>

concept MyConcept = requires (T a, T b) { // First case: SFINAE constrains

a + b; // Req. 1 - support add operator

a[0]; // Req. 2 - support subscript operator

a.x; // Req. 3 - has "x" data member

a.f(); // Req. 4 - has "f" function member

typename T::type; // Req. 5 - has "type" field

};

77/84

requires Expression 2/2

Concept library

# include <concepts>

template<typename T>

concept MyConcept2 = requires (T a, T b) {

{*a + 1} -> std::convertible_to<float>; // Req. 6 - can be deferred and the sum

// with an integer is convertible

// to float

{a * a} -> std::same_as<int>; // Req. 7 - "a * a" must be valid and

// the result type is "int"

};

78/84

requires Expression + Clause

requires expression can be combined with requires clause

(see

requires deﬁnition, second case) to compute a boolean value starting from

SFINAE expressions

template<typename T>

concept Arithmetic = requires { // expression -> bool (zero args)

T::value; // clause -> direct SFINAE

requires std::is_arithmetic_v<T>; // clause -> SFINAE from boolean

};

template<typename T>

concept MyConcept = requires (T value) { // expression -> bool (one arg)

requires sizeof(value) >= 4; // clause -> SFINAE from boolean

requires std::is_floating_point_v<T>; // clause -> SFINAE from boolean

};

79/84

requires Clause + Expression

requires clause can be combined with requires expression to apply SFINAE

(functions, structures) starting from a compile-time boolean expressions

template<typename T>

void f(T a) requires requires { T::value; }

// clause -> SFINAE followed by

expression -> bool (zero args)

{}

template<typename T>

T increment(T a) requires requires (T x) { x + 1; }

// clause -> SFINAE followed by

// expression -> bool (one arg)

{

return a + 1;

}

80/84

requires and constexpr

Some examples:

• constexpr bool has_member_x = requires(T v){ v.x; };

• if constexpr (MyConcept<T>)

• static_assert(requires(T v){ ++v; }, "no increment");

• template<typename Iter>

constexpr bool is_iterator() {

return requires(Iter it) { *it++; };

}

81/84

Nested requires

Nested requires example:

requires(Iter v) { // expression -> bool (one arg)

Iter it;

requires requires(typename Iter::value_type v) {

// clause -> SFINAE followed by

// expression -> bool (one arg)

v = *it; // read

*it = v; // write

};

}

82/84

Template Debugging

• -ftemplate-backtrace-limit=<N> Maximum number of template

instantiation notes for a single warning/error to N , default 10

N=1 is useful when looking only at the lasted instantiation (much less verbose

output). N=100 (or higher) if you are looking at all template instantiations (rare)

•

-ftemplate-depth=<N> Set the maximum instantiation depth for template

classes to N , default 900

• -Wfatal-errors Abort compilation on the ﬁrst error occurred rather than

trying to keep going and printing further error messages

83/84

Template Debugging

• -fdiagnostics-show-template-tree Display the templates as an indented

text tree

map<

[...],

map

[float != double],

[...]>>

84/84

Modern C++

Programming

12. Translation Units I

Linkage and One Definition Rule

Federico Busato

2025-04-14

Table of Contents

1 Basic Concepts

Translation Unit

Local and Global Scope

Linkage

1/54

Table of Contents

2 Storage Class and Duration

Storage Duration

Storage Class

static Keyword

Anonymous Namespace

extern Keyword

2/54

Table of Contents

3 Linkage of const and constexpr Variables

Static Initialization Order Fiasco

4 Linkage Summary

5 Dealing with Multiple Translation Units

Class in Multiple Translation Units

3/54

Table of Contents

6 One Deﬁnition Rule (ODR)

Global Variable Issues

ODR - Point (3)

inline Functions/Variables

constexpr and inline

7 ODR - Function Template

Cases

extern Keyword

4/54

Table of Contents

8 ODR - Class Template

Cases

extern Keyword

9 ODR Undeﬁned Behavior and Summary

5/54

Basic Concepts

Translation Unit

Header File and Source File

Header ﬁles allow deﬁning interfaces (.h, .hpp, .hxx), while keeping the

implementation in separated source ﬁles (.c, .cpp, .cxx).

Translation Unit

A translation unit (or compilation unit) is the basic unit of compilation in C++. It

consists of the content of a single source ﬁle, plus the content of any header ﬁle

directly or indirectly included by it

A single translation unit can be compiled into an object ﬁle, library, or executable

program

6/54

Compile Process

7/54

Local and Global Scope

Scope

The scope of a variable/function/object is the region of the code within the entity

can be accessed

Local Scope / Block Scope

Entities that are declared inside a function or a block are called local variables.

Their memory address is not valid outside their scope

Global Scope / File Scope / Namespace Scope

Entities that are deﬁned outside all functions.

They hold a single memory location throughout the life-time of the program

8/54

Local and Global Scope

int var1; // global scope

int f() {

int var2; // local scope

}

struct A {

int var3; // depends on where the instance of 'A' is used

};

9/54

Linkage

Linkage refers to the visibility of symbols to the linker

No Linkage

No linkage refers to symbols in the local scope of declaration and not visible to the

linker

Internal Linkage

Internal linkage refers to symbols visible only in scope of a single translation unit.

The same symbol name has a diﬀerent memory address in distinct translation units

External Linkage

External linkage refers to entities that exist ( visible/accessible) outside a single

translation unit. They are accessible and have the same identical memory address

through the whole program, which is the combination of all translation units

10/54

Storage Class and

Duration

Storage Duration 1/2

Storage Duration

The storage duration (or duration class) determines the duration of a variable,

namely when it is created and destroyed

Storage Duration Allocation Deallocation

Automatic Code block start Code block end

Static Program start Program end

Dynamic Memory allocation Memory deallocation

Thread Thread start Thread end

en.cppreference.com/w/cpp/language/storage_duration

11/54

Storage Duration 2/2

• Automatic storage duration. Local variables temporary allocated on registers or

stack (depending on compiler, architecture, etc.).

If not explicitly initialized, their value is undeﬁned

•

Static storage duration. The storage of an object is allocated when the program

begins and deallocated when the program ends.

If not explicitly initialized, it is zero-initialized

• Dynamic storage duration. The object is allocated and deallocated by using

dynamic memory allocation functions (

new/delete ).

If not explicitly initialized, its memory content is undeﬁned

•

Thread storage duration C++11. The object is allocated when the thread

begins and deallocated when the thread ends. Each thread has its own instance of

the object

12/54

Storage Duration Examples

int v1; // static duration

void f() {

int v2; // automatic duration

auto v3 = 3; // automatic duration

auto array = new int[10]; // dynamic duration (allocation)

} // array, v2, v3 variables deallocation (from stack)

// the memory associated to "array" is not deallocated

int main() {

f();

}

// main end: v1 is deallocated

13/54

Storage Class

Storage Class Speciﬁer

The storage class for a variable declaration is a type speciﬁer that, together with

the scope, governs its storage duration and linkage

Storage Class Notes Scope Storage Duration Linkage

no storage class local var decl. Local automatic No linkage

no storage class global

var decl. Global static External

static Local static

Function

Dependent

static Global static Internal

extern Global static External

thread_local C++11 any thread local any

14/54

Storage Class Examples

int v1; // no storage class

static int v2 = 2; // static storage class

extern int v3; // external storage class

thread_local int v4; // thread local storage class

thread_local static int v5; // thread local and static storage classes

int main() {

int v6; // auto storage class

auto v7 = 3; // auto storage class

static int v8; // static storage class

thread_local int v9; // thread local and auto storage classes

auto array = new int[10]; // auto storage class ("array" variable)

}

15/54

static Keyword for Local Variables

static local variables are allocated when the program begins, initialized when the

function is called the ﬁrst time, and deallocated when the program ends

int f() {

static int val = 1;

val

++;

return val;

}

int main() {

cout << f(); // print 2 ("val" is initialized)

cout << f(); // print 3

cout << f(); // print 4

}

16/54

static Keyword for Global Variables

static global variables or functions are visible only within the translation unit where

they are declared → internal linkage

• Non-

static global variables or functions with the same name in diﬀerent translation

units produce name collision (or name conﬂict) → multiple deﬁnitions at link-time

int var1 = 3; // external linkage

// (in conflict with variables in other

// translation units with the same name)

static int var2 = 4; // internal linkage (visible only in the

// current translation unit)

void f1() {} // external linkage (could conflict)

static void f2() {} // internal linkage

17/54

Anonymous Namespace 1/2

A namespace with no identiﬁer is called unnamed/anonymous namespace

Entities within an anonymous namespace have internal linkage and, therefore, are used

for declaring

unique identiﬁers, visible only in the same source ﬁle

Anonymous namespace vs. global static functions/variables:

• Entities withing an anonymous namespace have the same properties of

static

declarations at global scope

• In addition, anonymous namespaces allow type declarations and class deﬁnitions

• Anonymous namespaces are less verb ose than

static variables/functions but,

entities within an anonymous namespace are less visible if the scope contains

many lines

18/54

Anonymous Namespace 2/2

main.cpp

# include <iostream>

namespace { // anonymous, internal linkage

void f() { std::cout << "main"; }

using my_int = int; // not possible

// with 'static'

} // namespace

int main() {

f(); // print "main"

}

source.cpp

# include <iostream>

namespace { // anonymous, internal linkage

void f() { std::cout << "source"; }

using my_int = unsigned; // no conflicts

} // namespace

int g() {

f(); // print "source", no conflicts

}

19/54

extern Keyword

extern keyword is used to declare the existence of global variables or functions in

another translation unit → external linkage

• the variable or function must be deﬁned in one and only one translation unit

• it is redundant for functions

• it is necessary for variables to prevent the compiler to associate a memory location

in the current translation unit

Note: if the same identiﬁer within a translation unit appears with both internal and external

linkage, the behavior is undeﬁned

20/54

External Linkage Example

int var1 = 3; // external linkage

// (in conflict with variables in other

// translation units with the same name)

extern int var3; // external linkage

// (implemented in another translation unit)

void f1() {} // external linkage (could conflict)

extern void f4(); // external linkage

// (implemented in another translation unit)

21/54

Linkage of const

and constexpr

Variables

Linkage of const and constexpr Variables

const variables have internal linkage at global scope

constexpr variables imply const , which implies internal linkage

note: the same variable has diﬀerent memory addresses on diﬀerent translation units (code

bloat)

const int var1 = 3; // internal linkage

constexpr int var2 = 2; // internal linkage

static const int var3 = 3; // internal linkage (redundant)

static constexpr int var4 = 2; // internal linkage (redundant)

int main() {}

22/54

Static Initialization Order Fiasco 1/2

In C++, the order in which global variables are initialized at runtime is not deﬁned.

This introduces a subtle problem called static initialization order ﬁasco

source.cpp

int f() { return 3; } // run-time function

int x = f(); // run-time evalutation

main.cpp

extern int x;

int y = x; // run-time initialized

int main() {

cout

<< y; // print "3" or "0" depending on the linking order

}

23/54

Static Initialization Order Fiasco 2/2

source.cpp

constexpr int f() { return 3; } // compile-time/run-time function

constinit int x = f(); // compile-time initialized (C++20)

main.cpp

constinit extern int x; // compile-time initialized (C++20)

int y = x; // run-time initialized

int main() {

cout << y; // print "3"!!

}

24/54

Linkage Summary

Linkage Summary 1/2

No Linkage: Local variables, functions, classes

•

static local variable address depends on the linkage of its function

Internal Linkage:

(not accessible by other translation units, no conﬂicts, diﬀerent memory addresses)

• Global Variables:

◦

static

◦ non-inline, non-template, non-specialized, non-extern

const / constexpr

• Functions:

static

• Anonymous namespace content, even structures/classes

25/54

Linkage Summary 2/2

External Linkage:

(accessible by other translation units, potential conﬂicts, same memory address)

• Global Variables:

◦ no specifier, or

extern

◦

template/specialized C++14 (no conﬂicts for template , see ODR)

◦ inline const / constexpr C++17 (no conﬂicts, see ODR)

• Functions:

◦ no specifier (no conﬂicts with

inline , see ODR), or extern

◦ template/specialized (no conﬂicts for template , see ODR)

Note: inline , constexpr (which implies inline for functions) functions are not

accessible by other translation units even with external linkage

• Enumerators, Classes and their static, non-static members

26/54

Dealing with

Multiple Translation

Units

Code Structure 1

• one header, two source ﬁles → two translation units

• the header is included in both translation units

27/54

Code Structure 2

• two headers, two source ﬁles → two translation units

• one header for declarations (.hpp), and the other one for implementations

(.i.hpp)

• the header and the header implementation are included in both translation units

* separate header declaration and implementation is not mandatory, but it could help to better

organize the code

28/54

Class in Multiple Translation Units 1/2

header.hpp:

class A {

public:

void f();

static void g();

private:

int x;

static int y;

};

main.cpp:

# include "header.hpp"

# include <iostream>

int main() {

A a;

std

::cout << a.x; // print 1

std::cout << A::y; // print 2

}

source.cpp:

# include "header.hpp"

void A::f() {}

void A::g() {}

int A::y = 2;

// int A::x = 1; // non-static data member

// cannot be defined out-of-line

29/54

Class in Multiple Translation Units 2/2

header.hpp:

struct A {

static int y1; // zero-init

// static int y2 = 3; // compile error

// must be initialized out-of-class

inline static int y3 = 4; // inline initialization (C++17)

const int z = 3; // C++11 and later

// const int z; // compile error

// must be initialized

static const int w1; // zero-init

static const int w2 = 4; // inline-init

};

source.cpp:

# include "header.hpp"

int A::y1 = 2;

const int A::w1 = 3;

30/54

One Deﬁnition Rule

(ODR)

One Deﬁnition Rule (ODR)

(1) In any (single) translation unit, a template, type, function, or object, cannot

have more than one deﬁnition

- Compiler error otherwise

- Any number of declarations are allowed

(2) In the entire program, an object or non-inline function cannot have more

than one deﬁnition

- Multiple deﬁnitions linking error otherwise

- Entities with internal linkage in diﬀerent translation units are allowed, even if their

names and types are the same

(3) A template, type, or inline functions/variables, can be deﬁned in

more than

one translation unit. For a given entity, each deﬁnition must be the same

- Undeﬁned behavior otherwise

- Common case: same header included in multiple translation units

31/54

ODR - Point (1), (2)

header.hpp:

void f(); // DECLARATION

main.cpp:

# include "header.hpp"

# include <iostream>

int a = 1; // external linkage

// int a = 7; // compiler error, Point (1)

extern int b;

static int c = 2; // internal linkage

int main() {

std::cout << a; // print 1

std::cout << b; // print 5

std::cout << c; // print 2

f();

}

source.cpp:

# include "header.hpp"

# include <iostream>

linking error, multiple definitions

// int a = 2; // Point (2)

int b = 5; // ok

// internal linkage

static int c = 4; // ok

void f() { // DEFINITION

// std::cout << a; // 'a' is not visible

std::cout << b; // print 5

std::cout << c; // print 4

}

32/54

Global Variable Issues - ODR Point (2)

header.hpp:

# include <iostream>

struct A {

A() { std

::cout << "A()"; }

∼A() { std::cout << "∼A()"; }

};

// A obj; // linking error multiple definitions, Point (2)

const A const_obj{}; // "const/constexpr" implies internal linkage

constexpr float PI = 3.14f;

source1.cpp:

# include "header.hpp"

void f() { std::cout << &PI; }

// address: 0x1234ABCD

// print "A()" the first time

// print "∼A()" the first time

source2.cpp:

# include "header.hpp"

void f() { std::cout << &PI; }

// print address: 0x3820FDAC !!

// print "A()" the second time!!

// print "∼A()" the second time!!

33/54

Common Class Error - ODR Point (2)

header.hpp:

struct A {

void f() {}; // inline DEFINITION

void g(); // DECLARATION

void h(); // DECLARATION

};

void A::g() {} // DEFINITION

main.cpp:

# include "header.hpp"

linking error

// multiple definitions of A::g()

int main() {}

source.cpp:

# include "header.hpp"

linking error

// multiple definitions of A::g()

void A::h() {} // DEFINITION, ok

34/54

ODR - Point (3)

ODR Point (3): A template, type, or inline functions/variables, can be

deﬁned in

more than one translation unit

• The linker removes all deﬁnitions of an inline / template entity except one

• All deﬁnitions must be identical to avoid undeﬁned behavior due to arbitrary

linking order

•

inline / template entities have a unique memory address across all translation

units

•

inline / template entities have the same linkage as the corresponding

variables/functions without the speciﬁer

35/54

inline Functions/Variables 1/2

inline

inline speciﬁer allows a function or a variable (in C++17) to be identically

deﬁned (not only declared) in multiple translation units

•

inline is one of the most misunderstood features of C++

•

inline is a hint for the linker. Without it, the linker can emit “multiple

definitions” error

•

inline entities cannot be exported, namely, used by other translation units even

if they have external linkage (related warning: -Wundefined-inline )

•

inline doesn’t mean that the compiler is forced to perform function inlining. It

just increases the optimization heuristic threshold

36/54

inline Functions/Variables 2/2

void f() {}

inline void g() {}

f() :

• Cannot be deﬁned in a header included in multiple source ﬁles

• The linker issues a “multiple deﬁnitions” error

g() :

• Can be deﬁned in a header and included in multiple source ﬁles

37/54

constexpr and inline

constexpr functions are implicitly inline

constexpr variables are not implicitly inline . C++17 added inline variables

void f1() {} // external linkage

// potential multiple definitions error

constexpr void f2() {} // external linkage, implicitly inline

// multiple definitions allowed

constexpr int x = 3; //

internal linkage

// different files allows distinct definitions

// -> different addresses, code bloat

inline constexpr int y = 3; // external linkage unique memory address

// -> potential undefined behavior

int main() {}

38/54

One Deﬁnition Rule - Point (3) 1/2

header.hpp:

inline void f() {} // the function is marked 'inline' (no linking error)

inline int v = 3; // the variable is marked 'inline' (no linking error) (C++17)

template<typename T>

void g(T x) {} // the function is a template (no linking error)

using var_t = int; // types can be defined multiple times (no linking error)

main.cpp:

# include "header.hpp"

int main() {

f();

g(3); // g<int> generated

}

source.cpp:

# include "header.hpp"

void h() {

f();

g(5); // g<int> generated

}

39/54

One Deﬁnition Rule - Point (3) 2/2

Alternative organization:

header.hpp:

inline void f(); // DECLARATION

inline int v; // DECLARATION

template<typename T>

void g(T x); // DECLARATION

using var_t = int; // type

# include "header.i.hpp"

header.i.hpp:

void f() {} // DEFINITION

int v = 3; // DEFINITION

template<typename T>

void g(T x) {} // DEFINITION

main.cpp:

# include "header.hpp"

int main() {

f();

3); // g<int> generated

}

source.cpp:

# include "header.hpp"

void h() {

f();

5); // g<int> generated

}

40/54

ODR - Function

Template

Function Template - Case 1

header.hpp:

template<typename T>

void f(T x) {}; // DECLARATION and DEFINITION

main.cpp:

# include "header.hpp"

int main() {

3); // call f<int>()

f(3.3f); // call f<float>()

f('a'); // call f<char>()

}

source.cpp:

# include "header.hpp"

void h() {

3); // call f<int>()

f(3.3f); // call f<float>()

f('a'); // call f<char>()

}

f<int>() , f<float>() , f<char>() are generated two times (in both translation units)

41/54

Function Template - Case 2

header.hpp:

template<typename T>

void f(T x); // DECLARATION

main.cpp:

# include "header.hpp"

int main() {

f(3); // call f<int>()

f(3.3f); // call f<float>()

// f('a'); // linking error

} // the specialization does not exist

source.cpp:

# include "header.hpp"

template<typename T>

void f(T x) {} // DEFINITION

// template INSTANTIATION

template void f<int>(int);

template void f<float>(float);

// any explicit instance is also

// fine, e.g. f<int>(3)

42/54

Function Template and Specialization

header.hpp:

template<typename T>

void f() {} // DECLARATION and DEFINITION

main.cpp:

# include "header.hpp"

int main() {

f<char>(); // use the generic function

f<int>(); // use the specialization

}

source.cpp:

# include "header.hpp"

template<>

void f<int>() {} // SPECIALIZATION

// DEFINITION

43/54

Function Template - extern Keyword

C++11

header.hpp:

template<typename T>

void f() {} // DECLARATION and DEFINITION

main.cpp:

# include "header.hpp"

extern template void f<int>();

// f<int>() is not generated by the

// compiler in this translation unit

int main() {

<int>();

}

source.cpp:

# include "header.hpp"

void g() {

f<int>();

}

// or 'template void f<int>();'

44/54

ODR Function Template Common Error

header.hpp:

template<typename T>

void f(); // DECLARATION

// template<> //

linking error

// void f<int>() {} // multiple definitions -> included twice

full specializations are like standard functions

// it can be solved by adding "inline"

main.cpp:

# include "header.hpp"

int main() {}

source.cpp:

# include "header.hpp"

// some code

45/54

ODR - Class

Template

Class Template - Case 1

header.hpp:

template<typename T>

struct A {

T x

= 3; // "inline" DEFINITION

static inline T y = 3; // "inline" DEFINITION (C++17)

void f() {}; // "inline" DEFINITION

};

main.cpp:

# include "header.hpp"

int main() {

A<int> a1; // ok

A<float> a2; // ok

A<char> a3; // ok

}

source.cpp:

# include "header.hpp"

int g() {

A<int> a1; // ok

A<float> a2; // ok

A<char> a3; // ok

}

46/54

Class Template - Case 2

header.hpp:

template<typename T>

struct A {

static T x;

void f(); // DECLARATION

};

# include "header.i.hpp"

header.i.hpp:

template<typename T>

T A<T>::x = 3; // DEFINITION

template<typename T>

void A<T>::f() {} // DEFINITION

main.cpp:

# include "header.hpp"

int main() {

A<int> a1; // ok

A<float> a2; // ok

A<char> a3; // ok

}

source.cpp:

# include "header.hpp"

int g() {

A<int> a1; // ok

A<float> a2; // ok

A<char> a3; // ok

}

47/54

Class Template - Case 3

header.hpp:

template<typename T>

struct A {

static T x;

void f(); // DECLARATION

};

main.cpp:

# include "header.hpp"

int main() {

<int> a1; // ok

// A<char> a2; // linking error

} // 'f()' is undefined

// while 'x' has an undefined

// value for A<char>

source.cpp:

# include "header.hpp"

template<typename T>

int A<T>::x = 3; // initialization

template<typename T>

void A<T>::f() {} // DEFINITION

// template INSTANTIATION

template class A<int>;

48/54

Class Template - extern Keyword

C++11

header.hpp:

template<typename T>

struct A {

T x;

void f() {}

};

source.cpp:

# include "header.hpp"

extern template class A<int>;

// A<int> is not generated by the

// compiler in this translation unit

int main() {

A<int> a;

}

source.cpp:

# include "header.hpp"

// template INSTANTIATION

template class A<int>;

// or any instantiation of A<int>

49/54

ODR Undeﬁned

Behavior and

Summary

Undeﬁned Behavior - inline Function

main.cpp:

# include <iostream>

inline int f() { return 3; }

void g();

int main() {

std

::cout << f(); // print 3

std::cout << g(); // print 3!!

} // not 5

source.cpp:

// same signature and inline

inline int f() { return 5; }

int g() { return f(); }

The linker can arbitrary choose one of the two deﬁnitions of

f() . With -O3 , the

compiler could inline f() in g() , so now g() return 5

This issue is easy to detect in trivial examples but hard to ﬁnd in large codebase

Solution: static or anonymous namespace

50/54

Undeﬁned Behavior - Member Function

header.hpp:

# include <iostream>

struct A {

int f() { return 3; }

};

int g();

main.cpp:

# include "header.hpp"

int main() {

A a;

std

::cout << a.f();// print 3

std::cout << g(); // print 3!!

}

source.cpp:

struct A {

int f() { return 5; }

};

int g() {

A a;

return a.f();

}

51/54

Undeﬁned Behavior - Function Template

header.hpp:

template<typename T>

int f() {

return 3;

}

int g();

main.cpp:

# include "header.hpp"

int main() {

std

::cout << f<int>(); // print 3

std::cout << g(); // print 3!!

}

source.cpp:

template<typename T>

int f() {

return 5;

}

int g() {

return f<int>();

}

52/54

Undeﬁned Behavior

Other ODR violations are even harder (if not impossible) to ﬁnd, see Diagnosing

Hidden ODR Violations in Visual C++

Some tools for partially detecting ODR violations:

•

-detect-odr-violations ﬂag for gold/llvm linker

•

-Wodr -flto ﬂag for GCC

• Clang address sanitizer + ASAN_OPTIONS=detect_odr_violation=2

(link)

Another solution could b e included all ﬁles in a single translation unit

53/54

ODR - Declarations and Deﬁnitions Summary

• Header: declaration of

- functions, structures, classes, types, alias

template functions, structs, classes

- extern variables, functions

• Header (implementation):

deﬁnition of

inline variables/functions

- template variables/functions/classes

- global static, non-static

const/constexpr variables and constexpr

functions

• Source ﬁle: deﬁnition of

- functions, including

template full specializations

- classes

extern and static global variables/functions

54/54

Modern C++

Programming

13. Translation Units II

Include, Module, and Compilation

Federico Busato

2025-04-14

Table of Contents

1 #include Issues

Include Guard

Forward Declaration

Circular Dependencies

Common Linking Errors

1/47

Table of Contents

2 C++20 Modules

Overview

Terminology

Visibility and Reachability

Module Unit Types

Keywords

Global Module Fragment

Private Module Fragment

Header Module Unit

Module Partitions

2/47

Table of Contents

3 Compiling Multiple Translation Units

Fundamental Compiler Flags

Compile Methods

3/47

Table of Contents

4 Libraries in C++

Static Library

Building Static Libraries

Using Static Libraries

Dynamic Library

Building Dynamic Libraries

Using Dynamic Libraries

Application Binary Interface (ABI)

Demangling

Find Dynamic Library Dependencies

Analyze Object/Executable Symbols

4/47

#include Issues

Include Guard 1/3

The include guard avoids the problem of multiple inclusions of a header ﬁle in a

translation unit

header.hpp:

# ifndef HEADER_HPP // include guard

# define HEADER_HPP

... many lines of code ...

# endif // HEADER_HPP

#pragma once preprocessor directive is an alternative to the include guard to force current

ﬁle to be included only once in a translation unit

• #pragma once is less portable but less verbose and compile faster than the include

guard

The include guard/#pragma once should be used in every header ﬁle

5/47

Include Guard 2/3

Common case:

6/47

Include Guard 3/3

header_A.hpp:

# pragma once // prevent "multiple definitions" linking error

struct A {

};

header_B.hpp:

# include "header_A.hpp" // included here

struct B {

A a;

};

main.cpp:

# include "header_A.hpp" // .. and included here

# include "header_B.hpp"

int main() {

A a; // ok, here we need "header_A.hpp"

B b; // ok, here we need "header_B.hpp"

}

7/47

Forward Declaration

Forward declaration is a declaration of an identiﬁer for which a complete deﬁnition

has not yet given. “forward” means that an entity is declared before it is deﬁned

void f(); // function forward declaration

class A; // class forward declaration

int main() {

f();

// ok, f() is defined in the translation unit

// A a; // compiler error

no definition (incomplete type)

// e.g. the compiler is not able to deduce the size of A

A* a; // ok

}

void f() {} // definition of f()

class A {}; // definition of A()

8/47

Forward Declaration vs. #include

Advantages:

• Forward declarations can save compile time as

#include forces the compiler to open

more ﬁles and process more input

• Forward declarations can save on unnecessary recompilation.

#include can force your

code to be recompiled more often, due to unrelated changes in the header

Disadvantages:

• Forward declarations can hide a dependency, allowing user code to skip necessary

recompilation when headers change

• A forward declaration may be broken by subsequent changes to the library

• Forward declaring multiple symbols from a header can be more verbose than simply

#including the header

google.github.io/styleguide/cppguide.html#Forward_Declarations

9/47

Circular Dependencies 1/3

A circular dependency is a relation between two or more modules which either

directly or indirectly depend on each other to function properly

Circular dependencies can be solved by using forward declaration, or better, by

rethinking the project organization

10/47

Circular Dependencies 2/3

header_A.hpp:

# pragma once // first include

# include "header_B.hpp"

class A {

B* b;

};

header_B.hpp:

# pragma once // second include

# include "header_C.hpp"

class B {

* c;

};

header_C.hpp:

# pragma once // third include

# include "header_A.hpp"

class C { // compile error "header_A.hpp": already included by "main.cpp"

A* a; // the compiler does not know the meaning of "A"

};

11/47

Circular Dependencies (ﬁx) 3/3

header_A.hpp:

# pragma once

class B; // forward declaration

// note: does not include "header_B.hpp"

class A {

B* b;

};

header_B.hpp:

# pragma once

class C; // forward declaration

class B {

C* c;

};

header_C.hpp:

# pragma once

class A; // forward declaration

class C {

* a;

};

12/47

Common Linking Errors

Very common linking errors:

• undefined reference

Solutions:

- Check if the right headers and sources are included

- Break circular dependencies (could be hard to ﬁnd)

•

multiple definitions

Solutions:

inline function, variable deﬁnition or extern declaration

- Add include guard/ #pragma once to header ﬁles

- Place template deﬁnition in header ﬁle and full specialization in source ﬁles

13/47

C++20 Modules

C++20 Modules 1/2

The #include problem: The duplication of work - the same header ﬁles are

possibly parsed/compiled multiple times and most of the compiled output is later-on

thrown away again by the linker

C++20 introduces modules as a robust replacement for plain #include

Module (C++20)

A module is a set of source code ﬁles that are compiled independently of the

translation units that import them

Modules allow deﬁning clearer interfaces with a ﬁne-grained control on what to

import and export (similar to Java, Python, Rust, etc.)

• A Practical Introduction to C++20’s Modules

• Modules the beginner’s guide

• Understanding C++ Modules

• Overview of modules in C++

14/47

C++20 Modules 2/2

Less error-prone than #include :

• No eﬀect on the compilation of the translation unit that imports the module

• Macros, preprocessor directives, and non-exported names declared in a module are

not visible outside the module

• Declarations in the importing translation unit do not participate in overload

resolution or name lookup in the imported module

Other beneﬁts:

• (Much) Faster compile time. After a module is compiled once, the results are

stored in a binary ﬁle that describes all the exported types, functions, and

templates

• Smaller binary size. Allow to incorporate only the imported code and not the

whole

#include

15/47

Terminology

A module consists of one or more module units

A module unit is a translation unit tha t contains a module declaration

module my.module.example;

A module name is a concatenation of identiﬁers joined by dots (the dot carries no

meaning)

my.module.example

A module unit purview is the content of the translation unit

A module purview is the set of purviews of a given module name

16/47

Visibility and Reachability

Visibility of names instructs the linker if a symbol can be used by another translation

unit. Visible also means a candidate for name lookup

Reachable of declarations means that the semantic properties of an entity are

available

• Each visible declaration is also reachable

•

Not all reachable declarations are also visible

17/47

Reachability Example

Common example: the members of a class are reachable (i.e. can be used) or the class

size is known, but not the class type itself

auto g() {

struct A {

void f() {}

};

return A{};

}

//---------------------------------------------------------------------------------

auto x = g(); // ok

// A y = g(); //

compile error, "A" is unknown at this point

x.f(); // ok

sizeof(x); // ok

using T = decltype(x); // ok

18/47

Module Unit Types

• A module interface unit is a module unit that exports a symbol and/or module

name or module partition name

• A primary mo dule interface unit is a module interface unit that

exports the

module name. There must be

one and only one primary module interface unit in

a module

• A module implementation unit is a module unit that

does not export a module

name or module partition name

A module interface unit should contain only declarations if one or more module

implementation units are present. A module implementation unit

implements/deﬁnes the declarations of module interface units

19/47

Keywords

module speciﬁes that the ﬁle is a named module

module my.module; // first code line

import makes a module and its symbols visible in the current ﬁle

import my.module; // after module declaration and #include

export makes symbols visible to the ﬁles that import the current module

•

export module <module_name> makes visible all the exported symbols of a

module. It must appear once per module in the primary module interface unit

•

export namespace <namespace> makes visible all symbols in a namespace

•

export <entity> makes visible a speciﬁc function, class, or variable

•

export {<code>} makes visible all symbols in a block

20/47

import Example

# include <iostream>

int main() {

std::cout << "Hello World";

}

Preprocessing size

-E : ∼1MB

import <iostream>;

int main() {

std

::cout << "Hello World";

}

Preprocessing size: 236B (x500)

Compile time: 2x (up to 10x) less

g++-12 -std=c++20 -fmodules-ts main.cpp -x c++-system-header iostream

21/47

export Example - Single Primary Module Interface Unit

my_module.cpp

export module my.example; // make visible all module symbols

export int f1() { return 3; } // export function

export namespace my_ns { // export namespace and its content

int f2() { return 5; }

}

export { // export code block

int f3() { return 2; }

int f4() { return 8; }

}

void internal() {} // NOT exported. It can be used only internally

22/47

export Example - Two Module Interface Units

my_module1.cpp Primary Module Interface Unit

export module my.example; // This is the only file that exports all module symbols

export int f1() { return 3; } // export function

my_module2.cpp Module Interface Unit

module my.example; // Module declaration but symbols are not exported

export namespace my_ns { // export namespace

int f2() { return 5; }

}

export { // export code block7

int f3() { return 2; }

int f4() { return 8; }

}

23/47

export Example - Module Interface and Implementation Units

my_module1.cpp Primary Module Interface Unit

export module my.example; // This is the only file that exports all module symbols

export int f1(); // export function

export { // export code block

int f3();

int f4();

}

my_module2.cpp Module Implementation Unit

module my.example; // Module declaration but symbols are not exported

int f1() { return 3; }

int f3() { return 2; }

int f4() { return 8; }

24/47

Keyword Notes

import

• A module implementation unit can

import another module, but cannot

export any names. Symbols of the module interface unit are imported implicitly

• All import must appear before any declarations in that module unit and after

module; a export module (if present)

export

• Symbols with internal linkage or no linkage cannot be exported, i.e. anonymous

namespaces and static entities

• The export keyword is used in module interface units only

• The semantic properties associated to exported symbols become reachable

25/47

export import Declaration

Imported modules can be directly re-exported

export module main_module; // Top-level primary module interface unit

export import sub_module; // import and re-export "sub_module"

export module sub_module; // Primary module interface unit

export void f() {}

import main_module;

int main() {

f();

// ok, f() is visible

}

26/47

Global Module Fragment

A global module fragment (unnamed module) can be used to include header ﬁles in

a module interface when importing them is not possible or preprocessing directives are

needed

module; // start Global Module Fragment

# define ENABLE_FAST_MATH

# include

"my_math.h"

export module my.module; // end Global Module Fragment

Macro deﬁnitions or other preprocessing directives are not visible outside the ﬁle itself

27/47

Private Module Fragment

A private module fragment allows a module to be represented as a single translation

unit without making all the contents of the module reachable to importers

→ A modiﬁcation of the private module fragment

does not cause recompilation

If a module unit contains a private module fragment, it will be the

only module unit of

its module

export module my.example;

export int f();

module :private; // start

private module fragment

int f() { // definition not reachable from importers of f()

return 42;

}

28/47

Header Module Unit

Legacy headers can be directly imported with import instead of #include

All declarations are

implicitly exported and attached to the global module

(fragment)

• Macros from the header are available for the importer, but macros deﬁned in the

importer have no eﬀect on the imported header

• Importing compiled declarations is faster than

#include

C++23 will introduce modules for the standard library

29/47

Module Partitions 1/2

A module can be organized in isolated mo dule partitions

Syntax:

export module module_name : partition_name;

• Declarations in any of the partitions are

visible within the entire module

• Like common modules, a module partition consists in one module partition

interface unit and zero or more module partition implementation units

• Module partitions are

not visible outside the module

• Module partitions do

not implicitly import the module interface

• All names exported by partition interface ﬁles must be imported and

re-exported by the primary mo dule interface ﬁle

30/47

Module Partitions 2/2

main_module.ixx

export module main_module;

export import :partition1; // re-export f() to importers of "main_module"

export import :partition2; // re-export g() to importers of "main_module"

export void h() { internal(); } // internal() can be directly used

partition1.ixx

export module module_name:partition1;

export void f() {}

partition2.ixx

export module module_name:partition2;

export void g() {}

void internal() {} // not exported

31/47

Compiling Multiple

Translation Units

Fundamental Compiler Flags

Include ﬂag: g++ -I include/ main.cpp -o main.x

• -I : Specify the include path for the project headers

•

-isystem : Specify the include path for system (external) headers (warnings

are not emitted)

They can be used multiple times

Important: include and library compiler ﬂags, as well as multiple values in an

environment variable, are evaluated in order from left to right. The ﬁrst match

suppress the other ones

Compile to a ﬁle object: g++ -c source.cpp -o source.o

32/47

Compile Methods

Method 1

Compile

all ﬁles together (naive):

g++ main.cpp source.cpp -o main.out

Method 2

Compile each translation unit in a ﬁle object:

g++ -c source.cpp -o source.o

g++ -c main.cpp -o main.o

Multiple objects can be compiled in parallel

Link all ﬁle objects:

g++ main.o source.o -o main.out

33/47

Libraries in C++

Static Library

A static library is a set of object ﬁles (just the concatenation) that are directly linked

into the ﬁnal executable. If a program is compiled with a static library, all the

functionality of the static library becomes part of ﬁnal executable

– A static library cannot be modiﬁed without re-link the ﬁnal executable

– Increase the size of the ﬁnal exe cutable

+ The linker can optimize the ﬁnal executable (link time optimization)

Given the static library my_lib , the corresponding ﬁle is:

Linux libmy_lib.a

Windows my_lib.lib

34/47

Building Static Libraries

Steps to build a static library

• Compile object ﬁles for each translation unit (.cpp)

• Create the static library by using the archiver (ar) Linux utility

g++ source1.c -c source1.o

g++ source2.c -c source2.o

ar rvs libmystaticlib.a source1.o source2.o

35/47

Using Static Libraries

A static library has to be linked to the ﬁnal executable:

Linux g++ -llibrary main.cpp -o main

Windows

msvc <path_to_library>/library.lib main.cpp /OUT:main.exe

The directories where to search for static libraries at compile-time are speciﬁed with

environment variables:

Linux

LIBRARY_PATH Search for .a ﬁles

Windows

LIBPATH Search for .lib ﬁles

It is also possible to specify additional library paths with compiler ﬂags:

Linux

g++ -L<library_path> main.cpp -o main

Windows msvc /LIBPATH<library_path> main.cpp /OUT:main.exe

36/47

Dynamic Library

A dynamic library, also called a shared library, consists of routines that are loaded

into the application at

run-time. If a program is compiled with a dynamic library, the

library does not become part of ﬁnal executable. It remains as a separate unit

+ A dynamic library can be modiﬁed without re-link: bug ﬁxing, new functionalities

– Dynamic library functions are called outside the executable. Neither the linker nor

the compiler can optimize the code between shared libraries and the ﬁnal

executable

• The environment variables must be set to the right shared library path, otherwise

the application crashes at the beginning

Given the shared library

my_lib , the corresponding ﬁle is:

Linux libmy_lib.so

Windows

my_lib.dll + my_lib.lib

37/47

Building Dynamic Libraries

Steps to build a dynamic library

• Compile object ﬁles for each translation unit (.cpp). Since library cannot store

code at ﬁxed addresses, the compiler must generate position independent code

(

-fPIC )

• Create the dynamic library

g++ source1.c -c source1.o -fPIC

g++ source2.c -c source2.o -fPIC

g++ source1.o source2.o -shared -o libmydynamiclib.so

38/47

Using Dynamic Libraries 1/2

Dynamic libraries need to be available when the program executes (run-time). The

program searches for dynamic libraries in the same directory and the paths speciﬁed in

the following environment variables:

Linux Search for

.so ﬁles

•

LD_LIBRARY_PATH environment variable

•

/lib64 and /usr/lib64

• RPATH and RUNPATH ﬁelds with custom values embedded in the executable

• /etc/ld.so.cache cache of library locations created by the ldconfig command.

Can be inspected by

ldconfig -p

39/47

Using Dynamic Libraries 2/2

Windows Search for .dll ﬁles

• PATH environment variable

• Executable directory and current working directory

•

%SystemRoot%\System32 , %SystemRoot% system directories

•

HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control

\Session Manager\KnownDLLs list of known DLLs

40/47

Application Binary Interface (ABI)

An Application Binary Interface (ABI) deﬁnes the low-level details of how programs

composed of separately compiled modules work together. An ABI speciﬁes how

functions are called and how data is exchanged.

A stable ABI is esse ntial to update the program’s shared libraries without recompiling

all the code

Some examples of ABI-breaking changes are changing the type or order of members

within a struct , modifying the return type or parameters of a function, or adding a

virtual function to a class that previously did not have one

An ABI can be also checked across diﬀerent shared library/header versions with

speciﬁc tools, such as ABI Compliance Checker W

41/47

Demangling

Name mangling is a technique used to solve various problems caused by the need to

resolve

unique names

Transforming C++ ABI (Application binary interface) identiﬁers into the original

source identiﬁers is called demangling

Example (linking error):

_ZNSt13basic_filebufIcSt11char_traitsIcEED1Ev

After demangling:

std::basic_filebuf<char, std::char_traits<char> >::∼basic_filebuf()

How to demangle: echo <name> | c++filt

Online Demangler: https://demangler.com

42/47

Find Dynamic Library Dependencies

The ldd utility shows the shared objects (shared libraries) required by a program or

other shared objects

$ ldd /bin/ls

linux

-vdso.so.1 (0x00007ffcc3563000)

libselinux.so

.1 => /lib64/libselinux.so.1 (0x00007f87e5459000)

libcap.so

.2 => /lib64/libcap.so.2 (0x00007f87e5254000)

libc.so

.6 => /lib64/libc.so.6 (0x00007f87e4e92000)

libpcre.so.1 => /lib64/libpcre.so.1 (0x00007f87e4c22000)

libdl.so.2 => /lib64/libdl.so.2 (0x00007f87e4a1e000)

/lib64/ld-linux-x86-64.so.2 (0x00005574bf12e000)

libattr.so

.1 => /lib64/libattr.so.1 (0x00007f87e4817000)

libpthread.so.0 => /lib64/libpthread.so.0 (0x00007f87e45fa000)

Alternatively,

LD_DEBUG=libs can be used to print search and load paths of shared

libraries at runtime

43/47

Find Object/Executable Symbols

⋆

1/3

The nm utility provides information on the symbols being used in an object ﬁle or

executable ﬁle

$ nm -D -C something.so

w __gmon_start__

D __libc_start_main

D free

D malloc

D printf

# -C: Decode low-level symbol names

# -D: accepts a dynamic library

44/47

Find Object/Executable Symbols

⋆

2/3

readelf displays information about ELF format object ﬁles

$ readelf --symbols something.so | c++filt

... OBJECT LOCAL DEFAULT

17 __frame_dummy_init_array_

... FILE LOCAL DEFAULT ABS prog.cpp

... OBJECT LOCAL DEFAULT

14 CC1

... OBJECT LOCAL DEFAULT

14 CC2

... FUNC LOCAL DEFAULT

12 g()

# --symbols: display symbol table

45/47

Find Object/Executable Symbols

⋆

3/3

objdump displays information about object ﬁles

$ objdump -t -C something.so | c++filt

... df *ABS* ... prog.cpp

... O .rodata ... CC1

... O .rodata ... CC2

... F .text ... g()

... O .rodata ... (anonymous

namespace)::CC3

... O .rodata ... (anonymous namespace)::CC4

... F .text ... (anonymous namespace)::h()

... F .text ... (anonymous namespace)::B::j1()

...

F .text ... (anonymous namespace)::B::j2()

# --t: display symbols

# -C: Decode low-level symbol names

46/47

References and Additional Material

• 20 ABI (Application Binary Interface) breaking changes every C++

developer should know

• Policies/Binary Compatibility Issues With C++

• 10 differences between static and dynamic libraries every C++

developer should know

47/47

Modern C++

Programming

14. Code Conventions

Part I

Federico Busato

2025-04-14

Table of Contents

1 C++ Project Organization

Project Directories

Project Files

“Common” Project Organization Notes

Alternative - “Canonical” Project Organization

2 Coding Styles and Conventions

Overview

Popular Coding Styles

1/76

Table of Contents

3 Header Files and #include

#include Guard

#include Syntax

Order of #include

Common Header/Source Filename Conventions

4 Preprocessing

Macro

Preprocessing Statements

2/76

Table of Contents

5 Variables

static Global Variables

Conversions

6 Enumerators

7 Arithmetic Types

Signed vs. Unsigned Integral Types

Integral Types Conversion

Integral Types: Size and Othe r Issues

Floating-Point Types

3/76

Table of Contents

8 Functions

Functions Parameters

Functions Arguments

Function Return Values

Function Speciﬁers

Lambda Expressions

4/76

Table of Contents

9 Structs and Classes

struct vs. class

Initialization

Braced Initializer Lists

Special Member Functions

=default, =delete

Other Issues

Inheritance

Style

5/76

C++ Project

Organization

“Common” Project Organization

6/76

Project Directories 1/2

Fundamental directories

include Project public header ﬁles

src Project source/implementation ﬁles and private headers

test (or tests) Source ﬁles for testing the project

Empty directories

bin Output executables

build All intermediate ﬁles

doc (or docs) Project documentation

7/76

Project Directories 2/2

Optional directories

submodules Project submodules

third_party (less often deps/external/extern) dependencies or external

libraries

data (or extras) Files used by the executables or for testing

examples Source ﬁles for showing project features

utils (or tools, or script) Scripts and utilities related to the project

cmake CMake submodules (.cmake)

8/76

Project Files

LICENSE Describes how this project can be used and distributed

README.md General information about the project in Markdown format *

CMakeLists.txt Describes how to compile the project

Doxyfile Conﬁguration ﬁle used by doxygen to ge nerate the documentation (see

next lecture)

others .gitignore, .clang-format, .clang-tidy, etc.

* Markdown is a language with a syntax corresponding to a subset of HTML tags

github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet

9/76

Readme and License

README.md

• README template:

- Embedded Artistry README Template

- Your Project is Great, So Let’s Make Your README Great Too

LICENSE

• Choose an open source license:

choosealicense.com

• License guidelines:

Why your academic code needs a software license

10/76

File extensions

Common C++ ﬁle extensions:

• header .h .hh .hpp .hxx

• header implementation .i.h .i.hpp -inl.h .inl.hpp

(1) separate implementation from interface for inline functions and templates

(2) keep implementation “inline” in the header ﬁle

• source/implementation

.cc .cpp .cxx

11/76

“Common” Project Organization Notes

• Public header(s) in include/

• source ﬁles, private headers, header implementations in src/ directory

• The main ﬁle (if present) should b e placed in src/ and called main.cpp

• Code tests, unit and functional tests can be placed in

test/ .

Alternatively, unit tests can appear in the same directory of the component

under test with the same ﬁlename and include .test suﬃx, e.g.

my_file.test.cpp

12/76

“Common” Project Organization Example

<project_name> (root)

include/

public_header.hpp

src/

private_header.hpp

templ_class.hpp

templ_class.i.hpp

(template/inline functions)

templ_class.cpp

(specialization)

subdir/

my_file.cpp

<project_name> (root)

README.md

CMakeLists.txt

Doxyfile

LICENSE

build/ (empty)

bin/ (empty)

doc/ (empty)

test/

my_test.hpp

my_test.cpp

...

13/76

“Common” Project Organization - Improvements

The “common” project organization can be

improved by adding the name of the project

as subdirectory of

include/

Some projects often entirely avoid the

include/ directory

This is particularly useful when the project

is used as submodule (part of a larger

project) or imported as an external library

The includes now look like:

# include <my_project/public_header.hpp>

<project_name>

include/

<project_name>/

public_header.hpp

src/

private_file.cpp

14/76

Alternative - “Canonical” Project Organization 1/2

• Header and source ﬁles (or module interface and implementation ﬁles) are next

to each other (no include/ and src/ split)

• Headers are included with <> and contain the project directory preﬁx, for

example,

<hello/hello.hpp> (no need of "" syntax)

• Header and source ﬁle extensions are

.hpp / .cpp ( .mpp for module

interfaces). No special characters other than

_ and - in ﬁle names with . only

used for extensions

• A source ﬁle that implements a module’s unit tests should be placed next to that

module’s ﬁles and be called with the module’s name plus the

.test second-level

extension

• A project’s functional/integration tests should go into the

test/ subdirectory

15/76

Alternative - “Canonical” Project Organization 2/2

<project_name> (v1)

<project_name>/

public_header.hpp

private_header.hpp

my_file.cpp

my_file.mpp

my_file.test.cpp

test/

my_functional_test.cpp

build/

doc/

...

<project_name> (v2)

<project_name>/

public_header.hpp

private/

private_header.hpp

my_internal_file.cpp

my_internal_file.test.cpp

test/

my_functional_test.cpp

build/

doc/

...

16/76

References

• Kick-start your C++! A template for modern C++ projects

• The Pitchfork Layout

• Canonical Project Structure

17/76

Coding Styles and

Conventions

Overview

“One thing people should remember is

there is what you can do in a language and

what you should do

”

Bjarne Stroustrup

18/76

Overview

Most important rule:

BE CONSISTENT!!

“The best code explains itself"

Google

19/76

Overview

“80% of the lifetime cost of a piece of

software goes to maintenance”

Unreal Engine

20/76

Code Quality

“The worst thing that can happen to a code base is size”

— Steve Yegge

21/76

Bad Code

How my code looks like for other people?

abstrusegoose.com/432

22/76

Coding Styles Overview

Coding styles are common guidelines to improve the readability, maintainability,

prevent common errors, and make the co de more uniform

A consistent code base helps developers better understand code organization,

focus on program logic, and reduce the time spent interpreting other engineers’

intentions

Personal Comment: Don’t start a project that involves multiple engineers without establishing clear

guidelines that all engineers agree to. This is essential to avoid costly refactoring, personal style

discussions, and conﬂicts later on

This section, including the review of all coding styles, has been updated on October 2024

23/76

Popular Coding Styles 1/3

• LLVM Co ding Standards. llvm.org/docs/CodingStandards.html W

• Google C++ Style Guide.

google.github.io/styleguide/cppguide.html W

• Webkit Coding Style. webkit.org/code-style-guidelines W

• Mozilla Coding Style. firefox-source-docs.mozilla.org W

The Firefox code base adopts parts of the Google Coding style for C++ code (C++17, 2020),

but not all of its rules

• Chromium Coding Style. chromium.googlesource.com W

Chromium follows the Google C++ Style Guide with some exceptions

24/76

Popular Coding Styles 2/3

• Unreal Engine - Coding Standard

docs.unrealengine.com/en-us/Programming W

• µOS++ (derived from MISRA 2018 and JSV)

micro-os-plus.github.io/develop/coding-style W

micro-os-plus.github.io/develop/naming-conventions W

More educational-oriented guidelines

• C++ Core Guidelines

isocpp.github.io/CppCoreGuidelines/CppCoreGuidelines W

25/76

Popular Coding Styles 3/3

Secure Coding

• High Integrity C++ Coding Standard. www.perforce.com/resources

• CERT C++ Secure Coding. wiki.sei.cmu.edu

Critical system coding standards

• MISRA C++17, 2023. www.misra.org.uk

• Autosar C++14, 2019 (based on MISRA:2008). www.autosar.org

• Joint Strike Fighter Air Vehicle (JSV) C++, 2005. JSF-AV-rule

26/76

Static Analysis Tools

• clang-tidy

clang.llvm.org/extra/clang-tidy/checks/list.html W

• PVS-Studio

pvs-studio.com/en/docs/warnings W

• SonarSource

rules.sonarsource.com/cpp/ W

• cpp-checks

sourceforge.net/p/cppcheck/wiki/ListOfChecks/ W

Note: each tool also provides the list of checks that are evaluated

27/76

Legend

※ → Important!

Highlight potential code issues such as bugs, ineﬃciency, or important

readability problems. Should not be ignored

∗ → Useful

It is not fundamental, but it emphasizes good practices and can help to prevent

bugs. Should be followed if possible

• → Minor / Obvious

Style choice, not very common issue, or hard to enforce

28/76

Header Files and

#include

Header Files 1/2

※ Every include must be self-contained

- include every header you need directly

- do not rely on recursive

#include

- the project must compile with any include order

........

LLVM,

..........

Google,

.........

Unreal,

......

µOS,

............

CoreCpp

∗ Include as less as possible, especially in header ﬁles

- do not include unneeded headers

- minimize dependencies

- minimize code in headers (e.g. use forward declarations)

........

LLVM,

..........

Google,

.............

Chromium,

..........

Unreal,

.....

Hic,

......

µOS,

...........

Mozilla,

...............

Clang-Tidy,

............

CoreCpp

∗ Every source ﬁle should have an associated header ﬁle

...........

Google,

............

CoreCpp

29/76

Header Files 2/2

∗ #include preprocessor should be placed immediately after the header comment

and include guard

........

LLVM,

......

µOS,

............

CoreCpp

∗ Use C++ headers instead of C headers. C++ headers deﬁne additional

functions and their symbols are in the std namespace

.....

Hic

<cassert> instead of <assert.h>

<cmath> instead of <math.h>, etc.

30/76

#include Guard

※ Always use an include guard

........

LLVM,

..........

Google,

.............

Chromium,

..........

Unreal,

............

CoreCpp

• macro include guard vs. #pragma once

- Use macro include guard if portability is a very strong requirement

........

LLVM,

..........

Google,

.............

Chromium,

............

CoreCpp,

...........

Mozilla,

.....

Hic

- #pragma once otherwise

..........

Webkit,

..........

Unreal

※ Ensure a unique name for the include guard, e.g. project_name + path

..........

Google

31/76

#include Syntax

"" syntax

∗ Should be absolute paths from the project include root

..........

Google,

...........

Mozilla,

.....

Hic

e.g. #include "directory1/header.hpp"

<> syntax

• Any external code

..........

Webkit

• Only where strictly required

..........

Google,

....

Hic,

...........

Mozilla,

............

CoreCpp

C/C++ standard library headers #include <iostream>

POSIX/Linux/Windows system headers (e.g. <unistd.h> and <windows.h>

32/76

Order of #include

........

LLVM,

..........

Webkit,

...........

Mozilla,

............

CoreCpp

(1) Main module/interface header, if exists (it is only one)

• space

(2) Current project includes

• space

(3) Third party includes

• space

(4) System includes

Motivation: System/third party includes are self-contained, local includes might not

..........

Google: (4) → (3) → (2)

Note: headers within each section are lexicographic ordered

..........

Google,

..........

Webkit

33/76

#include - Other Issues

• Report at least one function used for each include. It helps to identify unused

headers

<iostream> // std::cout, std::cin

• Forward declarations vs. #includes

• Prefer forward declaration: reduce compile time, less dependency

..............

Chromium

• Prefer #include : safer

..........

Google

34/76

Common Header/Source Filename Conventions

• .h .c .cc

...........

Google,

......

µOS(.h)

•

.hh .cc (rare)

•

.hpp .cpp

.....

µOS(.cpp)

•

.hxx .cxx (rare)

35/76

Example

// [ LICENSE ]

# ifndef PROJECT_A_MY_HEADER

# define PROJECT_A_MY_HEADER

# include

"my_class.hpp" // MyClass

[ blank line ]

# include "my_dir/my_headerA.hpp" // npA::ClassA, npB::f2()

# include "my_dir/my_headerB.hpp" // np::g()

[ blank line ]

# include <cmath> // std::fabs()

# include <iostream> // std::cout

# include <vector> // std::vector

// ..

# endif // PROJECT_A_MY_HEADER

36/76

Preprocessing

Macro 1/3

※ Avoid deﬁning macros, especially in headers

..........

Google

- Do not use macro for enumerators, constants, and functions

.....

µOS,

............

CoreCpp

............

CoreCpp

※ Always put macros after #include statements

.....

µOS

※ Macros should be unique names, e.g. use a preﬁx for all macros related to a

project

MYPROJECT_MACRO

..........

Google,

..........

Unreal,

............

CoreCpp

※

#undef macros wherever p ossible

..........

Google

- Even in the source ﬁles if unity build is used (merging multiple source ﬁles to

improve compile time)

37/76

Macro 2/3

※ Always use curly brackets for multi-line macro

................

Clang-Tidy

# define INCREMENT_TWO(x, y) (x)++; (y)++

if (do_increment)

INCREMENT_TWO(a, b);

// (b)++ will be executed unconditionally

//---------------------------------------------------------------

# define INCREMENT_TWOO(x, y) \

{ \

(x)++; \

(y)++; \

}

※ Macro shall not have side eﬀect

................

Clang-Tidy

# define MIN(X, Y) (X < Y ? X : Y) // MIN(i++) -> increased twice

38/76

Macro 3/3

※ In the deﬁnition of a function-like macro, each instance of a parameter shall be

enclosed in parentheses to prevent unexpected expressions

......

µOS,

................

Clang-Tidy

# define ADD(x, y) ((x) + (y))

∗ Prefer checking macro values. It prevents mistakes deriving from missing

headers

# define MACRO 1 // defined in another header

//------------------------------------------

# if MACRO // instead of #if defined(MACRO)

• Put macros outside namespaces as they don’t have a scope

39/76

Preprocessing Statements 1/2

∗ Close #endif with a comment with the respective condition of the ﬁrst #if

# if defined(MACRO)

...

# endif // defined(MACRO)

∗ The hash mark that starts a preprocessor directive should always be at the

beginning of the line

..........

Google

# if defined(MACRO)

# define MACRO2

# endif

40/76

Preprocessing Statements 2/2

∗ Avoid conditional #include when possible

...........

Mozilla,

.............

Chromium

• Prefer #if defined(MACRO) instead of #ifdef MACRO

Improve readability, help grep-like utils, and it is uniform with multiple conditions

# if defined(MACRO1) && defined(MACRO2)

• Place the \ rightmost for multi-line preprocessing statements

# define MACRO2 \

macro_def...

41/76

Variables

Variables 1/2

※ Always initialize variables in the declaration

..........

Google,

............

CoreCpp,

.....

Hic,

.....

µOS,

............

SEI Cert,

................

Clang-Tidy

※

Place variables in the narrowest scope possible. Declare variables close to the

ﬁrst use

..........

Google,

............

CoreCpp

............

CoreCpp

............

CoreCpp

• It is allowed to declare multiple variables in the same line for improving the

readability, except for pointer or reference

..........

Google

(only one declaration per line)

............

CoreCpp

42/76

Variables 2/2

• Use assignment syntax = when performing “simple” initialization, {} otherwise

..............

Chromium,

............

CoreCpp

• Initialize variables with = , constructors with {}

...........

Mozilla

• Variables with narrow scope need by if , while , for statements should

normally be declared within those statements

if (int* ptr = f()) .

Even better with C++17 initialization statements, e.g.

if (auto it = m.find(10); it != m.end())

..........

Google

∗ Precede boolean values with words like is and did

..........

Webkit,

.............

Chromium

• Use \0 to indicate the null character

..........

Google

char n = '\0';

43/76

static Global Variables

∗ Avoid static global variables unless they are trivially destructible

..........

Google

e.g. std::string str = is not trivially destructible

- static local variables with dynamic initialization are allowed

∗ Avoid static global variables unless they are trivially constructible and

destructible

........

LLVM

∗ Avoid non- const static global variables

....

Hic,

...........

Mozilla,

............

CoreCpp

• Constant initialization of static global variables should be marked with

constexpr or constinit

..........

Google,

................

Clang-Tidy

• static global variables should only be initialized by constant expressions (e.g.

constexpr functions/lambdas)

..........

Google,

................

Clang-Tidy

44/76

Conversions

∗ Use static_cast instead of old-style cast

..........

Google

∗ Use const_cast to remove the const qualiﬁer only for pointers and references

..........

Google

• Avoid const_cast to remove const , except when implementing non- const

getters in terms of const getters

.............

Chromium

• Use reinterpret_cast to do unsafe conversions between pointer types, and

from/to integer types

..........

Google

∗ Use std::bit_cast to interpret the raw bits of a value using a diﬀerent type of

the same size

..........

Google

45/76

Enumerators

※ Prefer enumerators over macros

............

CoreCpp

∗ Prefer enum class over plain enum

.........

Unreal,

......

µOS,

............

CoreCpp

• Specify the underlying type and enumerator values only when necessary

............

CoreCpp

............

CoreCpp

enum class MyEnum : int16_t { Abc = 1, Def = 2 }; // bad

• Do not cast an expression to an enumeration type

Color c = static_cast<Color>(3)

.....

Hic

• Don’t use ALL_CAPS for enumerators

............

CoreCpp

46/76

Arithmetic Types

Signed vs. Unsigned Integral Types

※ Don’t mix signed and unsigned arithmetic

............

CoreCpp,

.....

µOS

∗ Prefer signed integers whatever possible

..........

Google,

......

µOS,

............

CoreCpp,

∗ Use unsigned integer only for bitwise operations

..........

Google,

......

µOS,

............

CoreCpp

∗ Do not shift ≪ signed operands

.....

Hic,

......

µOS,

................

Clang-Tidy

∗ size_t vs. int64_t

- Use

int64_t instead of size_t for object counts and loop indices

..........

Google

- Use size_t for object and allocation sizes, object counts, array and pointer oﬀsets,

vector indices, and so on (to avoid overﬂow undeﬁned behavior)

.............

Chromium

• Do not apply unary minus to operands of unsigned type, e.g. -1u

.....

Hic

47/76

Integral Types Conversion

∗ Avoid silent narrowing conversions, e.g, int i += 0.1;

................

Clang-Tidy

• Use brace initialization to convert/deﬁne constant arithmetic types

(narrowing) e.g.

int64_t{MyConstant}

..........

Google

• Use intptr_t to convert raw pointers to integers

..........

Google

• Be aware of implicit cast to int

48/76

Integral Types: Size and Other Issues

Size:

※ Except int , use ﬁxed-width integer type (e.g. int64_t , int8_t , etc.)

.............

Chromium,

..........

Unreal,

..........

Google,

.....

Hic,

......

µOS,

................

Clang-Tidy

∗ Prefer 32/64-bit signed integers over smaller data types

..........

Google

• 64-bit integers add no/little overhead on 64-bit platforms

Other issues:

• Avoid redundant type, e.g.

unsigned int , signed int

..........

Webkit

49/76

Floating-Point Types 1/2

∗ Floating point numbers shall not be converted to integers except through

use of standard library functions std::floor , std::ceil

......

µOS,

.....

Hic

double d = ...;

int i = d; // BAD, prefer std::floor(d)

∗ Don’t convert an expression of wider ﬂoating-point type to a narrower

ﬂoating-point type

.....

Hic

float f1 = 1.0; // Bad

float f2 = 1.0F; // Ok

50/76

Floating-Point Types 2/2

※ Do not directly compare ﬂoating point == , < , etc.

.....

Hic,

.....

µOS

• Floating-point literals should always have a radix point, with digits on both sides,

even if they use exponential notation

2.0f

..........

Google,

..........

Webkit (opposite)

51/76

Functions

Functions 1/2

※ A function should perform a single logical operation to promote simple

understanding, testing, and reuse

............

CoreCpp

※

Split up large functions (≥ 40) into logical sub-functions for improving

readability and compile time

..........

Unreal,

..........

Google,

............

CoreCpp,

................

Clang-Tidy

∗ Prefer pure functions, namely functions that always returns the same result

given the same input arguments (no external dependencies) and do es not modify

any state or have side eﬀects outside of returning a value

............

CoreCpp

52/76

Functions 2/2

∗ Limit overloaded functions. Prefer default arguments

...........

Google,

............

CoreCpp

(don’t use default arguments)

.....

Hic

∗ Overload a function when there are no semantic diﬀerences between

variants

..........

Google

53/76

Functions Parameters 1/4

※ Don’t declare functions with an excessive number of parameters. Use a

wrapper structure instead

.....

Hic,

...........

CoreCpp,

..........

Unreal,

.....

µOS

∗ Specify all input-only parameters before any output parameters

..........

Google

∗ Avoid adjacent parameters of the same type → easy to swap by mistake

............

CoreCpp

54/76

Functions Parameters - Input/Output 2/4

※ Pass-by- const -pointer or reference for input parameters are not intended to

be modiﬁed by the function

...........

Google,

..........

Unreal

• Use std::optional to represent optional by-value input parameters

..........

Google

∗ Pass-by-reference for input/output parameters

............

CoreCpp

∗ Pass-by-reference for output parameters, except rare cases where it is optional in

which case it should be passed-by-pointer

..........

Google

55/76

Functions Parameters - By-Value, By-Rvalue 3/4

• Prefer pass-by-value for small and trivially copyable types

............

CoreCpp,

.....

Hic

• Don’t pass-by- const -value, especially in the declaration (same signature of

pass-by-value)

..........

Google

(opposite) Autosar

∗ Don’t use rvalue references && except for move constructors and move

assignment operators

..........

Google

56/76

Functions Parameters 4/4

∗ Boolean parameters should be avoided

..........

Unreal

• Prefer enum to bool on function parameters

..........

Webkit,

.............

Chromium

• Parameter names should be the same for declaration and deﬁnition

................

Clang-Tidy,

.....

Hic

• All parameters should be aligned if they do not ﬁt in a single line (especially in the

declaration)

void f(int a,

const int* b);

57/76

Functions Arguments

• Consider introducing variables to describe the meaning of arguments

..........

Google

f(true); // BAD

bool enable_checks = true; // GOOD

f(enable_checks);

• Use argument comment to describe “magic number” arguments

...............

Clang-Tidy,

..........

Google

void f(bool enable_checks);

/*enable_checks=*/ true);

• All arguments should be aligned to the ﬁrst one if they do not ﬁt in a single line

..........

Google

my_function(my var1, my_var2,

my_var3);

58/76

Function Return Values 1/2

∗ Prefer to return values rather than output parameters

...........

Google,

............

CoreCpp

∗ Prefer to return by-value

..........

Google

• Prefer to return a struct /structure binding to return multiple output values

............

CoreCpp

• Don’t return const values

............

CoreCpp

• Use trailing return types only where using the ordinary syntax is impractical or

much less readable

..........

Google,

..........

Webkit

int foo(int x) instead of auto foo(int x) -> int

59/76

Function Return Values 2/2

※ Transfer ownership with smart pointers. Never return pointers for new objects.

Use std::unique_ptr instead

...........

Google,

..............

Chromium,

............

CoreCpp

int* f() { return new int[10]; } // wrong!!

std::unique_ptr<int> f() { return new int[10]; } // correct

void FooConsumer(std::unique_ptr<Foo> ptr); // correct

※ Never return reference/pointer for local objects. Return a pointer only to

indicate a position

............

CoreCpp

............

CoreCpp

..........

Google,

............

SEI Cert

60/76

Function Speciﬁers

• If a function might have to be evaluated at compile time, declare it constexpr

............

CoreCpp

............

CoreCpp

• Do not separate declaration and deﬁnition for

template and inline functions

..........

Google

• Use inline only for small functions (e.g. ≤ 10 lines, no loops or switch

statements)

..........

Google,

.....

Hic,

............

CoreCpp

• Do not use inline when declaring a function (only in the deﬁnition)

• Do not use inline when deﬁning a function in a class deﬁnition

........

LLVM

• Use noexcept when it is useful and correct

..........

Google

61/76

Lambda Expressions

∗ Prefer explicit captures if the lambda may escape the current scope

..........

Google

• Use default capture by reference ( [&] ) only when the lifetime of the lambda is

obviously shorter than any potential captures

...........

Google,

............

CoreCpp

• Do not capture variables implicitly in a lambda, e.g. [&]{body}

.....

Hic

• Omit parentheses for a C++ lambda whenever possible

[this] { return m_member; }

..........

Webkit

(opposite)

.....

Hic

int a[] { ++i }; // Not a lambda

[] { ++i; }; // A lambda

62/76

Structs and Classes

struct vs. class

∗ Use struct only for passive objects that carry data; everything else is

class

...........

Google,

............

CoreCpp

∗ Use class rather than struct if any member is non- public

............

CoreCpp

∗ Prefer struct instead of pair or tuple

..........

Google

63/76

Initialization

※ Objects are fully initialized by constructor calls and all resources acquired

must be released by the class’s destructor

..........

Google,

............

CoreCpp

............

CoreCpp

.....

Hic,

................

Clang-Tidy

∗ Prefer in-class initializers to member initializers

.............

Chromium,

............

CoreCpp

...........

CoreCpp

................

Clang-Tidy

∗ Initialize member variables in the order of member declaration

............

CoreCpp,

.....

Hic

∗ Prefer initialization to assignment in constructors

............

CoreCpp

struct A {

int _x;

int x) { x = _x; } // bad

64/76

Braced Initializer Lists

• Initialize variables with = , constructors with {}

...........

Mozilla

• Prefer braced initializer lists {} for constructors to clearly distinguish from

function calls, avoid implicit narrowing conversion, and avoid the most vexing

parse problem

............

CoreCpp

............

CoreCpp

............

CoreCpp

void f(float x) {

int v(int(x)); // function declaration

int v{int(x)}; // variable

}

• Do not use braced initializer lists {} for constructors (at least for containers, e.g.

std::vector ). It can be confused with std::initializer_list

........

LLVM

65/76

Special Member Functions

∗ Use delegating constructors to represent common actions for all

constructors of a class

............

CoreCpp,

.....

Hic

∗ Mark destructor and move constructor/assignment noexcept

............

CoreCpp

............

CoreCpp

.....

Hic

.....

Hic

............

SEI Cert,

................

Clang-Tidy

∗ Avoid implicit conversions. Use the explicit keyword for conversion

operators and constructors, especially single argument constructors

..........

Google,

............

CoreCpp

............

CoreCpp

.....

Hic,

......

µOS,

................

Clang-Tidy

66/76

=default, =delete

∗ Indicate if a non-trivial class is copyable, move-only, or neither copyable

nor movable by using

= default / = delete for constructors and assignment

operators if not directly implemented

..........

Google,

...........

Mozilla,

..............

Chromium,

............

CoreCpp

∗ Prefer = default constructors over user-deﬁned / implicit default

constructors

..........

Mozilla,

.............

Chromium,

............

CoreCpp,

.....

Hic

∗ Use = delete for mark deleted functions

............

CoreCpp,

.....

Hic

67/76

Structs and Classes - Other Issues 1/2

∗ Don’t return pointers or references to non- const objects from const

methods

.............

Chromium

∗ Use const functions wherever possible

..........

Google,

.............

Chromium,

......

µOS,

................

Clang-Tidy

∗ Make a function a member only if it needs direct access to the representation of a

class. Use a

static function or a free-function otherwise

............

CoreCpp

• Don’t deﬁne a class or enum and declare a variable of its type in the same

statement, e.g.

struct Data /*...*/ data;

............

CoreCpp

68/76

Structs and Classes - Other Issues 2/2

∗ Do not overload operators with special semantics && , ^ && , || , , , & ,

operator"" (user-deﬁned literals)

..........

Google,

.....

Hic,

.....

µOS

∗ Prefer to de ﬁne non-modifying binary operators as non-member functions

e.g.

operator==

..........

Google,

.....

Hic

∗ Place free-functions that interact with a class in the same namespace, e.g.

operator==

............

CoreCpp

∗ Declare data members private , unless they are constants. This simpliﬁes

reasoning about invariants

..........

Google,

.....

Hic

69/76

Inheritance 1/2

※ Avoid virtual method calls in constructors

..........

Google,

............

CoreCpp,

............

SEI Cert

※ Default arguments are allowed only on non-virtual functions

..........

Google,

............

CoreCpp,

.....

Hic,

................

Clang-Tidy

※ A class with a virtual function should have a virtual or protected destructor

(e.g. interfaces and abstract classes)

............

CoreCpp

∗ Always use

override/final function member keywords

..........

Google,

..........

Webkit,

...........

Mozilla,

..........

Unreal,

....

Hic,

................

Clang-Tidy,

............

CoreCpp

• Do not use virtual with final/override (implicit)

70/76

Inheritance 2/2

∗ Provide a virtual method anchor ( .cpp implementation) for classes in

headers

........

LLVM

∗ Multiple implementation inheritance is discouraged

..........

Google,

..............

Chromium,

.....

Hic,

................

Clang-Tidy

∗ Prefer composition to inheritance

..........

Google

∗ Inheritance should be public

..........

Google

∗ A polymorphic class should suppress public copy/move semantics

............

CoreCpp

71/76

Structs and Classes - Style 1/5

※ Declare class data members in special way

- It helps to keep track of class variables and local function variables

- The ﬁrst character is helpful in ﬁltering through the list of available variables

Examples:

- Trailing underscore (e.g.

member_var_ )

..........

Google,

......

µOS,

.............

Chromium

- Leading underscore (e.g. _member_var ) .NET

- Public members (e.g. m_member_var , mVar )

..........

Webkit,

...........

Mozilla

- Static members (e.g. s_static_var , sVar )

..........

Webkit,

...........

Mozilla

Personal Comment: Prefer _member_var as I read left-to-right and is less invasive

• Class members are indented

..........

Google

72/76

Structs and Classes - Style 2/5

∗ Class inheritance declarations order:

public , protected , private

..........

Google,

......

µOS,

............

CoreCpp

∗ Declarations order

..........

Google

(a) Types and type aliases

(b) (Optionally, for structs only) non-static data members

(d) Factory functions

(e) Constructors and assignment operators

(f) Destructor

(g) All other functions

(h) All other data members

73/76

Structs and Classes - Style 3/5

struct A { // passive data structure

int x;

float y;

};

class B {

public:

B();

void public_function();

protected:

int _a; // in general, it is not public in derived classes

void _protected_function(); // "protected_function()" is not wrong

// it may be public in derived classes

private:

int _x;

float _y;

void _private_function();

};

74/76

Structs and Classes - Style 4/5

• In the constructor, each member of the initializer list should be indented on a

separate line, e.g.

..........

Google,

..........

Webkit

A::A(int x1, int y1) :

x{x1}, // double indentation

y{y1} {

body

}

// or

A::A(int x1, int y1)

: x{x1},

y{y1} {

body

}

75/76

Structs and Classes - Style 5/5

• If possible, avoid this-> keyword

• Prefer empty() method over size() to check if a container has no items

...........

Mozilla

• Do not use get for observer methods ( const ) without parameters, e.g.

get_size() → size()

..........

Webkit

• Precede getters that return values via out-arguments with the word get

.............

Chromium

• Precede setters with the word set . Use bare words for getters

..........

Webkit,

.............

Chromium

76/76

Modern C++

Programming

15. Code Conventions

Part II

Federico Busato

2025-04-14

Table of Contents

1 auto

2 Templates and Type Deduction

3 Control Flow

Redundant Control Flow

if/else

Comparison

switch

for/while

1/78

Table of Contents

4 namespace

using namespace Directive

Anonymous/Unnamed Namespace

Namespace and Class Design

Style

5 Modern C++

Keywords

Features

Class

Library

2/78

Table of Contents

6 Maintainability

Code Comprehension

Functions

Template and Deduction

Library

7 Portability

3/78

Table of Contents

8 Naming

Entities

Variables

Functions

Style Conventions

Enforcing Naming Styles

4/78

Table of Contents

9 Readability and Formatting

Horizontal Spacing

Pointers/References

Vertical Spacing

Braces

Type Decorators

Reduce Code Verbosity

Other Issues

5/78

Table of Contents

10 Code Documentation and Comments

Function Documentation

Comment Syntax

File Documentation

6/78

auto

∗ Use auto to avoid type names that are noisy, obvious, or unimportant

auto array = new int[10];

auto var = static_cast<int>(var);

........

LLVM,

..........

Google

lambdas, iterators, template expressions

unreal (only)

∗ Do not excessively use auto for variable types. Use auto only when the

left type is easy to deduce looking at the right expression

..........

Google

• Don’t use auto when the type would be deduced to be a pointer type

auto* v = new int;

.............

Chromium

• Use auto for return type deduction only with small/simple functions and lambda

expressions

..........

Google

7/78

Templates and Type

Deduction

Templates and Type Deduction

※ Avoid complicated template programming

..........

Google

∗ Prefer automatic template deduction f(0) instead of f<int>(0)

• Use class template argument deduction (CTAD) only with templates that provide

at least one explicit deduction guide

..........

Google

• Use trailing return types only where using the ordinary syntax is impractical or

much less readable

..........

Google,

..........

Webkit

int foo(int x) instead of auto foo(int x) -> int

8/78

Templates and Type Deduction

• Declare template specializations in the same ﬁle as the primary template they

specialize

.....

Hic

template<typename T>

f(); // primary template

template<>

f<int>();

• Do not place spaces between the identiﬁer template and its angle brackets

..........

Webkit

template<typename U> struct Bar { };

9/78

Control Flow

※ Limit control ﬂow complexity (cyclomatic/cognitive complexity)

.....

Hic,

......

µOS,

................

Clang-Tidy

∗ Avoid

goto

......

µOS,

............

CoreCpp

10/78

Redundant Control Flow 1/3

∗ Avoid redundant control ﬂow (see next slides)

................

Clang-Tidy,

............

CoreCpp

- Do not use else after a return / break

........

LLVM,

..........

Webkit,

................

Clang-Tidy

- Avoid comparing boolean condition to true/false

...........

Mozilla

- Avoid return true/return false pattern

- Merge multiple conditional statements

11/78

Redundant Control Flow 2/3

if (condition) { // BAD

< body1 >

return; // <--

}

else // <-- redundant

< body2 >

if (condition) { // GOOD

< body1 >

return;

}

< body2 >

if (condition == true) // BAD

if (condition) // GOOD

12/78

Redundant Control Flow 3/3

if (condition) // BAD

return true;

else

return false

;

return condition; // GOOD

if (condition1) {

if (condition2) {

if (condition3) { // BAD

if (condition1 && condition2 && condition3) { // GOOD

bool condition4 = condition1 && condition2 && condition3;

if (condition4) { // GOOD

13/78

Control Flow - if/else

∗ The if and else keywords belong on separate lines

if (c1) <statement1>; else <statement2>; // BAD

..........

Google,

..........

Webkit

• Don’t use the ternary operator ( ?: ) as a sub-expression

(i != 0) ? ((j != 0) ? 1 : 0) : 0;

.....

Hic

14/78

Control Flow - Comparison

※ Tests for null/non-null , and zero/non-zero should all be done with

equality comparisons

.....

Hic

(opposite)

...........

Mozilla,

..........

Webkit,

............

CoreCpp

if (!ptr)

return;

if (!count)

return;

if (ptr == nullptr)

return;

if (count == 0)

return;

※ Prefer (ptr == nullptr) and x > 0 over (nullptr == ptr) and

0 < x

.............

Chromium

15/78

Control Flow - switch

∗ Prefer switch to multiple if -statement

............

CoreCpp

∗ Don’t use default labels in fully covered switch over enumerations

.........

LLVM,

............

CoreCpp

∗ In all other cases, switch statements should always have a default case

..........

Google,

..........

Unreal,

.....

Hic,

................

Clang-Tidy

16/78

Control Flow - switch - Style

• case blocks in switch statements are indented twice

..........

Google

switch (var) {

case 0: {

Foo();

break;

}

• A case label should line up with its

switch statement. The case statement is

indented

..........

Webkit

switch (var) {

case 0:

Foo();

break;

}

17/78

Control Flow - for/while 1/3

※ Use range-based for loops whenever possible

.........

LLVM,

..........

Unreal,

...............

Clang-Tidy,

............

CoreCpp

............

CoreCpp

............

CoreCpp

∗ Prefer a for -statement to a while -statement when there is an obvious loop

variable

............

CoreCpp

∗ Prefer a while -statement to a for -statement when there is no obvious loop

variable

............

CoreCpp

• Avoid do-while loop

............

CoreCpp

18/78

Control Flow - for/while 2/3

• Use early exits ( continue , break , return ) to simplify the code

.........

LLVM,

............

CoreCpp

for (<condition1>) { // BAD

if (<condition2>)

...

}

for (<condition1>) { // GOOD

if (!<condition2>)

continue;

...

}

19/78

Control Flow - for/while 3/3

∗ Turn predicate loops into predicate functions

.........

LLVM,

............

CoreCpp

bool var = ...;

for (<loop_condition1>) { // should be an external

if (<condition2>) { // function

var = ...

break;

}

20/78

namespace

Namespace

※ Always place code in a namespace to avoid global namespace pollution

..........

Google

※

Do not use namespace aliases namespace nsA = other_namespace at

namespace/global scope in header ﬁles except in explicitly marked

internal-only namespaces

...........

Google,

...........

Mozilla

※

Do not declare anything in the namespace std

...........

Google,

............

SEI Cert,

................

Clang-Tidy,

............

CoreCpp

※ Do not use using namespace declarations of any kind to import names in the

std namespace

..........

Webkit

∗ Do not use

inline namespaces

..........

Google

21/78

using namespace Directive

※ Avoid using namespace -directives, especially at global scope

........

LLVM,

..........

Google,

..........

Webkit,

..........

Unreal,

....

Hic,

......

µOS,

............

CoreCpp

# include <cmath> // if 'header.hpp' contains

# include "header.hpp" // 'using namespace std;'

auto f(float a) { return abs(a) * 2; } // f(3.5) returns 7 instead of 6

∗ Limit using namespace -directives at local scop e and prefer explicit

namespace entities declarations

..........

Google,

..........

Unreal,

.....

Hic,

................

Clang-Tidy

• using namespace is allowed in implementation ﬁles in nested namespaces

..........

Webkit

22/78

Anonymous/Unnamed Namespace

※ Avoid anonymous namespaces/ static in headers

...........

Google,

......

µOS,

............

SEI Cert,

................

Clang-Tidy,

............

CoreCpp

• anonymous namespace vs. static

- anonymous namespaces instead of static everywhere

....

Hic,

...............

Clang-Tidy,

............

CoreCpp

- anonymous namespaces only for struct / class declaration, static

otherwise (easy identiﬁcation)

........

LLVM,

...........

Mozilla,

......

µOS

∗ Anonymous namespaces and static in source ﬁles:

Items local to a source ﬁle (e.g. .cpp) ﬁle should be wrapped in an anonymous

namespace/marked

static . Anonymous namespaces/ static restrict symbols visibility

to the translation unit, improving function call cost and reduce the size of entry point

tables

..........

Google,

.............

Chromium,

...........

CoreCpp,

.....

Hic,

......

µOS

23/78

Namespace and Class Design

※ All helper functions and operators of a class need to belong to the same

namespace of the class

∗ Prefer free functions in namespaces instead of classes, avoid global scope

functions

..........

Google

Namespaces & Interface Principle

24/78

Style 1/2

∗ The content of namespaces is not indented

........

LLVM,

..........

Google,

..........

Webkit

namespace ns {

void f() {}

}

∗ Close namespace declarations

.........

LLVM,

..........

Google,

..........

Webkit,

................

Clang-Tidy

} // namespace <namespace_identifier>

} // namespace (for anonymous namespaces)

• Namespaces should have unique names based on the project name

..........

Google

25/78

Style 2/2

• Prefer single-line nested namespace declarations ns1::ns2 C++17

...........

Google,

...........

Mozilla

• Minimize use of nested namespaces

.............

Chromium

• Namespaces can match hierarchy with ﬁle system hierarchy for consistency

include/

my_project/

core.hpp

detail/

helper.hpp

namespace my_project::detail

Using namespaces effectively

26/78

Modern C++

Use C++ over pure C and

use modern C++ wherever possible

27/78

Modern C++ Keywords 1/3

※ Use constexpr C++11 variables to deﬁne true constants (instead of macro)

..........

Google,

..........

Webkit,

............

CoreCpp

............

CoreCpp

※ Use consteval C++20 function to ensure compile-time evaluation

..........

Google

※

Use constinit C++20 to ensure constant initialization for non-constant

variables

..........

Google

※ static_assert compile-time assertion

..........

Unreal,

.....

Hic

28/78

Modern C++ Keywords 2/3

※ Prefer enum class C++11 instead of plain enum C++11

.........

Unreal,

......

µOS,

............

CoreCpp

∗ Use auto C++11 to avoid type names that are noisy, obvious, or

unimportant

auto array = new int[10];

auto var = static_cast<int>(var);

........

LLVM,

..........

Google,

....

Hic,

................

Clang-Tidy,

............

CoreCpp

(only for lambdas, iterators, template expressions)

..........

Unreal

※ nullptr C++11 instead of 0 or NULL for pointers

..........

Google,

..........

Unreal,

..........

Webkit,

...........

Mozilla,

....

Hic,

.....

µOS,

................

Clang-Tidy,

............

CoreCpp

29/78

Modern C++ Keywords 3/3

∗ Use the explicit keyword for conversion operators C++11 and

constructors. Do not deﬁne implicit conversions

...........

Google,

...........

Mozilla,

.....

µOS

※

Use using C++11 instead typedef

..........

Mozilla,

................

Clang-Tidy,

............

CoreCpp

∗ Avoid throw function sp eciﬁer. Use noexcept C++11 instead

.. . .. .. .. .. .

Microsoft Blog

30/78

Modern C++ Features 1/2

※ lambda expression C++11

..........

Unreal

※ move semantic C++11

..........

Unreal

※ Use range-based for loops whenever possible C++11

.........

LLVM,

..........

Unreal,

...............

Clang-Tidy,

............

CoreCpp

............

CoreCpp

............

CoreCpp

∗ Prefer uniform (brace) initialization C++11 when it cannot be confused with

std::initializer_list

.............

Chromium

31/78

Modern C++ Features 2/2

∗ static_cast , reinterpret_cast , const_cast , std::bit_cast C++20,

instead of old style cast

(type)

........

LLVM,

..........

Google,

......

µOS,

.....

Hic,

................

Clang-Tidy

∗ Use [[deprecated]] C++14 / [[noreturn]] C++11 / [[nodiscard]]

C++17 to indicate deprecated functions / that do not return / result should not

be discarded

................

Clang-Tidy

∗ Use = delete C++11 to mark deleted functions

• Replace SFINAE with concepts C++20

................

Clang-Tidy

• Use structure binding C++17

32/78

Modern C++ Features - Class 1/2

∗ Always use override C++11 and final function member keywords

..........

Google,

..........

Webkit,

...........

Mozilla,

..........

Unreal,

....

Hic,

................

Clang-Tidy,

............

CoreCpp

∗ Use = default C++11 constructors

33/78

Modern C++ Features - Class 2/2

∗ Use braced direct-list-initialization or copy-initialization C++11 for setting

default data member value. Avoid initialization in constructors if possible

..........

Unreal

struct A {

int x = 3; // copy-initialization

int x { 3 }; // direct-list-initialization

};

• Replaces explicit calls to the constructor in a return with a braced initializer list

................

Clang-Tidy

Foo bar() { return Foo(3); }

Foo bar() { return {3}; }

34/78

Modern C++ Library

※ Avoid C-Style memory management malloc()/free() and use new/delete

............

CoreCpp,

................

Clang-Tidy

※

Except int , Use ﬁxed-width integer type C++11 (e.g. int64_t , int8_t ,

etc.)

.............

Chromium,

..........

Unreal,

..........

Google,

.....

Hic,

......

µOS,

................

Clang-Tidy

• Use std::print C++23

................

Clang-Tidy

• Uses modern type traits C++17

................

Clang-Tidy

std::is_integral<T>::value; // --> std::is_integral_v

std::make_signed<unsigned>::type; // --> std::std::make_signed_t

35/78

Maintainability

Maintainability 1/3

※ Document code (See code documentation section)

※ Don’t optimize without reason

............

CoreCpp

∗ Address compiler warnings. Compiler warning messages mean something is

wrong

..........

Unreal

∗ Compile-time and link-time errors should be preferred over run-time errors

......

µOS,

............

CoreCpp

36/78

Maintainability 2/3

∗ Avoid RTTI (dynamic_cast) and exceptions

........

LLVM,

..........

Google

...........

Google

...........

Mozilla

...........

Mozilla

.....

Hic

※

Do not use reserved names

............

SEI Cert,

................

Clang-Tidy

- double underscore followed by any character __var

- single underscore followed by uppercase

_VAR

• The

goto statement shall not be used

......

µOS,

................

Clang-Tidy

• Code that is not used (commented out) should be deleted

.....

µOS

• Code should not include unnecessary constructs: variables, types, unreachable

code

.....

µOS

37/78

Maintainability 3/3

※ Do not depend on the order of evaluation for side eﬀects

............

SEI Cert

f(i++, i++);

a[i

++] = i;

• Do not perform assignments in conditional statements

............

SEI Cert,

................

Clang-Tidy

if (a = b)

∗ Prefer sizeof(variable/value) instead of sizeof(type)

..........

Google

∗ Avoid octal numbers, e.g. int v = 0010; //8

.....

Hic,

.....

µOS

38/78

Maintainability - Code Comprehension

※ Write self-documenting code

e.g. (x + y - 1) / y → ceil_div(x, y)

..........

Unreal

※

Use symbolic names instead of literal values in code (don’t use magic numbers)

....

Hic,

................

Clang-Tidy,

............

CoreCpp

double area1 = 3.14 * radius * radius; // BAD

constexpr auto Pi = 3.14; // correct

double area2 = Pi * radius * radius;

• Use parentheses in expressions to specify the intent of the expression,

especially with mixed operators

....

Hic,

.....

µOS,

................

Clang-Tidy,

............

CoreCpp

int r = i + j * k - 4 / 5; // BAD

if ((i != 0) && (j != 0) || (k != 0)) // correct

39/78

Maintainability - Constants 2/3

※ Enforce const -correctness

..........

Unreal

• Pass function arguments by const pointer or reference

............

CoreCpp

• Function members

............

CoreCpp

• Use const iteration over containers if the loop isn’t intended to modify the

container

• Declare an object

const or constexpr unless you want to modify its value

later on

............

CoreCpp

...........

CoreCpp

..........

Unreal

• but don’t

const all the things

............

CoreCpp

• Pass by- const value: almost useless (copy), ABI break

•

const return: useless (copy)

...............

Clang-Tidy,

..........

Unreal

• const data member: disable assignment and copy constructor

•

const local variables: verbose, rarely eﬀective

Don’t const all the things

40/78

Maintainability - Functions

※ Use assert to document preconditions and assumptions

.........

LLVM,

............

CoreCpp

• Ensure that all statements are reachable for at least one combination of function

inputs

.....

Hic

• Prevent using functions that don’t accept nullptr

............

CoreCpp

# include <cstddef> // std::nullptr_

void f(void*);

void f(std::nullptr_t) = delete;

// f(nullptr) // compile error

41/78

Maintainability - Object Semantic

∗ Prefer RAII instead of manual resource management

............

CoreCpp

............

CoreCpp

void f(char* name) {

FILE* input = fopen(name, "r"); // use "ifstream input {name};" instead

if (something) return; // BAD: if something == true,

// ... // a file handle is leaked

fclose(input);

}

※ Never transfer ownership by a raw pointer (T*) or reference (T&) . Use

object semantics, unique_ptr , etc.

............

CoreCpp

∗ Avoid singletons. Use a static member function named singleton() to

access the instance of the singleton instead of a free function

..........

Webkit,

............

CoreCpp

42/78

Maintainability - Template and Deduction

※ Avoid complicated template programming

..........

Google

∗ Be aware of bug-prone deductions

template<typename T, int N>

void f(const T&);

template<typename T>

void f(T); // same of f(T*)

int array[3];

f(array);

// call the second funtion, not f(T&)

43/78

Maintainability - Library

∗ Do not pass an array as a single pointer. Prefer std::span , std::mdspan

............

CoreCpp

∗ Prefer core-language features over library facilities, e.g. uint8_t vs.

std::byte

• Prefer std::array over plain array. It can be also used to return multiple values

of the same type from a function

............

CoreCpp

............

CoreCpp

• Use std::string_view to refer to character sequences

............

CoreCpp

Prefer core-language features over library facilities

44/78

Portability

Portability 1/2

※ Ensure ISO C++ compliant code. Do not use non-standard extensions

see -Wpedantic

.....

Hic,

..........

Google

..........

Google

......

µOS,

............

CoreCpp

※ Do not use deprecated C++ features, or asm declarations, e.g. register ,

__attribute__ , throw (function qualiﬁer)

.....

Hic

※

Do not use reinterpret_cast or union for type punning

Prefer

std::bit_cast or std::memcpy

............

CoreCpp

............

CoreCpp

.....

Hic

※ Except int , use ﬁxed-width integer type (e.g. int64_t , int8_t , etc.)

.............

Chromium,

..........

Unreal,

..........

Google,

.....

Hic,

......

µOS,

................

Clang-Tidy

45/78

Portability 2/2

※ Don’t use long double

∗ Do not use UTF characters* for portability, prefer ASCII

..........

Google,

.....

µOS

∗ If UTF is needed, prefer UTF-8 encoding for portability

..........

Google,

.............

Chromium

∗ Use the same line ending (e.g. '\n' ) for all ﬁles

...........

Mozilla,

.............

Chromium

* Trojan Source attack for introducing invisible vulnerabilities

46/78

Naming

“Beyond basic mathematical aptitude, the diﬀerence be-

tween go od programmers and great programmers is verbal

ability”

Marissa Mayer

47/78

General Notes on Naming 1/2

∗ Naming is hard. Most of the time, code is shared with other developers. It is

worth spending a few seconds to ﬁnd the right name

∗ Think about the purpose to choose names

∗ Adopt names commonly used in real contexts (outside the code)

∗ Don’t use the same name for diﬀerent things. Use a speciﬁc name everywhere

• Prefer single English word to implementation-fo cused, e.g.

UpdateConfigFile() → save()

• Use natural word pair, e.g. create()/destroy() , open()/close() ,

begin()/end() , source()/destination()

.....

µOS

48/78

General Notes on Naming 2/2

• Don’t overdecorate, e.g. Base/Impl , Factory/Singleton

• Don’t list the content, e.g. NameAndAddress → ContactInfo

• Don’t repeat class/enum names, e.g. Employee::EmployeeName

• Avoid temporal attributes, e.g.

PreLoad() , PostLoad()

• Use adjectives to enrich a name, e.g. Name → FullName , Salary →

AnnualSalary

Naming is Hard: Let’s Do Better, CppCon 2019, Kate Gregory

49/78

Entities Naming 1/2

∗ Abbreviations are generally bad, longer names are better in most cases (don’t

be lazy)

.....

µOS

※

Use whole words, except in the rare case where an abbreviation would be more

canonical and easier to understand, e.g. tmp

..........

Webkit

∗ Avoid short and very long names. Remember that the average word length in

English is 4.8

................

Clang-Tidy

50/78

Entities Naming 2/2

• Avoid names that are easily misread: similar or hard to pronounce

............

CoreCpp

※

Avoid ambiguous characters, o/O/0 , I/l/1 , s/S/5 , Z/2 , N/n/h , B/8

e.g.

hel1o

....

Hic,

......

µOS,

............

CoreCpp

• Do not abbreviate by deleting letters within a word

..........

Google

• If you are naming something that is analogous to an existing C or C++ entity

then you can follow the existing naming convention scheme

..........

Google

51/78

Variables Naming 1/2

∗ The length of a variable should be proportional to the size of the scope that

contains it. For example, i is ﬁne within a loop

..........

Google,

............

CoreCpp

............

CoreCpp

• Names can be made singular or plural depending on whether they hold a single

value or multiple values, thus arrays a nd collections should be plural

.....

µOS

int value;

int values[N];

52/78

Variables Naming 2/2

• Use common loop variable names

- i, j, k, l used in order

it for iterators

• Make literals readable

............

CoreCpp

auto c = 299'792'458; // digit separation

auto interval = 100ms; // using <chrono>

53/78

Functions Naming

∗ Should be descriptive verb (as they represent actions)

..........

Webkit

∗ Should describe their action or eﬀect instead of how they are

implemented, e.g. partial_sort() → top_n()

∗ Functions that return boolean values should start with boolean verbs, like

is, has, should, does

.....

µOS

empty() → is_empty()

54/78

Naming Style Conventions

Capital Uppercase ﬁrst word letter (sometimes called Pascal style or uppercase

Camel style) (less readable, shorter names)

CapitalStyle

Camel-Back Uppercase ﬁrst word letter except the ﬁrst one (less readable, shorter

names)

camelBack

Snake Lower case words separated by single underscore (good readability, longer

names)

snake_style

Macro Upper case words separated by single underscore (sometimes called All

Capitalized or Screaming style) (best readability, longer names)

MACRO_STYLE

55/78

Naming Style Conventions - Variables/Constant

Variable Variable names should be nouns

• Capital style e.g.

MyVar

........

LLVM,

..........

Unreal

• Snake style e.g. my_var

...........

Google,

..........

Webkit, Std,

......

µOS

• Global variable with g preﬁx, e.g. gVar

...........

Mozilla

• Arguments with a preﬁx, e.g. aVar

...........

Mozilla

Constant • Capital style + k preﬁx,

..........

Google,

...........

Mozilla

e.g. kConstantVar

• Snake style e.g.

my_var

......

µOS

• Macro style e.g. CONSTANT_VAR OpenStack

56/78

Naming Style Conventions - Function

• Camel-back style, e.g. myFunc()

........

LLVM

• Capital style, e.g. MyFunc()

..........

Google,

.............

Chromium,

...........

Mozilla,

..........

Unreal

• Snake style, e.g. my_func()

..........

Webkit, Std,

.....

µOS

• Snake style for accessor and mutator methods

..........

Google,

.............

Chromium

57/78

Naming Style Conventions - Enum/Namespace

Enum • Capital style + k

..........

Google

e.g. enum MyEnum { kEnumVar1, kEnumVar2 }

•

e preﬁx

...........

Mozilla

e.g. enum MyEnum { eVar1, eVar2 }

• Capital style

........

LLVM,

..........

Webkit,

..........

Unreal

e.g. enum MyEnum { EnumVar1, EnumVar2 }

• Snake style

......

µOS

e.g. enum MyEnum { enum_var1, enum_var2 }

Type Should b e nouns

• Capital style (including classes, structs, enums, typedefs, template, etc.)

e.g.

HelloWorldClass

........

LLVM,

...........

Google,

..........

Webkit,

..........

Unreal

• Snake style µOS (class), Std

58/78

Naming Style Conventions - Type/Macro/File

Namespace • Snake style, e.g. my_namespace

...........

Google,

........

LLVM, Std

• Capital style, e.g. MyNamespace

..........

Webkit,

..........

Unreal

Macro Macro style, e.g.

MY_MACRO

..........

Google, Std,

..........

Unreal,

..........

Webkit,

...........

Mozilla,

...........

CoreCpp

Macro style should be used only for macros

............

CoreCpp

............

CoreCpp

............

CoreCpp

...........

CoreCpp

File • Snake style (

my_file )

..........

Google

• Capital style ( MyFile ), could lead Windows/Linux conﬂicts

........

LLVM

59/78

Personal Comment

Personal Comment: Macro style needs to be used only for macros to avoid subtle bugs. I prefer

snake style for almost everything because it has the best readability. On the other hand, I don’t want

to confuse typenames and variables, so I use camel style for the former ones. Finally, I also use camel

style for compile-time constants because they are very relevant in my work and I need to quickly

identify them

60/78

Enforcing Naming Styles

Naming style conventions can be also enforced by using tools like

clang-tidy: readability-identifier-naming W

.clang-tidy conﬁguration ﬁle

Checks: 'readability-identifier-naming'

HeaderFileExtensions: ['', 'h','hh','hpp','hxx']

ImplementationFileExtensions: ['c','cc','cpp','cxx']

CheckOptions:

readability-identifier-naming.ClassCase: 'lower_case'

readability-identifier-naming.MacroDefinitionCase: 'UPPER_CASE'

class MyClass {}; // before

# define my_macro

class my_class {}; // after

# define MY_MACRO

61/78

Readability and

Formatting

Horizontal Spacing 1/3

※ Limit line length (width) to be at most 80 characters long (or 100, or 120) →

help code view on a terminal

........

LLVM (80),

..........

Google (80),

.....

µOS(120)

Personal Comment: I was tempted several times to use a line length > 80 to reduce the

number of lines, and therefore improve the readability. Many of my colleagues use split-screens or

even the notebook during travels. A line length of 80 columns is a good compromise for everyone

• Is the 80 character limit still relevant in times of widescreen monitors?

• Linus Torvalds on 80 column limit

62/78

Horizontal Spacing 2/3

※ Use always the same indentation style

- tab → 2 spaces

..........

Google,

......

µOS

- tab → 4 spaces

........

LLVM,

..........

Webkit,

....

Hic, Python

- (actual) tab = 4 spaces

..........

Unreal

Personal Comment: I worked on projects with both two and four-space tabs. I observed less

bugs due to indentation and better readability with four-space tabs. ’Actual tabs’ breaks the line

length convention and can introduce tabs in the middle of the code, producing a very diﬀerent

formatting from the original one

Style Guide for Python Code, Guido van Rossum„ Barry Warsaw, Alyssa Coghlan

63/78

Horizontal Spacing 3/3

※ Separate commands, operators, etc., by a space

........

LLVM,

..........

Google

..........

Google

..........

Webkit,

............

CoreCpp

if(a*b<10&&c) // BAD

if (a * c < 10 && c) // good

∗ Prefer consecutive alignment

int var1 = ...

long long int longvar2 = ...

• Do not place spaces around unary operators

i ++

..........

Webkit

• Never put trailing white space or tabs at the end of a line

..........

Google

64/78

Pointers/References

• Declaration of pointer/reference variables or arguments may be placed with the

asterisk/ampersand adjacent to either the type or to the variable name for all

symbols in the same way

..........

Google

• char* c;

..........

Webkit,

.............

Chromium,

.........

Unreal,

............

CoreCpp

• char *c;

• char * c;

• Pointer and reference types and variables have no space after the

* or &

..........

Google

char * v; // BAD

auto & v = w; // BAD

* p = 3; // BAD

v. x + 2; // BAD

x = r-> y; // BAD

65/78

Vertical Spacing 1/2

∗ Do not write excessive long ﬁle

∗ Each statement should get its own line

..........

Webkit,

......

µOS,

............

CoreCpp

............

CoreCpp

....

Hic,

..........

Google

x++;

y++;

if (condition)

doIt();

What is your threshold for a long source file?

66/78

Vertical Spacing 2/2

∗ Minimize the number of empty rows. The more code that ﬁts on one screen,

the easier it is to follow and understand the control ﬂow of the program

..........

Google

• Close ﬁles with a blank line

..........

Unreal

67/78

Braces 1/2

∗ Multi-lines statements and complex conditions require curly braces. Use an

additional boolean variable if possible

..........

Google

..........

Google

..........

Webkit

if (c1 && ... &&

c2 && ...) { // correct

}

• Curly braces are not required for single-line statements (

for, while, if )

........

LLVM,

..........

Google,

..........

Webkit

if (c1) { // not mandatory

}

• Always use brace for all control statements

...........

Mozilla,

.............

Chromium,

.....

µOS

68/78

Braces 2/2

∗ Use always the same style for braces

• Same line, aka Kernigham & Ritchie

...........

Google

...........

Google

..........

Webkit (function only),

............

CoreCpp (expect for f unction)

• Its own line, aka Allman

..........

Unreal,

..........

Webkit (class, namespace, control ﬂow)

//Kernigham & Ritchie

int main() {

code

}

// Allman

int main()

{

code

}

Personal Comment: C++ is a very verb ose language. Same line convention helps to keep the

code more compact, improving the readability

69/78

Type Decorators

• The same concept applies to const

• const int* West notation

..........

Google,

............

CoreCpp

•

int const* East notation Autosar (Rule A7-1-3)

Personal Comment: I prefer West notation to prevent unintentional cv-qualify

(const/volatile) of a reference or pointer types

char &const p , see DCL52-CPP. Never

qualify a reference type with const or volatile

• Prefer the common order of declaration static constexpr int var

.....

µOS

70/78

Reduce Code Verbosity

• Use the short name version of built-in types, e.g.

..........

Webkit

unsigned instead of unsigned int

long long instead of long long int

• Don’t

const all the things. Avoid Pass by- const , const return, const

data member,

const local variables

Don’t const all the things

71/78

Other Issues

※ Write all code in English, comments included

∗ Use true , false for boolean variables instead numeric values 0, 1

..........

Webkit,

................

Clang-Tidy

• Boolean expressions at the same nesting level that span multiple lines should have

their operators on the left side of the line instead of the right side

..........

Webkit

return attribute.name() == srcAttr

|| attribute.name() == lowsrcAttr;

Final note: Most of the formatting guidelines can be forced by using clang-tidy W

and clang-format W

72/78

Code

Documentation and

Comments

Programmers vs. Documentation

73/78

Code Documentation

※ Comment what the code does and why

.........

LLVM,

............

CoreCpp

- Avoid how it is implemented at low level

- All ﬁles should report a brief description of their purpose

- Describe classes and methods

∗ Don’t say in comments what can be clearly stated in code

............

CoreCpp

∗ Document each entity (functions, classes, namespaces, deﬁnitions, etc.) and

only in the declarations, e.g. header ﬁles

74/78

Function Documentation

∗ The ﬁrst sentence (beginning with @brief ) is used as an abstract

∗ Document the inputs: @param[in] , @param[in,out] , , and template

parameters

@tparam

∗ Document outputs: return value @return and output parameters

@param[out]

...........

Google,

..........

Unreal

∗ Document preconditions: input ranges, impossible values (e.g. nullptr ),

status/return values meaning

..........

Unreal

∗ Document program state changes (e.g. static ), arguments with lifetime

beyond the duration of the method call (e.g. constructors), performance

implications

...........

Google,

..........

Unreal

75/78

Comment Syntax

∗ Prefer // comment instead of /* */ → prevent bugs and allow string-search

tools like grep to identify valid code lines

.....

Hic,

.....

µOS

• Use the same style of comment // , /// , //* , //! , etc.

• Multiple lines and single line comments can have diﬀerent styles

/**

* comment1

* comment2

/// single line

• µOS++ Doxygen style guide link

• Teaching the art of great documentation, by Google

76/78

Other Comment Issues

• Use anchors for indicating special issues: TODO , FIXME , BUG , etc.

..........

Webkit,

.............

Chromium

• Only one space between statement and comment

..........

Webkit

77/78

File Documentation

∗ Any ﬁle start with a license (even scripts)

..........

Google,

........

LLVM

• Each ﬁle should include

@author name, surname, aﬃliation, email

- @date e.g. year and month

∗ @file the purpose of the ﬁle

in both header and source ﬁles

78/78

Modern C++

Programming

16. Debugging and Testing

Federico Busato

2025-04-14

Table of Contents

1 Debugging Overview

2 Assertions

3 Execution Debugging

Breakpoints

Watchpoints / Catchpoints

Control Flow

Stack and Info

Disassemble

std::breakpoint

1/82

Table of Contents

4 Memory Debugging

valgrind

5 Hardening Techniques

Stack Usage

Standard Library Checks

Undeﬁned Behavior Protections

Control Flow Protections

2/82

Table of Contents

6 Sanitizers

Address Sanitizer

Leak Sanitizer

Memory Sanitizers

Undeﬁned Behavior Sanitizer

Sampling-Based Sanitizer

7 Debugging Summary

8 Compiler Warnings

3/82

Table of Contents

9 Static Analysis

10 Code Testing

Unit Testing

Test-Driven Development (TDD)

Code Coverage

Fuzz Testing

11 Code Quality

clang-tidy

4/82

Feature Complete

5/82

Debugging Overview

Is this a bug?

for (int i = 0; i <= (2^32) - 1; i++) {

“Software developers spend 35-50 percent of their time vali-

dating and debugging software. The cost of debugging, test-

ing, and veriﬁcation is estimated to account for 50-75 percent

of the total budget of software development projects”

from: John Regehr (on Twitter)

The Debugging Mindset

6/82

Errors, Defects, and Failures

• An error is a human mistake. Errors lead to software defects

• A defects is an unexpected behavior of the software (correctness, performance,

etc.). Defects potentially lead to software failures

• A failure is an observable incorrect behavior

7/82

Cost of Software Defects 1/2

8/82

Cost of Software Defects 2/2

Some examples:

• The Millennium Bug (2000): $100 billion

• The Morris Worm (1988): $10 million (single student)

• Ariane 5 (1996): $370 million

• Knight’s unintended trades (2012): $440 million

• Bitcoin exchange error (2011): $1.5 million

• Pentium FDIV Bug (1994): $475 million

• Boeing 737 MAX (2019): $3.9 million

see also:

11 of the most costly software errors in history

Historical Software Accidents and Errors

List of software bugs

9/82

Types of Software Defects

Ordered by ﬁx complexity, (time to ﬁx):

(1) Typos, Syntax, Formatting (seconds)

(2) Compilation Warnings/Errors (seconds, minutes)

(3) Logic, Arithmetic, Runtime Errors (minutes, hours, days)

(4) Resource Errors (minutes, hours, days)

(5) Accuracy Errors (hours, days)

(6) Performance Errors (days)

(7) Design Errors (weeks, months)

10/82

Causes of Bugs

• C++ is very error prone language, see 60 terrible tips for a C++

developer

• Human behavior, e.g. copying & pasting code is very common practice and can

introduce subtle bugs → check the code carefully, deep understanding of its

behavior

11/82

Program Errors

A program error is a set of conditions that produce an incorrect result or unexpected

behavior, including performance regression, memory consumption, early termination,

etc.

We can distinguish between two kind of errors:

Recoverable Conditions that are not under the control of the program. They

indicate “exceptional" run-time conditions. e.g. ﬁle not found, bad

allocation, wrong user input, etc.

Unrecoverable It is a synonym of a bug. It indicates a problem in the program logic.

The program must terminate and be modiﬁed. e.g. out-of-bound,

division by zero, etc.

A recoverable

should be considered unrecoverable if it is extremely rare and diﬃcult to

handle, e.g. bad allocation due to out-of-memory error

12/82

Dealing with Software Defects

Software defects can be identiﬁed by:

Dynamic Analysis A

mitigation strategy that acts on the runtime state of a program.

Techniques: Print, run-time debugging, sanitizers, fuzzing, unit test supp ort,

performance regression tests

Limitations: Infeasible to cover all program states

Static Analysis A

proactive strategy that examines the source code for (potential)

errors.

Techniques: Warnings, static analysis tool, compile-time checks

Limitations: Turing’s undecidability theorem, exponential code paths

13/82

Assertions

Unrecoverable Errors and Assertions

Unrecoverable errors cannot be handled. They should be prevented by using assertion

for ensuring pre-conditions and post-conditions

An assertion is a statement to detect a violated assumption. An assertion represents

an invariant in the code

It can happen both at run-time (

assert ) and compile-time ( static_assert ).

Run-time assertion failures should never be exposed in the normal program execution

(e.g. release/public)

14/82

Assertion

# include <cassert> // <-- needed for "assert"

# include <cmath> // std::is_finite

# include <type_traits> // std::is_arithmetic_v

template<typename T>

T sqrt(T value) {

static_assert(std::is_arithmetic_v<T>, // precondition

"T must be an arithmetic type");

assert(std::is_finite(value) && value >= 0); // precondition

int ret = ... // sqrt computation

assert(std::is_finite(value) && ret >= 0 && // postcondition

(ret == 0 || ret == 1 || ret < value));

return ret;

}

15/82

Assertion

Assertions may slow down the execution. They can be disabled by deﬁning the

NDEBUG macro

# define NDEBUG // or with the flag "-DNDEBUG"

Additionally, MSVC deﬁnes the _DEBUG macro when the /MTd or /MDd ﬂags are

provided to select the debug version of the C run-time library

16/82

Assertion Enhancements 1/2

boost.org/libs/assert W provides an enhanced version of assert to help the

debugging process

The library provides the

BOOST_ASSERT(expr) macro which is mapped to the

following function (to implement and customize)

void boost::assertion_failed(

const char* expr, // failed expression

const char* function, // function name of the failed assertion

const char* file, // file name of the failed assertion

long line); // line number of the failed assertion

17/82

Assertion Enhancements 2/2

boost.org/libs/stacktrace W allows to print the stacktrace for a given function

call

boost::stacktrace::stacktrace() returns a string with the stacktrace

This function can be combined with

boost::assertion_failed , exception

handling, or signal handling to enhance debugging information

0# bar(int) at /path/to/source/file.cpp:70

# bar(int) at /path/to/source/file.cpp:70

2# bar(int) at /path/to/source/file.cpp:70

3# bar(int) at /path/to/source/file.cpp:70

4# main at /path/to/main.cpp:93

# __libc_start_main in /lib/x86_64-linux-gnu/libc.so.6

6# _start

18/82

Execution

Debugging

Execution Debugging (gdb) 1/2

How to compile and run for debugging:

g++ -O0 -g [-g3] <program.cpp> -o program

gdb [--args] ./program <args...>

-O0 Disable any code optimization for helping the debugger. It is implicit for most

compilers

-g Enable debugging

- stores the symbol table information in the executable (mapping between assembly

and source code lines)

- for some compilers, it may disable certain optimizations

- slow down the compilation phase and the execution

-g3 Produces enhanced debugging information, e.g. macro deﬁnitions. Available for

most compilers. Suggested instead of -g

19/82

Execution Debugging (gdb) 2/2

Additional ﬂags:

-ggdb3 Generate speciﬁc debugging information for gdb.

Equivalent to -g3 with gcc

-fno-omit-frame-pointer Do not remove information that can be used to

reconstruct the call stack

-fasynchronous-unwind-tables Allow precise stack unwinding

How to build highly-debuggable C++ binaries

20/82

gdb - Breakpoints

Command Abbr. Description

break <ﬁle>:<line> b Insert a breakpoint in a speciﬁc line

break <function_name> b Insert a breakpoint in a speciﬁc function

break <func/line> if <condition> b Insert a breakpoint with a conditional statement

delete d Delete all breakpoints or watchpoints

delete <breakpoint_number> d Delete a sp eciﬁc breakpoint

clear [function_name/line_number] Delete a speciﬁc breakpoint

enable/disable <breakpoint_number> Enable/Disable a speciﬁc breakpoint

info breakpoints info b List all active breakpoints

21/82

gdb - Watchpoints / Catchpoints

Command Abbr. Description

watch <expression>

Stop execution when the value of expression

changes

(variable, comparison, etc.)

rwatch <variable/location> Stop execution when variable/location

is read

delete <watchpoint_number> d Delete a speciﬁc watchpoint

info watchpoints List all active watchpoints

catch throw Stop execution when an exception is thrown

22/82

gdb - Control Flow

Command Abbr. Description

run [args] r Run the program

continue c Continue the execution

finish f Continue until the end of the current function

step s Execute next line of code (follow function calls)

next n Execute next line of code

until <program_point>

Continue until reach line number,

function name, address, etc.

CTRL+C Stop the execution (not quit)

quit q Exit

help [<command>] h Show help about command

23/82

gdb - Stack and Info

Command Abbr. Description

list l Print code

list <function or #start,#end> l Print function/range code

up u Move up in the call stack

down d Move down in the call stack

backtrace [full] bt Prints stack backtrace (call stack) [local vars]

info args Print current function arguments

info locals Print local variables

info variables Print all variables

info <breakpoints/watchpoints/registers>

Show information about program

breakpoints/watchpoints/registers

24/82

gdb - Print

Command Abbr. Description

print <variable> p Print variable

print/h <variable> p/h Print variable in hex

print/nb <variable> p/nb print variable in binary (n bytes)

print/w <address> p/w Print address in binary

p /s <char array/address> Print char array

p *array_var@

n Print n array elements

p (int[4])<address> Print four elements of type int

p *(char**)&<std::string> Print std::string

25/82

gdb - Disassemble

Command Description

disassemble <function_name> Disassemble a speciﬁed function

disassemble <0xStart,0xEnd addr> Disassemble function range

nexti <variable>

Execute next line of code (follow

function calls)

stepi <variable> Execute next line of code

x/nfu <address>

Examine address

n number of elements,

f format (d: int, f: ﬂoat, etc.),

u data size (b: byte, w: word, etc.)

26/82

std::breakpoint

C++26 provides the <debugging> library, which allows interaction with a debugger

directly from the source code, without relying on platform-speciﬁc intrinsic instructions

•

breakpoint() attempts to temp orarily halt the execution of the program and

transfer control to the debugger. The behavior is implementation-deﬁned

•

breakpoint_if_debugging() halts the execution if a debugger is detected

• is_debugger_present() returns true if the program is executed under a

debugger,

false otherwise

27/82

gdb - Notes

The debugger automatically stops when:

• breakpoint (by using the debugger)

• assertion fail

• segmentation fault

• trigger software breakpoint (e.g. SIGTRAP on Linux)

github.com/scottt/debugbreak

Full story: www.yolinux.com/TUTORIALS/GDB-Commands.html (it also contains a

script to de-referencing STL Containers)

gdb reference card V5 link

28/82

Memory Debugging

Memory Vulnerabilities 1/3

“70% of all the vulnerabilities in Microsoft products are memory safety

issues"

Matt Miller, Microsoft Security Engineer

“Chrome: 70% of all security bugs are memory safety issues"

Chromium Security Report

“you can expect at least 65% of your security vulnerabilities to be

caused by memory unsafety"

What science can tell us about C and C++’s security

Microsoft: 70% of all security bugs are memory safety issues

Chrome: 70% of all security bugs are memory safety issues

What science can tell us about C and C++’s security

29/82

Memory Vulnerabilities 2/3

“Memory Unsafety in Apple’s OS represents 66.3%- 88.2% of all the

vulnerabilities"

“Out of bounds (OOB) reads/writes comprise ∼70% of all the vul-

nerabilities in Android"

Jeﬀ Vander, Google, Android Media Team

“Memory corruption issues are the root-cause of 68% of listed CVEs"

Ben Hawkes, Google, Project Zero

Memory Unsafety in Apple’s Operating Systems

Google Security Blog: Queue the Hardening Enhancements

Google Project Zero

30/82

Memory Vulnerabilities 2/2

Terms like buﬀer overﬂow, race condition, page fault, null pointer, stack exhaustion,

heap exhaustion/corruption, use-after-free, or double free – all describe memory

safety vulnerabilities

Mitigation:

• Run-time check

• Static analysis

• Avoid unsafe language constructs

31/82

valgrind 1/9

valgrind W is a tool suite to automatically detect many

memory management and threading bugs

How to install the last version:

$ wget ftp://sourceware.org/pub/valgrind/valgrind-3.21.tar.bz2

$ tar xf valgrind-3.21.tar.bz2

$ cd valgrind-3.21

$ ./configure --enable-lto

$ make -j 12

$ sudo make install

$ sudo apt install libc6-dbg #if needed

some linux distributions provide the package through apt install valgrid , but it could be an old version

32/82

valgrind 2/9

Basic usage:

• compile with -g

• $ valgrind ./program <args...>

Output example 1:

==60127== Invalid read of size 4 !!out-of-bound access

==60127== at 0x100000D9E: f(int) (main.cpp:86)

==60127== by 0x100000C22: main (main.cpp:40)

==60127== Address 0x10042c148 is 0 bytes after a block of size 40 alloc'd

==60127== at 0x1000161EF: malloc (vg_replace_malloc.c:236)

==60127== by 0x100000C88: f(int) (main.cpp:75)

==60127== by 0x100000C22: main (main.cpp:40)

33/82

valgrind 3/9

Output example 2:

!!memory leak

==19182== 40 bytes in 1 blocks are definitely lost in loss record 1 of 1

==19182== at 0x1B8FF5CD: malloc (vg_replace_malloc.c:130)

==19182== by 0x8048385: f (main.cpp:5)

==19182== by 0x80483AB: main (main.cpp:11)

==60127== HEAP SUMMARY:

==60127== in use at exit: 4,184 bytes in 2 blocks

==60127== total heap usage: 3 allocs, 1 frees, 4,224 bytes allocated

==60127==

==60127== LEAK SUMMARY:

==60127== definitely lost: 128 bytes in 1 blocks !!memory leak

==60127== indirectly lost: 0 bytes in 0 blocks

==60127== possibly lost: 0 bytes in 0 blocks

==60127== still reachable: 4,184 bytes in 2 blocks !!not deallocated

==60127== suppressed: 0 bytes in 0 blocks

34/82

valgrind 4/9

Memory leaks are divided into four categories:

• Deﬁnitely lost

• Indirectly lost

• Still reachable

• Possibly lost

When a program terminates, it releases all heap mem ory allocations. Despite this,

leaving memory leaks is considered a bad practice and makes the program unsafe with

respect to multiple internal iterations of a functionality. If a program has memory leaks

for a single iteration, is it safe for multiple iterations?

A robust program prevents any memory leak even when abnormal conditions occur

35/82

valgrind 5/9

Deﬁnitely lost indicates blocks that are not deleted at the end of the program (return

from the

main() function). The common case is local variables pointing to newly

allocated heap memory

void f() {

int* y = new int[3]; // 12 bytes definitely lost

}

int main() {

int* x = new int[10]; // 40 bytes definitely lost

f();

}

36/82

valgrind 6/9

Indirectly lost indicates blocks pointed by other heap variables that are not deleted.

The common case is global variables pointing to newly allocated heap memory

struct A {

int* array;

};

int main() {

* x = new A; // 8 bytes definitely lost

x->array = new int[4]; // 16 bytes indirectly lost

}

37/82

valgrind 7/9

Still reachable indicates blocks that are not deleted but they are still reachable at the

end of the program

int* array;

int main() {

array = new int[3];

}

// 12 bytes still reachable (global static class could delete it)

# include <cstdlib>

int main() {

int* array = new int[3];

std::abort(); // early abnormal termination

// 12 bytes still reachable

... // maybe it is delete here

}

38/82

valgrind 8/9

Possibly lost indicates blocks that are still reachable but pointer arithmetic makes the

deletion more complex, or even not possible

# include <cstdlib>

int main() {

int* array = new int[3];

array++; // pointer arithmetic

std::abort(); // early abnormal termination

// 12 bytes still reachable

... // maybe it is delete here but you should be able

// to revert pointer arithmetic

}

39/82

valgrind 9/9

Advanced ﬂags:

•

–leak-check=full print details for each “definitely lost" or “possibly lost"

block, including where it was allocated

•

–show-leak-kinds=all to combine with –leak-check=full. Print all leak kinds

•

–track-fds=yes list open ﬁle descriptors on exit (not closed)

•

–track-origins=yes tracks the origin of uninitialized values (very slow execution)

valgrind --leak-check=full --show-leak-kinds=all

--track-fds=yes --track-origins=yes ./program <args...>

Track stack usage:

valgrind --tool=drd --show-stack-usage=yes ./program <args...>

40/82

Hardening

Techniques

Overview and References

Hardening techniques are compiler and linker options that enhance the security and

reliability of applications by mitigating vulnerabilities such as memory safety issues,

undeﬁned behavior, and exploitation risks

• Compiler Options Hardening Guide for C and C++ [March, 2024]

• Hardened mode of standard library implementations

41/82

Compile-time Stack Usage

• -Wstack-usage=<byte-size> Warn if the stack usage of a function might

exceed byte-size. The computation done to determine the stack usage is

conservative (no VLA)

•

-fstack-usage Makes the compiler output stack usage information for the

program, on a per-function basis

•

-Wvla Warn if a variable-length array is used in the code

• -Wvla-larger-than=<byte-size> Warn for declarations of variable-length

arrays whose size is either unb ounded, or bounded by an argument that allows the

array size to exceed byte-size bytes

Use compiler flags for stack protection in GCC and Clang

42/82

Compile-time Stack Protection

• -Wtrampolines Check whether the compiler generates trampolines for pointers

to nested functions which may interfere with stack virtual memory protection

•

-Wl,-z,noexecstack Enable data execution prevention by marking stack

memory as non-executable

43/82

Run-time Stack Usage

• -fstack-clash-protection Enables run-time checks for variable-size stack

allocation validity

•

-fstack-protector-strong Enables run-time checks for stack-based buﬀer

overﬂows using strong heuristic

•

-fstack-protector-all Enables run-time checks for stack-based buﬀer

overﬂows for all functions

44/82

libc Buﬀer Overﬂow Checks 1/2

_FORTIFY_SOURCE deﬁne: the compiler provides buﬀer overﬂow checks for the

following functions:

memcpy , mempcpy , memmove , memset , strcpy , stpcpy , strncpy , strcat ,

strncat , sprintf , vsprintf , snprintf , vsnprintf , gets .

Recent compilers (e.g. GCC 12+, Clang 9+) allow detects buﬀer overﬂows with

enhanced coverage, e.g. dynamic p ointers, with _FORTIFY_SOURCE=3 *

*GCC’s new fortification level: The gains and costs

45/82

libc Buﬀer Overﬂow Checks 2/2

# include <cstring> // std::memset

# include <string> // std::stoi

int main(int argc, char** argv) {

int size = std::stoi(argv[1]);

char buffer[24];

std

::memset(buffer, 0xFF, size);

}

$ gcc -O1 -D_FORTIFY_SOURCE program.cpp -o program

$ ./program 12 # OK

$ ./program 32 # Wrong

$ *** buffer overflow detected ***: ./program terminated

46/82

Standard Library Precondictions

The standard library provides run-time precondition checks for library calls, such as

bounds-checks for strings and containers, and null-pointer checks, etc.

-D_GLIBCXX_ASSERTIONS for libstdc++ (GCC)

-D_LIBCPP_ASSERT , _LIBCPP_HARDENING_MODE_EXTENSIVE for libc++ (LLVM):

47/82

Undeﬁned Behavior Protections 1/2

• -fno-strict-overflow Prevent code optimization (code elimination) due to

signed integer undeﬁned b ehavior

•

-fwrapv Signed integer has the same semantic of unsigned integer, with a

well-deﬁned wrap-around behavior

• -fno-strict-aliasing Strict aliasing means that two objects with the same

memory address are not same if they have a diﬀerent type, undeﬁned behavior

otherwise. The ﬂag disables this constraint

48/82

Undeﬁned Behavior Protections 2/2

• -fno-delete-null-pointer-checks NULL pointer dereferencing is undeﬁned

behavior and the compiler can assume that it never happens. The ﬂag disable this

optimization

• -ftrivial-auto-var-init[=<hex pattern>] Ensures that default

initialization initializes variables with a ﬁxed 1-byte pattern. Explicit uninitialized

variables requires the

[[uninitialized]] attribute

49/82

Control Flow Protections

• -fcf-protection=full Enable control ﬂow protection to counter Return

Oriented Programming (ROP) and Jump Oriented Progra mming (JOP) attacks

on many x86 architectures

• -mbranch-protection=standard Enable branch protection to counter Return

Oriented Programming (ROP) and Jump Oriented Progra mming (JOP) attacks

on AArch64

50/82

Other Run-time Checks

• -fPIE -pie Position-Independent Executable enables the support for address

space layout randomization, which makes exploits more diﬃcult.

•

-Wl,-z,relro,-z,now Prevents modiﬁcation of the Global Oﬀset Table

(locations of functions from dynamically linked libraries) after the program startup

• -Wl,-z,nodlopen Restrict dlopen(3) calls to shared objects

51/82

Sanitizers

Address Sanitizer

Sanitizers are compiler-based instrumentation components to perform dynamic

analysis

Sanitizers are used during development and testing to discover and diagnose memory

misuse bugs and potentially dangerous undeﬁned behavior

Sanitizers are implemented in Clang (from 3.1), gcc (from 4.8) and Xcode

Project using Sanitizers:

• Chromium

• Firefox

• Linux kernel

• Android

Memory error checking in C and C++: Comparing Sanitizers and Valgrind

52/82

Address Sanitizer

Address Sanitizer W is a memory error detector

• heap/stack/global out-of-bounds

• memory leaks

• use-after-free, use-after-return, use-after-scope

• double-free, invalid free

• initialization order bugs

* Similar to valgrind but faster (50X slowdown)

clang++ -O1 -g -fsanitize=address -fno-omit-frame-pointer <program>

-O1 disable inlining

-g generate symbol table

• github.com/google/sanitizers/wiki/AddressSanitizer

• gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html

53/82

Leak Sanitizer

LeakSanitizer W is a run-time memory leak detector

• integrated into AddressSanitizer, can be used as standalone tool

* almost no performance overhead until the very end of the process

clang++ -O1 -g -fsanitize=leak -fno-omit-frame-pointer <program>

• github.com/google/sanitizers/wiki/AddressSanitizerLeakSanitizer

• gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html

54/82

Memory Sanitizers

Memory Sanitizer W is a detector of uninitialized reads

• stack/heap-allocated memory read before it is written

* Similar to valgrind but faster (3X slowdown)

clang++ -O1 -g -fsanitize=memory -fno-omit-frame-pointer <program>

-fsanitize-memory-track-origins=2

track origins of uninitialized values

Note: not compatible with Address Sanitizer

• github.com/google/sanitizers/wiki/MemorySanitizer

• gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html

55/82

Undeﬁned Behavior Sanitizer

UndefinedBehaviorSanitizer W is an undeﬁned behavior detector

• signed integer overﬂow, ﬂoating-point types overﬂow, enumerated not in range

• out-of-bounds array indexing, misaligned address

• divide by zero

• etc.

* Not included in valgrind

clang++ -O1 -g -fsanitize=undefined -fno-omit-frame-pointer <program>

gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html

56/82

Undeﬁned Behavior Sanitizer

-fsanitize=<options> :

undefined All of the checks other than float-divide-by-zero,

unsigned-integer-overflow, implicit-conversion,

local-bounds and the nullability-* group of checks

float-divide-by-zero Undeﬁned behavior in C++, but deﬁned by Clang and IEEE-754

integer Checks for undeﬁned or suspicious integer behavior (e.g. unsigned

integer overﬂow)

implicit-conversion Checks for suspicious behavior of implicit conversions

local-bounds Out of bounds array indexing, in case s where the array bound can be

statically determined

nullability Checks passing null as a function parameter, assigning null to an

lvalue, and returning null from a function

57/82

Sampling-Based Sanitizer

GWPSan W is a framework to implement low-overhead sampling-based dynamic binary

instrumentation, designed for detecting various bugs where more expensive dynamic

analysis would otherwise not be feasible

•

tsan (thread-sanitizer) data races

•

uar use-after-return bugs

• lmsan Uninitialized variables

clang++ -fexperimental-sanitize-metadata=atomics,uar <program>

58/82

Sanitizers vs. Valgrind

Valgrind - A neglected tool from the shadows or a serious debugging tool?

59/82

Debugging Summary

How to Debug Common Errors

Segmentation fault

• gdb, valgrind, sanitizers

• Segmentation fault when just entered in a function → stack overflow

Double free or corruption

• gdb, valgrind, sanitizers

Inﬁnite execution

• gdb + (CTRL + C)

Incorrect results

• valgrind + assertion + gdb + sanitizers

60/82

Compiler Warnings

Compiler Warnings - GCC and Clang

Enable speciﬁc warnings:

g++ -W<warning> <args...>

Disable speciﬁc warnings:

g++ -Wno-<warning> <args...>

Common warning ﬂags to minimize accidental mismatches:

-Wall Enables many standard warnings (∼50 warnings)

-Wextra Enables some extra warning ﬂags that are not enabled by -Wall (∼15 warnings)

-Wpedantic Issue all the warnings demanded by strict ISO C/C++

-Werror Treat warnings as errors

Enable

ALL warnings, only clang: -Weverything

61/82

Compiler Warnings - MSVC

Enable speciﬁc warnings:

cl.exe /W<level><warning_id> <args...>

Disable speciﬁc warnings:

cl.exe /We<warning_id> <args...>

Common warning ﬂags to minimize accidental mismatches:

/W1 Severe warnings

/W2 Signiﬁcant warnings

/W3 Production quality warnings

/W4 Informational warnings

/Wall All warnings

/WX Treat warnings as errors

62/82

Static Analysis

Overview

Static analysis is the process of source code examination to ﬁnd potential issues

Beneﬁts of static code analysis:

• Problem identiﬁcation before the execution

• Analyze the program outside the execution environment

• The analysis is independent from the run-time tests

• Enforce code quality and compliance by ensuring that the code follows speciﬁc

rules and standards

• Identify security vulnerabilities

63/82

Static Analyzers - Clang and GCC

The Clang Static Analyzer W (LLVM suite) ﬁnds bugs by reasoning

about the semantics of code (may produce false positives)

void test() {

int i, a[10];

int x = a[i]; // warning: array subscript is undefined

}

scan-build make

The GCC Static Analyzer W can diagnose various kinds of problems

in C/C++ code at compile-time (e.g. double-free, use-after-free, stdio

related, etc) by adding the -fanalyzer ﬂag

64/82

Static Analyzers - cppcheck

The MSVC Static Analyzer W Enables code analysis and control op-

tions (e.g. double-free, use-after-free, stdio related, etc) by adding the

/analyze ﬂag

cppcheck W provides code analysis to detect bugs, undeﬁned behavior and

dangerous coding construct. The goal is to detect only real errors in the

code (i.e. have very few false positives)

cppcheck --enable=warning,performance,style,portability,information,error

<src_file/directory>

cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .

cppcheck --enable=<enable_flags> --project=compile_commands.json

65/82

Popular Static Analyzers - PVS-Studio, SonarLint

PVS-Studio W is a high-quality proprietary (free for open source projects)

static code analyzer supporting C, C++

Customers: IBM, Intel, Adobe, Microsoft, Nvidia, Bosh, IdGames, EpicGames, etc.

SonarSource W is a static analyzer which inspects source code for bugs,

code smells, and security vulnerabilities for multiple languages (C++, Java,

etc.)

SonarLint plugin is available for Visual Code, Visual Studio Code, Eclipse, and IntelliJ IDEA

66/82

Other Static Analyzers - FBInfer, DeepCode

FBInfer W is a static analysis tool (also available online) to checks for

null pointer dereferencing, memory leak, coding conventions, unavailable

APIs, etc.

Customers: Amazon AWS, Facebook/Ocolus, Instagram, Whatapp, Mozilla, Spotify, Uber,

Sky, etc.

deepCode W is an AI-powered code review system, with machine learning

systems trained on billions of lines of code from open-source projects

Available for Visual Studio Code, Sublime, IntelliJ IDEA, and Atom

see also: A curated list of static analysis tool

67/82

Code Testing

see Case Study 4: The $440 Million Software Error at Knight Capital

from: Kat Maddox (on Twitter)

68/82

Code Testing

Unit Test A unit is the smallest piece of code that can be logically isolated in a

system. Unit test refers to the veriﬁcation of a unit. It supposes the

full knowledge of the code under testing (white-box testing)

Goals: meet speciﬁcations/requirements, fast development/debugging

Functional Test Output validation instead of the internal structure (black-box testing)

Goals: performance, regression (same functionalities of previous

version), stability, security (e.g. sanitizers), composability (e.g.

integration test)

69/82

Unit Testing 1/3

Unit testing involves breaking your program into pieces, and subjecting each piece to

a series of tests

Unit testing should observe the following key features:

• Isolation: Each unit test should be independent and avoid external interference

from other parts of the code

• Automation: Non-user interaction, easy to run, and manage

• Small Scope: Unit tests focus on small portions of code or speciﬁc

functionalities, making it easier to identify bugs

Popular C++ Unit testing frameworks:

catch, doctest, Google Test, CppUnit, Boost.Test

70/82

Unit Testing 2/3

71/82

Unit Testing 3/3

JetBrains C++ Developer Ecosystem 2022

72/82

Test-Driven Development (TDD)

Unit testing is often associated with the Test-Driven Development (TDD)

methodology. The practice involves the deﬁnition of automated functional tests

before

implementing the functionality

The process consists of the following steps:

1. Write a test for a new functionality

2. Write the minimal code to pass the test

3. Improve/Refactor the code iterating with the test veriﬁcation

4. Go to 1.

73/82

Test-Driven Development (TDD) - Main advantages

• Software design. Strong focus on interface deﬁnition, expected behavior,

speciﬁcations, and requirements before working at lower level

• Maintainability/Debugging Cost Small, incremental changes allow you to catch

bugs as they are introduced. Later refactoring or the introduction of new features

still rely on well-deﬁned tests

• Understandable behavior. New user can learn how the system works and its

properties from the tests

• Increase conﬁdence. Developers are more conﬁdent that their code will work as

intended because it has been extensively tested

• Faster development. Incremental changes, high conﬁdence, and automation

make it easy to move through diﬀerent functionalities or enhance existing ones

74/82

catch 1/2

Catch2 W is a multi-paradigm test framework for C++

Catch2 features

• Header only and no external dependencies

• Assertion macro

• Floating point tolerance comparisons

Basic usage:

• Create the test program

• Run the test

$ ./test_program [<TestName>]

• github.com/catchorg/Catch2

• The Little Things: Testing with Catch2

75/82

catch 2/2

# define CATCH_CONFIG_MAIN // This tells Catch to provide a main()

# include "catch.hpp" // only do this in one cpp file

unsigned Factorial(unsigned number) {

return number <= 1 ? number : Factorial(number - 1) * number;

}

"Test description and tag name"

TEST_CASE( "Factorials are computed", "[Factorial]" ) {

REQUIRE( Factorial(1) == 1 );

REQUIRE( Factorial(

2) == 2 );

REQUIRE( Factorial(

3) == 6 );

REQUIRE( Factorial(

10) == 3628800 );

}

float floatComputation() { ... }

TEST_CASE(

"floatCmp computed", "[floatComputation]" ) {

REQUIRE( floatComputation() == Approx( 2.1 ) );

}

76/82

Code Coverage 1/3

Code coverage is a measure used to describe the degree to which the source code of

a program is executed when a particular execution/test suite runs

gcov and llvm-profdata/llvm-cov are tools used in conjunction with compiler

instrumentation (gcc, clang) to interpret and visualize the raw code coverage

generated during the execution

gcovr and lcov are utilities for managing gcov/llvm-cov at higher level and

generating code coverage results

Step for code coverage:

• Compile with

–coverage ﬂag (objects + linking)

• Run the program / test

• Visualize the results with gcovr, llvm-cov, lcov

77/82

Code Coverage 2/3

program.cpp:

# include <iostream>

# include <string>

int main(int argc, char* argv[]) {

int value = std::stoi(argv[1]);

if (value % 3 == 0)

std::cout << "first\n";

if (value % 2 == 0)

std::cout << "second\n";

}

$ gcc -g --coverage program.cpp -o program

$ ./program 9

first

$ gcovr -r --html --html-details <program_path> # generate html

# or

$ lcov --coverage --directory <program_path> --output-file coverage.info

$ genhtml coverage.info --output-directory <program_path> # generate html

78/82

Code Coverage 3/3

1: 4:int main(int argc, char* argv[]) {

1: 5: int value = std::stoi(argv[1]);

1: 6: if (value % 3 == 0)

1: 7: std::cout << "first\n";

1: 8: if (value % 2 == 0)

# ####: 9: std::cout << "second\n";

4: 10:}

79/82

Coverage-Guided Fuzz Testing

A fuzzer is a specialized tool that tracks which areas of the code are reached, and

generates mutations on the corpus of input data in order to maximize the code

coverage

LibFuzzer W is the library provided by LLVM and feeds fuzzed inputs to the library via

a speciﬁc fuzzing entrypoint

The fuzz target function accepts an array of bytes and does something interesting with these

bytes using the API under test:

extern "C" int LLVMFuzzerTestOneInput(const uint8_t* Data,

size_t Size) {

DoSomethingInterestingWithMyAPI(Data, Size);

return 0;

}

80/82

Code Quality

Linters - clang-tidy 1/2

lint: The term was derived from the name of the undesirable bits of ﬁber

clang-tidy W provides an extensible framework for diagnosing and ﬁxing typical

programming errors, like style violations, interface misuse, or bugs that can be deduced

via static analysis

$ cmake -DCMAKE_EXPORT_COMPILE_COMMANDS=ON .

$ clang-tidy -p .

clang-tidy searches the conﬁguration ﬁle .clang-tidy ﬁle located in the closest

parent directory of the input ﬁle

clang-tidy is included in the LLVM suite

81/82

Linters - clang-tidy 2/2

Coding Guidelines:

• CERT Secure Coding Guidelines

• C++ Core Guidelines

• High Integrity C++ Coding Standard

Supported Code Conventions:

• Fuchsia

• Google

• LLVM

Bug Related:

• Android related

• Boost library related

• Misc

• Modernize

• Performance

• Readability

• clang-analyzer checks

• bugprone code constructors

.clang-tidy

Checks: 'android-*,boost-*,bugprone-*,cert-*,cppcoreguidelines-*,

clang-analyzer-*,fuchsia-*,google-*,hicpp-*,llvm-*,misc-*,modernize-*,

performance-*,readability-*'

82/82

Modern C++

Programming

17. C++ Ecosystem

Cmake and Other Tools

Federico Busato

2025-04-14

Table of Contents

1 CMake

ctest

2 Code Documentation

doxygen

3 Code Statistics

Count Lines of Code

Cyclomatic Complexity Analyzer

1/41

Table of Contents

4 Other Tools

Code Formatting - clang-format

Compiler Explorer

Code Transformation - CppInsights

AI-Powered Code Completion

Local Code Search - ugrep, ripgrep, hypergrep

Code Search Engine - searchcode, grep.app

Code Benchmarking - Quick-Bench

Font for Coding

2/41

CMake

CMake Overview

CMake W is an open-source, cross-platform family of tools designed to build,

test and package software

CMake is used to control the software compilation process using simple platform and

compiler independent conﬁguration ﬁles, and generate native

Makefile/Ninja and

workspaces that can be used in the compiler environment of your choice

CMake features:

• Turing complete language (if/else, loops, functions, etc.)

• Multi-platform (Windows, Linux, etc.)

• Open-Source

• Generate: makefile, ninja, etc.

• Supported by many IDEs: Visual Studio, Clion, Eclipse, etc.

3/41

CMake Books

Professional CMake: A Practical Guide

(14th)

C. Scott, 2023

Modern CMake for C++ (2nd)

R. Świdziński, 2024

4/41

CMake - References

• 19 reasons why CMake is actually awesome

• An Introduction to Modern CMake

• Effective Modern CMake

• Awesome CMake

• Useful Variables

5/41

Install CMake

Using PPA repository

$ wget -O - https://apt.kitware.com/keys/kitware-archive-latest.asc 2>/dev/null |

gpg --dearmor - | sudo tee /etc/apt/trusted.gpg.d/kitware.gpg >/dev/null

$ sudo apt-add-repository 'deb https://apt.kitware.com/ubuntu/ focal main' # bionic, xenial

$ sudo apt update

$ sudo apt install cmake cmake-curses-gui

Using the installer or the pre-compiled binaries: cmake.org/download/

# download the last cmake package, e.g. cmake-x.y.z-linux-x86_64.sh

$ sudo sh cmake-x.y.z-linux-x86_64.sh

6/41

A Minimal Example

CMakeLists.txt:

project(my_project) # project name

add_executable(program program.cpp) # compile command

# we are in the project root dir

$ mkdir build # 'build' dir is needed to isolate temporary files

$ cd build

$ cmake .. # search for CMakeLists.txt directory

$ cmake --build . # makefile automatically generated, -j to parallelize the build

Scanning dependencies of target program

[100%] Building CXX object CMakeFiles/out_program.dir/program.cpp.o

Linking CXX executable program

[100%] Built target program

7/41

Parameters and Message

CMakeLists.txt:

project(my_project)

add_executable(program program.cpp)

if (VAR)

message("VAR is set, NUM is ${NUM}")

else()

message(FATAL_ERROR "VAR is not set")

endif()

$ cmake ..

VAR is not set

$ cmake -DVAR=ON -DNUM=4 ..

VAR is set, NUM is 4

...

[100%] Built target program

8/41

Language Properties

project(my_project

DESCRIPTION "Hello World"

HOMEPAGE_URL "github.com/"

LANGUAGES CXX)

cmake_minimum_required(VERSION 3.15)

set(CMAKE_CXX_STANDARD 14) # force C++14

set(CMAKE_CXX_STANDARD_REQUIRED ON)

set(CMAKE_CXX_EXTENSIONS OFF) # no compiler extensions

add_executable(program ${PROJECT_SOURCE_DIR}/program.cpp) #$

# PROJECT_SOURCE_DIR is the root directory of the project

9/41

Target Commands

add_executable(program) # also add_library(program)

target_include_directories(program

PUBLIC include/

PRIVATE src/)

# target_include_directories(program SYSTEM ...) for system headers

target_sources(program # best way for specifying

PRIVATE src/program1.cpp # program sources and headers

PRIVATE src/program2.cpp

PUBLIC include/header.hpp)

target_compile_definitions(program PRIVATE MY_MACRO=ABCEF)

target_compile_options(program PRIVATE -g)

target_link_libraries(program PRIVATE boost_lib)

target_link_options(program PRIVATE -s)

10/41

Build Types

project(my_project) # project name

cmake_minimum_required(VERSION 3.15) # minimum version

add_executable(program program.cpp)

if (CMAKE_BUILD_TYPE STREQUAL "Debug") # "Debug" mode

# cmake already adds "-g -O0"

message("DEBUG mode")

if (CMAKE_COMPILER_IS_GNUCXX) # if compiler is gcc

target_compile_options(program "-g3")

endif()

elseif (CMAKE_BUILD_TYPE STREQUAL "Release") # "Release" mode

message("RELEASE mode") # cmake already adds "-O3 -DNDEBUG"

endif()

$ cmake -DCMAKE_BUILD_TYPE=Debug ..

11/41

Custom Targets and File Managing

project(my_project)

add_executable(program)

add_custom_target(echo_target # makefile target name

COMMAND echo "Hello" # real command

COMMENT "Echo target")

# find all .cpp file in src/ directory

file(GLOB_RECURSE SRCS ${PROJECT_SOURCE_DIR}/src/*.cpp)

# compile all *.cpp file

target_sources(program PRIVATE ${SRCS}) # prefer the explicit file list instead

$ cmake ..

$ make echo_target

12/41

Local and Cached Variables

Cached variables can be reused across multiple runs, while local variables are only

visible in a single run. Cached

FORCE variables can be modiﬁed only after the

initialization

project(my_project)

set(VAR1 "var1") # local variable

set(VAR2 "var2" CACHE STRING "Description1") # cached variable

set(VAR3 "var3" CACHE STRING "Description2" FORCE) # cached variable

option(OPT "This is an option" ON) # boolean cached variable

# same of var2

message(STATUS "${VAR1}, ${VAR2}, ${VAR3}, ${OPT}")

$ cmake .. # var1, var2, var3, ON

$ cmake -DVAR1=a -DVAR2=b -DVAR3=c -DOPT=d .. # var1, b, var3, d

13/41

Manage Cached Variables

$ ccmake . # or 'cmake-gui'

14/41

Find Packages

project(my_project) # project name

cmake_minimum_required(VERSION 3.15) # minimum version

add_executable(program program.cpp)

find_package(Doxygen REQUIRED) # compile only if Doxygen is found

find_package(Boost 1.87.0) # search for a specific version

if (Boost_FOUND)

target_include_directories("${PROJECT_SOURCE_DIR}/include" PUBLIC ${Boost_INCLUDE_DIRS})

else()

message(FATAL_ERROR "Boost Lib not found")

endif()

15/41

Compile Commands

Generate JSON compilation database (compile_commands.json)

It contains the exact compiler calls for each ﬁle that are used by other tools

project(my_project)

cmake_minimum_required(VERSION 3.15)

set(CMAKE_EXPORT_COMPILE_COMMANDS ON) # <--

add_executable(program program.cpp)

Change the C/C++ compiler:

CC=clang CXX=clang++ cmake ..

16/41

ctest 1/2

CTest is a testing tool (integrated in CMake) that can be used to automate updating,

conﬁguring, building, testing, p erforming memory checking, performing coverage

project(my_project)

cmake_minimum_required(VERSION 3.5)

add_executable(program program.cpp)

enable_testing()

add_test(NAME Test1 # check if "program" returns 0

WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}/build

COMMAND ./program <args>) # command can be anything

add_test(NAME Test2 # check if "program" print "Correct"

WORKING_DIRECTORY ${PROJECT_SOURCE_DIR}/build

COMMAND ./program <args>)

set_tests_properties(Test2

PROPERTIES PASS_REGULAR_EXPRESSION "Correct")

17/41

ctest 2/2

Basic usage (call ctest):

$ make test # run all tests

ctest usage:

$ ctest -R Python # run all tests that contains 'Python' string

$ ctest -E Iron # run all tests that not contain 'Iron' string

$ ctest -I 3,5 # run tests from 3 to 5

Each ctest command can be combined with other tools (e.g. valgrind)

18/41

ctest with Diﬀerent Compile Options

It is possible to combine a custom target with ctest to compile the same code with

diﬀerent compile options

add_custom_target(program-compile

COMMAND mkdir -p test-release test-ubsan test-asan # create dirs

COMMAND cmake .. -B test-release # -B change working dir

COMMAND cmake .. -B test-ubsan -DUBSAN=ON

COMMAND cmake .. -B test-asan -DASAN=ON

COMMAND make -C test-release -j20 program # -C run make in a

COMMAND make -C test-ubsan -j20 program # different dir

COMMAND make -C test-asan -j20 program)

enable_testing()

add_test(NAME Program-Compile

COMMAND make program-compile)

19/41

CMake Alternatives - xmake

xmake W is a cross- platform build utility based on

Lua.

Compared with makefile/CMakeLists.txt, the conﬁguration syntax is more concise

and intuitive. It is very friendly to novices and can quickly get started in a short time.

Let users focus more on actual project development

Comparison: xmake vs cmake

20/41

Code

Documentation

doxygen 1/6

Doxygen W is the de facto standard tool for generating documentation from annotated

C++ sources

Doxygen usage

• comment the code with

/// or /** comment */

• generate doxygen base conﬁguration ﬁle

$ doxygen -g

• modify the conﬁguration ﬁle Doxyfile

• generate the documentation

$ doxygen <config_file>

21/41

doxygen 2/6

22/41

doxygen 3/6

Doxygen requires the following tags for generating the documentation:

• @file Document a ﬁle

•

@brief Brief description for an entity

•

@param Run-time parameter description

•

@tparam Template parameter description

• @return Return value description

23/41

doxygen - Features 4/6

• Automatic cross references between functions, variables, etc.

• Speciﬁc highlight. Code

`<code>` , input/output parameters

@param[in] <param>

• Latex/MathJax

$<code>$

• Markdown (

Markdown Cheatsheet link), Italic text *<code>* , bold text

**<code>** , ta ble, list, etc.

• Call/Hierarchy graph can be useful in large projects (requires graphviz)

HAVE_DOT = YES

GRAPHICAL_HIERARCHY = YES

CALL_GRAPH = YES

CALLER_GRAPH = YES

24/41

doxygen - Example 5/6

/**

* @file

* @copyright MyProject

* license BSD3, Apache, MIT, etc.

* @author MySelf

* @version v3.14159265359

* @date March, 2018

/// @brief Namespace brief description

namespace my_namespace {

/// @brief "Class brief description"

/// @tparam R "Class template for"

template<typename R>

class A {

/**

* @brief "What the function does?"

* @details "Some additional details",

* Latex/MathJax: $\sqrt a$

* @tparam T Type of input and output

* @param[in] input Input array

* @param[out] output Output array

* @return `true` if correct,

* `false` otherwise

* @remark it is *useful* if ...

* @warning the behavior is **undefined** if

* @p input is `nullptr`

* @see related_function

template<typename T>

bool my_function(const T* input, T* output);

/// @brief

void related_function();

25/41

doxygen - Call Graph 6/6

26/41

Doxygen Alternatives

M.CSS Doxygen C++ theme

Doxypress Doxygen fork

clang-doc LLVM tool

Sphinx Clear, Functional C++ Documentation with Sphinx + Breathe

+ Doxygen + CMake

standardese The nextgen Doxygen for C++ (experimental)

HDoc The modern documentation tool for C++ (alpha)

Adobe Hyde Utility to facilitate documenting C++

27/41

Code Statistics

Count Lines of Code - cloc

cloc W counts blank lines, comment lines, and physical lines of source code in many

programming languages

$cloc my_project/

4076 text files.

3883 unique files.

1521 files ignored.

http://cloc.sourceforge.net v 1.50 T=12.0 s (209.2 files/s, 70472.1 lines/s)

-------------------------------------------------------------------------------

Language files blank comment code

-------------------------------------------------------------------------------

C 135 18718 22862 140483

C/C++ Header 147 7650 12093 44042

Bourne Shell 116 3402 5789 36882

Features: ﬁlter by-ﬁle/language, SQL database, archive support, line count diﬀ, etc.

28/41

Cyclomatic Complexity Analyzer - lyzard 1/3

Lizard W is an extensible Cyclomatic Complexity Analyzer for many programming

languages including C/C++

Cyclomatic Complexity: is a software metric used to indicate the complexity of a program. It

is a quantitative measure of the number of linearly independent paths through a program

source code

$lizard my_project/

==============================================================

NLOC CCN token param function@line@file

--------------------------------------------------------------

10 2 29 2 start_new_player@26@./html_game.c

6 1 3 0 set_shutdown_flag@449@./httpd.c

24 3 61 1 server_main@454@./httpd.c

--------------------------------------------------------------

• CCN: cyclomatic complexity (should not exceed a threshold)

• NLOC: lines of code without comments

• token: Number of conditional statements

• param: Parameter count of functions

29/41

Cyclomatic Complexity Analyzer - lyzard 2/3

CCN = 3

30/41

Cyclomatic Complexity Analyzer - lyzard 3/3

CC Risk Evaluation

1-10 a simple program, without much risk

11-20 more complex, moderate risk

21-50 complex, high risk

> 50 untestable program, very high risk

CC Guidelines

1-5 The routine is probably ﬁne

6-10 Start to think about ways to simplify the routine

> 10 Break part of the routine

Risk: Lizard: 15, OCLint: 10

• www.microsoftpressstore.com/store/code-complete-9780735619678

• blog.feabhas.com/2018/07/code-quality-cyclomatic-complexity

31/41

Other Tools

Code Formatting - clang-format

clang-format W is a tool to automatically format C/C++ code (and other languages)

$ clang-format <file/directory>

clang-format searches the conﬁguration ﬁle .clang-format ﬁle located in the

closest parent directory of the input ﬁle

clang-format example:

IndentWidth: 4

UseTab: Never

BreakBeforeBraces: Linux

ColumnLimit: 80

SortIncludes: true

32/41

Compiler Explorer (assembly and execution)

Compiler Explorer W is an interactive tool that lets you type source code and see

assembly output, control ﬂow graph, optimization hint, etc.

Key features: support multiple architectures and compilers

33/41

Code Transformation - CppInsights

CppInsights W See what your compiler does behind the scenes

34/41

AI-Powered Code Completion 1/2

AI-Powered Code Completion tools help writing code faster by drawing context

from comments and code to suggest individual lines and whole functions

Common features:

• Semantic completion

• Recognize common language patterns

• Use the documentation to infer this function name, return type, and arguments

• Suggest bug ﬁxes

• Generate comments, documentation, and even Pull Request text

35/41

AI-Powered Code Completion 2/2

They are commonly provided as plug-in for the most popular editors and IDE

• CoPilot W

•

TabNine W

•

Codeium W

• Replit Ghostwriter W

• CodeWhisperer W

36/41

Local Code Search - ugrep, ripgrep, hypergrep

ugrep W, Ripgrep W, Hypergrep W are code-searching-oriented tools for regex pattern

Features:

• Default recursively searches

• Skip .gitignore patterns, binary and hidden ﬁles/directories

• Windows, Linux, Mac OS support

• Up to 100x faster than GNU grep

37/41

Code Search Engine - searchcode

Searchcode W is a free source code search engine

Features:

• Search over 20 billion lines of code from 7,000,000 projects

• Search sources: github, bitbucket, gitlab, google code, sourceforge, etc.

38/41

Code Search Engine - grep.app

grep.app W searches across a half million GitHub repos

39/41

Code Benchmarking - Quick-Bench

Quick-benchmark W is a micro benchmarking tool intended to quickly and simply

compare the performances of two or m ore c ode snippets. The benchmark runs on a

pool of AWS machines

40/41

Font for Coding

Many editors allow adding optimized fonts for programming which improve legibility

and provide extra symbols (ligatures)

Some examples:

• JetBrain Mono

• Fira Code

• Microsoft Cascadia

• Consolas Ligaturized

41/41

Modern C++

Programming

18. Utilities

Federico Busato

2025-04-14

Table of Contents

1 I/O Stream

Manipulator

ofstream/ifstream

2 Strings and std::print

std::string

Conversion from/to Numeric Values

std::string_view

std::format

std::print

1/88

Table of Contents

3 View

std::span

4 Math Libraries

<cmath> Math Library

<limits> Numerical Limits

<numbers> Mathematical Constants

2/88

Table of Contents

5 Random Number

Basic Concepts

C++ <random>

Seed

PRNG Period and Quality

Distribution

Recent Algorithms and Performance

Quasi-random

3/88

Table of Contents

6 Time Measuring

Wall-Clock Time

User Time

System Time

4/88

Table of Contents

7 Std Classes

std::pair

std::tuple

std::variant

std::optional

std::any

std::stacktrace

5/88

Table of Contents

8 Filesystem Library

Query Methods

Modify Methods

6/88

I/O Stream

<iostream> input/output library refers to a family of classes and supporting

functions in the C++ Standard Library that implement stream-based input/output

capabilities

There are four predeﬁned iostreams:

•

cin standard input (stdin)

•

cout standard output (stdout) [buﬀered]

•

cerr standard error (stderr) [unbuﬀered]

•

clog standard error (stderr) [buﬀered]

buﬀered: the content of the buﬀer is not written to disk / console until some events

occur

7/88

I/O Stream (manipulator) 1/3

Basic I/O Stream manipulator:

• flush ﬂushes the output stream cout ≪ flush;

•

endl shortcut for cout ≪ "\n" ≪ flush;

cout ≪ endl

•

flush and endl force the program to synchronize with the terminal → very

slow operation!

8/88

I/O Stream (manipulator) 2/3

• Set integral representation: default: dec

cout ≪ dec ≪ 0xF; prints 16

cout ≪ hex ≪ 16; prints 0xF

cout ≪ oct ≪ 8; prints 10

• Print the underlying bit representation of a value:

#include <bitset>

std::cout << std::bitset<32>(3.45f); // (32: num. of bits)

// print 01000000010111001100110011001101

• Print true/false text:

cout ≪ boolalpha ≪ 1; prints true

cout ≪ boolalpha ≪ 0; prints false

9/88

I/O Stream (manipulator) 3/3

• Set decimal precision: default: 6

cout ≪ setprecision(2) ≪ 3.538; → 3.54

• Set ﬂoat representation:

default: std::defaultfloat

cout ≪ setprecision(2) ≪ fixed ≪ 32.5; → 32.50

cout ≪ setprecision(2) ≪ scientific ≪ 32.5; → 3.25e+01

• Set alignment: default: right

cout ≪ right ≪ setw(7) ≪ "abc" ≪ "##"; → ␣␣␣␣abc##

cout ≪ left ≪ setw(7) ≪ "abc" ≪ "##"; → abc␣␣␣␣##

(better than using tab \t)

10/88

I/O Stream - std::cin

std::cin is an example of input stream. Data coming from a source is read by the program.

In this example cin is the standard input

#include <iostream>

int main() {

int a;

std

::cout << "Please enter an integer value:" << endl;

std::cin >> a;

int b;

float c;

std::cout << "Please enter an integer value "

<< "followed by a float value:" << endl;

std::cin >> b >> c; // read an integer and store into "b",

} // then read a float value, and store

// into "c"

11/88

I/O Stream - ofstream/ifstream 1/3

ifstream , ofstream are output and input stream too

• Open a ﬁle for reading

Open a ﬁle in input mode:

ifstream my_file("example.txt")

• Open a ﬁle for writing

Open a ﬁle in output mode: ofstream my_file("example.txt")

Open a ﬁle in append mode:

ofstream my_file("example.txt", ios::out | ios::app)

• Read a line getline(my_file, string)

• Close a ﬁle my_file.close()

• Check the stream integrity

my_file.good()

12/88

I/O Stream - ofstream/ifstream 2/3

• Peek the next character

char current_char = my_file.peek()

• Get the next character (and advance)

char current_char = my_file.get()

• Get the position of the current character in the input stream

int byte_offset = my_file.tellg()

• Set the char position in the input sequence

my_file.seekg(byte_offset) (absolute position)

my_file.seekg(byte_offset, position) (relative position)

where position can b e:

ios::beg (the begin), ios::end (the end),

ios::cur (current position)

13/88

I/O Stream - ofstream/ifstream 3/3

• Ignore characters until the delimiter is found

my_file.ignore(max_stream_size, <delim>)

e.g. skip until end of line \n

• Get a pointer to the stream buﬀer object currently associated with the stream

my_file.rdbuf()

can be used to redirect ﬁle stream

14/88

I/O Stream - Example 1

Open a ﬁle and print line by line:

#include <iostream>

#include <fstream>

int main() {

std

::ifstream fin("example.txt");

std

::string str;

while (std::getline(fin, str))

std::cout << str << "\n";

fin.close();

}

An alternative version with redirection:

# include <iostream>

# include <fstream>

int main() {

std::ifstream fin("example.txt");

std

::cout << fin.rdbuf();

fin.close();

}

Reading files line by line in C++ using ifstream

15/88

I/O Stream - Example 2

example.txt:

23␣70␣␣␣44\n

\t57\t89

The input stream is independent from the

type of space (multiple space, tab, new-

line \n, \r\n, etc.)

Another example:

#include <iostream>

#include <fstream>

int main() {

std

::ifstream fin("example.txt");

char c = fin.peek(); // c = '2'

while (fin.good()) {

int var;

fin

>> var;

std::cout << var;

} // print 2370445789

fin.seekg(4);

c = fin.peek(); // c = '0'

fin.close();

}

16/88

I/O Stream -Check the End of a File

• Check the current character

while (fin.peek() != std::char_traits<char>::eof()) // C: EOF

fin >> var;

• Check if the read operation fails

while (fin >> var)

...

• Check if the stream past the end of the ﬁle

while (true) {

fin >> var

if (fin.eof())

break;

}

17/88

I/O Stream (checkRegularType)

Check if a ﬁle is a regular ﬁle and can be read/written

#include <sys/types.h>

#include <sys/stat.h>

bool checkRegularFile(const char* file_path) {

struct stat info;

if (::stat( file_path, &info ) != 0)

return false; // unable to access

if (info.st_mode & S_IFDIR)

return false; // is a directory

std::ifstream fin(file_path); // additional checking

if (!fin.is_open() || !fin.good())

return false;

try { // try to read

char c; fin >> c;

} catch (std::ios_base::failure&) {

return false;

}

return true;

}

18/88

I/O Stream - File size

Get the ﬁle size in bytes in a portable way:

long long int fileSize(const char* file_path) {

std

::ifstream fin(file_path); // open the file

fin.seekg(0, ios::beg); // move to the first byte

std::istream::pos_type start_pos = fin.tellg();

// get the start offset

fin.seekg(0, ios::end); // move to the last byte

std::istream::pos_type end_pos = fin.tellg();

// get the end offset

return end_pos - start_pos; // position difference

}

see C++17 ﬁle system utilities

19/88

Strings and std::print

std::string 1/4

std::string is a wrapper of character sequences

More ﬂexible and safer than raw char array but can be slower

#include <string>

int main() {

std::string a; // empty string

std::string b("first");

using namespace std::string_literals; // C++14

std::string c = "second"s; // C++14

}

std::string supports constexpr in C++20

20/88

std::string - Capacity and Search 2/4

• empty() returns true if the string is empty, false otherwise

• size() returns the number of characters in the string

•

find(string) returns the position of the ﬁrst substring equal to the given character

sequence or npos if no substring is found

•

rfind(string) returns the position of the last substring equal to the given character

sequence or npos if no substring is found

• find_first_of(char_seq) returns the position of the ﬁrst character equal to one of

the characters in the given character sequence or

npos if no characters is found

•

find_last_of(char_seq) returns the position of the last character equal to one of the

characters in the given character sequence or npos if no characters is found

npos special value returned by string methods

21/88

std::string - Operations 3/4

• new_string substr(start_pos)

returns a new substring [start_pos, end]

new_string substr(start_pos, count)

returns a new substring [start_pos, start_pos + count)

• clear() removes all characters from the string

•

erase(pos) removes the character at position

erase(start_pos, count)

removes the characters at positions [start_pos, start_pos + count)

•

replace(start_pos, count, new_string)

replaces the part of the string indicated by [start_pos, start_pos + count) with

new_string

• c_str()

returns a pointer to the raw char sequence

22/88

std::string - Overloaded Operators 4/4

• access speciﬁed character string1[i]

• string copy

string1 = string2

• string compare

string1 == string2

works also with

!=,<,≤,>,≥

• concatenate two strings

string_concat = string1 + string2

• append characters to the end

string1 += string2

23/88

Conversion from/to Numeric Values

Converts a string to a numeric value C++11:

•

stoi(string) string to signed integer

•

stol(string) string to long signed integer

•

stoul(string) string to long unsigned integer

•

stoull(string) string to long long unsigned integer

•

stof(string) string to ﬂoating point value (ﬂoat)

• stod(string) string to ﬂoating point value (double)

• stold(string) string to ﬂoating point value (long double)

•

C++17 std::from_chars(start, end, result, base) fast string conversion (no

allocation, no exception)

Converts a numeric value to a string:

• C++11 to_string(numeric_value) numeric value to string

24/88

Examples

std::string str("si vis pacem para bellum");

cout << str.size(); // print 24

cout << str.find("vis"); // print 3

cout << str.find_last_of("bla"); // print 21, 'l' found

cout << str.substr(7, 5);// print "pacem", pos=7 and count=5

cout << str[1]; // print 'i'

cout << (str == "vis"); // print false

cout << (str < "z"); // print true

const char* raw_str = str.c_str();

cout << string("a") + "b"; // print "ab"

cout << string("ab").erase(0); // print 'b'

char* str2 = "34";

int a = std::stoi(str2); // a = 34;

std::string str3 = std::to_string(a); // str3 = "34"

25/88

Tips

• Conversion from integer to char letter (e.g. 3 → 'C'):

static_cast<char>('A'+ value)

value ∈ [0, 26] (English alphabet)

• Conversion from char to integer (e.g. ’C’ → 3):

value - 'A'

value ∈ [0, 26]

• Conversion from digit to char number (e.g. 3 → '3'):

static_cast<char>('0'+ value)

value ∈ [0, 9]

• char to string std::string(1, char_value)

26/88

std::string_view 1/3

C++17 std::string_view describ es a minimum common interface to interact with

string data:

•

const std::string&

•

const char*

The purpose of std::string_view is to avoid copying data which is already owned

by the original object

#include <string>

#include <string_view>

std::string str = "abc"; // new memory allocation + copy

std::string_view = "abc"; // only the reference

27/88

std::string_view 2/3

std::string_view provides similar functionalities of std::string

#include <iostream>

#include <string>

#include <string_view>

void string_op1(const std::string& str) {}

void string_op2(std::string_view str) {}

string_op1(

"abcdef"); // allocation + copy

string_op2("abcdef"); // reference

const char* str1 = "abcdef";

std::string str2(str1); // allocation + copy

std::cout << str2.substr(0, 3); // print "abc"

std::string_view str3(str1); // reference

std::cout << str3.substr(0, 3); // print "abc"

28/88

std::string_view 3/3

std::string_view supports constexpr constructor and methods

constexpr std::string_view str1("abc");

constexpr std::string_view str2 = "abc";

constexpr char c = str1[0]; // 'a'

constexpr bool b = (str1 == str2); // 'true'

constexpr int size = str1.size(); // '3'

constexpr std::string_view str3 = str1.substr(0, 2); // "ab"

constexpr int pos = str1.find("bc"); // '1'

29/88

std::format 1/2

printf functions: no automatic type deduction, error prone, not extensible

stream objects: very verbose, hard to optimize

C++20 std::format provides python style formatting:

• Type-safe

• Support positional arguments

• Extensible (support user-deﬁned types)

• Return a

std::string

30/88

std::format - Example 2/2

Integer formatting

std::format("{}", 3); // "3"

std::format("{:b}", 3); // "101"

Floating point formatting

std::format("{:.1f}", 3.273); // "3.1"

Alignment

std::format("{:>6}", 3.27); // " 3.27"

std::format("{:<6}", 3.27); // "3.27 "

Argument reordering

std::format("{1} - {0}", 1, 3); // "3 - 1"

31/88

std::print

C++23 introduces std::print() std::println()

std::print("Hello, {}!\n", name);

std

::println("Hello, {}!", name); // prints a newline

std::print in C++23

32/88

View

std::span 1/3

C++20 introduces std::span which is a non-owning view of an underlying sequence

or array

std::span can either have a static extent, in which case the number of elements

in the sequence is known at compile-time, or a dynamic extent

template<

class T,

std::size_t Extent = std::dynamic_extent

> class span;

33/88

std::span 2/3

# include <span>

# include <array>

# include <vector>

int array1[] = {1, 2, 3};

std

::span s1{array1}; // static extent

std::array<int, 3> array2 = {1, 2, 3};

std

::span s2{array2}; // static extent

auto array3 = new int[3];

std::span s3{array3, 3}; // dynamic extent

std::vector<int> v{1, 2, 3};

std::span s4{v.data(), v.size()}; // dynamic extent

std::span s5{v}; // dynamic extent

34/88

std::span 3/3

void f(std::span<int> span) {

for (auto x : span) // range-based loop (safe)

cout << x;

std

::fill(span.begin(), span.end(), 3); // std algorithms

}

int array1[] = {1, 2, 3};

f(array1);

auto array2 = new int[3];

f({array2, 3});

35/88

Math Libraries

<cmath> Math Library 1/2

<cmath> W

•

abs(x) computes absolute value, |x|, C++11

• exp(x) returns e raised to the given power, e

•

exp2(x) returns 2 raised to the given power, 2

, C++11

• log(x) computes natural (base e) logarithm, log

(x)

•

log10(x) computes base 10 logarithm, log

(x)

•

log2(x) computes base 2 logarithm, log

(x), C++11

• pow(x, y) raises a number to the given power, x

•

sqrt(x) computes square root,

√

• cqrt(x) computes cubic root,

√

x, C++11

36/88

<cmath> Math Library 2/2

• sin(x) computes sine, sin(x)

•

cos(x) computes cosine, cos(x )

• tan(x) computes tangent, tan(x)

•

ceil(x) nearest non-decimal value not less than the given value, ⌈x⌉

• floor(x) nearest non-decimal value not greater than the given value, ⌊x⌋

•

round(x) rounding to the nearest non-decimal value halfway cases away from zero

Math functions in C++11 can be applied directly to integral types without implicit/explicit

casting (return type: ﬂoating point).

37/88

<limits> Numerical Limits

Get numeric limits of a given type:

<limits> W C++11

T numeric_limits<T>:: max() // returns the maximum finite value

// value representable

T numeric_limits<T>:: min() // returns the minimum finite value

// value representable

T numeric_limits<T>:: lowest() // returns the lowest finite

// value representable

38/88

<numbers> Mathematical Constants

<numbers> W C++20

The header provides numeric constants

• e Euler number e

•

pi π

• phi Golden ratio

√

•

sqrt2

√

39/88

Integer Division

Integer ceiling division and rounded division:

• Ceiling Division:



value

div



unsigned ceil_div(unsigned value, unsigned div) {

return (value + div - 1) / div;

} // note: may overflow

• Rounded Division:



value

div



unsigned round_div(unsigned value, unsigned div) {

return (value + div / 2) / div;

} // note: may overflow

Note: do not use ﬂoating-point conversion (see Basic Concept I)

40/88

Random Number

“Random numbers should not be generated with a method chosen at random”

— Donald E. Knuth

Applications: cryptography, simulations (e.g. Monte Carlo), etc.

41/88

Random Number

see Lavarand

42/88

Basic Concepts

• A pseudorandom (PRNG) sequence of numbers satisﬁes most of the statistical

properties of a truly random sequence but is generated by a deterministic algorithm

(deterministic ﬁnite-state machine)

• A quasirandom sequence of n-dimensional p oints is generated by a deterministic

algorithm designed to ﬁll an n-dimensional space evenly

• The state of a PRNG describes the status of the generator (the values of its variables),

namely where the system is after a certain amount of transitions

• The seed is a value that initializes the starting state of a PRNG. The same seed always

produces the same sequence of results

• The oﬀset of a sequence is used to skip ahead in the sequence

• PRNGs produce uniformly distributed values. PRNGs can also generate values according

to a probability function (binomial, normal, etc.)

43/88

C++ <random> 1/2

The problem: C rand() function produces poor quality random numbers

• C++14 discourage the use of

rand() and srand()

C++11 introduces pseudo random number generation (PRNG) facilities to produce

random numbers by using combinations of generators and distributions

A random generator requires four steps:

(1) Select the seed

(2) Deﬁne the random engine (optional)

<type_of_random_engine> generator(seed)

(3) Deﬁne the distribution

<type_of_distribution> distribution(range_start, range_end)

(4) Produce the random number

distribution(generator)

44/88

C++ <random> 2/2

Simplest example:

#include <iostream>

#include <random>

int main() {

std::random_device rd;

std

::default_random_engine generator{rd{}};

std

::uniform_int_distribution<int> distribution{0, 9};

std::cout << distribution(generator); // first random number

std::cout << distribution(generator); // second random number

}

It generates two random integer numbers in the range [0, 9] by using the default

random engine

45/88

Seed 1/4

Given a seed, the generator produces always the same sequence

The seed could be selected randomly by using the current time:

#include <random>

#include <chrono>

unsigned seed = std::chrono::system_clock::now()

.time_since_epoch().count();

std

::default_random_engine generator{seed};

chrono::system_clock::now() returns an object representing the current point in time

.time_since_epoch().count() returns the count of ticks that have elapsed since January 1, 1970

(midnight UTC/GMT)

Problem: Consecutive calls return very similar seeds

46/88

Seed 2/4

Pseuso seed: easy to guess, e.g. single source of randomness

Secure seed: hard to guess, e.g. multiple sources of randomness

How do I generate a random integer in C#?

47/88

Seed 3/4

A random device std::random_device is a uniformly distributed integer generator

that produces non-deterministic

random numbers, e.g. from a hardware device such as

/dev/urandom

#include <random>

std::random_device rnd_device;

std::default_random_engine generator{rnd_device()};

Note: Not all OSs provide a random device

48/88

Seed 4/4

std::seed_seq consumes a sequence of integer-valued data and produces a number

of unsigned integer values in the range [0, 2

− 1]. The produced values are

distributed over the entire 32-bit range even if the consumed values are close

#include <random>

#include <chrono>

unsigned seed1 = std::chrono::system_clock::now()

.time_since_epoch().count();

unsigned seed2 = seed1 + 1000;

std::seed_seq seq1{seed1, seed2};

std::default_random_engine generator1{seq};

49/88

PRNG Period and Quality

PRNG Period

The period (or cycle length) of a PRNG is the length of the sequence of numbers that the

PRNG generates before repeating

PRNG Quality

(informal) If it is hard to distinguish a generator output from truly random sequences, we call it

a high quality generator. Otherwise, we call it low quality generator

Generator Quality Period Randomness

Linear Congruential Poor 2

≈ 10

Statistical tests

Mersenne Twister 32/64-bit High 10

6000

Statistical tests

Subtract-with-carry 24/48-bit Highest 10

171

Mathematically proven

50/88

Randomness Quality

• On C++ Random Number Generator Quality

• It is high time we let go of the Mersenne Twister

51/88

Random Engines

• Linear congruential (LF)

The simplest generator engine. Modulo-based algorithm:

i+1

= (αx

+ c)mod m where α, c, m are implementation deﬁned

C++ Generators

: std::minstd_rand , std::minstd_rand0 ,

std::knuth_b

• Mersenne Twister (M. Matsumoto and T. Nishimura, 1997)

Fast generation of high-quality pseudorandom number. It relies on Mersenne prime number.

(used as default random generator in linux)

C++ Generators: std::mt19937 , std::mt19937_64

• Subtract-with-carry (LF) (G. Marsaglia and A. Zaman, 1991)

Pseudo-random generation based on Lagged Fibonacci algorithm (used for example by

physicists at CERN)

C++ Generators

: std::ranlux24_base , std::ranlux48_base , std::ranlux24 ,

std::ranlux48

52/88

Statistical Tests

The table shows after how many iterations the generator fails the statistical tests

Generator 256M 512M 1G 2G 4G 8G 16G 32G 64G 128G 256G 512G 1T

ranlux24_base ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗

ranlux48_base ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗

minstd_rand ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗

minstd_rand0 ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗

knuth_b ✓ ✓ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗ ✗

mt19937 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✗ ✗

mt19937_64 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✗ ✗

ranlux24 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

ranlux48 ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓

53/88

Space and Performance

Generator Predictability State Performance

Linear Congruential Trivial 4-8 B Fast

Knuth Trivial 1 KB Fast

Mersenne Twister Trivial 2 KB Good

randlux_base Trivial 8-16 B Slow

randlux Unknown? ∼120 B Super slow

54/88

Distribution

• Uniform distribution

uniform_int_distribution<T>(range_start, range_end) where T is integral type

uniform_real_distribution<T>(range_start, range_end) where T is ﬂoating

point type

• Normal distribution P (x) =

√

2π

−

(x−µ)

2σ

normal_distribution<T>(mean, std_dev)

where T is ﬂoating point type

• Exponential distribution P (x, λ) = λe

−λx

exponential_distribution<T>(lambda)

where T is ﬂoating point type

55/88

Examples

unsigned seed = ...

// Original linear congruential

minstd_rand0 lc1_generator(seed);

// Linear congruential (better tuning)

minstd_rand lc2_generator(seed);

// Standard mersenne twister (64-bit)

mt19937_64 mt64_generator(seed);

// Subtract-with-carry (48-bit)

ranlux48_base swc48_generator(seed);

uniform_int_distribution

<int> int_distribution(0, 10);

uniform_real_distribution

<float> real_distribution(-3.0f, 4.0f);

exponential_distribution<float> exp_distribution(3.5f);

normal_distribution<double> norm_distribution(5.0, 2.0);

56/88

Recent Algorithms and Performance

Recent algorithms:

• PCG, A Family of Better Random Number Generators

• Xoshiro / Xoroshiro generators and the PRNG shootout

• The Xorshift128+ random number generator fails BigCrush

Parallel algorithms:

• Squares: A Fast Counter-Based RNG

• Parallel Random Numbers: As Easy as 1, 2, 3 (Philox)

• OpenRNG: New Random Number Generator Library for best performance

when porting to Arm

If strong random number quality properties are not needed, it is possible to generate a

random permutation of integer values (with period of 2

) in a very eﬃcient way by

using hashing functions Hash Function Prospector W

57/88

Performance Comparison

Random number generators for C++ performance tested

58/88

Quasi-random 1/2

The quasi-random numbers have the low-discrepancy property that is a measure of

uniformity for the distribution of the point for the multi-dimensional case

• Quasi-random sequence, in comparison to pseudo-random sequence, distributes

evenly, namely this leads to spread the number over the entire region

• The concept of low-discrepancy is associated with the property that the successive

numbers are added in a position as away as possible from the other numbers that

is, avoiding clustering (grouping of numbers close to each other)

59/88

Quasi-random 2/2

Pseudo-random vs. Quasi random

60/88

Time Measuring

Time Measuring 1/2

Wall-Clock/Real time

It is the human perception of the passage of time from the start to the completion of

a task

User/CPU time

The amount of time spent by the CPU to compute in user code

System time

The amount of time spent by the CPU to compute system calls (including I/O calls)

executed into kernel code

61/88

Time Measuring 2/2

The Wall-clock time measured on a concurrent process platform may include the time

elapsed for other tasks

The User/CPU time of a multi-thread program is the sum of the execution time of all

threads

If the system workload (except the current program) is very low and the program uses

only one thread then

Wall-clock time = User time + System time

62/88

Time Measuring - Wall-Clock Time 1/3

::gettimeofday() : time resolution 1µs

# include <time.h> //struct timeval

# include <sys/time.h> //gettimeofday()

struct timeval start, end; // timeval {second, microseconds}

::gettimeofday(&start, NULL);

...

// code

::gettimeofday(&end, NULL);

long start_time = start.tv_sec * 1000000 + start.tv_usec;

long end_time = end.tv_sec * 1000000 + end.tv_usec;

cout << "Elapsed: " << end_time - start_time; // in microsec

Problems: Linux only (not portable), the time is not monotonic increasing (timezone), time

resolution is big

63/88

Time Measuring - Wall-Clock Time 2/3

std::chrono C++11

# include <chrono>

auto start_time = std::chrono::system_clock::now();

...

// code

auto end_time = std::chrono::system_clock::now();

std

::chrono::duration<double> diff = end_time - start_time;

cout << "Elapsed: " << diff.count(); // in seconds

cout << std::chrono::duration_cast<milli>(diff).count(); // in ms

Problems: The time is not monotonic increasing (timezone)

64/88

Time Measuring - Wall-Clock Time 3/3

An alternative of system_clock is steady_clock which ensures monotonic

increasing time.

steady_clock is implemented over clock_gettime on POSIX system and has 1ns

time resolution

# include <chrono>

auto start_time = std::chrono::steady_clock::now();

...

// code

auto end_time = std::chrono::steady_clock::now();

However, the overhead of C++ API is not always negligible, e.g.

Linux libstdc++ → 20ns, Mac libc++ → 41ns

Measuring clock precision

65/88

Time Measuring - User Time

std::clock , implemented over clock_gettime on POSIX system and has 1ns

time resolution

# include <chrono>

clock_t start_time = std::clock();

...

// code

clock_t end_time = std::clock();

float diff = static_cast<float>(end_time - start_time) / CLOCKS_PER_SEC;

cout

<< "Elapsed: " << diff; // in seconds

66/88

Time Measuring - User/System Time

# include <sys/times.h>

struct ::tms start_time, end_time;

::times(&start_time);

...

// code

::times(&end_time);

auto user_diff = end_time.tmus_utime - start_time.tms_utime;

auto sys_diff = end_time.tms_stime - start_time.tms_stime;

float user = static_cast<float>(user_diff) / ::sysconf(_SC_CLK_TCK);

float sys = static_cast<float>(sys_diff) / ::sysconf(_SC_CLK_TCK);

cout << "user time: " << user; // in seconds

cout << "system time: " << sys; // in seconds

67/88

Std Classes

std::pair 1/2

std::pair ( <utility> ) class couples together a pair of values, which may be of

diﬀerent types

Construct a std::pair

•

std::pair pair1(value1, value2) , C++17 CTAD

• std::pair<T1, T2> pair2(value1, value2)

•

std::pair<T1, T2> pair3 = {value1, value2}

•

auto pair4 = std::make_pair(value1, value2)

Data members:

• first access ﬁrst ﬁeld

•

second access second ﬁeld

Methods:

• comparison

==, <, >, ≥, ≤

• swap

std::swap

68/88

std::pair 2/2

# include <utility>

std::pair<int, std::string> pair1(3, "abc");

std::pair<int, std::string> pair2 = { 4, "zzz" };

auto pair3 = std::make_pair(3, "hgt");

cout

<< pair1.first; // print 3

cout << pair1.second; // print "abc"

std::swap(pair1, pair2);

cout << pair2.first; // print "zzz"

cout << pair2.second; // print 4

cout << (pair1 > pair2); // print 1

Note: std::pair is not trivially copyable

69/88

std::tuple 1/3

std::tuple ( <tuple> ) is a ﬁxed-size collection of heterogeneous values. It is a

generalization of

std::pair . It allows any number of values

Construct a

std::tuple of size 3

# include <tuple>

std::tuple tuple1(value1, value2, value3); // C++17 CTAD

std::tuple<T1, T2, T3> tuple2(value1, value2, value3);

std

::tuple<T1, T2, T3> tuple3 = {value1, value2, value3};

auto tuple4 = std::make_tuple(value1, value2, value3);

Get data members

std::get<I>(tuple); // returns the I-th value of the tuple

std::get<type>(tuple); // returns the tuple element with given type

// (compiles only if that type is unique)

Other methods: comparison ==, <, >, ≥, ≤ , swap std::swap

70/88

std::tuple 2/3

• auto t3 = std::tuple_cat(t1, t2)

concatenate two tuples

• const int size = std::tuple_size<TupleT>::value

returns the number of elements in a tuple at compile-time

• using T = typename std::tuple_element<I, TupleT>::type obtains the

type of the speciﬁed element

•

std::tie(value1, value2, value3) = tuple

creates a tuple of references to its arguments

•

std::ignore

an object of unspeciﬁed type such that any value can be assigned to it with no

eﬀect

71/88

std::tuple 3/3

# include <tuple>

std::tuple<int, float, char> f() { return {7, 0.1f, 'a'}; }

std

::tuple<int, char, float> tuple1(3, 'c', 2.2f);

auto tuple2 = std::make_tuple(2, 'd', 1.5f);

cout

<< std::get<0>(tuple1); // print 3

cout << std::get<1>(tuple1); // print 'c'

cout << std::get<2>(tuple1); // print 2.2f

cout << (tuple1 > tuple2); // print true

auto concat = std::tuple_cat(tuple1, tuple2);

cout << std::tuple_size<decltype(concat)>::value; // print 6

using T = std::tuple_element<4, decltype(concat)>::type; // T is int

int value1; float value2;

std

::tie(value1, value2, std::ignore) = f();

72/88

std::variant 1/3

<variant> C++17

std::variant represents a type-safe union as the corresponding objects know

which type is currently being held

It can be indexed by:

•

std::get<index>(variant) an integer

•

std::get<type>(variant) a type

# include <variant>

std::variant<int, float, bool> v(3.3f);

auto x = std::get<1>(v); // return 3.3f

auto y = std::get<float>(v); // return 3.3f

// std::get<0>(v); // member 0 is not active, run-time exception!!

73/88

std::variant 2/3

Another useful method is index() which returns the position of the type currently

held by the variant

# include <variant>

std::variant<int, float, bool> v(3.3f);

cout

<< v.index(); // return 1

v = true; // not 'v' holds a bool

cout << v.index(); // return 2

74/88

std::variant + Visitor 3/3

It is also possible to query the index at run-time depending on the type currently being

held by providing a visitor

# include <variant>

struct Visitor {

void operator()(int& value) { value *= 2; }

void operator()(float& value) { value += 3.0f; } // <--

void operator()(bool& value) { value = true; }

};

std

::variant<int, float, bool> v(3.3f);

std::visit(v, Visitor{});

cout

<< std::get<float>(v); // 6.3f

75/88

std::optional 1/2

<optional> C++17

std::optional provides facilities to represent potential “no value” states

As an example, it can be used for representing the state when an element is not found

in a set

# include <optional>

std::optional<int> find(const std::vector<int>& vector, int value_to_search) {

for (int i = 0; i < vector.size(); i++) {

if (vector[i] == value_to_search)

return i;

}

return {}; // std::nullopt;

}

76/88

std::optional 2/2

# include <optional>

char set[] = "sdfslgfsdg";

auto x = find(set, 'a'); // 'a' is not present

if (!x)

cout

<< "not found";

if (!x.has_value())

cout

<< "not found";

auto y = find(set, 'l');

cout << *y << " " << y.value(); // print '4' '4'

x.value_or('A'); // returns 'A'

y.value_or('A'); // returns 'A'

77/88

std::any

<any> C++17

std::any holds arbitrary values and provides type-safety

# include <any>

std::any var = 1; // int

cout << var.type().name(); // print 'i'

cout << std::any_cast<int>(var);

// cout << std::any_cast<float>(var); // exception!!

var = 3.14; // double

cout << std::any_cast<double>(var);

var.reset();

cout << var.has_value(); // print 'false'

78/88

std::stacktrace 1/2

C++23 introduces std::stacktrace library to get the current function call stack,

namely the sequence of calls from the

main() entry point

# include <print>

# include <stacktrace> // the program must be linked with the library

// -lstdc++_libbacktrace

// (-lstdc++exp with gcc-14 trunk)

void g() {

auto call_stack = std::stacktrace::current();

for (const auto& entry : call_stack)

std

::print("{}\n", entry);

}

void f() { g(); }

int main() { f(); }

79/88

std::stacktrace 2/2

the previous code prints

g() at /app/example.cpp:6

f() at /app/example.cpp:11

main at /app/example.cpp:13

at :0

__libc_start_main at :0

_start at :0

The library also provides additional functions for entry to allow ﬁne-grained control

of the output

description() , source_file() , source_line()

for (const auto& entry : call_stack) { // same output

std::print("{} at {}:{}\n", entry.description(), entry.source_file(),

entry.source_line());

}

80/88

Filesystem Library

C++17 introduces abstractions and facilities for performing operations on ﬁle systems

and their components, such as paths, ﬁles, and directories

• Follow the Boost ﬁlesystem library

• Based on POSIX

• Fully-supported from clang 7, gcc 8, etc.

• Work on Windows, Linux, Android, etc.

81/88

Basic concepts

• ﬁle: a ﬁle system object that holds data

◦ directory a container of directory entries

◦ hard link associates a name with an existing ﬁle

◦ symbolic link associates a name with a path

◦ regular ﬁle a ﬁle that is not one of the other ﬁle types

• ﬁle name: a string of characters that names a ﬁle. Names

. (dot) and ..

(dot-dot) have special meaning at library level

• path: sequence of elements that identiﬁes a ﬁle

◦ absolute path: a path that unambiguously identiﬁes the location of a ﬁle

◦ canonical path: an absolute path that includes no symlinks,

. or .. elements

◦ relative path: a path that identiﬁes a ﬁle relative to some location on the ﬁle system

82/88

path Object

A path object stores the pathname in native form

# include <filesystem> // required

namespace fs = std::filesystem;

::path p1 = "/usr/lib/sendmail.cf"; // portable format

fs::path p2 = "C:\\users\\abcdef\\"; // native format on Windows

cout << "p1: " << p1; // /usr/lib/sendmail.cf

cout << "p2: " << p2; // C:\users\abcdef\

out << "p3: " << p2 / "xyz\\"; // C:\users\abcdef\xyz\

83/88

path Methods

Decomposition (member) methods:

• Return root-name of the path

root_name()

• Return path relative to the root path

relative_path()

• Return the path of the parent path

parent_path()

• Return the ﬁlename path component

filename()

• Return the ﬁle extension path component

extension()

84/88

Filesystem Methods - Query

• Check if a ﬁle or path exists

exists(path)

• Return the ﬁle size

file_size(path)

• Check if a ﬁle is a directory

is_directory(path)

• Check if a ﬁle (or directory) is empty

is_empty(path)

• Check if a ﬁle is a regular ﬁle

is_regular_file(path)

• Returns the current path

current_path()

85/88

Directory Iterators

Iterate over ﬁles of a directory (recursively/non-recursively)

# include <filesystem>

namespace fs = std::filesystem;

for(auto& path : fs::directory_iterator("/usr/tmp/"))

cout << path << '\n';

for(auto& path : fs::recursive_directory_iterator("/usr/tmp/"))

cout

<< path << '\n';

86/88

Filesystem Methods - Modify

• Copy ﬁles or directories

copy(path1, path2)

• Copy ﬁles

copy_file(src_path, src_path, [fs::copy_options::recursive])

• Create new directory

create_directory(path)

• Remove a ﬁle or empty directory

remove(path)

• Remove a ﬁle or directory and all its contents, recursively

remove_all(path)

• Rename a ﬁle or directory

rename(old_path, new_path)

87/88

Examples

# include <filesystem> // required

namespace fs = std::filesystem;

::path p1 = "/usr/tmp/my_file.txt";

cout

<< fs::exists(p1); // true

cout << p1.parent_path(p1); // "/usr/tmp/"

cout << p1.filename(); // "my_file.txt"

cout << p1.stem(); // "my_file"

cout << p1.extension(); // "txt"

cout << fs::is_directory(p1); // false

cout << fs::is_regular_file(p1); // true

fs::create_directory("/my_dir/");

fs::copy(p1.parent_path(), "/my_dir/", fs::copy_options::recursive);

fs::copy_file(p1, "/my_dir/my_file2.txt");

::remove(p1);

::remove_all(p1.parent_path());

88/88

Modern C++

Programming

19. Containers, Iterators,

Ranges, and Algorithms

Federico Busato

2025-04-14

Table of Contents

1 Containers and Iterators

Semantic

2 Sequence Containers

std::array

std::vector

std::deque

std::list

std::forward_list

1/69

Table of Contents

3 Associative Containers

std::set

std::map

std::multiset

4 Container Adaptors

std::stack, std::queue, std::priority_queue

5 Implement a Custom Iterator

Implement a Simple Iterator

2/69

Table of Contents

6 Iterator Notes

7 Iterator Utility Methods

std::advance, std::next

std::prev, std::distance

Container Access Methods

Iterator Traits

3/69

Table of Contents

8 Algorithms Library

std::find_if, std::sort

std::accumulate, std::generate, std::remove_if

4/69

Table of Contents

9 C++20 Ranges

Key Concepts

Range View

Range Adaptor

Range Factory

Range Algorithms

Range Actions

5/69

Containers and

Iterators

Containers and Iterators

Container

A container is a class, a data structure, or an abstract data type, whose instances

are collections of other objects

• Containers store objects following speciﬁc access rules

Iterator

An iterator is an object allowing to traverse a container

• Iterators are a generalization of pointers

• A pointer is the simplest iterator, and it supports all its operations

C++ Standard Template Library (STL) is strongly based on containers and

iterators

6/69

Reasons to use Standard Containers

• STL containers eliminate redundancy, and save time avoiding writing your own

code (productivity)

• STL containers are

implemented correctly, and they do not need to spend time to

debug (reliability)

• STL containers are well-implemented and

fast

• STL containers do

not require external libraries

• STL containers

share common interfaces, making it simple to utilize diﬀerent

containers without looking up member function deﬁnitions

• STL containers are well-documented and

easily understood by other developers,

improving the understandability and maintainability

• STL containers are

thread safe. Sharing objects across threads preserve the

consistency of the container

7/69

Container Properties

C++ Standard Template Library (STL) Containers have the following properties:

• Default constructor

• Destructor

• Copy constructor and assignment (deep copy)

• Iterator methods

begin() , end()

• Support

std::swap

• Content-based and order equality (

==, != )

• Lexicographic order comparison ( >, >=, <, <= )

• size()

∗

, empty() , and max_size() methods

∗

except for std::forward_list

8/69

Iterator Concept

STL containers provide the following methods to get iterator objects:

• begin() returns an iterator pointing to the ﬁrst element

• end() returns an iterator pointing to the end of the container (i.e. the element after the

last element)

There are diﬀerent categories of iterators and each of them supports a subset of the

following operations:

Operation Example

Read *it

Write *it =

Increment it++

Decrement it–

Comparison

it1 < it2

Random access it + 4 , it[2]

9/69

Iterator Categories/Tags

10/69

Iterator Semantic 1/2

Iterator

• Copy Constructible It(const It&)

• Copy Assignable It operator=(const It&)

• Destructible ∼X()

• Dereferenceable It_value& operator*()

• Pre-incrementable

It& operator++()

Input/Output Iterator

• Satisfy Iterator

• Equality

bool operator==(const It&)

• Inequality bool operator!=(const It&)

• Post-incrementable

It operator++(int)

Forward Iterator

• Satisfy Input/Output Iterator

• Default constructible

It()

11/69

Iterator Semantics 2/2

Bidirectional Iterator

• Satisfy Forward Iterator

• Pre/post-decrementable It& operator- -(), It operator- -(int)

Random Access Iterator

• Satisfy Bidirectional Iterator

• Addition/Subtraction

void operator+(const It& it) , void operator+=(const It& it) ,

void operator-(const It& it) , void operator-=(const It& it)

• Comparison

bool operator<(const It& it) , bool operator>(const It& it) ,

bool operator<=(const It& it) , bool operator>=(const It& it)

• Subscripting It_value& operator[](int index)

anderberg.me/2016/07/04/c-custom-iterators/

12/69

Sequence Containers

Overview

Sequence containers are data structures storing objects of the same data type in a

linear mean manner

The STL Sequence Container types are:

•

std::array provides a ﬁxed-size contiguous array (on stack)

•

std::vector provides a dynamic contiguous array ( constexpr in C++20)

•

std::list provides a double-linked list

•

std::deque provides a double-ended queue (implemented as array-of-array)

• std::forward_list provides a single-linked list

While std::string is not included in most container lists, it actually meets the requirements

of a Sequence Container

embeddedartistry.com

13/69

std::array

14/69

std::vector

Other methods:

• resize() resizes the allocated elements of the container

• capacity() number of allocated elements

• reserve() resizes the allocated memory of the container (not size)

• shrink_to_fit() reallocate to remove unused capacity

• clear() removes all elements from the container (no reallocation)

15/69

std::deque

Other methods:

• resize() resizes the allocated elements of the container

• shrink_to_fit() reallocate to remove unused capacity

• clear() removes all elements from the container (no reallocation)

16/69

std::list

Other methods:

• resize() resizes the allocated elements of the container

• shrink_to_fit() reallocate to remove unused capacity

• clear() removes all elements from the container (no reallocation)

• remove() removes all elements satisfying speciﬁc criteria

• reverse() reverses the order of the elements

• unique() removes all consecutive duplicate elements

• sort() sorts the container elements

17/69

std::forward_list

Other methods:

• resize() resizes the allocated elements of the container

• shrink_to_fit() reallocate to remove unused capacity

• clear() removes all elements from the container (no reallocation)

• remove() removes all elements satisfying speciﬁc criteria

• reverse() reverses the order of the elements

• unique() removes all consecutive duplicate elements

• sort() sorts the container elements

18/69

Supported Operations and Complexity

CONTAINERS operator[]/at front back

std::array O (1) O (1) O (1)

std::vector O (1) O (1) O (1)

std::list O (1) O (1)

std::deque O (1) O (1) O (1)

std::forward_list O (1)

CONTAINERS

push_front

pop_front

push_back

pop_back

insert

(it)

erase

(it)

std::array

std::vector O (1)

∗

O (1)

∗

O (n) O (n)

std::list O (1) O (1) O (1) O (1) O (1) O (1)

std::deque O (1)

∗

O (1) O (1) O (1) O (1)

∗

/O (n)

†

O (1)

std::forward_list O (1) O (1) O (1) O (1)

∗

Amortized time

†

Worst case (middle insertion)

19/69

std::array example

# include <algorithm> // std::sort

# include <array>

// std::array supports initialization only through initialization list

std::array<int, 3> arr1 = { 5, 2, 3 };

std::array<int, 4> arr2 = { 1, 2 }; // [3]: 0, [4]: 0

// std::array<int, 3> arr3 = { 1, 2, 3, 4 }; // compiler error

std::array<int, 3> arr4(arr1); // copy constructor

std::array<int, 3> arr5 = arr1; // assign operator

arr5.fill(3); // equal to { 3, 3, 3 }

std::sort(arr1.begin(), arr1.end()); // arr1: 2, 3, 5

cout << (arr1 >= arr5); // true

cout << sizeof(arr1); // 12

cout << arr1.size(); // 3

for (const auto& it : arr1)

cout << it << ", "; // 2, 3, 5

cout << arr1[0]; // 2

cout << arr1.at(0); // 2, throw if the index is not within the range

cout << arr1.data()[0]; // 2 (raw array)

20/69

std::vector example

# include <vector>

# include <algorithm> // std::fill

std::vector<int> vec1 { 2, 3, 4 };

std::vector<std::string> vec2 = { "abc", "efg" };

std

::vector<int> vec3(2); // [0, 0]

std::vector<int> vec4{2}; // [2]

std::vector<int> vec5(5, -1); // [-1, -1, -1, -1, -1]

std::fill(vec5.begin(), vec5.end(), 3); // equal to { 3, 3, 3, 3, 3 }

cout << sizeof(vec1); // 24

cout << vec1.size(); // 3

for (const auto& it : vec1)

cout << it << ", "; // 2, 3, 4

cout << vec1[0]; // 2

cout << vec1.at(0); // 2 (bound check)

cout << vec1.data()[0] // 2 (raw array)

vec1.push_back(5); // [2, 3, 4, 5]

21/69

std::list example

# include <list>

# include <algorithm> // std::fill

std::list<int> list1 { 2, 3, 2 };

std::list<std::string> list2 = { "abc", "efg" };

std::list<int> list3(2); // [0, 0]

std::list<int> list4{2}; // [2]

std::list<int> list5(2, -1); // [-1, -1]

std::fill(list5.begin(), list5.end(), 3); // [3, 3]

list1.push_back(5); // [2, 3, 2, 5]

list1.sort(); // [2, 2, 3, 5]

list1.merge(list5); // [2, 2, 3, 3, 3, 5] merge two sorted lists

list1.remove(2); // [-1, -1, 3, 5]

list1.unique(); // [-1, 3, 5]

list1.reverse(); // [5, 3, -1]

22/69

std::deque example

# include <deque>

# include <algorithm> // std::fill

std::deque<int> queue1 { 2, 3, 2 };

std::deque<std::string> queue2 = { "abc", "efg" };

std

::deque<int> queue3(2); // [0, 0]

std::deque<int> queue4{2}; // [2]

std::deque<int> queue5(2, -1); // [-1, -1]

std::fill(queue5.begin(), queue5.end(), 3); // [3, 3]

queue1.push_front(5); // [5, 2, 3, 2]

queue1[0]; // retuns 5

23/69

std::forward_list example

# include <forward_list>

# include <algorithm> // std::fill

std::forward_list<int> flist1 { 2, 3, 2 };

std::forward_list<std::string> flist2 = { "abc", "efg" };

std

::forward_list<int> flist3(2); // [0, 0]

std::forward_list<int> flist4{2}; // [2]

std::forward_list<int> flist5(2, -1); // [-1, -1]

std::fill(flist5.begin(), flist5.end(), 4); // [4, 4]

flist1.push_front(5); // [5, 2, 3, 2]

flist1.insert_after(flist1.begin(), 0); // [5, 0, 2, 3, 2]

flist1.erase_after(flist1.begin()); // [5, 2, 3, 2]

flist1.remove(2); // [5, 3, 3]

flist1.unique(); // [5, 3]

flist1.reverse(); // [3, 5]

flist1.sort(); // [3, 5]

flist1.merge(flist5); // [3, 4, 4, 5] merge two sorted lists

24/69

Associative

Containers

Overview

An associative container is a collection of elements not necessarily indexed with

sequential integers and that supports eﬃcient retrieval of the stored elements through

keys

Keys are unique

• std::set is a collection of sorted unique elements (operator<)

•

std::unordered_set is a collection of unsorted unique keys

•

std::map is a collection of unique <key, value> pairs, sorted by keys

•

std::unordered_map is a collection of unique <key, value> pairs, unsorted

Multiple entries for the same key are permitted

• std::multiset is a collection of sorted elements (operator<)

• std::unordered_multiset is a collection of unsorted elements

•

std::multimap is a collection of <key, value> pairs, sorted by keys

•

std::unordered_multimap is a collection of key, value pairs

25/69

Internal Representation

Sorted associative containers are typically implemented using red-black trees, while

unordered associative containers (C++11) are implemented using hash tables

Red-Black Tree

Hash Table

26/69

Supported Operations and Complexity

CONTAINERS

insert

erase

count

find

lower_bound

upper_bound

Ordered Containers O (log(n)) O (log(n)) O (log(n)) O (log(n)) O (log(n))

Unordered Containers O (1)

∗

O (1)

∗

O (1)

∗

O (1)

∗

O (n) worst case

•

count() returns the number of elements with key equal to a speciﬁed argument

• find() returns the element with key equal to a speciﬁed argument

•

lower_bound() returns an iterator pointing to the ﬁrst element that is not less than

key

•

upper_bound() returns an iterator pointing to the ﬁrst element that is greater than key

27/69

Other Methods

Ordered/Unordered containers:

• equal_range() returns a range containing all elements with the given key

std::map, std::unordered_map

• operator[]/at() returns a reference to the element having the speciﬁed key in the

container.

•

operator[] if the key is not found, it returns a new element

•

at() if the key is not found, raises an exception

Unordered containers:

•

bucket_count() returns the number of buckets in the container

• reserve() sets the number of buckets to the number needed to accommodate at least

count elements without exceeding maximum load factor and rehashes the container

28/69

std::set example

# include <set>

std::set<int> set1 { 5, 2, 3, 2, 7 };

std::set<int> set2 = { 2, 3, 2 };

std::set<std::string> set3 = { "abc", "efg" };

std

::set<int> set4; // empty set

set2.erase(2); // [ 3 ]

set3.insert("hij"); // [ "abc", "efg", "hij" ]

for (const auto& it : set1)

cout << it << " "; // 2, 3, 5, 7 (sorted)

auto search = set1.find(2); // iterator

cout << search != set1.end(); // true

auto it = set1.lower_bound(4);

cout << *it; // 5

set1.count(2); // 1, note: it can only be 0 or 1

auto it_pair = set1.equal_range(2); // iterator between [2, 3)

29/69

std::map example

# include <map>

std::map<std::string, int> map1 { {"bb", 5}, {"aa", 3} };

std::map<double, int> map2; // empty map

cout << map1["aa"]; // prints 3

map1["dd"] = 3; // insert <"dd", 3>

map1["dd"] = 7; // change <"dd", 7>

cout << map1["cc"]; // insert <"cc", 0>

for (const auto& it : map1)

cout << it.second << " "; // 3, 5, 0, 7

map1.insert( {"jj", 1} ); // insert pair

auto search = map1.find("jj"); // iterator

cout << (search != map1.end()); // true

auto it = map1.lower_bound("bb");

cout << (*it).second; // 5

30/69

std::multiset example

# include <set> // std::multiset

std::multiset<int> mset1 {1, 2, 5, 2, 2}; // 1, 2, 2, 2, 5

std::multiset<double> mset2; // empty set

mset1.insert(5);

for (const auto& it : mset1)

cout << it << " "; // 1, 2, 2, 2, 5, 5

cout << mset1.count(2); // 3

auto it = mset1.find(5); // iterator

cout << *it; // 5

it = mset1.lower_bound(4);

cout

<< *it; // 5

31/69

Container Adaptors

Overview

Container adaptors are interfaces for reducing the number of functionalities normally

available in a container

The underlying container of a container adaptors can be optionally spec iﬁed in the

declaration

The STL Container Adaptors are:

•

std::stack LIFO data structure

default underlying container: std::deque

•

std::queue FIFO data structure

default underlying container: std::deque

• std::priority_queue (max) priority queue

default underlying container: std::vector

32/69

Container Adaptors Methods

std::stack interface for a FILO (ﬁrst-in, last-out) data structure

• top() accesses the top element

• push() inserts element at the top

• pop() removes the top element

std::queue interface for a FIFO (ﬁrst-in, ﬁrst-out) data structure

• front() access the ﬁrst element

• back() access the last element

• push() inserts element at the end

• pop() removes the ﬁrst element

std::priority_queue interface for a priority queue data structure (lookup to the

largest element by default)

• top() accesses the top element

• push() inserts an element on the proper, sorted position

• pop() removes the ﬁrst/top element

33/69

Container Adaptor Examples

# include <stack> // <--

# include <queue> // <-- also include priority_queue

std::stack<int> stack1;

stack1.push(

1); stack1.push(4); // [1, 4]

stack1.top(); // 4

stack1.pop(); // [1]

std::queue<int> queue1;

queue1.push(1); queue1.push(4); // [1, 4]

queue1.front(); // 1

queue1.pop(); // [4]

std::priority_queue<int> pqueue1;

pqueue1.push(1); pqueue1.push(5); pqueue1.push(4); // [5, 4, 1]

pqueue1.top(); // 5

pqueue1.pop(); // [4, 1]

34/69

Implement a Custom

Iterator

Implement a Simple Iterator 1/6

Goal: implement a simple iterator to iterate over a List of elements:

# include <iostream>

# include <algorithm>

// !! List implementation here

int main() {

List list;

list.push_back(

2);

list.push_back(4);

list.push_back(7);

std

::cout << *std::find(list.begin(), list.end(), 4); // print 4

for (const auto& it : list) // range-based loop

std::cout << it << " "; // 2, 4, 7

}

Range-based loops require:

begin() , end() , pre-increment ++it , not equal comparison

it != end() , dereferencing *it

35/69

Implement a Simple Iterator (List declaration) 2/6

using value_t = int;

struct List {

struct Node { // Internal Node Structure

value_t _value; // Node value

Node* _next; // Pointer to next node

};

Node

* _head { nullptr }; // head of the list

Node* _tail { nullptr }; // tail of the list

void push_back(const value_t& value); // insert a value at the end

// !! here we have to define the List iterator "It"

It begin() { return It{_head}; } // begin of the list

It end() { return It{nullptr}; } // end of the list

};

36/69

Implement a Simple Iterator (List deﬁnition) 3/6

void List::push_back(const value_t& value) {

auto new_node = new Node{value, nullptr};

if (_head == nullptr) { // empty list

_head = new_node; // head is updated

_tail = _head;

return;

}

assert(_tail

!= nullptr);

_tail->_next = new_node; // add new node at the end

_tail = new_node; // tail is updated

}

37/69

Implement a Simple Iterator (Iterator declaration) 4/6

struct It {

Node* _ptr; // internal pointer

It(Node* ptr); // Constructor

value_t& operator*(); // Deferencing

// Not equal -> stop traversing

friend bool operator!=(const It& itA, const It& itB);

& operator++(); // Pre-increment

It operator++(int); // Post-increment

// !! Type traits here

};

38/69

Implement a Simple Iterator (Iterator deﬁnition) 5/6

List::It::It(Node* ptr) :_ptr(ptr) {}

value_t

& Lis::It::operator*() { return _ptr->_value; }

bool operator!=(const It& itA, const It& itB) {

return itA._ptr != itB._ptr;

}

List

::It& List::It::operator++() {

_ptr

= _ptr->_next;

return *this;

}

List::It List::It::operator++(int) {

auto tmp = *this;

++(*this);

return tmp;

}

39/69

Implement a Simple Iterator (Type Traits) 6/6

The type traits of an iterator describe its properties, e.g. the type of the value held,

and they are widely used in the

std algorithms

std::iterator class template deﬁnes the type traits for an iterator. It has been

deprecated in

C++17, so users need to provide the type traits explicitly

# include <iterator>

!! Type traits

using iterator_category = std::forward_iterator_tag;

using difference_type = std::ptrdiff_t;

using value_type = value_t;

using pointer = value_t*;

using reference = value_t&;

internalpointers.com/post/writing-custom-iterators-modern-cpp

Preparation for std::iterator Being Deprecated

40/69

Iterator Notes

Common Errors

Modify a container with a “active” iterators

# include <vector>

std::vector<int> vec{1, 2, 3, 4, 5};

for (auto x : vec)

vec.push_back(x);

// iterator invalidation!!

41/69

Iterator Utility

Methods

Iterator Operations 1/2

• std::advance(InputIt& it, Distance n)

Increments a given iterator it by n elements

- InputIt must support input iterator requirements

- Modiﬁes the iterator

- Returns void

- More general than adding a value

it + 4

- No performance loss if

it satisﬁes random access iterator requirements

• std::next(ForwardIt it, Distance n) C++11

Returns the n-th successor of the iterator

ForwardIt must support forward iterator requirements

- Does not modify the iterator

- More general than adding a value

it + 4

- The compiler should optimize the computation if

it satisﬁes random access iterator

requirements

- Supports negative values if

it satisﬁes bidirectional iterator requirements

42/69

Iterator Operations 2/2

• std::prev(BidirectionalIt it, Distance n) C++11

Returns the n-th predecessor of the iterator

- InputIt must support bidirectional iterator requirements

- Does not modify the iterator

- More general than adding a value

it + 4

- The compiler should optimize the computation if it satisﬁes random access iterator

requirements

•

std::distance(InputIt start, InputIt end)

Returns the number of elements from start to last

InputIt must support input iterator requirements

- Does not modify the iterator

- More general than adding iterator diﬀerence

it2 - it1

- The compiler should optimize the computation if it satisﬁes random access iterator

requirements

- C++11 Supports negative values if it satisﬁes random iterator requirements

43/69

Examples

# include <iterator>

# include <iostream>

# include <vector>

# include <forward_list>

int main() {

std

::vector<int> vector { 1, 2, 3 }; // random access iterator

auto it1 = std::next(vector.begin(), 2);

auto it2 = std::prev(vector.end(), 2);

std::cout << *it1; // 3

std::cout << *it2; // 2

std::cout << std::distance(it2, it1); // 1

std::advance(it2, 1);

std

::cout << *it2; // 3

//--------------------------------------

std::forward_list<int> list { 1, 2, 3 }; // forward iterator

// std::prev(list.end(), 1); // compile error

}

44/69

Container Access Methods

C++11 provides a generic interface for containers, plain arrays, and std::initializer_list

to access to the corresponding iterator.

Standard metho d

.begin() , .end() etc., are not supported by plain array and initializer list

•

std::begin begin iterator

•

std::cbegin begin const iterator

•

std::rbegin begin reverse iterator

•

std::crbegin begin const reverse iterator

•

std::end end iterator

•

std::cend end const iterator

•

std::rend end reverse iterator

•

std::crend end const reverse iterator

# include <iterator>

# include <iostream>

int main() {

int array[] = { 1, 2, 3 };

for (auto it = std::crbegin(array); it != std::crend(array); it++)

std::cout << *it << ", "; // 3, 2, 1

}

45/69

Iterator Traits 1/2

std::iterator_traits allows retrieving iterator properties

• difference_type a type that can be used to identify distance between iterators

•

value_type the type of the values that can be obtained by dereferencing the

iterator. This type is void for output iterators

•

pointer deﬁnes a pointer to the type iterated over value_type

•

reference deﬁnes a reference to the type iterated over value_type

•

iterator_category the category of the iterator. Must be one of iterator

category tags

46/69

Iterator Traits 2/2

# include <iterator>

template<typename T>

void f(const T& list) {

using D = std::iterator_traits<T>::difference_type; // D is std::ptrdiff_t

// (pointer difference)

// (signed size_t)

using V = std::iterator_traits<T>::value_type; // V is double

using P = std::iterator_traits<T>::pointer; // P is double*

using R = std::iterator_traits<T>::reference; // R is double&

// C is BidirectionalIterator

using C = std::iterator_traits<T>::iterator_category;

}

int main() {

std::list<double> list;

f(list);

}

47/69

Algorithms Library

STL Algorithms Library

C++ STL Algorithms library

The algorithm library provides functions for a variety of purposes (e.g. searching,

sorting, counting, manipulating) that operate on ranges of elements

• STL Algorithm library allow

great ﬂexibility which makes included functions

suitable for solving real-world problem

• The user can adapt and customize the STL through the use of

function objects

• Library functions work independently on containers and plain array

• Many of them support constexpr in C++20

48/69

Examples 1/2

# include <algorithm>

# include <vector>

struct Unary {

bool operator()(int value) {

return value <= 6 && value >= 3;

}

};

struct Descending {

bool operator()(int a, int b) {

return a > b;

}

};

int main() {

std::vector<int> vector { 7, 2, 9, 4 };

// returns an iterator pointing to the first element in the range[3, 6]

std::find_if(vector.begin(), vector.end(), Unary());

// sort in descending order : { 9, 7, 4, 2 };

std::sort(vector.begin(), vector.end(), Descending());

}

49/69

Examples 2/2

# include <algorithm> // it includes also std::multiplies

# include <vector>

# include <cstdlib> // std::rand

# include <numeric> // std::accumulate

struct Unary {

bool operator()(int value) { return value > 100; }

};

int main() {

std::vector<int> vector { 7, 2, 9, 4 };

int product = std::accumulate(vector.begin(), vector.end(), // product = 504

1, std::multiplies<int>());

std

::generate(vector.begin(), vector.end(), std::rand);

// now vector has 4 random values

// remove all values > 100 using Erase-remove idiom

auto new_end = std::remove_if(vector.begin(), vector.end(), Unary());

// elements are removed, but vector size is still unchanged

vector.erase(new_end, vector.end()); // shrink vector to finish removal

}

50/69

STL Algorithms Library (Possible Implementations)

std::find

template<class InputIt, class T>

InputIt find(InputIt first, InputIt last, const T& value) {

for (; first != last; ++first) {

if (*first == value)

return first;

}

return last;

}

std::generate

template<class ForwardIt, class Generator>

void generate(ForwardIt first, ForwardIt last, Generator g) {

while (first != last)

*first++ = g();

}

51/69

Algorithm Library 1/5

• swap(v1, v2) Swaps the values of two objects

• min(x, y) Finds the minimum value between x and y

•

max(x, y) Finds the maximum value between x and y

•

min_element(begin, end) (returns a pointer)

Finds the minimum element in the range [begin, end)

• max_element(begin, end) (returns a pointer)

Finds the maximum element in the range [begin, end)

• minmax_element(begin, end) C++11 (returns pointers <min,max>)

Finds the minimum and the maximum element in the range [begin, end)

en.cppreference.com/w/cpp/algorithm

52/69

Algorithm Library 2/5

• equal(begin1, end1, begin2)

Determines if two sequences are the same in

[begin1, end1), [begin2, begin2 + end1 - begin1)

•

mismatch(begin1, end1, begin2) (returns pointers <pos1,pos2>)

Finds the ﬁrst position where two ranges diﬀer in

[begin1, end1), [begin2, begin2 + end1 - begin1)

• find(begin, end, value) (returns a pointer)

Finds the ﬁrst element in the range [begin, end) equal to value

•

count(begin, end, value)

Counts the number of elements in the range [begin, end) equal to value

53/69

Algorithm Library 3/5

• sort(begin, end) (in-place)

Sorts the elements in the range [begin, end) in ascending order

•

merge(begin1, end1, begin2, end2, output)

Merges two sorted ranges [begin1, end1), [begin2, end2), and store the results in

[output, output + end1 - start1)

•

unique(begin, end) (in-place)

Removes consecutive duplicate elements in the range [begin, end)

•

binary search(begin, end, value)

Determines if an element value exists in the (sorted) range [begin, end)

• accumulate(begin, end, value)

Sums up the range [begin, end) of elements with initial value (common case equal to

zero)

•

partial_sum(begin, end, output) (in-place)

Computes the inclusive preﬁx-sum of the range [begin, end)

54/69

Algorithm Library 4/5

• fill(begin, end, value)

Fills a range of elements [begin, end) with value

•

iota(begin, end, value) C++11

Fills the range [begin, end) with successive increments of the starting value

•

copy(begin1, end1, begin2)

Copies the range of elements [begin1, end1) to the new location

[begin2, begin2 + end1 - begin1)

•

swap_ranges(begin1, end1, begin2)

Swaps two ranges of elements

[begin1, end1), [begin2, begin2 + end1 - begin1)

•

remove(begin, end, value) (in-place)

Removes elements equal to value in the range [begin, end)

• includes(begin1, end1, begin2, end2)

Checks if the (sorted) set [begin1, end1) is a subset of [begin2, end2)

55/69

Algorithm Library 5/5

• set_difference(begin1, end1, begin2, end2, output)

Computes the diﬀerence between two (sorted) sets

•

set_intersection(begin1, end1, begin2, end2, output)

Computes the intersection of two (sorted) sets

•

set_symmetric_difference(begin1, end1, begin2, end2, output)

Computes the symmetric diﬀerence between two (sorted) sets

•

set_union(begin1, end1, begin2, end2, output)

Computes the union of two (sorted) sets

•

make_heap(begin, end) Creates a max heap out of the range of elements

•

push_heap(begin, end) Adds an element to a max heap

• pop_heap(begin, end) Remove an element (top) to a max heap

56/69

Algorithm Library - Other Examples

#include <algorithm>

int a = std::max(2, 5); // a = 5

int array1[] = {7, 6, -1, 6, 3};

int array2[] = {8, 2, 0, 3, 7};

int b = *std::max_element(array1, array1 + 5); // b = 7

auto c = std::minmax_element(array1, array1 + 5);

//*c.first = -1, *c.second = 7

bool d = std::equal(array1, array1 + 5, array2); // d = false

std::sort(array1, array1 + 5); // [-1, 3, 6, 6, 7]

std::unique(array1, array1 + 5); // [-1, 3, 6, 7]

int e = std::accumulate(array1, array1 + 4, 0); // 15

std::partial_sum(array1, array1 + 4, array1); // [-1, 2, 8, 15]

std::iota(array1, array1 + 5, 2); // [2, 3, 4, 5, 6]

std::make_heap(array2, array2 + 5); // [8, 7, 0, 3, 2]

57/69

C++20 Ranges

Ranges are an abstraction that allows to operate on elements of data structures

uniformly. They are an extension of the standard iterators

A range is an object that provides

begin() and end() methods (an iterator + a

sentinel)

begin() returns an iterator, which can be incremented until it reaches end()

template<typename T>

concept range = requires(T& t) {

ranges::begin(t);

ranges::end(t);

};

• An Overview of Standard Ranges

• Range, Algorithms, Views, and Actions - A Comprehensive Guide

• Eric Nielbler - Range v3

• Range by Example

58/69

Key Concepts

Range View is a range deﬁned on top of another range

Range Adaptors are utilities to transform a range into a view

Range Factory is a view that contains no elements

Range Algorithms are library-provided functions that directly operate on ranges

(corresponding to std iterator algorithm)

Range Action is an object that modiﬁes the underlying data of a range

59/69

Range View 1/2

A range view is a range deﬁned on top of another range that transforms the

underlying way to access internal data

• Views do

not own any data

• copy, move, assignment operations perform in constant time

• Views are composable

• Views are lazy evaluated

Syntax:

range/view | view

60/69

Range View 2/2

#include <iostream>

#include <ranges>

#include <vector>

std::vector<int> v{1, 2, 3, 4};

for (int x : v | std::views::reverse)

std

::cout << x << " "; // print: "4, 3, 2, 1"

auto rv2 = v | std::views::reverse; // cheap, it does not copy "v"

auto rv3 = v | std::views::drop(2) | // drop the first two elements

std::views::reverse;

for (int x : rv3) // lazy evaluated

std::cout << x << " "; // print: "4, 3"

61/69

Range Adaptor 1/2

Range Adaptors are utilities to transform a range into a view with custom behaviors

• Range adaptors produce lazily evaluated views

• Range adaptors can be chained or composed (pipeline)

Syntax:

adaptor(range/view, args...)

adaptor(args...)(range/view)

range/view | adaptor(args...) // preferred syntax

62/69

Range Adaptor 2/2

# include <ranges>

# include <vector>

std::vector<int> v{1, 2, 3, 4};

for (int x : std::ranges::reverse_view(v)) // adaptor

cout << x << " "; // print: "4, 3, 2, 1"

auto rv2 = std::ranges::reverse_view(v); // cheap, it does not copy "v"

auto rv3 = std::ranges::reverse_view(

std::ranges::drop_view(2, v)); // drop the first two elements

for (int x : rv3) // lazy evaluated

cout << x << " "; // print: "4, 3"

63/69

Range Factory

Range Factory produces a view that contains no elements

# include <ranges>

for (int x : std::ranges::iota_view{1, 4}) // factory (adaptor)

cout << x << " "; // print: "1, 2, 3, 4"

for (int x : std::views::repeat('a', 4)) // factory (view)

cout << x << " "; // print: "a, a, a, a"

64/69

Range Algorithms

The range algorithms are almost identical to the corresponding iterator-pair

algorithms in the std namespace, except that they have concept-enforced constraints

and accept range arguments

• Range algorithms are immediately evaluated

• Range algorithms can work directly on containers (

begin() , end() are no

more explicitly needed) and view

#include <algorithm>

#include <vector>

std::vector<int> vec{3, 2, 1};

std::ranges::sort(vec); // 1, 2, 3

Std Library - Range Algorithms

65/69

Algorithm Operators and Projections

# include <algorithm>

# include <vector>

struct Data {

char value1;

int value2;

};

std

::vector<int> vec{4, 2, 5};

auto cmp = [](auto a, auto b) { return a > b; }; // Unary boolean predicate

std::ranges::sort(vec, cmp); // 5, 4, 2

std::vector<Data> vec2{{'a', 4}, {'b', 2}, {'c', 5}};

std::ranges::sort(vec2, {}, &Data::value2); // Projection: 2, 4, 5

// {'b', 2}, {'a', 4}, {'c', 5}

66/69

Algorithms and Views

// sum of the squares of the first 'count' numbers

auto sum_of_squares(int count) {

auto squares = std::views::iota(1, count) |

std::views::transform([](int x) { return x * x; });

return std::ranges::fold_left_first(squares, std::plus{});

}

67/69

Range Actions 1/2

The range actions mimic std algorithms and range algorithms adding the

composability property

• Range actions are eager evaluated

• Range algorithms work directly on ranges

•

Not included in the std library

68/69

Range Actions 2/2

# include <algorithm>

# include <vector>

std::vector<int> vec{3, 5, 6, 3, 5}

// in-place

vec = vec | actions::sort // 3, 3, 5, 5, 6

| actions::unique; // 3, 5, 6

vec |= actions::sort // 3, 3, 5, 5, 6

| actions::unique; // 3, 5, 6

// out-of-place

auto vec2 = std::move(vec) | actions::sort // 3, 3, 5, 5, 6

| actions::unique; // 3, 5, 6

69/69

Modern C++

Programming

20. Advanced Topics I

Federico Busato

2025-04-14

Table of Contents

1 Move Semantic

lvalues and rvalues references

Move Semantic

std::move

Class Declaration Semantic

2 Universal Reference and Perfect Forwarding

Universal Reference

Reference Collapsing Rules

Perfect Forwarding

1/59

Table of Contents

3 Value Categories

4 &, && Ref-qualiﬁers and volatile Overloading

&, && Ref-qualiﬁers Overloading

volatile Overloading

5 Copy Elision and RVO/NVRO

2/59

Table of Contents

6 Type Deduction

Pass-by-Reference

Pass-by-Pointer

Pass-by-Value

auto Deduction

auto(x): Decay-copy

7 const Correctness

3/59

Move Semantic

Overview

Move semantics refers in transferring ownership of resources

from one object to another

Diﬀerently from copy semantic, move semantic does not duplicate

the original resource

4/59

lvalue and rvalue 1/3

In C++ every expression is either an rvalue or an lvalue

• a lvalue (left) represents an expression that occupies some identiﬁable location in

memory

• a rvalue (right) is an expression that do es not represent an object occupying some

identiﬁable location in memory

int x = 5; // "x" is an lvalue, "5" is an rvalue

int y = 10; // "y" is an lvalue

int z = (x * y); // "z" is an

lvalue, (x * y) is an rvalue

5/59

lvalues and rvalues 2/3

C++11 introduces a new kind of reference called rvalue reference X&&

• An rvalue reference only binds to an rvalue, that is a temporary

• An lvalue reference only binds to an lvalue

• A const lvalue reference binds to both lvalue and rvalue

int x = 5; // "x" is an lvalue

int& r1 = x; // "r1" is an lvalue reference

// int& r2 = 5; //

compile error, "5" is an rvalue

const int& cr = (x * y); // "cr" is an const lvalue reference

int&& rv = (x * y); // "rv" is an rvalue reference, "(x * y)" is an rvalue

// int&& rv1 = x; //

compile error, "x" is NOT an rvalue

6/59

lvalues and rvalues 3/3

struct A {};

void f(A& a) {} // lvalue reference

void g(const A& a) {} // const lvalue reference

void h(A&& a) {} // rvalue reference

A a;

f(a);

// ok, f() can modify "a"

g(a); // ok, f() cannot modify "a"

// h(a); // compile error f() does not accept lvalues

// f(A{}); // compile error f() does not accept rvalues

g(A{}); // ok, f() cannot modify the object A{}

h(A{}); // ok, f() can modify the object A{}

7/59

Move Semantic - The Problem 1/2

# include <algorithm>

class Array { // Array Wrapper

public:

Array() = default;

Array(

int size) : _size{size}, _array{new int[size]} {}

Array(

const Array& obj) : _size{obj._size}, _array{new int[obj._size]} {

// EXPENSIVE COPY (deep copy)

std::copy(obj._array, obj._array + _size, _array);

}

∼Array() { delete[] _array; }

private:

int _size;

int* _array;

};

8/59

Move Semantic - The Problem 2/2

# include <vector>

int main() {

std

::vector<Array> vector;

vector.push_back( Array{

1000} ); // call push_back(const Array&)

} // expensive copy

Before C++11: Array{1000} is created, passed by const-reference, copied, and

then destroyed

Note:

Array{1000} is no more used outside push_back

After C++11: Array{1000} is created, and moved to vector (fast!)

9/59

Move Semantic 1/3

Class prototype with support for move semantic:

class X {

public:

X(); // default constructor

X(const X& obj); // copy constructor

X(X&& obj); // move constructor

X& operator=(const X& obj); // copy assign operator

X& operator=(X&& obj); // move assign operator

∼X(); // destructor

};

10/59

Move Semantic 2/3

Move constructor semantic

X(X&& obj);

(1) Shallow copy of obj data members (in contrast to deep copy)

(2) Release any

obj resources and reset all data members (pointer to nullptr, size to 0,

etc.)

Move assignment semantic

X& operator=(X&& obj);

(1) Release any resources of this

(2) Shallow copy of obj data members (in contrast to deep copy)

(3) Release any obj resources and reset all data members (pointer to nullptr, size to 0,

etc.)

(4) Return

*this

11/59

Move Semantic 3/3

Move constructor

Array(Array&& obj) {

_size = obj._size; // (1) shallow copy

_array = obj._array; // (1) shallow copy

obj._size = 0; // (2) release obj (no more valid)

obj._array = nullptr; // (2) release obj

}

Move assignment

Array& operator=(Array&& obj) {

delete[] _array; // (1) release this

_size = obj._size; // (2) shallow copy

_array = obj._array; // (2) shallow copy

obj._array = nullptr; // (3) release obj

obj._size = 0; // (3) release obj

return *this; // (4) return *this

}

12/59

std::move

C++11 provides the method std::move ( <utility> ) to indicate that an

object may be “moved from”

It allows to eﬃcient transfer resources from an object to another one

# include <vector>

int main() {

std::vector<Array> vector;

vector.push_back( Array{

1000} ); // call "push_back(Array&&)"

Array arr{1000};

vector.push_back( arr ); // call "push_back(const Array&)"

vector.push_back( std::move(arr) ); // call "push_back(Array&&)"

// efficient!!

// "arr" is not more valid here

}

13/59

Move Semantic Notes

If an object requires the copy constructor/assignment, then it should also deﬁne the

move constructor/assignment. The opposite could not be true

The defaulted move constructor/assignment =default recursively applies the move

semantic to its base class and data members.

Important: it does not release the resources. It is very dangerous for classes with

manual resource management

// Suppose: Array(Array&&) = default;

Array x{10};

Array y = std::move(x); // call the move constructor

// "x" calls ∼Array() when it is out of scope, but now the internal pointer

// "_array" is NOT nullptr -> double free or corruption!!

14/59

Move Semantic and Code Reuse

Some operations can be expressed as a function of the move semantic

A& operator=(const A& other) {

*this = A{other}; // copy constructor + move assignment

return *this;

}

void init(... /* any paramters */ ) {

*this = A{...}; // user-declared constructor + move assignment

}

15/59

Class Declaration Semantic - Compiler Implicit

Everything You Ever Wanted To Know About Move Semantics

A Quick Note of Copy and Move Control in C++

16/59

Class Declaration Semantic

User-declared Entity Meaning / Implications

non- static const members

Copy/Move constructors are not trivial (not provided by the

compiler). Copy/move assignment is not supported

reference members

Copy/Move constructors/assignment are not trivial (not

provided by the compiler)

destructor

The resource management is not trivial. Copy

constructor/assignment is very likely to be implemented

copy constructor/assignment

Resource management is not trivial. Move

constructors/assignment need to b e implemented by the user

move constructor/assignment

There is an eﬃcient way to move the object. Copy

constructor/assignment cannot fall back safely to copy

constructors/assignment, so they are deleted

17/59

Universal Reference

and Perfect

Forwarding

Universal Reference 1/3

The && syntax has two diﬀerent meanings depending on the context it is used

• rvalue reference

• Universal reference: Either rvalue reference or lvalue reference

Universal references (also called forwarding references) are rvalues that appear in a

type-deducing context.

T&& , auto&& accept any expression regardless it is an lvalue

or rvalue and preserve the const property

void f1(int&& t) {} // rvalue reference

template<typename T>

void f2(T&& t) {} // universal reference

int&& v1 = ...; // rvalue reference

auto&& v2 = ...; // universal reference

18/59

Universal Reference 2/3

int f_copy();

int& f_ref();

const int& f_const_ref();

auto c1 = f_copy(); // lvalue, T=int

// auto c2 = f_ref(); //

compile error

auto c3 = f_const_ref(); // lvalues (decay), T=int

// auto& r1 = f_copy(); // compile error

auto& r2 = f_ref(); // lvalue ref, T=int&

// auto& r3 = f_const_ref(); //

compile error

const auto& cr1 = f_copy(); // not modifiable, T=const int&

const auto& cr2 = f_ref(); // not modifiable, T=const int&

const auto& cr3 = f_const_ref(); // not modifiable, T=const int&

auto&& u1 = f_copy(); // T=int&

auto&& u2 = f_ref(); // T=int&

auto&& u3 = f_const_ref(); // not modifiable, T=const int&

19/59

Universal Reference 3/3

struct A {};

void f1(A&& a) {} // rvalue only

template<typename T>

void f2(T&& t) {} // universal reference

A a;

f1(A{}); // ok

// f1(a); //

compile error (only rvalue)

f2(A{}); // universal reference

f2(a); // universal reference

A&& a2 = A{}; // ok

// A&& a3 = a; // compile error (only rvalue)

auto&& a4 = A{}; // universal reference

auto&& a5 = a; // universal reference

20/59

Universal Reference - Misleading Cases

template<typename T>

void f(std::vector<T>&&) {} // rvalue reference

template<typename T>

void f(const T&&) {} // rvalue reference (const)

const auto&& v = ...; // const rvalue reference

21/59

Reference Collapsing Rules

Before C++11 (C++98, C++03), it was not allowed to take a reference to a

reference (

A& & causes a compile error)

C++11, by contrast, introduces the following reference collapsing rules:

template<typename T>

void f(T&) {} // compile error in C++98/03 (with gcc),

// no errors in C++11 (and clang with C++98/03)

int a = 3; //

f<int&>(a); //

Type Reference Result

A& & → A&

A& && → A&

A&& & → A&

A&& && → A&&

22/59

Perfect Forwarding

Perfect forwarding allows preserving argument value category and const/volatile

modiﬁers

std::forward ( <utility> ) forwards the argument to another function with the

value category it had when passed to the calling function (p erfect forwarding)

# include <utility> // std::forward

template<typename T> void f(T& t) { cout << "lvalue"; }

template<typename T> void f(T&& t) { cout << "rvalue"; } // overloading

template<typename T> void g1(T&& obj) { f(obj); } // call only f(T&)

template<typename T> void g2(T&& obj) { f(std::forward<T>(obj)); }

struct A{};

f ( A{10} ); // print "rvalue"

g1( A{10} ); // print "lvalue"!!

g2( A{10} ); // print "rvalue"

23/59

Value Categories

Taxonomy (simpliﬁed)

Every expression is either an rvalue or an lvalue

• An lvalue (left value of an assignment for historical reason or locator value)

represents an expression that occupies an identity, namely a memory location

(it has an address)

• An rvalue is movable; an lvalue is not

24/59

Taxonomy 1/2

glvalue (generalized lvalue) is an expression that has an identity

lvalue is a glvalue but it is not movable (it is not an xvalue). An named rvalue

reference is a lvalue

xvalue (eXpiring) has an identity and it is movable. It is a glvalue that denotes an

object whose resources can be reused. An unnamed rvalue reference is a

xvalue

prvalue (pure rvalue) doesn’t have identity, but is movable. It is an expression

whose evaluation initializes an object or computes the value of an operand

of an operator

rvalue is movable. It is a prvalue or an xvalue

en.cppreference.com/w/cpp/language/value_category

25/59

Taxonomy 2/2

26/59

Examples

struct A {

int x;

};

void f(A&&) {}

&& g();

//----------------------------------------------------------------

int a = 4; // "a" is an lvalue, "4" is a prvalue

f(A{4}); // "A{4}" is a prvalue

A&& b = A{3}; // "A&& b" is a named rvalue reference → lvalue

A c{4};

f(std::move(c)); // "std::move(c)" is a xvalue

f(A{}.x); // "A{}.x" is a xvalue

g(); // "A&&" is a xvalue

27/59

&, && Ref-qualiﬁers

and volatile

Overloading

&, && Ref-qualiﬁers Overloading 1/3

C++11 allows overloading member func tions depending on the lvalue/rvalue property

of their object. This is also known as ref-qualiﬁers overloading and can be useful for

optimization purposes, namely, moving a variable instead of copying it

struct A {

// void f() {} // already covered by "f() &"

void f() & {}

void f() && {}

};

A a1;

a1.f();

// call "f() &"

A{}.f(); // call "f() &&"

std::move(a1).f(); // call "f() &&"

28/59

&, && Ref-qualiﬁers Overloading 2/3

Ref-qualiﬁers overloading can be also combined with const methods

struct A {

// void f() const {} // already covered by "f() const &"

void f() const & {}

void f() const && {}

};

const A a1;

a1.f();

// call "f() const &"

std::move(a1).f(); // call "f() const &&"

29/59

&, && Ref-qualiﬁers Overloading 3/3

A simple example where ref-qualiﬁers overloading is useful

struct ArrayWrapper {

ArrayWrapper(

/*params*/ ) { /* something expensive */ }

ArrayWrapper copy()

const & { /* expensive copy with std::copy() */ }

ArrayWrapper copy()

const && { /* just move the pointer as the original

object is no more used */ }

};

30/59

volatile Overloading

struct A {

void f() {}

void f() volatile {} // e.g. propagate volatile to data members

void f() const volatile {}

// void f() volatile & {} // combining ref-qualifier and volatile

// void f() const volatile & {} // overloading is also fine

// void f() volatile && {}

// void f() const volatile && {}

};

volatile A a1;

a1.f(); // call "f() volatile"

const volatile A a2;

a2.f(); // call "f() const volatile"

Member Function Overloading: Choices You Didn’t Know You Had

31/59

Copy Elision and

RVO/NVRO

Copy Elision and RVO/NVRO

Copy elision is a compiler optimization technique that eliminates unnecessary

creation, destruction, copying, moving of temporary objects

Copy elision can be also applied to avoid unnecessary object copies when returning

from functions. Such optimizations are:

• RVO (Return Value Optimization) means the compiler is allowed to avoid

creating temporary objects for return values

• NRVO (Named Return Value Optimization) means the compiler is allowed to

return an object (with automatic storage duration) without invoking copy/move

constructors

32/59

RVO Example

Returning an object from a function is very expensive without RVO/NVRO:

struct Obj {

Obj()

= default;

Obj(

const Obj&) { // non-trivial

cout << "copy constructor\n";

}

};

Obj

f() { return Obj{}; } // first copy

auto x1 = f(); // second copy (create "x")

If provided, the compiler uses the move constructor instead of copy constructor

33/59

RVO - Where it works

RVO Copy elision is always guaranteed if the operand is a prvalue of the same class

type and the copy constructor is trivial and non-deleted

struct Trivial {

Trivial()

= default;

Trivial(

const Trivial&) = default;

};

// sigle instance

Trivial f1() {

return Trivial{}; // Guarantee RVO

}

// distinct instances and run-time selection

Trivial f2(bool b) {

return b ? Trivial{} : Trivial{}; // Guarantee RVO

}

34/59

Guaranteed Copy Elision (C++17)

In C++17, RVO Copy elision is always guaranteed if the operand is a prvalue of the

same class type, even if the copy constructor is not trivial or deleted

struct S1 {

S1()

= default;

S1(

const S1&) = delete; // deleted

};

struct S2 {

S2()

= default;

S2(

const S2&) {} // non-trivial

};

f() { return S1{}; }

g() { return S2{}; }

auto x1 = f(); // compile error in C++14

auto x2 = g(); // RVO only in C++17

35/59

RVO Example - Where it does NOT work 1/3

NRVO is not always guaranteed even in C++17

Obj f1() {

Obj a;

return a; // most compilers apply NRVO

}

Obj

f2(bool v) {

Obj a;

if (v)

return a; // copy/move constructor

return Obj{}; // RVO

}

GCC 14 adds the ﬂag

-Wnvro to diagnose when NVRO is not possible

New C++ features in GCC 14

36/59

RVO Example - Where it does NOT work 2/3

Obj f3(bool v) {

Obj a, b;

return v ? a : b; // copy/move constructor

}

Obj

f4() {

Obj a;

return std::move(a); // force move constructor

}

Obj

f5() {

static Obj a;

return a; // only copy constructor is possible

}

37/59

RVO Example - Where it does NOT work 3/3

Obj f6(Obj& a) {

return a; // copy constructor (a reference cannot be elided)

}

Obj

f7(const Obj& a) {

return a; // copy constructor (a reference cannot be elided)

}

Obj

f8(const Obj a) {

return a; // copy constructor (a const object cannot be elided)

}

Obj

f9(Obj&& a) {

return a; // copy constructor (the object is instantiated in the function)

}

38/59

Type Deduction

When you call a template function, you may omit any template argument that

the compiler can determine or deduce (inferred) by the usage and context of

that template function call [IBM]

• The compiler tries to deduce a template argument by comparing the type of the

corresponding template parameter with the type of the argument used in the func-

tion call

• Similar to function default parameters, (any) template parameters can be deduced

only if they are at end of the parameter list

Full Story: IBM Knowledge Center

39/59

Example

template<typename T>

int add1(T a, T b) { return a + b; }

template<typename T, typename R>

int add2(T a, R b) { return a + b; }

template<typename T, int B>

int add3(T a) { return a + B; }

template<int B, typename T>

int add4(T a) { return a + B; }

add1(1, 2); // ok

// add1(1, 2u); // the compiler expects the same type

add2(1, 2u); // ok (add2 is more generic)

add3<int, 2>(1); // "int" cannot be deduced

add4<2>(1); // ok

40/59

Type Deduction - Pass by-Reference

Type deduction with references

template<typename T>

void f(T& a) {}

template<typename T>

void g(const T& a) {}

int x = 3;

int& y = x;

const int& z = x;

f(x); // T: int

f(y); // T: int

f(z); // T: const int // <-- !! it works...but it does not

g(x); // T: int // for "f(int& a)"!!

g(y); // T: int // (only non-const references)

g(z); // T: int // <-- note the difference

41/59

Type Deduction - Pass by-Pointer 1/2

Type deduction with pointers

template<typename T>

void f(T* a) {}

template<typename T>

void g(const T* a) {}

int* x = nullptr;

const int* y = nullptr;

auto z = nullptr;

f(x); // T: int

f(y); // T: const int

// f(z); // compile error, z: "nullptr_t != T*"

g(x); // T: int

g(y); // T: int <-- note the difference

// g(z); // compile error, z: "nullptr_t != T*"

42/59

Type Deduction - Pass by-Pointer 2/2

template<typename T>

void f(const T* a) {} // pointer to const-values

template<typename T>

void g(T* const a) {} // const pointer

int* x = nullptr;

const int* y = nullptr;

int* const z = nullptr;

const int* const w = nullptr;

f(x); // T: int

f(y); // T: int

f(z); // T: int

g(x); // T: int

g(y); // T: const int

g(z); // T: int

g(w); // T: const int

43/59

Type Deduction - Pass by-Value 1/2

Type deduction with values

template<typename T>

void f(T a) {}

template<typename T>

void g(const T a) {}

int x = 2;

const int y = 3;

const int& z = y;

f(x); // T: int

f(y); // T: int!! (drop const)

f(z); // T: int!! (drop const&)

g(x); // T: int

g(y); // T: int

g(z); // T: int!! (drop reference)

44/59

Type Deduction - Pass by-Value 2/2

template<typename T>

void f(T a) {}

int* x = nullptr;

const int* y = nullptr;

int* const z = x;

f(x);

// T = int*

f(y); // T = const int*

f(z); // T = int* !! (const drop)

45/59

Type Deduction - Array

Type deduction with arrays

template<typename T, int N>

void f(T (&array)[N]) {} // type and size deduced

template<typename T>

void g(T array) {}

int x[3] = {};

const int y[3] = {};

f(x); // T: int, N: 3

f(y); // T: const int, N: 3

g(x); // T: int*

g(y); // T: const int*

46/59

Type Deduction - Conﬂicts 1/2

template<typename T>

void add(T a, T b) {}

template<typename T, typename R>

void add(T a, R b) {}

template<typename T>

void add(T a, char b) {}

add(

2, 3.0f); // call add(T, R)

add(2, 3); // call add(T, T)

add<int>(2, 3); // call add(T, T)

add<int, int>(2, 3); // call add(T, R)

add(2, 'b'); // call add(T, char) -> nearest match

47/59

Type Deduction - Conﬂicts 2/2

template<typename T, int N>

void f(T& array) {}

template<typename T>

void f(T* array) {}

int x[3];

f(x); // call f(T*) not f(T&) !!

template<typename T, int N>

void g(T& array) {}

template<typename T>

void g(T array) {}

int x[3];

g(x);

// call g(T) not g(T&) !!

48/59

auto Deduction

• auto x = copy by-value/by-const value

• auto& x = copy by-reference/by-const-refernce

•

auto* x = copy by-pointer/by-const-pointer

• auto&& x = copy by-universal reference

•

decltype(auto) x = automatic type deduction

int f1(int& x) { return x; }

int& f2(int& x) { return x; }

auto f3(int& x) { return x; }

decltype(auto) f4(int& x) { return x; }

int v = 3;

int x1 = f1(v);

int& x2 = f2(v);

// int& x3 = f3(v); // compile error 'x' is copied by-value

int& x4 = f4(v);

49/59

auto(x): Decay-copy 1/3

The problem: implement a function to remove the ﬁrst element of a container

template<typename T>

void pop_v1(T& x) {

std

::remove(x.begin(), x.end(), x.front()); // undefined behavior!!

}

This is undeﬁned b ehavior because

•

x.front() returns a reference

•

std::remove takes the element to remove by-const-reference

• std::remove modiﬁes the container, invalidating iterators and references. The

reference must not b e an element of the range [first, last)

50/59

auto(x): Decay-copy 2/3

Sub-optimal solutions:

template<typename T>

void pop_v2(T& x) {

auto tmp = x.front(); // lvalue copy

std::remove(x.begin(), x.end(), tmp); // ok

}

template<typename T>

void pop_v3(T& x) {

using R = std::decay_t<decltype(x.front())>; // verbose/non-trivial solution

std::remove(x.begin(), x.end(), R(x)); // ok, create a temporary (rvalue)

} // copy

// decltype(x.front()) -> retrieve the type of x.front()

// std::decay_t -> get the 'decay' type as pass by-value,

// e.g. 'const int' to 'int'

51/59

auto(x): Decay-copy 3/3

C++23 introduces auto(x) decay-copy utility to express the rvalue copy in a clear

way

template<typename T>

void pop_v4(T& x) {

std

::remove(x.begin(), x.end(), auto(x.front())); // ok, rvalue copy

} // equivalent to R(x)

auto(x): decay-copy in the language

52/59

const Correctness

const correctness refers to guarantee object/variable const consistency throughout

its lifetime and ensuring safety from unintentional modiﬁcations

References:

• Isocpp: const-correctness

• GotW: Const-Correctness

• Abseil: Meaningful ‘const’ in Function Declarations

• const is a contract

• Why const Doesn’t Make C Code Faster

• Constant Optimization?

53/59

Basic Rules 1/2

• const entities do not change their values at run-time. This does not imply that

they are evaluated at compile-time

• const T* is diﬀerent from T* const . The ﬁrst case means “the content does

not change”, while the later “the value of the pointer does not change”

• Pass by-const-value and by-value parameters imply the same function signature

• Return by-const-value and by-value have diﬀerent meaning

• const_cast can break const-correctness

54/59

Basic Rules 2/2

const and member functions:

• const member functions do not change the internal status of an object

•

mutable ﬁelds can be modiﬁed by a const member function (they should not

change the external view)

const and code optimization:

•

const keyword purpose is for correctness (type safety), not for performance

• const may provide performance advantages in a few cases, e.g. non-trivial copy

semantic

55/59

Function Declarations Example

void f(int);

void f(const int); // the declaration is exactly the same of

// "void f(int)"!!

void f(int*);

void f(const int*); // different declaration

void f(int&);

void f(const int&); // different declaration

int f();

// const int f(); // compile error conflicting declaration

56/59

const Return Example

const int const_value = 3;

const int& f2() { return const_value; }

// int& f1() { return const_value; } // WRONG

int f3() { return const_value; } // ok

struct A {

void f() { cout << "non-const"; }

void f() const { cout << "const"; }

};

const A getA() { return A{}; }

auto a = getA(); // "a" is a copy

a.f(); // print "non-const"

getA().f(); // print "const"

57/59

struct Example

struct A { // struct A_const { // equal to "const A"

int* ptr; // int* const ptr;

int value; // const int value;

}; // };

void f(A a) {

a.value

= 3;

a.ptr[

0] = 3;

}

void g(const A a) { // the same with g(const A&)

// a.value = 3; // compile error

a.ptr[0] = 3; // "const" does not apply to the "ptr" content!!

}

A a{new int[10]};

f(a); // ok

g(a); // compile error

58/59

Member Functions Example

struct A {

int value = 0;

int& f1() { return value; }

const int& f2() { return value; }

// int& f3() const { return value; } // compile error, const violation

const int& f4() const { return value; }

int f5() const { return value; } // ok, return by-copy

const int f6() const { return value; } // ok, return by-copy

};

59/59

Modern C++

Programming

21. Advanced Topics II

Federico Busato

2025-04-14

Table of Contents

1 Undeﬁned Behavior

Illegal Behavior

Platform Speciﬁc Behavior

Unspeciﬁed Behavior

Detecting Undeﬁned Behavior

1/57

Table of Contents

2 Error Handing

Recoverable Error Handing

Return Code

C++ Exceptions

Deﬁning Custom Exceptions

noexcept Keyword

Memory Allocation Issues

Return Code and Exception Summary

std::expected

Alternative Error Handling Approaches

2/57

Table of Contents

3 Smart pointers

std::unique_ptr

std::shared_ptr

std::weak_ptr

4 Concurrency

Thread Methods

Mutex

Atomic

Task-based parallelism

3/57

Undeﬁned Behavior

Undeﬁned Behavior Overview

Undeﬁned behavior means that the semantic of certain operations is

• Unspeciﬁed behavior : outside the language/library speciﬁcation, two or more choices

• Illegal: the compiler presumes that such operations never happen, e.g. integer overﬂow

• Implementation-deﬁned behavior: depends on the compiler and/or platform (not portable)

Motivations behind undeﬁned behavior:

• Compiler optimizations, e.g. signed overﬂow or NULL pointer dereferencing

• Simplify compile checks

• Unfeasible/expensive to check

• What Every C Programmer Should Know About Undefined Behavior, Chris Lattner

• What are all the common undefined behaviors that a C++ programmer should know

about?

• Enumerating Core Undefined Behavior

4/57

Illegal Behavior 1/3

• const_cast applied to a const variables

const int var = 3;

const_cast<int&>(var) = 4;

...

// use var

• Memory alignment

char* ptr = new char[512];

auto ptr2 = reinterpret_cast<uint64_t*>(ptr + 1);

ptr2[3]; // ptr2 is not aligned to 8 bytes (sizeof(uint64_t))

• Memory initialization

int var; // undefined value

auto var2 = new int; // undefined value

• Memory access-related: Out-of-bound access: the code could crash or not

depending on the platform/compiler

5/57

Illegal Behavior 2/3

• Strict aliasing

float x = 3;

auto y = reinterpret_cast<unsigned&>(x);

// x, y break the strict aliasing rule

• Lifetime issues

int* f() {

int tmp[10];

return tmp;

}

int* ptr = f();

ptr[0];

• One Deﬁnition Rule violation

- Diﬀerent deﬁnitions of inline functions in distinct translation units

6/57

Illegal Behavior 3/3

• Missing return statement

int f(float x) {

int y = x * 2;

}

• Dangling reference

int n = 1;

const int& r = std::max(n-1, n+1); // dangling

// GCC 13 experimental -Wdangling-reference (enabled by -Wall)

• Illegal arithmetic and conversion operations

- Division by zero 0 / 0 , fp_value / 0.0

- Floating-point to integer conversion

7/57

Platform Speciﬁc Behavior

• Memory access-related: NULL pointer dereferencing: the 0x0 address is valid

in some platforms

• Endianness

union U {

unsigned x;

char y;

};

• Type deﬁnition

long x = 1ul << 32u; // different behavior depending on the OS

• Intrinsic functions

8/57

Unspeciﬁed Behavior 1/2

Legal operations but the C++ standard does not document the result → diﬀerent

compilers/platforms can show diﬀerent behavior

• Signed shift of negative values

-2 ≪ x (before C++20), large-than-type shift

3u ≪ 32 , etc.

• Floating-point narrowing conversion to ﬂoating-point or integer types with

unrepresentable values

double → float , float → int

• Arithmetic operation ordering f(i++, i++)

• Function evaluation ordering

auto x = f() + g(); // C++ doesn't ensure that f() is evaluated before g()

9/57

Unspeciﬁed Behavior 2/2

• Signed overﬂow

for (int i = 0; i <= N; i++)

if N is INT_MAX , the last iteration is undeﬁned behavior. The compiler can assum e that

the loop is ﬁnite and enable important optimizations, as opposite to unsigned (wrap

around)

• Trivial inﬁnite loops, until

C++26

int main() {

while (true) // -> std::this_thread::yield(); in C++26

;

}

void unreachable() { cout << "Hello world!" << endl; }

the code print Hello world! with some clang versions

P2809R3: Trivial infinite loops are not Undefined Behavior

10/57

Detecting Undeﬁned Behavior

There are several ways to detect or prevent undeﬁned behavior at compile-time and at

run-time:

• Modify the compiler behavior, see Debugging and Testing: Hardening Techniques

• Using undeﬁned behavior sanitizer, see Debugging and Testing: Sanitizer

• Static analysis tools

•

constexpr expressions doesn’t allow undeﬁned behavior

constexpr int x1 = 2147483647 + 1; // compile error

constexpr int x2 = (1 << 32); // compile error

constexpr int x3 = (1 << -1); // compile error

constexpr int x4 = 3 / 0; // compile error

constexpr int x5 = *((int*) nullptr) // compile error

constexpr int x6 = 6

constexpr float x7 = reinterpret_cast<float&>(x6); // compile error

Exploring Undefined Behavior Using Constexpr

11/57

Error Handing

Recoverable Error Handing

Recoverable Conditions that are not under the control of the program. They indicate

“exceptional” run-time conditions. e.g. ﬁle not found, bad allocation, wrong user

input, etc.

A recoverable

should be considered unrecoverable if it is extremely rare and diﬃcult to

handle, e.g. bad allocation due to out-of-memory error

The common ways for handling

recoverable errors are:

Exceptions Robust but slower and requires more resources

Return code Fast but diﬃcult to handle in complex programs

12/57

Error Handing References

• Modern C++ best practices for exceptions and error handling

• Back to Basics: Exceptions - CppCon2020

• ISO C++ FAQ: Exceptions and Error Handling

• Zero-overhead deterministic exceptions: Throwing values, P0709

• C++ exceptions are becoming more and more problematic, P2544

• std::expected

• C++ Error Handling Strategies – Benchmarks and Performance

13/57

Return Code

Historically, C programs handled errors with return codes, even for unrecoverable errors

enum Status { IllegalValue, Success };

Status

f(int* ptr) { return (ptr == nullptr) ? IllegalValue : Success; }

Why such behavior? Debugging → need to understand what / where / why the

program failed

A better approach in C++ involves

std::source_location() C++20 and

std::stacktrace() C++23

ABI related issues:

• Removing an enumerator value is an API breaking change

• Adding a new enumerator value associated to a return type is also problematic as it

causes ABI breaking change

14/57

C++ Exceptions - Advantages

C++ Exceptions provide a well-deﬁned mechanism to detect errors passing the

information up the call stack

• Exceptions cannot be ignored. Unhandled exceptions stop program execution

(call

std::terminate() )

• Intermediate functions are not forced to handle them. They don’t have to

coordinate with other layers and, for this reason, they provide good composability

• Throwing an exception acts like a return statement destroying all objects in the

current scope

• An exception enables a clean separation between the code that detects the error

and the code that handles the error

• Exceptions work well with object-oriented semantic (constructor)

15/57

C++ Exceptions - Disadvantages 1/2

• Code readability: Using exception can involve more code than the functionality

itself

• Code comprehension: Exception control ﬂow is invisible and it is not explicit in

the function signature

• Performance: Extreme p erformance overhead in the failure case (violate the

zero-overhead principle)

• Dynamic behavior:

throw requires dynamic alloc ation and catch requires

RTTI. It is not suited for real-time, safety-critical, or embedded systems

• Code bloat: Exceptions could increase executable size by 5-15% (or more*)

*Binary size and exceptions

16/57

C++ Exceptions - Disadvantages 2/2

17/57

C++ Exception Basics

C++ provides three keywords for exception handling:

throw Throws an exception

try Code block containing potential throwing expressions

catch Code block for handling the exception

void f() { throw 3; }

int main() {

try {

f();

} catch (int x) {

cout << x; // print "3"

}

18/57

std Exceptions

throw can throw everything such as integers, pointers, objects, etc. The standard

way consists in using the std library exceptions

# include <stdexcept>

void f(bool b) {

if (b)

throw std::runtime_error("runtime error");

throw std::logic_error("logic error");

}

int main() {

try {

f(false);

}

catch (const std::runtime_error& e) {

cout << e.what();

} catch (const std::exception& e) {

cout << e.what(); // print: "logic error"

}

19/57

Exception Capture

NOTE: C++, diﬀerently from other programming languages, does not require explicit

dynamic allocation with the keyword

new for throwing an exception. The compiler

implicitly generates the appropriate code to construct and clean up the exception

object. Dynamically allocated objects require a

delete call

The right way to capture an exception is by

const -reference. Capturing by-value is

also possible but, it involves useless copy for non-trivial exception objects

catch(...) can be used to capture any thrown exception

int main() {

try {

throw "runtime error"; // throw const char*

} catch (...) {

cout << "exception"; // print "exception"

}

20/57

Exception Propagation

Exceptions are automatically propagated along the call stack. The user can also

control how they are propagated

int main() {

try {

...

}

catch (const std::runtime_error& e) {

throw e; // propagate a copy of the exception

} catch (const std::exception& e) {

throw; // propagate the exception

}

21/57

Deﬁning Custom Exceptions

# include <exception> // to not confuse with <stdexcept>

struct MyException : public std::exception {

const char* what() const noexcept override { // could be also "constexpr"

return "C++ Exception";

}

};

int main() {

try {

throw MyException();

} catch (const std::exception& e) {

cout << e.what(); // print "C++ Exception"

}

22/57

noexcept Keyword

C++03 allows listing the exceptions that a function might directly or indirectly throw,

e.g.

void f() throw(int, const char*) {

C++11 deprecates throw and introduces the noexcept keyword

void f1(); // may throw

void f2() noexcept; // does not throw

void f3() noexcept(true); // does not throw

void f4() noexcept(false); // may throw

template<bool X>

void f5() noexcept(X); // may throw if X is false

If a noexcept function throw an exception, the runtime calls std::terminate()

noexcept should be used when throwing an exception is impossible or unacceptable.

It is also useful when the function contains code outside user control, e.g. std

functions/objects

23/57

Function-try-block

Exception handlers can b e deﬁned around the body of a function.

The behavior is the same as using the try/catch blocks within the function scope

→ less verbose

void f() try {

...

// do something

} catch (const std::runtime_error& e) {

cout

<< e.what();

}

catch (...) { // other exception

...

}

24/57

Memory Allocation Issues 1/4

The new operator automatically throws an exception ( std::bad_alloc ) if it cannot

allocate the memory

delete never throws an exception (unrecoverable error)

int main() {

int* ptr = nullptr;

try {

ptr = new int[1000];

}

catch (const std::bad_alloc& e) {

cout

<< "bad allocation: " << e.what();

}

delete[] ptr;

}

25/57

Memory Allocation Issues 2/4

C++ also provides an overload of the new operator with non-throwing memory

allocation

# include <new> // std::nothrow

int main() {

int* ptr = new (std::nothrow) int[1000];

if (ptr == nullptr)

cout << "bad allocation";

}

26/57

Memory Allocation Issues 3/4

Throwing exceptions in constructors is ﬁne while it is not allowed in destructors

struct A {

A() { new int[10]; }

∼A() {

throw -2; }

};

int main() {

try {

A a;

// could throw "bad_alloc"

// "a" is out-of-scope -> throw 2

} catch (...) {

// two exceptions at the same time

}

Destructors should be marked

noexcept

27/57

Memory Allocation Issues 4/4

struct A {

int* ptr1, *ptr2;

A() {

ptr1

= new int[10];

ptr2

= new int[10]; // if bad_alloc here, ptr1 is lost

}

};

struct A {

std

::unique_ptr<int[]> ptr1, ptr2;

A() {

ptr1 = std::make_unique<int[]>(10);

ptr2

= std::make_unique<int[]>(10); // if bad_alloc here,

} // ptr1 is deallocated

};

28/57

Return Code and Exception Summary

Exception Return Code

Pros

• Cannot be ignored

• Work well with object-oriented semantic

•

Information: Exceptions can be arbitrarily rich

•

Clean code: Conceptually, clean separation

between the code that detects errors and the

code that handles the error, but. . . *

•

Non-Intrusive wrt. API: Proper communication

channel

• Visibility: prototype of the called function

• No performance overhead

• No code bloat

• Easy to debug

Cons

• Visibility: Not visible without further analysis of

the code or documentation

• Clean code: *... handling exception can generate

more code than the functionality itself

• Dynamic behavior: memory and RTTI

• Extreme performance overhead in the failure case

• Code bloat

• Non-trivial to debug

• Easy to ignore, [[nodiscard]] can help

• Cannot be used with object-oriented semantic

• Information: Historically, a simple integer.

Nowadays, richer error code

•

Clean code: At least, an if statement after

each function call

•

Non-Intrusive wrt. API: Monopolization of

the return channel

29/57

std::expected 1/2

C++23 introduces std::expected to get the best properties of return codes and

exceptions

The class template

expected<T, E> contains either:

• A value of type T , the expected value type; or

• A value of type

E , an error type used when an unexpected outcome occured

enum class Error { Invalid };

std

::expected<int, Error> f(int v) {

if (v > 0)

return 3;

return std::unexpected(Error::Invalid);

}

30/57

std::expected 2/2

The user chooses how to handle the error depending on the context

auto ret = f(n);

// Return code handling

if (!ret)

// error handling

int v = *ret + 3; // execute without checking

Exception handling

ret.value(); // throw an exception if there is a problem

// Monadic operations

auto lambda = [](int x) { return (x > 3) ? 4 : std::unexpected(Error::Invalid); };

ret.and_then(lambda) // pass the value to another function

.tranform([](int x) { return x + 4; };) // transform the previous value

.transform_error([](auto error_code){ /*error handling*/ };

31/57

Alternative Error Handling Approaches 1/2

• Global state, e.g. errno

- Easily forget to check for failures

- Error propagation using

if statements and early return is manual

- No compiler optimizations due to global state

• Simple error code, e.g.

int , enum , etc.

- Easily forget to check for failures (workaround [[nodiscard]] )

- Error propagation using if statements and early return is manual

- Potential error propagation through diﬀerent contexts and losing initial error

information

- Constructor errors cannot be handled

32/57

Alternative Error Handling Approaches 2/2

• std::error_code , standardized error code

- Easily forget to check for failures (workaround

[[nodiscard]] )

- Error propagation using

if statements and early return is manual

- Code bloating for adding new enumerators (see Your own error code)

- Constructor errors cannot be handled

• Supporting libraries, e.g. Boost Outcome, STX, etc.

- Require external dependencies

- Constructor errors cannot be handled in a direct way

- Extra logic for managing return values

33/57

Smart pointers

Smart Pointers

Smart pointer is a pointer-like type with some additional functionality, e.g. automatic

memory deallocation (when the pointer is no longer in use, the memory it points to is

deallocated), reference counting, etc.

C++11 provides three smart pointer types:

• std::unique_ptr

• std::shared_ptr

• std::weak_ptr

Smart pointers prevent most situations of memory leaks by making the memory

deallocation automatic

C++ Smart Pointers

34/57

Smart Pointers Beneﬁts

• If a smart pointer goes out-of-scope, the appropriate method to release resources

is called automatically. The memory is not left dangling

• Smart pointers will automatically be set to

nullptr if not initialized or when

memory has been released

•

std::shared_ptr provides automatic reference count

• If a special

delete function needs to be called, it will be speciﬁed in the pointer

type and declaration, and will automatically be called on delete

35/57

std::unique_ptr - Unique Pointer 1/4

std::unique_ptr is used to manage any dynamically allocated object that is not

shared by multiple objects

# include <iostream>

# include <memory>

struct A {

A() { std::cout << "Constructor\n"; } // called when A()

∼A() { std::cout << "Destructor\n"; } // called when u_ptr1,

}; // u_ptr2 are out-of-scope

int main() {

auto raw_ptr = new A();

std

::unique_ptr<A> u_ptr1(new A());

std::unique_ptr<A> u_ptr2(raw_ptr);

// std::unique_ptr<A> u_ptr3(raw_ptr); // no compile error, but wrong!! (not unique)

// u_ptr1 = raw_ptr; //

compile error (not unique)

// u_ptr1 = u_ptr2; // compile error (not unique)

u_ptr1 = std::move(u_ptr2); // delete u_ptr1;

} // u_ptr1 = u_ptr2;

// u_ptr2 = nullptr

36/57

std::unique_ptr - Unique Pointer 2/4

std::unique_ptr methods

•

get() returns the underlying pointer

•

operator* operator-> dereferences pointer to the managed object

•

operator[] provides indexed access to the stored array (if it supports random

access iterator)

•

release() returns a pointer to the managed object and releases the ownership

• reset(ptr) replaces the managed object with ptr

Utility method: std::make_unique<T>() creates a unique pointer to a class T that

manages a new object

37/57

std::unique_ptr - Unique Pointer 3/4

# include <iostream>

# include <memory>

struct A {

int value;

};

int main() {

std::unique_ptr<A> u_ptr1(new A());

u_ptr1->value; // dereferencing

(*u_ptr1).value; // dereferencing

auto u_ptr2 = std::make_unique<A>(); // create a new unique pointer

u_ptr1.reset(new A()); // reset

auto raw_ptr = u_ptr1.release(); // release

delete raw_ptr;

std::unique_ptr<A[]> u_ptr3(new A[10]);

auto& obj = u_ptr3[3]; // access

}

38/57

std::unique_ptr - Unique Pointer 4/4

Implement a custom deleter

# include <iostream>

# include <memory>

struct A {

int value;

};

int main() {

auto DeleteLambda = [](A* x) {

std

::cout << "delete" << std::endl;

delete x;

};

std::unique_ptr<A, decltype(DeleteLambda)>

x(new A(), DeleteLambda);

}

// print "delete"

39/57

std::shared_ptr - Shared Pointer 1/3

std::shared_ptr is the pointer type to be used for memory that can be owned by

multiple resources at one time

std::shared_ptr maintains a reference count of pointer objects. Data managed by

std::shared_ptr is only freed when there are no remaining objects pointing to the data

# include <iostream>

# include <memory>

struct A {

int value;

};

int main() {

std::shared_ptr<A> sh_ptr1(new A());

std::shared_ptr<A> sh_ptr2(sh_ptr1);

std::shared_ptr<A> sh_ptr3(new A());

sh_ptr3

= nullptr; // allowed, the underlying pointer is deallocated

// sh_ptr3 : zero references

sh_ptr2 = sh_ptr1; // allowed. sh_ptr1, sh_ptr2: two references

sh_ptr2 = std::move(sh_ptr1); // allowed // sh_ptr1: zero references

} // sh_ptr2: one references

40/57

std::shared_ptr - Shared Pointer 2/3

std::shared_ptr methods

•

get() returns the underlying pointer

•

operator* operator-> dereferences pointer to the managed object

• use_count() returns the number of objects referring to the same managed

object

•

reset(ptr) replaces the managed object with ptr

Utility method:

std::make_shared() creates a shared pointer that manages a new

object. It is more eﬃcient than using the

std::shared_ptr constructors because it

performs a single memory allocation instead of two

Difference in make_shared and normal shared_ptr in C++

41/57

std::shared_ptr - Shared Pointer 3/3

# include <iostream>

# include <memory>

struct A {

int value;

};

int main() {

std::shared_ptr<A> sh_ptr1(new A());

auto sh_ptr2 = std::make_shared<A>(); // std::make_shared

std::cout << sh_ptr1.use_count(); // print 1

sh_ptr1 = sh_ptr2; // copy

// std::shared_ptr<A> sh_ptr2(sh_ptr1); // copy (constructor)

std::cout << sh_ptr1.use_count(); // print 2

std::cout << sh_ptr2.use_count(); // print 2

auto raw_ptr = sh_ptr1.get(); // get

sh_ptr1.reset(new A()); // reset

(*sh_ptr1).value = 3; // dereferencing

sh_ptr1->value = 2; // dereferencing

}

42/57

std::weak_ptr - Weak Pointer 1/3

A std::weak_ptr is simply a std::shared_ptr that is allowed to dangle (pointer

not deallocated)

# include <memory>

std::shared_ptr<int> sh_ptr(new int);

std::weak_ptr<int> w_ptr = sh_ptr;

sh_ptr

= nullptr;

cout

<< w_ptr.expired(); // print 'true'

43/57

std::weak_ptr - Weak Pointer 2/3

It must be converted to std::shared_ptr in order to access the referenced object

std::weak_ptr methods

•

use_count() returns the number of objects referring to the same managed

object

•

reset(ptr) replaces the managed object with ptr

• expired() checks whether the referenced object was already deleted (true,

false)

• lock() creates a std::shared_ptr that manages the referenced object

44/57

std::weak_ptr - Weak Pointer 3/3

# include <memory>

auto sh_ptr1 = std::make_shared<int>();

cout

<< sh_ptr1.use_count(); // print 1

std::weak_ptr<int> w_ptr = sh_ptr1;

cout

<< w_ptr.use_count(); // print 1

auto sh_ptr2 = w_ptr.lock();

cout

<< w_ptr.use_count(); // print 2 (sh_ptr1 + sh_ptr2)

sh_ptr1 = nullptr;

cout << w_ptr.expired(); // print false

sh_ptr2 = nullptr;

cout << w_ptr.expired(); // print true

45/57

Concurrency

Overview

C++11 introduces the Concurrency library to simplify managing OS threads

# include <iostream>

# include <thread>

void f() {

std

::cout << "first thread" << std::endl;

}

int main(){

std::thread th(f);

th.join();

// stop the main thread until "th" complete

}

How to compile:

$g++ -std=c++11 main.cpp -pthread

46/57

Example

# include <iostream>

# include <thread>

# include <vector>

void f(int id) {

std::cout << "thread " << id << std::endl;

}

int main() {

std::vector<std::thread> thread_vect; // thread vector

for (int i = 0; i < 10; i++)

thread_vect.push_back( std::thread(&f, i) );

for (auto& th : thread_vect)

th.join();

thread_vect.clear();

for (int i = 0; i < 10; i++) { // thread + lambda expression

thread_vect.push_back(

std::thread( [](){ std::cout << "thread\n"; } );

}

47/57

Thread Methods 1/2

Library methods:

•

std::this_thread::get_id() returns the thread id

•

std::thread::sleep_for( sleep_duration )

Blocks the execution of the current thread for at least the speciﬁed sleep_duration

•

std::thread::hardware_concurrency() returns the number of concurrent threads

supported by the implementation

Thread object methods:

•

get_id() returns the thread id

• join() waits for a thread to ﬁnish its execution

•

detach() permits the thread to execute independently of the thread handle

48/57

Thread Methods 2/2

# include <chrono> // the following program could

# include <iostream> // produces the output (not deterministic):

# include <thread> // "child thread exit" (t_child < t_main)

// "main thread exit"

int main() {

using namespace std::chrono_literals;

std::cout << std::this_thread::get_id();

std::cout << std::thread::hardware_concurrency(); // e.g. print 6

auto lambda = []() {

std

::this_thread::sleep_for(1s); // t_child

std::cout << "child thread exit\n";

};

std

::thread child(lambda);

child.detach(); // without detach(), child must join() the

// main thread (run-time error otherwise)

std::this_thread::sleep_for(2s); // t_main

std::cout << "main thread exit\n";

}

49/57

Parameters Passing

Parameters passing by-value or by-pointer to a thread function works in the same way

of a standard function. Pass-by-reference requires a special wrapper (

std::ref ,

std::cref ) to avoid wrong behaviors

# include <thread>

void f(int& a, const int& b) {

a = 7 * b;

}

int main() {

int a = 1, b = 2;

std

::thread th1(f, a, b); // wrong!!!

th1.join();

cout << a << endl; // print 2!!

std::thread th2(f, std::ref(a), std::cref(b)); // correct

th2.join();

cout << a << endl; // print 49!!

}

50/57

Mutex (The Problem) 1/4

The following code produces (in general) a value < 1000:

# include <chrono>

# include <iostream>

# include <thread>

# include <vector>

void f(int& value) {

for (int i = 0; i < 10; i++) {

value

++;

std::this_thread::sleep_for(std::chrono::milliseconds(10));

}

int main() {

int value = 0;

std::vector<std::thread> th_vect;

for (int i = 0; i < 100; i++)

th_vect.push_back( std

::thread(f, std::ref(value)) );

for (auto& it : th_vect)

it.join();

std::cout << value;

}

51/57

Mutex 2/4

C++11 provides the mutex class as synchronization primitive to protect shared data

from being simultaneously accessed by multiple threads

mutex methods:

•

lock() locks the mutex, blocks if the mutex is not available

• try_lock() tries to lock the mutex, returns if the mutex is not available

•

unlock() unlocks the mutex

More advanced mutex can be found here: en.cppreference.com/w/cpp/thread

C++ includes three mutex wrappers to provide safe copyable/movable objects:

•

lock_guard (C++11) implements a strictly scope-based mutex ownership

wrapper

• unique_lock (C++11) implements movable mutex ownership wrapper

• shared_lock (C++14) implements movable shared mutex ownership wrapper

52/57

Mutex - Example 1 3/4

# include <mutex>

# include <thread> // + iostream, vector, chrono

void f(int& value, std::mutex& m) {

for (int i = 0; i < 10; i++) {

m.lock();

value

++; // other threads must wait

m.unlock();

std::this_thread::sleep_for(std::chrono::milliseconds(10));

}

int main() {

std

::mutex m;

int value = 0;

std::vector<std::thread> th_vect;

for (int i = 0; i < 100; i++)

th_vect.push_back( std::thread(f, std::ref(value), std::ref(m)) );

for (auto& it : th_vect)

it.join();

cout << value;

}

53/57

Mutex - Example 2 4/4

# include <mutex>

# include <thread> // + iostream, vector, chrono

void f(int& value, std::mutex& m) {

for (int i = 0; i < 10; i++) {

{

const std::lock_guard<std::mutex> lock(m);

value++; // other threads must wait

}

std::this_thread::sleep_for(std::chrono::milliseconds(10));

}

int main() {

std::mutex m;

int value = 0;

std

::vector<std::thread> th_vect;

for (int i = 0; i < 100; i++)

th_vect.push_back( std::thread(f, std::ref(value), std::ref(m)) );

for (auto& it : th_vect)

it.join();

cout

<< value;

}

54/57

Atomic

std::atomic (C++11) class template deﬁnes an atomic type that are implemented

with lock-free operations (much faster than locks)

# include <atomic> // chrono, iostream, thread, vector

void f(std::atomic<int>& value) {

for (int i = 0; i < 10; i++) {

value++;

std::this_thread::sleep_for(std::chrono::milliseconds(10));

}

int main() {

std::atomic<int> value(0);

std::vector<std::thread> th_vect;

for (int i = 0; i < 100; i++)

th_vect.push_back( std

::thread(f, std::ref(value)) );

for (auto& it : th_vect)

it.join();

std::cout << value; // print 1000

}

55/57

Task-based parallelism 1/2

The future library provides facilities to obtain values that are returned and to catch

exceptions that are thrown by asynchronous tasks

Asynchronous call: std::future async(function, args...)

runs a function asynchronously (p otentially in a new thread)

and returns a std::future object that will hold the result

std::future methods:

• T get() returns the result

•

wait() waits for the result to become available

async() can be called with two launch policies for a task executed:

• std::launch::async a new thread is launched to execute the task asynchronously

•

std::launch::deferred the task is executed on the calling thread the ﬁrst time its

result is requested (lazy evaluation)

56/57

Task-based parallelism 2/2

# include <future> // numeric, algorithm, vector, iostream

template <typename RandomIt>

int parallel_sum(RandomIt beg, RandomIt end) {

auto len = end - beg;

if (len < 1000) // base case

return std::accumulate(beg, end, 0);

RandomIt mid

= beg + len / 2;

auto handle = std::async(std::launch::async, // right side

parallel_sum<RandomIt>, mid, end);

int sum = parallel_sum(beg, mid); // left side

return sum + handle.get(); // left + right

}

int main() {

std::vector<int> v(10000, 1); // init all to 1

std::cout << "The sum is " << parallel_sum(v.begin(), v.end());

}

57/57

Modern C++

Programming

22. Performance Optimization I

Basic Concepts

Federico Busato

2025-04-14

Table of Contents

1 Introduction

Moore’s Law

Moore’s Law Limitations

Reasons for Optimizing

1/65

Table of Contents

2 Basic Concepts

Asymptotic Complexity

Time-Memory Trade-oﬀ

Developing Cycle

Ahmdal’s Law

Throughput, Bandwidth, Latency

Performance Bounds

Arithmetic Intensity

2/65

Table of Contents

3 Basic Architecture Concepts

Instruction Throughput (IPC), In-Order, and Out-of-Order Execution

Instruction Pipelining

Instruction-Level Parallelism (ILP)

Little’s Law

Data-Level Parallelism (DLP) and Vector Instructions (SIMD)

Thread-Level Parallelism (TLP)

Single Instruction Multiple Threads (SIMT)

RISC, CISC Instruction Sets

3/65

Table of Contents

4 Memory Concepts

Memory Hierarchy Concepts

Memory Locality

Core-to-Core Latency and Thread Aﬃnity

Memory Ordering Model

4/65

Introduction

Performance and Technological Progress

5/65

Performance and Technological Progress

6/65

Moore’s Law 1/2

“The number of transistors incorporated in a chip will approximately

double every 24 months.” (40% per year)

Gordon Moore, Intel co-founder

7/65

Moore’s Law 2/2

The Moore’s Law is not (yet) dead, but the same concept is not true for clock

frequency, single-thread performance, power consumption, and cost

8/65

Single-Thread Performance Trend

9/65

Moore’s Law Limitations 1/3

Higher performance over time is not merely dictated by the number of transistors.

Speciﬁc hardware improvements, software engineering, and algorithms play a crucial

rule in driving the computer performance.

10/65

Moore’s Law Limitations - Some Examples 2/3

Specialized Hardware

Reduced precision, matrix multiplication engine, and sparsity provided orders

of magnitude performance improvement for AI applications

Forget Moore’s Law. Algorithms drive technology forward

“Algorithmic improvements make more eﬃcient use of existing resources and allow

computers to do a task faster, cheaper, or both. Think of how easy the smaller MP3

format made music storage and transfer. That compression was because of an algorithm.”

• There’s plenty of room at the Top: What will drive computer performance after

Moore’s law?

• Forget Moore’s Law

• Heeding Huang’s Law

11/65

Moore’s Law Limitations - Some Examples 3/3

Poisson’s equation solver on a cube of size N = n

Year Method Reference Storage Complexity

1947 GE (banded) Von Neumann & Goldstine n

→ n

1950 Optimal SOR Reid n

log n

1971 CG Young n

3.5

log n

1984 MG Brandt n

→ n

Tile Low-rank Methods and Applications, David Keyes

12/65

Reasons for Optimizing

• In the ﬁrst decades, the computer performance was extremely limited. Low-level

optimizations were essential to fully exploit the hardware

• Modern systems provide much higher performance, but we cannot more rely on

hardware improvement on short-period

• Performance and eﬃciency add market value (fast program for a given task), e.g.

search, page loading, etc.

• Optimized code uses less resources, e.g. in a program that runs on a server for

months or years, a small reduction in the execution time/power consumption

translates in a big saving of power consumption

13/65

Software Optimization is Complex

from ”Speed is Found in the Minds of People“,

Andrei Alexandrescu, CppCon 2019

14/65

Optimization Books

Hacker’s Delight (2nd)

H. S. Warren, 2016

Optimized C++

K. Guntheroth, 2014

15/65

References

• Awesome C/C++ performance optimization resources, Bartlomiej Filipek

• Optimizing C++, wikibook

• Optimizing software in C++, Agner Fog

• Algorithmica: Algorithms for Modern Hardware

• What scientists must know about hardware to write fast code

Figure references

• A Look Back at Single-Threaded CPU Performance

• Herb Sutter, The Free Lunch Is Over

• Genomic Analysis at Scale: Mapping Irregular Computations to Advanced

Architectures

• microprocessor-trend-data

• What is Moore’s Law?

16/65

Basic Concepts

Asymptotic Complexity 1/2

The asymptotic analysis refers to estimate the execution time or memory usage as

function of the input size (the order of growing)

The asymptotic behavior is opposed to a low-level analysis of the code

(instruction/loop counting/weighting, cache accesses, etc.)

Drawbacks:

• The worst-case is not the average-case

• Asymptotic complexity does not consider small inputs (think to insertion sort)

• The hidden constant can be relevant in practice

• Asymptotic complexity does not consider instructions cost and hardware details

17/65

Asymptotic Complexity 2/2

Be aware that only real-world problems with a small asymptotic complexity or small

size can be solved in a “user” acceptable time

Three examples:

• Sorting: O (n log n), try to sort an array of some billion elements

• Diameter of a (sparse) graph: O





, just for graphs with a few hundred

thousand vertices it becomes impractical without advanced techniques

• Matrix multiplication: O





, even for small sizes N (e.g. 8K, 16K), it requires

special accelerators (e.g. GPU, TPU, etc.) for achieving acceptable performance

18/65

Time-Memory Trade-oﬀ

The time-memory trade-oﬀ is a way of solving a problem or calculation in less time

by using more storage space (less often the opposite direction)

Examples:

• Memoization (e.g. used in dynamic programming): returning the cached result

when the same inputs occur again

• Hash table: number of entries vs. eﬃciency

• Lookup tables: precomputed data instead branches

• Uncompressed data: bitmap image vs. jpeg

19/65

Developing Cycle 1/3

“If you’re not writing a program, don’t use a programming language”

Leslie Lamport, Turing Award

“First solv e the problem, then write the code”

“Inside every large program is an algorithm trying to get out”

Tony Hoare, Turing Award

“Premature optimi zation is the root of all evil”

Donald Knuth, Turing Award

“Code for correctness ﬁrst, then optimize!”

20/65

Developing Cycle 2/3

21/65

Developing Cycle 3/3

• One of the most important phase of the optimization cycle is the

application proﬁling for ﬁnding regions of code that are critical for

performance (hotspot)

→ Expensive code region (absolute)

→ Code regions executed many times (cumulative)

• Most of the time, there is no the perfect algorithm for all cases (e.g.

insertion, m erge, radix sort). Optimizing also refers in ﬁnding the correct

heuristics for diﬀerent program inputs/platforms instead of modifying the

existing code

22/65

Ahmdal’s Law 1/3

Ahmdal’s Law

The Ahmdal’s law expresses the maximum improvement possible by improving a

particular part of a system

Observation: The performance of any system is constrained by the speed of the

slowest point

S : improvement factor expressed as a factor of P

23/65

Ahmdal’s Law 2/3

Overall Improvement =

(1 − P) +

P \ S 25% 50% 75% 2x 3x 4x 5x 10x ∞

10% 1.02x 1.03x 1.04x 1.05x 1.07x 1.08x 1.09x 1.10x 1.11x

20% 1.04x 1.07x 1.09x 1.11x 1.15x 1.18x 1.19x 1.22x 1.25x

30% 1.06x 1.11x 1.15x 1.18x 1.25x 1.29x 1.31x 1.37x 1.49x

40% 1.09x 1.15x 1.20x 1.25x 1.36x 1.43x 1.47x 1.56x 1.67x

50% 1.11x 1.20x 1.27x 1.33x 1.50x 1.60x 1.66x 1.82x 2.00x

60% 1.37x 1.25x 1.35x 1.43x 1.67x 1.82x 1.92x 2.17x 2.50x

70% 1.16x 1.30x 1.43x 1.54x 1.88x 2.10x 2.27x 2.70x 3.33x

80% 1.19x 1.36x 1.52x 1.67x 2.14x 2.50x 2.78x 3.57x 5.00x

90% 1.22x 1.43x 1.63x 1.82x 2.50x 3.08x 3.57x 5.26x 10.00x

24/65

Ahmdal’s Law 3/3

note: s is the portion of the system that cannot be improved

25/65

Throughput, Bandwidth, Latency

The throughput is the rate at which operations are performed

Peak throughput:

(CPU speed in Hz) x (CPU instructions per cycle) x

(number of CPU cores) x (number of CPUs per node)

NOTE: modern processors have more than one computation unit

The memory bandwidth is the amount of data that can be loaded from or stored into

a particular memory space

Peak bandwidth:

(Frequency in Hz) x (Bus width in bit / 8) x (Pump rate, memory type multiplier)

The latency is the amount of time needed for an operation to complete

26/65

Performance Bounds 1/2

The performance of a program is bounded by one or more aspects of its computation.

This is also strictly related to the underlying hardware

• Memory-bound. The program spends its time primarily in performing memory

accesses. The performance is limited by the memory bandwidth (rarely

memory-bound also refers to the amount of memory available)

• Compute-bound (Math-bound). The program spends its time primarily in

computing arithmetic instructions. The performance is limited by the speed of the

CPU

27/65

Performance Bounds 2/2

• Latency-bound. The program spends its time primarily in waiting the data are

ready (instruction/memory dependencies). The performance is limited by the

latency of the CPU/memory

• I/O Bound. The program spends its time primarily in performing I/O operations

(network, user input, storage, etc.). The performance is limited by the speed of

the I/O subsystem

28/65

Arithmetic Intensity 1/6

Arithmetic Intensity

Arithmetic/Operational Intensity is the ratio of total operations to total data

movement (bytes or words)

The arithmetic intensity is a fundamental metric to understand the performance

limitations of a system, namely compute-bound or memory-bound

The rooﬂine model uses the arithmetic intensity to visually assess the performance of

a system and the algorithms/implementations that execute on it

29/65

Arithmetic Intensity - Ideal Matrix Multiplication 2/6

The naive matrix multiplication algorithm requires N

· 2 ﬂoating-point operations*

(multiplication + addition) and operates on



· 4B



· 3 data

Considering an ideal system, where each matrix

entry is accessed only once, and

float data

type

R =

ops

bytes

12N

which means that for every byte accessed, the

algorithm performs

operations → compute-

bound

* What Is a Flop?

30/65

Arithmetic Intensity - Real Matrix Multiplication, Basic Algorithm 3/6

Assuming N a large value (N ∗ N ≫ cache size), the basic algorithm is equivalent to a

dot product for each entry of the output matrix. The algorithm performs 2N

operations and involves N

∗ 4B data movement (excluding storing the results on C)

for (int i = 0; i < N; i++) {

for (int j = 0; j < N; j++) {

float sum = 0;

for (int k = 0; k < N; k++)

sum += A[i][k] * B[k][j]; // row-major order

C[i][j] = sum;

}

ops

bytes

12N

→ memory-bound

31/65

Arithmetic Intensity - Real Matrix Multiplication, Blocked Algorithm 4/6

One of the main optimizations in matrix multiplication is to organize the computation

by partitioning the matrices into blocks (or tiles). The primary goal is to take

advantage of the memory hierarchy to improve data locality

While blocked matrix multiplication doesn’t change the number of operations, it

signiﬁcantly reduces data movement out of main memory

32/65

Arithmetic Intensity - Real Matrix Multiplication, Blocked Algorithm 5/6

By selecting blocks of optimal size, we can reduce the data movement by a factor

proportional to the block size. The computation can be viewed as a sequence of dot

products, one for each block in the output matrix

Considering an optimal block size B to fully

exploit the caches

ops

bytes





→ compute-bound

33/65

Arithmetic Intensity - Performance Numbers 6/6

N Operations Data Movement Ratio Exec. Time

512 268 · 10

3 MB 85 2 ms

1024 2 · 10

12 MB 170 21 ms

2048 17 · 10

50 MB 341 170 ms

4096 137 · 10

201 MB 682 1.3 s

8192 1 · 10

806 MB 1365 11 s

16384 9 · 10

3 GB 2730 90 s

A modern CPU performs 100 GFlops, and has about 50 GB/s memory bandwidth

34/65

Basic Architecture

Concepts

Instruction Throughput (IPC), In-Order, and Out-of-Order Execution

The processor throughput, namely the number of instructions that can be executed in

a unit of time, is measured in Instruction per Cycle (IPC).

It is worth noting that most instructions require multiple clock cycles (Cycles Per

Instruction, CPI). Therefore improving the IPC requires advanced hardware support

In-Order Execution (IOE) refers to the sequential processing of instructions in the

exact order they appear in the program

Out-of-Order Execution (OOE) refers to the execution of instructions based on the

availability of input data and execution units, rather than their original order in a

program executed in a unit of time

35/65

Instruction Pipeling 1/3

Out-of-order execution on a scalar processor (single instruction at a time) is

implemented through instruction pipeling which consists in dividing instructions into

stages performed by diﬀerent processor units, allowing diﬀerent parts of instructions to

be processed in parallel

Instruction pipeling breaks up the processing of instructions into several steps, allowing

the processor to avoid stalls that occur when the data needed to execute an instruction

is not immediately available. The processor avoid stalls by ﬁlling slots with other

instructions that are ready

36/65

Instruction Pipeling 2/3

Fetch: The processor retrieves an instruction from memory

Decode: Instruction interpretation and preparation for execution, determining

what operations it calls for

Execute: The processor carries out the instruction

Memory Access: Reading from or writing to memory (if needed)

Write-back: The results of the instruction execution are written back to the

processor’s registers or memory

37/65

Instruction Pipeling 3/3

Microarchitecture

Pipeline

stages

Core 14

Bonnell 16

Sandy Bridge 14

Silvermont 14 to 17

Haswell 14

Skylake 14

Kabylake 14

The pipeline eﬃciency is aﬀected by

• Instruction stalls, e.g. cache miss, an execution unit not available, etc.

• Bad speculation, branch misprediction

38/65

Instruction-Level Parallelism (ILP) 1/2

A superscalar processor is a type of microprocessor architecture that allows for the

execution of multiple instructions in parallel during a single clock cycle. This is

achieved by incorporating multiple execution units within the processor

The concept should not be confused with instruction pipelining, which decompose the

instruction processing in stages. Modern processors combine both techniques to

improve the IPC

Instruction-Level Parallelism (ILP) is a measure of how many instructions in a

program can be executed simultaneously by issuing independent instructions in

sequence.

ILP is achieved with out-of-order execution or with the SIMT programming model (see

next slides)

39/65

Instruction-Level Parallelism (ILP) 2/2

for (int i = 0; i < N; i++) // with no optimizations, the loop

C[i] = A[i] * B[i]; // is executed in sequence

can be rewritten as:

for (int i = 0; i < N; i += 4) { // four independent multiplications

C[i] = A[i] * B[i]; // per iteration

C[i + 1] = A[i + 1] * B[i + 1]; // A, B, C are not alias

C[i + 2] = A[i + 2] * B[i + 2];

C[i + 3] = A[i + 3] * B[i + 3];

}

40/65

Instruction-Level Parallelism and Little’s Law

The Little’s Law expresses the relation between latency and throughput. The

throughput of a system λ is equal to the number of elements in the system divided by

the average time spent (latency ) W for each element in the system:

L = λW → λ =

• L: average number of customers in a store

• λ: arrival rate (throughput)

• W : average time spent (latency)

41/65

Data-Level Parallelism (DLP) and Vector Instructions (SIMD)

Data-Level Parallelism (DLP) refers to the execution of the same operation on

multiple data in parallel

Vector processors or array processors provide SIMD (Single Instruction-Multiple Data)

or vector instructions for exploiting data-level parallelism

The popular vector instruction sets are:

MMX MultiMedia eXtension. 80-bit width (Intel, AMD)

SSE (SSE2, SSE3, SSE4) Streaming SIMD Extensions. 128-bit width (Intel, AMD)

AVX (AVX, AVX2, AVX-512) Advanced Vector Extensions. 512-bit width (Intel, AMD)

NEON Media Processing Engine. 128-bit width (ARM)

SVE (SVE, SVE2) Scalable Vector Extension. 128-2048 bit width (ARM)

42/65

Thread-Level Parallelism (TLP)

A thread is a single sequential execution ﬂow within a program with its state

(instructions, data, PC, register state, and so on)

Thread-level parallelism (TLP) refers to the execution of separate computation

“thread” on diﬀerent processing units (e.g. CPU cores)

43/65

Single Instruction Multiple Threads (SIMT)

An alternative approach to the classical data-level parallelism is Single Instruction

Multiple Threads (SIMT), where multiple threads execute the same instruction

simultaneously, with each thread operating on diﬀerent data.

GPUs are successful examples of SIMT architectures.

SIMT can be thought of as an evolution of SIMD (Single Instruction Multiple Data).

SIMD requires that all data processed by the instruction be of the same type and

requires no dependencies or inter-thread communication. On the other hand, SIMT is

more ﬂexible and does not have these restrictions. Each thread has access to its own

memory and can operate independently.

44/65

RISC, CISC Instruction Sets

The Instruction Set Architecture (ISA) is an abstract model of the CPU to

represent its behavior. It consists of addressing modes, instructions, data types,

registers, memory architecture, interrupt, etc.

It does not deﬁne how an instruction is processed

The microarchitecture (µarch) is the implementation of an ISA which includes

pipelines, caches, etc.

45/65

CISC

Complex Instruction Set Computer (CISC)

• Complex instructions for special tasks even if used infrequently

• Assembly instructions follow software. Little compiler eﬀort for translating

high-level language into assembly

• Initially designed for saving cost of computer memory and disk storage (1960)

• High number of instructions with diﬀerent size

• Instructions require complex micro-ops decoding (translation) for exploiting ILP

• Multiple low-level instructions per clock but with high latency

Hardware implications

• High number of transistors

• Extra logic for decoding. Heat dissipation

• Hard to scale

46/65

RISC

Reduced Instruction Set Computer (RISC)

• Simple instructions

• Small number of instructions with ﬁxed size

• 1 clock per instruction

• Assembly instructions does not follow software

• No instruction decoding

Hardware implications

• High ILP, easy to schedule

• Small number of transistors

• Little power consumption

• Easy to scale

47/65

Instruction Set Comparison

x86 Instruction set

MOV AX, 15; AH = 0x00, AL = 0xFh

AAA; AH = 0x01, AL = 0x05

RET

ARM Instruction set

MOV R3, #0x10

AND R2, R0, #0xF

CMP R2, R3

IT LT

BLT elsebranch

ADD R2, #0x6

ADD R1, #0x1

elsebranch:

END

48/65

CISC vs. RISC

• Hardware market:

- RISC (ARM, IBM): Qualcomm Snapdragon, Amazon Graviton, Nvidia Grace,

Nintendo Switch, Fujitsu Fukaku, Apple M1, Apple Iphone/Ipod/Mac, Tesla

Full Self-Driving Chip, PowerPC

- CISC (Intel, AMD): all x86-64 processors

• Software market:

- RISC: Android, Linux, Apple OS, Windows

- CISC: Windows, Linux

• Power consumption:

- CISC: Intel i5 10th Generation: 64W

- RISC: Arm-based smartphone < 5W

49/65

ARM Quote

“Incidentally, the ﬁrst ARM1 chips required so little power, when the

ﬁrst one from the factory was plugged into the development system to

test it, the microprocessor immediately sprung to life by drawing current

from the IO interface – before its own power supply could be properly

connected.”

Happy birthday, ARM1. It is 35 years since Britain’s Acorn RISC Machine chip

sipped power for the first time

50/65

Memory Concepts

The Von Neumann Bottleneck

Access to memory dominates other costs in a processor

51/65

The Von Neumann Bottleneck

The eﬃciency of computer architectures is limited by the Memory Wall

problem, namely the memory is the slowest part of the system

Moving da ta to and from main memory consumes the vast majority of time and

energy of the system

52/65

Memory Hierarchy 1/5

53/65

Memory Hierarchy 2/5

Modern architectures rely on complex memory hierarchy (primary memory, caches,

registers, scratchpad memory, etc.). Each level has diﬀerent characteristics and

constraints (size, latency, bandwidth, concurrent accesses, etc.)

1 byte of RAM (1946) IBM 5MB hard drive (1956)

twitter.com/MIT_CSAIL

54/65

Memory Hierarchy 3/5

Source:

“Accelerating Linear Algebra on Small Matrices from Batched BLAS to Large Scale Solvers",

ICL, University of Tennessee

55/65

Memory Hierarchy 4/5

Intel Alder Lake 12th-gen Core-i9-12900k (Q1’21) + DDR4-3733 example:

Hierarchy level Size Latency

Latency

Ratio

Bandwidth

Ratio

L1 cache 192 KB 1 ns 1.0x 1,600 GB/s 1.0x

L2 cache 1.5 MB 3 ns 3x 1,200 GB/s 1.3x

L3 cache 12 MB 6 - 20 ns 6-20x 900 GB/s 1.7x

DRAM / 50 - 90 ns 50-90x 80 GB/s 20x

SDD Disk (swap) / 70µs 10

x 2 GB/s 800x

HDD Disk (swap) / 10 ms 10

x 2 GB/s 800x

• en.wikichip.org/wiki/WikiChip

• Memory Bandwidth Napkin Math

56/65

Memory Hierarchy 5/5

“Thinking diﬀerently about memory accesses, a good start is to get rid of the

idea of O(1) memory access and replace it with O

√

N” - The Myth of RAM

Algorithmica: Memory Latency

57/65

Memory Hierarchy Concepts 1/4

A cache is a small and f ast memory located close to the processor that stores

frequently used instructions and data. It is part of the processor package and takes 40

to 60 percent of the chip area

Characteristics and content:

Registers Program counter (PC), General purpose registers, Instruction Register

(IR), etc.

L1 Cache Instruction cache and data cache, private/exclusive per CPU core,

located on-chip

L2 Cache Private/exclusive per single CPU core or a cluster of cores, located

oﬀ-chip

L3 Cache Shared between all cores and located oﬀ-chip (e.g. motherboard), up to

128/256MB

58/65

Memory Hierarchy Concepts 2/4

59/65

Memory Hierarchy Concepts 3/4

A cache line or cache block is the unit of data transfer between the cache and main

memory, namely the memory is loaded at the granularity of a cache line. A cache line

can be further organized in banks or sectors

The typical size of the cache line is 64 bytes on x86-64 architectures (Intel, AMD),

while it is 128 bytes on Arm64

Cache access type:

Hot Closest-processor cached, L1

Warm L2 or L3 caches

Cold First load, cache empty

60/65

Memory Hierarchy Concepts 4/4

• A cache hit occurs when a requested data is successfully found in the cache

memory

• The cache hit rate is the number of cache hits divided by the numb er of memory

requests

• A cache miss occurs when a requested data is not found in the cache memory

• The miss penalty refers to the extra time required to load the data into cache

from the main memory when a cache miss occurs

• A page fault occurs when a requested data is in the process address space, but it

is not currently located in the main memory (swap/pageﬁle)

• Page thrashing occurs when page faults are frequent and the OS spends

signiﬁcant time to swap data in and out the physical RAM

61/65

Memory Locality

• Spatial Locality refers to the use of data elements within

relatively close storage locations e.g. scan arrays in increasing order, matrices by

row. It involves mechanisms such as memory prefetching and access granularity

When spatial locality is low, many words in the cache line are not used

• Temporal Locality refers to the reuse of the same data within a relatively

small-time duration, and, as consequence, exploit lower le vels of the memory

hierarchy (caches), e.g. multiple sparse accesses

Heavily used memory locations can be accessed more quickly than less heavily

used locations

62/65

Core-to-Core Latency

The slowing of Moore’s Law and the collapse of Dennard scaling necessitated the

hierarchical organization of caches and processors in the CPU. Today, CPUs organize

their cores into clusters, chiplets, and multi-sockets. As a result, how execution threads

are mapped to cores has a signiﬁcant impact on the overall performance

Core-to-Core

Latency Heatmap:

63/65

Thread Aﬃnity

The thread aﬃnity refers to the binding of a thread to a speciﬁc execution unit. The

goal of thread aﬃnity is improving the application performance by taking advantage of

cache locality and optimizing resource usage

Setting CPU aﬃnity can be done programmatically, such as using the

pthread_setaffinity_np function for POSIX threads, or at OS level with the

taskset command and the sched_setaffinity system call on Linux

*Dennard Scaling: power is proportional to the area of the transistor

CPU Affinity: Because Even A Single Chip Is Nonuniform

64/65

Memory Ordering Model

• Source code order: The order in which the memory operations are speciﬁed in

the source code, e.g. subscript, dereferencing

• Program order: The order in which the memory operations are speciﬁed at

assembly level. Compilers can reorder instructions as part of the optimization

process

• Execution order: The order in which the individual memory-reference

instructions are executed on a given CPU, e.g., out-of-order execution

• Perceived order: The order in which a CPU perceives its memory operations.

The perceived order can diﬀer from the execution order due to caching,

interconnect, and memory-system optimizations

C++ Memory Model: Migrating from X86 to ARM

65/65

Modern C++

Programming

23. Performance Optimization II

Code Optimization

Federico Busato

2025-04-14

Table of Contents

1 I/O Operations

printf

Memory Mapped I/O

Speed Up Raw Data Loading

2 Memory Optimizations

Heap Memory

Stack Memory

Cache Utilization

Memory Alignment

Memory Prefetch

1/87

Table of Contents

3 Arithmetic Types

Data Types

Arithmetic Operations

Conversion

Floating-Point

Compiler Intrinsic Functions

Value in a Range

Lookup Table

2/87

Table of Contents

4 Control Flow

Branches

Branch Hints - [[likely]] / [[unlikely]]

Signed/Unsigned Integers

Loops

Loop Hoisting

Loop Unrolling

Assertions

Compiler Hints - [[assume]]/std::unreachable()

Recursion

3/87

Table of Contents

5 Functions

Function Call Cost

Argument Passing

Function Inlining

Function Attributes

Pointers Aliasing

6 Object-Oriented Programming

7 Std Library and Other Language Aspects

4/87

I/O Operations

I/O Operations are orders of magnitude slower than

memory accesses

5/87

I/O Streams

In general, input/output op erations are one of the most expensive

• Use endl for ostream only when it is strictly necessary (prefer \n )

• Disable synchronization with

printf/scanf :

std::ios_base::sync_with_stdio(false)

• Disable IO ﬂushing when mixing

istream/ostream calls:

<istream_obj>.tie(nullptr);

• Increase IO buﬀer size:

file.rdbuf()->pubsetbuf(buffer_var, buffer_size);

6/87

I/O Streams - Example

#include <iostream>

int main() {

std::ifstream fin;

// --------------------------------------------------------

std::ios_base::sync_with_stdio(false); // sync disable

fin.tie(nullptr); // flush disable

// buffer increase

const int BUFFER_SIZE = 1024 * 1024; // 1 MB

char buffer[BUFFER_SIZE];

fin.rdbuf()->pubsetbuf(buffer, BUFFER_SIZE);

// --------------------------------------------------------

fin.open(filename); // Note: open() after optimizations

// IO operations

fin.close();

}

7/87

printf

• printf is faster than ostream (see speed test link)

• A

printf call with a simple format string ending with \n is converted to a

puts() call

printf("Hello World\n");

printf("%s\n", string);

• No optimization if the string is not ending with

\n or one or more % are

detected in the format string

www.ciselant.de/projects/gcc_printf/gcc_printf.html

8/87

Memory Mapped I/O

A memory-mapped ﬁle is a segment of virtual memory that has been assigned a

direct byte-for-byte correlation with some portion of a ﬁle

Beneﬁts:

• Orders of magnitude faster than system calls

• Input can be “cached” in RAM memory (page/ﬁle cache)

• A ﬁle requires disk access only when a new page boundary is crossed

• Memory-mapping may bypass the page/swap ﬁle completely

• Load and store raw data (no parsing/conversion)

9/87

Memory Mapped I/O - Example 1/2

#if !defined(__linux__)

#error It works only on linux

#endif

#include

<fcntl.h> //::open

#include <sys/mman.h> //::mmap

#include <sys/stat.h> //::open

#include <sys/types.h> //::open

#include <unistd.h> //::lseek

// usage: ./exec <file> <byte_size> <mode>

int main(int argc, char* argv[]) {

size_t file_size = std::stoll(argv[2]);

auto is_read = std::string(argv[3]) == "READ";

int fd = is_read ? ::open(argv[1], O_RDONLY) :

::open(argv[1], O_RDWR | O_CREAT | O_TRUNC, S_IRUSR | S_IWUSR);

if (fd == -1)

ERROR("::open") // try to get the last byte

if (::lseek(fd, static_cast<off_t>(file_size - 1), SEEK_SET) == -1)

ERROR("::lseek")

if (!is_read && ::write(fd, "", 1) != 1) // try to write

ERROR("::write")

10/87

Memory Mapped I/O Example 2/2

auto mm_mode = (is_read) ? PROT_READ : PROT_WRITE;

// Open Memory Mapped file

auto mmap_ptr = static_cast<char*>(

::mmap(nullptr, file_size, mm_mode, MAP_SHARED, fd, 0) );

if (mmap_ptr == MAP_FAILED)

ERROR("::mmap");

// Advise sequential access

if (::madvise(mmap_ptr, file_size, MADV_SEQUENTIAL) == -1)

ERROR("::madvise");

// MemoryMapped Operations

// read from/write to "mmap_ptr" as a normal array: mmap_ptr[i]

// Close Memory Mapped file

if (::munmap(mmap_ptr, file_size) == -1)

ERROR(

"::munmap");

if (::close(fd) == -1)

ERROR(

"::close");

11/87

Low-Level Parsing 1/2

Consider using optimized (low-level) numeric conversion routines:

template<int N, unsigned MUL, int INDEX = 0>

struct fastStringToIntStr;

inline unsigned fastStringToUnsigned(const char* str, int length) {

switch(length) {

case 10: return fastStringToIntStr<10, 1000000000>::aux(str);

case 9: return fastStringToIntStr< 9, 100000000>::aux(str);

case 8: return fastStringToIntStr< 8, 10000000>::aux(str);

case 7: return fastStringToIntStr< 7, 1000000>::aux(str);

case 6: return fastStringToIntStr< 6, 100000>::aux(str);

case 5: return fastStringToIntStr< 5, 10000>::aux(str);

case 4: return fastStringToIntStr< 4, 1000>::aux(str);

case 3: return fastStringToIntStr< 3, 100>::aux(str);

case 2: return fastStringToIntStr< 2, 10>::aux(str);

case 1: return fastStringToIntStr< 1, 1>::aux(str);

default: return 0;

}

12/87

Low-Level Parsing 2/2

template<int N, unsigned MUL, int INDEX>

struct fastStringToIntStr {

static inline unsigned aux(const char* str) {

return static_cast<unsigned>(str[INDEX] - '0') * MUL +

fastStringToIntStr<N - 1, MUL / 10, INDEX + 1>::aux(str);

}

};

template<unsigned MUL, int INDEX>

struct fastStringToIntStr<1, MUL, INDEX> {

static inline unsigned aux(const char* str) {

return static_cast<unsigned>(str[INDEX] - '0');

}

};

Faster parsing: lemire.me/blog/tag/simd-swar-parsing

13/87

Speed Up Raw Data Loading 1/2

• Hard disk is orders of magnitude slower than RAM

• Parsing is faster than data reading

• Parsing can be avoided by using binary storage and

mmap

• Decreasing the number of hard disk accesses improves the performance →

compression

LZ4 is lossless compression algorithm providing extremely fast decompression up to

35% of memcpy and good compression ratio

github.com/lz4/lz4

Another alternative is Facebook zstd

github.com/facebook/zstd

14/87

Speed Up Raw Data Loading 2/2

Performance comparison of diﬀerent methods for a ﬁle of 4.8 GB of integers. They are

explicit values in a text ﬁle in the case of ifstream and memory mapped, while binary

values for LZ4

Load Method Exec. Time Speedup

ifstream + parsing 102 667 ms 1.0x

memory mapped + parsing (first run) 30 235 ms 3.4x

memory mapped + parsing (second run) 22 509 ms 4.5x

memory mapped + lz4 (first run) 3 914 ms 26.2x

memory mapped + lz4 (second run) 1 261 ms 81.4x

NOTE: the size of the Lz4 compressed ﬁle is 1,8 GB

15/87

Memory

Optimizations

Heap Memory

• Dynamic heap allocation is expensive: implementation dependent and interact

with the operating system

• Many small heap allocations are more expensive than one large memory allocation

The default page size on Linux is 4 KB. For smaller/multiple sizes, C++ uses a

sub-allocator

• Allocations within the page size is faster than larger allocations (sub-allocator)

16/87

Stack Memory

• Stack memory is faster than heap memory. The stack memory provides high

locality, it is small (cache ﬁt), and its size is known at compile-time

•

static stack allocations produce better code. It avoids ﬁlling the stack each

time the function is reached

• constexpr arrays with dynamic indexing produces very ineﬃcient code with

GCC. Use

static constexpr instead

void f(int x) {

// bad performance with GCC

// constexpr int array[] = {1,2,3,4,5,6,7,8,9};

static constexpr int array[] = {1,2,3,4,5,6,7,8,9};

return array[x];

}

17/87

Cache Utilization

Maximize cache utilization:

• Maximize spatial and temporal locality (see next examples)

• Prefer small data types

• For basic set query and insertion:

◦ Prefer std::vector<bool> over a dynamic array of bool

◦ Prefer std::bitset over std::vector<bool> if the data size is known in

advance or bounded. Fixed-size array of

bool should be always replaced by

std::bitset

◦ Remember that common std algorithms could not be optimized for these containers,

e.g.

std::count_if , std::find

• Prefer stack data structures instead of heap data structures, e.g. std::vector

vs. static_vector W

18/87

Spatial Locality Example 1/2

A, B, C matrices of size N × N

C = A * B

for (int i = 0; i < N; i++) {

for (int j = 0; j < N; j++) {

int sum = 0;

for (int k = 0; k < N; k++)

sum += A[i][k] * B[k][j]; // row × column

C[i][j] = sum;

}

C = A * B

for (int i = 0; i < N; i++) {

for (int j = 0; j < N; j++) {

int sum = 0;

for (int k = 0; k < N; k++)

sum += A[i][k] * B[j][k]; // row × row

C[i][j] = sum;

}

19/87

Spatial Locality Example 2/2

Benchmark:

N 64 128 256 512 1024

A * B < 1 ms 5 ms 29 ms 141 ms 1,030 ms

A * B

< 1 ms 2 ms 6 ms 48 ms 385 ms

Speedup / 2.5x 4.8x 2.9x 2.7x

20/87

Temporal-Locality Example

Speeding up a random-access function

for (int i = 0; i < N; i++) // V1

out_array[i] = in_array[hash(i)];

for (int K = 0; K < N; K += CACHE) { // V2

for (int i = 0; i < N; i++) {

auto x = hash(i);

if (x >= K && x < K + CACHE)

out_array[i]

= in_array[x];

}

V1 : 436 ms, V2 : 336 ms → 1.3x speedup (temporal locality improvement)

.. but it needs a careful evaluation of CACHE , and it can even decrease the performance for

other sizes

pre-sorted

hash(i) : 135 ms → 3.2x speedup (spatial locality improvement)

lemire.me/blog/2019/04/27

21/87

Memory Alignment

Memory alignment refers to placing data in memory at addresses that conform to

certain boundaries, typically powers of two (e.g., 1, 2, 4, 8, 16 bytes, etc.)

Note: For multidimensional data, alignment only means that the start address of the data is

aligned, not that all start oﬀsets for all dimensions are aligned., e.g. for a 2D matrix, if

row[0][0] is aligned doesn’t imply that row[1][0] has the same property. Also the strides

between rows need to be multiple of the alignment

Data alignment is classiﬁed in:

• Internal alignment for struct/class layout optimization → reducing memory

footprint, optimizing memory bandwidth, and minimizing cache-line misses

• External alignment across several elements of the same type → minimizing

cache-line misses, vectorization (SIMD instructions)

22/87

Internal Structure Alignment

struct A1 {

char x1; // offset 0

double y1; // offset 8!! (not 1)

char x2; // offset 16

double y2; // offset 24

char x3; // offset 32

double y3; // offset 40

char x4; // offset 48

double y4; // offset 56

char x5; // offset 64 (65 bytes)

}

struct A2 { // internal alignment

char x1; // offset 0

char x2; // offset 1

char x3; // offset 2

char x4; // offset 3

char x5; // offset 4

double y1; // offset 8

double y2; // offset 16

double y3; // offset 24

double y4; // offset 32 (40 bytes)

}

(1) We are wasting 40% of memory for ( A1 )

(2) Considering an array of structures (AoS) and a cache line of 64 bytes (x64

processors), every access to A1 involves two cache line operations (∼2x slower)