Contents

Three C++ misconceptions from C programmers

In this post I’ll try to clarify some of the misconceptions about C++ which I often find in various code bases.

People have their habits

C++ is a complex language. Part of this complexity stems from legacy. With legacy comes source code which is often bad. Code that is difficult to maintain, which relies on false assumptions. These assumptions have been reinforced in programmers’ minds early on when the language was in its peak popularity (which was probably like 20 years ago) or (even worse) have been adopted from C world by people who still think that C++ is just C with classes.

The language evolved and it’s time to clarify some of these (at least for my own sake).

First misconception: static local variables as an optimisation attempt

I find this one quite often. The premise here is that since static local variables are initialised only once, time is saved when re-entering the function subsequently. Here’s an example:

1
2
3
4
5
int foo_with_static_int() {
    // here I save time because `i` will be initialised only once
    static const int i = 123;
    return i;
}

Of source people often defend this approach when the local variable requires more complex initialisation like i.e.:

1
2
3
4
5
6
7
8
int vector_with_static() {
    static const std::vector<int> v = {
        1,2,3,4,5,6,7,8,9,10
    };

    // completely arbitrary index
    return v[4];
}

To prove that this assumption is false, let’s use google-benchmark. Here’s my test code (which can be found here as well):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
 <benchmark/benchmark.h>

int foo_with_static_int() {
    static const int i = 123;
    return i;
}

int foo_without_static_int() {
    const int i = 123;
    return i;
}

int vector_with_static() {
    static const std::vector<int> v = {
        1,2,3,4,5,6,7,8,9,10
    };

    return v[4];
}

int vector_without_static() {
    const std::vector<int> v{
        1,2,3,4,5,6,7,8,9,10
    };

    return v[4];
}

static void BM_foo_with_static_int(benchmark::State& state) {
    for (auto _ : state) {
        foo_with_static_int();
    }
}

static void BM_foo_without_static_int(benchmark::State& state) {
    for (auto _ : state) {
        foo_without_static_int();
    }
}

static void BM_vector_with_static(benchmark::State& state) {
    for (auto _ : state) {
        vector_with_static();
    }
}

static void BM_vector_without_static(benchmark::State& state) {
    for (auto _ : state) {
        vector_without_static();
    }
}

BENCHMARK(BM_foo_with_static_int);
BENCHMARK(BM_foo_without_static_int);
BENCHMARK(BM_vector_without_static);
BENCHMARK(BM_vector_with_static);

BENCHMARK_MAIN();

And here are the results:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
Run on (8 X 2500 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x4)
  L1 Instruction 32 KiB (x4)
  L2 Unified 256 KiB (x4)
  L3 Unified 6144 KiB (x1)
Load Average: 2.58, 2.00, 1.78
--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
BM_foo_with_static_int          4.81 ns         4.81 ns    145309010
BM_foo_without_static_int       4.81 ns         4.81 ns    145371174
BM_vector_without_static         372 ns          372 ns      1885227
BM_vector_with_static           7.38 ns         7.38 ns     93426760

So… I guess I was wrong? static indeed helps? No! This is a debug build, as soon as you enable optimisations:

meson configure --buildtype release bld

all that premature optimisation attempts disappear:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
Run on (8 X 2500 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x4)
  L1 Instruction 32 KiB (x4)
  L2 Unified 256 KiB (x4)
  L3 Unified 6144 KiB (x1)
Load Average: 2.90, 2.02, 1.81
--------------------------------------------------------------------
Benchmark                          Time             CPU   Iterations
--------------------------------------------------------------------
BM_foo_with_static_int         0.000 ns        0.000 ns   1000000000
BM_foo_without_static_int      0.000 ns        0.000 ns   1000000000
BM_vector_without_static       0.000 ns        0.000 ns   1000000000
BM_vector_with_static          0.291 ns        0.291 ns   1000000000

It’s even visible that code with static is even worse because compiler cannot just optimise the whole lookup away and has to assure the lifetime of the object so, the check if the variable has been already initialised (which static unavoidably introduces) will always be there.

It’s very clear what happens once you disassemble to code:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
$ objdump --demangle --disassemble-functions="vector_without_static()" ./bld/static/static

./bld/static/static:	file format Mach-O 64-bit x86-64


Disassembly of section __TEXT,__text:

00000001000046e0 vector_without_static():
1000046e0: 55                          	pushq	%rbp
1000046e1: 48 89 e5                    	movq	%rsp, %rbp
1000046e4: b8 05 00 00 00              	movl	$5, %eax
1000046e9: 5d                          	popq	%rbp
1000046ea: c3                          	retq
1000046eb: 0f 1f 44 00 00              	nopl	(%rax,%rax)

… and the version with static:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
$ objdump --demangle --disassemble-functions="vector_with_static()" ./bld/static/static

./bld/static/static:	file format Mach-O 64-bit x86-64


Disassembly of section __TEXT,__text:

00000001000045c0 vector_with_static():
1000045c0: 55                          	pushq	%rbp
1000045c1: 48 89 e5                    	movq	%rsp, %rbp
1000045c4: 53                          	pushq	%rbx
1000045c5: 50                          	pushq	%rax
1000045c6: 8a 05 a4 be 02 00           	movb	179876(%rip), %al
1000045cc: 84 c0                       	testb	%al, %al
1000045ce: 74 11                       	je	17 <__Z18vector_with_staticv+0x21>
1000045d0: 48 8b 05 81 be 02 00        	movq	179841(%rip), %rax
1000045d7: 8b 40 10                    	movl	16(%rax), %eax
1000045da: 48 83 c4 08                 	addq	$8, %rsp
1000045de: 5b                          	popq	%rbx
1000045df: 5d                          	popq	%rbp
1000045e0: c3                          	retq
1000045e1: 48 8d 3d 88 be 02 00        	leaq	179848(%rip), %rdi
1000045e8: e8 fd 40 02 00              	callq	147709 <dyld_stub_binder+0x1000286ea>
1000045ed: 85 c0                       	testl	%eax, %eax
1000045ef: 74 df                       	je	-33 <__Z18vector_with_staticv+0x10>
1000045f1: 48 c7 05 6c be 02 00 00 00 00 00    	movq	$0, 179820(%rip)
1000045fc: 48 c7 05 59 be 02 00 00 00 00 00    	movq	$0, 179801(%rip)
100004607: 48 c7 05 46 be 02 00 00 00 00 00    	movq	$0, 179782(%rip)
100004612: bf 28 00 00 00              	movl	$40, %edi
100004617: e8 9e 40 02 00              	callq	147614 <dyld_stub_binder+0x1000286ba>
10000461c: 48 89 05 35 be 02 00        	movq	%rax, 179765(%rip)
100004623: 48 8d 35 2e be 02 00        	leaq	179758(%rip), %rsi
10000462a: 48 89 c1                    	movq	%rax, %rcx
10000462d: 48 83 c1 28                 	addq	$40, %rcx
100004631: 48 89 0d 30 be 02 00        	movq	%rcx, 179760(%rip)
100004638: 48 8b 15 51 5c 02 00        	movq	154705(%rip), %rdx
10000463f: 48 89 50 20                 	movq	%rdx, 32(%rax)
100004643: 48 8b 15 3e 5c 02 00        	movq	154686(%rip), %rdx
10000464a: 48 89 50 18                 	movq	%rdx, 24(%rax)
10000464e: 48 8b 15 2b 5c 02 00        	movq	154667(%rip), %rdx
100004655: 48 89 50 10                 	movq	%rdx, 16(%rax)
100004659: 48 8b 15 18 5c 02 00        	movq	154648(%rip), %rdx
100004660: 48 89 50 08                 	movq	%rdx, 8(%rax)
100004664: 48 8b 15 05 5c 02 00        	movq	154629(%rip), %rdx
10000466b: 48 89 10                    	movq	%rdx, (%rax)
10000466e: 48 89 0d eb bd 02 00        	movq	%rcx, 179691(%rip)
100004675: 48 8d 3d 44 00 00 00        	leaq	68(%rip), %rdi
10000467c: 48 8d 15 7d b9 ff ff        	leaq	-18051(%rip), %rdx
100004683: e8 44 40 02 00              	callq	147524 <dyld_stub_binder+0x1000286cc>
100004688: 48 8d 3d e1 bd 02 00        	leaq	179681(%rip), %rdi
10000468f: e8 5c 40 02 00              	callq	147548 <dyld_stub_binder+0x1000286f0>
100004694: e9 37 ff ff ff              	jmp	-201 <__Z18vector_with_staticv+0x10>
100004699: 48 89 c3                    	movq	%rax, %rbx
10000469c: 48 8d 3d cd bd 02 00        	leaq	179661(%rip), %rdi
1000046a3: e8 3c 40 02 00              	callq	147516 <dyld_stub_binder+0x1000286e4>
1000046a8: 48 89 df                    	movq	%rbx, %rdi
1000046ab: e8 78 3e 02 00              	callq	147064 <dyld_stub_binder+0x100028528>
1000046b0: 0f 0b                       	ud2
1000046b2: 66 2e 0f 1f 84 00 00 00 00 00       	nopw	%cs:(%rax,%rax)
1000046bc: 0f 1f 40 00                 	nopl	(%rax)

So, please. Don’t just use static as a premature optimisation attempt on function local variables! It won’t buy you any faster code - it’s just wrong.

If you’ve got a local variable that really requires complex initialisation think about your design, maybe your function should become a class on its own?

Second misconception: iostreams are bad

There seems to be strong preference to stick with classical printf/FILE* APIs rather than reliance on iostreams. This is difficult to explain from the objective standpoint. Classical C++ IO APIs are non-portable, making work with different data types a nightmare. I.e:

1
printf("size: %u", list.size());

What if size() returns a 64-bit type? Sure, you can use platform-dependant formatting macros like:

1
printf("size: " PRIu64, list.size());

but this looks dodgy and is error prone.

Another argument against streams I hear quite often is that with streams the formatting directives “stick”. Which is partially true but this can be easily remedied if required, with a function specifically designed to deal with this: flags:

1
2
3
const auto origFlags = std::cout.flags();
std::cout << std::hex << "0x" << 123 << std::endl;
std::cout.flags(origFlags);

You can even create a nice RAII wrapper if needed. iostreams provide a coherent API with the rest of the STL. In complex large software projects, consistency and the design matters the most.

Third misconception: Avoiding exceptions

This again, often is discussed in context of performance.

Exceptions are slow, therefore we should avoid them.

No! The repercussion of this approach is that the error handling within the system is more or less unspecified. Some parts of the code use error codes, some return errors via an argument. Nothing is consistent and is a source of bugs. Before any optimisation attempt is made, one should first evaluate if the problem exists at all to begin with. Sure, I agree, when C++ was in its infancy, it may have been the case that exceptions were unacceptably costly. This could’ve been additionally reinforced by the fact that machines were a lot slower as well. We’ve made a lot of progress since then though!

The discussion on this one is quite a controversial topic, hence I’m gonna support myself with core guidelines again which I think, goes into the details in this regard.