#3 The state of C++ package management: The build systems
Welcome to the third and concluding part of the series about dependency and package management in C++ projects in which I’m gonna mostly focus on solutions built into build systems themselves. If you haven’t already, I encourage you to have a quick read of the first and the second part.
Overview
In this part I’ll have a closer look on meson wraps, cmake’s FetchContent, bazel’s central registry and, a bit unconventionally, conda.
Meson WrapDB
Meson allows for integration of dependencies using subprojects. A subproject can be a git submodule or a wrap file - which describes how to get the subproject and what dependencies it provides. Meson can import CMake projects as subprojects as well using the CMake module. The integration is pretty seamless and works out of the box.
With the support for both meson and CMake build systems within dependencies, theoretically, it’s possible to integrate other build systems (like autotools or bazel) by providing a CMake or meson integration layer.
Meson provides a WrapDB. It’s a rather humble collection of packages that have been ported to meson by either the community or the meson team themselves. Installation from WrapDB is extremely simple
meson wrap install gtest
That’s it. This will pull the wrap file from meson’s WrapDB. The project can now be integrated as a dependency.
Meson can build rust code as well. Read more about it here.
Testing
I am a bit biased here as I really like meson. I’ve written about it in the past, you might wanna read my prior post about meson if you’re interested in more details. That being said, I’ve tried integrating inja as my goto project into my meson test project and it was effortless and worked flawlessly. All it took was:
mkdir subprojects
git submodule add https://github.com/pantor/inja subprojects/inja
inja
provides both CMake and meson toolchains. In my meson.build
I wanted
to try CMake integration so, I’ve added the CMake
module. My entire meson.build
file
end up looking the following way:
|
|
That being said, meson has its own limitations as well that have to be mentioned.
Beyond your project specific subprojects
directory, there’s no additional
caching. So, if you’ve got two independent projects both using the same
dependency, it will be rebuilt twice in each of these projects.
And of course, meson wrapdb is meson specific so it can only cater towards meson itself only.
Summary
Feature | Support | My verdict |
---|---|---|
Declarative dependencies | Supported via wrap files or indirectly by using git submodules. | ✔️ |
Build reproducibility | Meson guarantees deterministic build order but the wrap file and subprojects define only direct dependencies. The dependencies themselves have to take care of their own dependencies. Which means there’s no exact version pinning defined anywhere. | ❌ |
Inter-dependency mgmt | Packages have to define their own dependencies via their own build system. This is not guaranteed. | ❌ |
Handling non-native packages | Provides WrapDB containing ready made wrap files for most popular projects. Can build CMake and Rust dependencies (and many more) using provided meson modules. Provides converter tools. The dependencies can be ingested using a wrap file or directly using git submodules. | ✔️ |
Project build systems supported | Meson only. | ❌ |
Dependencies build systems supported | Meson, CMake, Rust, Java, D, and many more via meson modules. | ✔️ |
Caching | Only caches locally in project’s subprojects directory. |
❌ |
Build tools | No support. | ❌ |
Other remarks | It’s a great and convenient build tool. Has some limitations but for small projects it’s flexible enough and a pleasure to work with. | ✔️ |
conda
conda can simply be described as a “package manager”.
Package manager just like apt
, pacman
or any other standard system package
manager. conda
creates environments into which you then install your
project’s dependencies. In that sense it’s very similar to spack
with the
main difference being the dependencies are pre-compiled binaries. That’s why I
like to think of it in a similar vein as your system’s package manager. You
pull in binaries into an environment and then use pkg-config (with meson) or
cmake modules to discover them. conda
uses a notion of a channel. A channel
is simply a repository that contains a pre-built set of packages.
You can rely on many channels at once. All of them can be configured via .condarc
file along with priorities to manage collisions.
Most popular (aside of the default one) is community driven conda-forga.
What about packages that aren’t available in any channel? conda
allows for
creation of custom channels and building packages published within these
channels. It’s possible to install a local package, unpublished package as well.
Testing
There’s a special environment called base
. Activate it and install
conda-build
into it:
conda activate base
conda install conda-build
conda install conda-verify
For good measure:
conda update conda
conda update conda-build
pantor/inja is available via
conda-forge
but as an exercise I’m gonna try to create a recipe for it myself. To build it, I’m gonna follow the instructions in the
documentation.
The documentation provides a link to sample recipes
as an extra reference, which is very helpful.
mkdir pantor_inja
touch meta.yaml build.sh
Here’s the bare minimum required in the meta.yaml
:
|
|
build.sh
contains:
|
|
Having the base
environment active, the package can be build by simply issuing conda build
command:
conda build inja
This is now available for installation within other environments:
conda create -n conda_test
conda activate conda_test
conda install --use-local inja
The package is now installed within conda_test
environment and can be discovered by the project using CMake:
|
|
Having the package I wanted to check how difficult it’d be to create a simple local channel. Turns out it’s easy. Having the base
environment activated, the built package can be copied from conda-build
:
mkdir conda_twdev_index
cp -r ~/miniconda3/conda-build/{noarch,osx-64} .
conda index
The channel is now searchable and can be used for package distribution:
|
|
The environment can be exported and stored within the project repository using:
conda export -n conda_test >env.yaml
Summary
It feels like a robust solution which I’m definitely gonna try more but at one
point. Playing along with the package creation I had an impression that spack
(have a look on the first post about package managers)
solves this problem much better by automating the process of package creation.
Maybe there’s a corner case which conda
addresses better but at the moment I’m not sure if I can see it.
Feature | Support | My verdict |
---|---|---|
Declarative dependencies | The environment comprises the dependencies set. The environment can be exported. | ✔️ |
Build reproducibility | The environment defines the dependencies graph and version set. | ✔️ |
Inter-dependency mgmt | Supported. Packages metadata define the dependency graph. | ✔️ |
Handling non-native packages | Packages unavailable via any upstream channel can be built locally. Private channels are first class citisens. | ✔️ |
Project build systems supported | Build system agnostic. As long as the package itself exports build system specific files, the dependency will be discoverable. | ✔️ |
Dependencies build systems supported | Build system agnostic. You define your own build instructions in build.sh |
✔️ |
Caching | Supported. | ✔️ |
Build tools | meta.yaml defines build requirements for packages. Project-wise, there’s no discinction between package types |
❌ |
Other remarks | Seems like spack automates some of the manual steps when creating packages. Regardless, conda feels like a mature reliable solution. |
✔️ |
CMake FetchContent
FetchContent
is part of CMake and it allows to declare and pull external
dependencies directly in the project’s CMake file. Usage of FetchContent
is described in details in CMake’s documentation - which I encourage you to read.
In short, first you declare the dependencies using FetchContent_Declare
and once you’ve got everything you need you call FetchContent_MakeAvailable
. Here’s an example of a complete CMakeLists.txt
using FetchContent
:
Testing
|
|
The advantages? It’s CMake native, no additional tools are required. It’s
much better than e.g. Hunter
as unlike with Hunter
you can download
whatever version you want from whichever source and you’re not restricted by
HunterGate
or anything like that.
The dependencies are declared in one place in a simple manner along with their versions which is very good.
Unfortunately, transient dependencies are not handled at all. If the project
you depend on, doesn’t arrange to obtain its own dependencies (and most of them
don’t) by e.g. using FetchContent
as well, then you’ll have to do it
yourself in the main project.
Summary
Feature | Support | My verdict |
---|---|---|
Declarative dependencies | Declared directly within the CMakeLists.txt along with origins and versions. | ✔️ |
Build reproducibility | Kind of. Since you’re forced to declare all dependencies up front along with their origins, you might say that this guarantees build reproducibility to some extent. | ✔️ |
Inter-dependency mgmt | Not supported. | ❌ |
Handling non-native packages | Not supported as far as I know. The dependency has to provide CMake toolchain. | ❌ |
Project build systems supported | CMake only. | ❌ |
Dependencies build systems supported | CMake | ❌ |
Caching | Only caches locally in project’s subprojects directory. |
❌ |
Build tools | No support. | ❌ |
Other remarks | Simple and convenient to use for small scale projects. Might be a good option to get going quickly. | ✔️ |
Bazel Central Registry
Bazel is a topic on its own. Anything I write here barely scratches the surface. It comes with its own version manager (bazelisk) and a programming language (Starlark - based on python). Focuses on build isolation, parallelism, remote execution and a myriad of other things I’m barely aware of. Suffice to say - it’s huge and most perfectly suited for adequately large complex code bases. The barrier to entry is quite high as well. In order to do anything non-standard you’ll have to familiarise yourself with Starlark, and a lot of terms describing basic notions behind Bazel’s design. That includes:
- Actions - given a set of input files, actions generate output files using toolchains or shell scripts/commands,
- Rules - in short, functions which create actions and return providers,
- Providers - the format for information exchanged between rules and toolchains (e.g. this can be compiler and system libraries path. Custom providers are just dictionaries with a fixed set of keys),
- Toolchains - define platform specific set of tools used by rules and actions to produce build artefacts
- Macros - simply putting, these are functions producing rules as a result,
- Aspects - in definition, aspects are similar to rules. I like to think of them as ‘side rules’ - extra rules augmenting existing rules with additional functionality,
The most convenient way to install bazel
is by using bazelisk - this is
bazel
’s version manager and an execution wrapper. Bazelisk
itself may be
installed in a variety of ways - I won’t go into details as it is not that important. Once you have it, it’s
preferable to just use bazelisk
instead of bazel
. I’d even suggest:
alias bazel=bazelisk
The set of bazelisk
’s commands is the same but you’re benefiting from transparent bazel
version management i.e. bazelisk
will download and install and use an appropriate version of bazel
.
bazelisk
is able to determine which version of bazel
is needed either using
environment (USE_BAZEL_VERSION
variable) or by inspecting the contents of
.bazeliskrc
file in your repo.
You should also install bazel build
tools which amongst many, contain
buildifier
- this is an auto-formatter for BUILD.bazel
files.
Workspaces vs Bazel modules
Originally, every bazel
package was marked with WORKSPACE
file and a BUILD
file (both of these files can be optionally suffixed with .bazel
extension).
WORKSPACE
defines a bazel package. Within the package itself you can have
more bazel packages containing their own WORKSPACE
files - these will become
sub-packages of the root package.
Usually, you’d put code handling your external dependencies using
rules like
http_archive
within WORKSPACE
file. BUILD
contains only the rules to build your code.
There was a bit of a problem with workspaces related to handling of transitive dependencies and as a result workspaces are becoming deprecated in favour of bazel modules (sometimes referred to as bzlmod).
During the transition period, you’re still allowed to have the WORKSPACE
file
but it is being replaced with MODULE
file. Additionally, you can have
WORKSPACE.bzlmod
- this has the same syntax as WORKSPACE
but takes
precedence over WORKSPACE
if bzlmod builds are enabled. More details about
migration can be found here.
To explicitly enable bzlmod
support, you can put
common --enable_bzlmod
in .bazelrc
in your repo.
Testing
Bazel
can handle a dependency graph for a collection of bazel
projects within
a repo just fine. It does that much better than CMake
by performing a 3
stage build process.
After parsing all BUILD
files, the dependency graph is transformed into
action graph so, unlike with CMake
, it’s possible to rely on projects in your
repo which are not yet built (CMake
projects are only discoverable by other
projects once they are built and installed. The installation step will
produce and install needed pkg-config
files or CMake
module export files). The
analysis stage just generates the action graph which itself will produce all
artefacts in correct order. This is a major improvement over CMake
. The
first build tutorial exemplifies the whole
process very well so, I’m not gonna bother with my own example here.
What about integration of non-bazel
projects? This is where the fun starts!
Officially, it’s not supported out of the box. There’s an unofficial effort called
rules_foreign_cc which allows
for integration of CMake
, autotools
and meson
projects. Other than that - you’ll have to get your hands dirty and write your own Starlark code.
Integrating CMake project manually
As an exercise in learning a bit more about Starlark and bazel
itself, I’ve
decided to attempt integrating a simple test project in a bazel
repo myself.
My goal is just to explore the problem and not necessarily come up with a
production ready solution.
I’m gonna start with a trivial repo with a single executable:
|
|
BUILD.bazel
contains only:
|
|
This can be built with just:
bazel build //:main
This works - no surprises at all. I’m gonna add a simple CMake project as a next thing (“cmake_proj”):
|
|
cmake_proj
provides a simple shared library that I wish to link with my
executable produced by bazel
. How to build it though?
After reading the
bazel
documentation about toolchains I decided that I probably need to
declare a separate toolchain for CMake. I’ve created the
bazel/private/toolchain/cmake
package containing the toolchain_type
definition, cmake_toolchain rule and two instantiations of the toolchain for
linux and osx. Here’s the resulting tree:
|
|
… and the toolchain instances in cmake_toolchain.bzl
:
|
|
I’ve even defined a simple provider but I’m not sure if I’m gonna use it. To make things simple, I just assumed and hard-coded the paths to cmake itself.
The toolchains are registered in MODULE.bazel
:
|
|
The next step is to write a rule that will build the cmake projects using the cmake toolchain. After a bit of trial and error, I’ve come up with the following code:
|
|
In short, the implementation of the rule creates a shell script that first
prepares a cmake
build directory, which it later on builds. I wasn’t sure
how to express the dependencies between the configuration and the build stage
hence the script. The rest (especially everything that’s using cc_common
and
CcInfo
) is just plumbing allowing to export the library information
so other rules can link with it. The rule can be used the following way in BUILD.bazel
:
|
|
With the above rules in place, it’s possible to integrate cmake project into bazel build:
|
|
This is of course highly hacky and experimental. The point I’m trying to make is that integration of foreign toolchains is non-trivial. The repo used in this example is available here.
Integrating CMake project using rules_foreign_cc
As much as writing your own rules to integrate CMake into bazel can broaden the understanding of bazel’s mechanics, it’s not recommended to pursue that seriously as it requires quite a lot of effort to do it right. rules_foreign_cc provides such integration and is a much more stable and reliable alternative that saves a lot of work.
Unfortunately, being a community effort, in some areas it’s a bit rough around
the edges as well. I’ve tried to follow the documentation and the example provided for cmake and as
much as it worked for me without any bigger issues for non-bzlmod builds, the
documentation is missing the details about how to use rules_foreign_cc
with
bzlmod enabled.
Registries
bazel
provides a notion of a registry with Bazel Central
Registry being the official package registry.
Any projects from the registry can be integrated by adding entries to
MODULE.bazel
. An example for
asio:
bazel_dep(name = "asio", version = "1.31.0")
Additionally, it’s possible to create local registries. This might be useful if you’re patching an upstream project or require some custom changes that won’t ever be published upstream.
The requirements for the registry are well described in bazel’s documentation but there’s an easier way. I usually just clone the central registry:
|
|
Now I have a complete copy in my repo under registry
. Now it’s a matter of overriding the default registry by adding the following to .bazelrc
:
common --registry=file:///%workspace%/registry
Having the .bazelrc
updated it’s good to make sure that bazel reads it again:
bazel shutdown
bazel build --lockfile_mode=off
That’s it!
Summary
Feature | Support | My verdict |
---|---|---|
Declarative dependencies | Supported either by using WORKSPACE or MODULE rules. | ✔️ |
Build reproducibility | Supported. | ✔️ |
Inter-dependency mgmt | Supported with bzlmod. | ✔️ |
Handling non-native packages | The library of rules for foreign toolchains is growing but you might find yourself having to write custom Starlark integration code - which is not trivial. | ❌ |
Project build systems supported | Bazel only. | ❌ |
Dependencies build systems supported | Bazel only - custom rules needed for foreign build systems. | ❌ |
Caching | Supported. | ✔️ |
Build tools | Supported via toolchains. Custom tools might need custom toolchain definitions. | ✔️ |
Other remarks | For simple project, I’d recommend to stick with simpler solutions as it might be frustrating and a huge effort pit to integrate tools and libraries having no Bazel toolchains defines. | ❌ |
Conclusion
The state of package management and build system fragmentation in C++ can be best summarised by the classic XKCD:
I’m not even sure if introduction of an official C++ package manager would change
anything at all at that stage? Just as in the xkcd strip, the result would probably
be number_of_package_managers += 1
.
Is this a bad thing? On one hand, there’s nothing official like Cargo
for
rust
but at the same time, C++ projects require diversity as, due to language
legacy, it would be incredibly difficult to support all use cases in elegant,
uniform manner. Forcing everyone to transition to a certain “official” paradigm of
dependency management would probably be impossible as well.
The sad reality is that most respected C++ projects now, have to provide support for multiple build systems and package managers in order to maintain its momentum. Having a look at e.g. gtest, it comes with both Bazel and CMake build files. Similarly catch2 - has files for Bazel, meson and CMake. This puts extra work on the maintainers and is an easy source of bugs and incompatibilities.
It feels like we’ve went from one far end of the spectrum (no package managers at all) to the other (proliferation of build systems and package managers).
Time will tell what the next step will be. Until then, I guess we all need to stay at least on top of the game to be able to maintain our code bases with minimum effort required in regards to dependency management.