Contents

I wrote my own argument parser in C++20

Why?

Well, it kind of happened by accident, if I’m honest. While working on a different project, where I’m building a set of utilities (assembler, disassembler, debugger, simulator) for a custom programming language, I needed a parser to create some basic CLI interfaces. I wanted to limit the amount of dependencies and thus didn’t want to reach for Boost’s program_options. Another reason is that I don’t really like it that much. program_options’s syntax is… weird, confusing and a bit cumbersome. It’s not something I can use without spending some time with the documentation.

On the other edge of the spectrum there’s the venerable getopt family of APIs. Which does the job… but feels dated and I was hoping for something more along the C++20 lines. Therefore, I decided to get my hands dirty and quickly cobble something together that would satisfy my requirements.

The requirements

Initially, having the ease of use in mind as a priority, I wanted to have something along the lines of Python’s argparse.ArgumentParser. One thing that doesn’t translate well from Python to C++ is the way parse_args returns the parsing results. In case of Python it’s an dictionary-like object. To do something similar in C++, I’d probably need some sort of wrapper types or use std::any or something similar. I’ve decided to do something similar to program_options (ironically) and bind variables to CLI argument definitions. In such case it’s easy to retrieve the values and at the same time provide the defaults.

Enter: ArgParser

So, it happened. I’ve spent maybe 12h in total, spread across a couple of days, writing version v0.1.1. So far, I’m quite happy with the project. The initial goals have been fulfilled completely. ArgParser is something I can just pick up and integrate in any project in a matter of minutes. The supported feature set is maybe insufficient for the moment to compete with Goliaths such as program_options but that is fine - it was never the intention.

What can it do?

The feature I’m most satisfied with is automatic conversion to given type. If the argument is bound to an int, the parser will convert the CLI value to an int or raise ArgConversionEx if it’s impossible. Same applies to any other supported data types. It even detects narrowing type conversion problems i.e. given:

1
2
3
4
5
6
7
8
9
uint8_t n = 0;
try {
    ap.addPositional("N")
        .set(n)
        ;
}
catch (const std::exception& e) {
    std::cerr << e.what() << std::endl;
}

The following invocation will produce ArgConversionEx, like so:

$ myprogram 256
'256' overflows

At the moment the parser supports the most basic PODs:

  • unsigned/signed integral types
  • std::string
  • bool
  • float/double
  • std::vector of any of the above

std::vector is special. If an argument is bound to std::vector, its semantics is changed. It becomes cumulative and may occur more than one time on the command line. All the values from the command line will be collected in the bound vector variable (of course conversion to the destination type will be performed as well). For example, given:

1
2
3
4
5
6
7
8
9
std::vector<std::string> paths;
try {
    ap.addOption("-p", "--path")
        .set(paths)
        ;
}
catch (const std::exception& e) {
    std::cerr << e.what() << std::endl;
}

Invocation like:

myprogram -p /bin --path /usr/bin --path /usr/local/bin

Will result in paths vector being populated with all collected command line option values. As mentioned, it even performs conversion on vector types:

1
2
3
4
5
6
7
8
9
std::vector<numbers> nums;
try {
    ap.addPositional("N")
        .set(nums)
        ;
}
catch (const std::exception& e) {
    std::cerr << e.what() << std::endl;
}

Invocation like:

myprogram 1 2 3 4

Will produce a vector of integers: 1,2,3,4.

Which I think is pretty cool and quite convenient at the same time.

How does it work?

The majority of the code is pretty straight forward. The only bit that is slightly more complex is related to variable binding.

Converting arguments from strings to (almost) any types

The type of conversion required is selected dynamically using the type provided to Argument::set() API. Let’s first consider how type dependant conversion can be implemented. For that, I’m gonna declare a Converter type:

1
2
    template <typename ValueT>
        class Converter;

I think it’s becoming pretty apparent now that I’m gonna use template specialisation for that. Let’s have a look on strings first:

1
2
3
4
5
6
7
8
    template <typename AssignableT>
        requires std::constructible_from<AssignableT, std::string>
        class Converter<AssignableT> {
        public:
            static AssignableT convert(std::string_view s) {
                return std::string{s};
            }
        };

This specialisation will be selected if a std::string is constructible from the given type. For any other types, the Converter class remains undefined.

Let’s have a look on another specialisation for unsigned integrals:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
    template <std::unsigned_integral ValueT>
        class Converter<ValueT> {
        public:
            static ValueT convert(std::string_view s) {
                errno = 0;
                const auto& value = std::strtoull(s.data(), nullptr, 0);
                if (0 == value && errno != 0) {
                    throw ArgConversionEx{std::string{s},
                        "is not convertable to unsigned integral type"};
                }

                if (value > std::numeric_limits<ValueT>::max()) {
                    throw ArgConversionEx{std::string{s}, "overflows"};
                }

                return value;
            }
        };

Having these two specialisations, it’s now possible to use them like so:

1
2
3
4
5
6
7
auto strType =
    Converter<std::string>::convert("this will be converted to std::string");

...
auto unsignedType = Converter<unsigned>::convert("1234");
...
// auto doubleType = Converter<double>("3.14")  // ERROR: no specialisation for `double`

Binding variables

Storing the reference to the bound variable is the other missing piece of the puzzle. To achieve that, I’m doing something very similar as what std::any does. The reason why I don’t use std::any in the first place is that I need to combine it with the type converter.

I need a base type to perform type erasure. For that, I’m declaring something similar to the following interface:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
class ValueWrapper {
public:
    virtual ~ValueWrapper() = default;

    // Consumes command line argument; given as `std::string_view` and
    // assigns it to bound reference to variable.
    virtual void setValue(std::string_view s) = 0;

    // Returns true if value has been already assigned
    virtual bool isAssigned() const = 0;
};

Now, a templated child class is needed to actually store the references to variables:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
template <typename ValueT>
    class ValueWrapperForT : public ValueWrapper {
    public:
        ValueWrapperForT(ValueT& v) :
            v{v},
            isArgAlreadyAssigned{false}
        {
        }

        bool isAssigned() const override {
            return isArgAlreadyAssigned;
        }

        void setValue(std::string_view s) override {
            this->v = Converter<ValueT>::convert(s);
            this->isArgAlreadyAssigned = true;
        }

    protected:
        ValueT& v;
        bool isArgAlreadyAssigned;
    };

Now, I’ve got the “type independent” base type, I can just store it in a container i.e.:

1
2
3
4
5
6
7
int intValue = 0;
std::string strValue;

std::vector<std::unique_ptr<ValueWrapper>> values;

values.push_back(std::make_unique<ValueWrapperForT<int>>(intValue));
values.push_back(std::make_unique<ValueWrapperForT<std::string>>(strValue));

… and thanks to polymorphism:

1
2
3
4
5
// will be automatically converted to `int` and assigned to `intValue`
values[0]->setValue("123");

// will be automatically converted to `std::string` and assigned to `strValue`
values[1]->setValue("string value");

In ArgParser’s case, there’s some other elements involved but they are irrelevant. The principle remains the same.

ArgParser on Gitlab

Feel free to checkout the project on gitlab. Maybe, you’ll find it useful for yourself. I’m open to pull requests as well so, if you find a fundamental problem or would like to contribute, feel free to do so.