WiParser#

WiParser is a modern parser combinator written from scratch in plain C++ 17. It is intended to provide a simple API that can be easily extended.

Why?#

I am currently planning to write a C compiler for a custom architecture (custom instruction set too), thus I need a parser.

I thought that a parser builder could be used in other projects as well, so it would be a better choice. I chose C++ because I am very familiar with it and because speed was a factor I took into account.

Short example#

The following example is able to parse (and compute) a small LISP (Racket) mathematical instruction (API 1.0.2b):

void example_lisp() {
    using namespace wi;

    parser_state_t init_parser_state("[% (* 2 (- [+ 8 2] (pow 2 2))) 5]");

    lazy_parser_t *p_lazy_function = new lazy_parser_t();

    parser_t *p_value = new choice_of_parser_t({
        new digits_parser_t(),
        p_lazy_function
    });

    parser_t *p_function = new between_parser_t(
        new sequence_of_parser_t({
            new char_parser_t(std::regex(R"([\(\[])")),
            new maybe_whitespaces_parser_t()
        }),
        new sequence_of_parser_t({
            new maybe_whitespaces_parser_t(),
            new char_parser_t(std::regex(R"([\)\]])"))
        }),
        new sequence_of_parser_t({
            new choice_of_string_parser_t({"+", "-", "*", "/", "%", "pow"}),
            new between_parser_t(
                new whitespaces_parser_t(),
                new whitespaces_parser_t(),
                p_value
            ),
            p_value
        })
    );

    p_lazy_function->set_parser(p_function);
    parser_state_t ps = p_function->run(init_parser_state);

    std::function<int(std::any)> f = [&](std::any a) {
        if (a.type() == typeid(std::vector<std::any>)) {
            std::vector<std::any> v = std::any_cast< std::vector<std::any> >(a);
            std::string op = smart_string_any_cast(v[0]);
            int left = f(v[1]), right = f(v[2]);
            if (op == "+") return left + right;
            if (op == "-") return left - right;
            if (op == "*") return left * right;
            if (op == "/") return left / right;
            if (op == "%") return left % right;
            if (op == "pow") return (int)std::pow(left, right);
            return 0;
        } else {
            return std::stoi(smart_string_any_cast(a));
        }
    };

    ps = ps.map_result(f);

    std::cout << ps.to_string() << std::endl;
}

If we were to manually compute [% (* 2 (- [+ 8 2] (pow 2 2))) 5], we would get:

\[2\cdot(8 + 2 - 2^2) \ \mathrm{mod}\ 5\equiv2\]

Here is the output we get when running the snippet above:

{ target_string: "[% (* 2 (- [+ 8 2] (pow 2 2))) 5]",
  index: 33,
  result: 2 }

Features#

My design philosophy of a parser combinator is the following:

  • It should be simple to work with;

  • It should be quick;

  • It should be well documented;

  • The API shouldn't change all that much;

  • It should be easily expandable;

  • Functional programming could aid the parser, thus it should be implemented (functions such as map, chain, flatten etc.)

All these features (and more!) were provided by my parser combinator.

Source code#

This project is provided under the MIT License. For more info about the author, check out my personal website or my other projects.

Documentation#

The rich API provided by the current version of the project is explained in the repo's README file - take a quick look!