parser/doc/tutorial.qbk

[/
 / Distributed under the Boost Software License, Version 1.0. (See accompanying
 / file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
 /]

[section Tutorial]

[section Terminology]

First, let's cover some terminology that we'll be using throughout the docs:

A /semantic action/ is an arbitrary bit of logic associated with a parser,
that is only executed when the parser succeeds.

Simpler parsers can be combined to form more complex parsers.  Given some
combining operation `C`, and parsers `P0`, `P1`, ... `PN`, `C(P0, P1, ... PN)`
creates a new parser `Q`.  This creates a /parse tree/.  `Q` is the parent of
`P1`, `P2` is the child of `Q`, etc.  The parsers are applied in the top-down
fashion implied by this.  When you use `Q` to parse a string, it will use
`P0`, `P1`, etc. to do the actual work.  If `P3` is being used to parse the
input, that means that `Q` is as well, since the way `Q` parses is by
dispatching to its children to do some or all of the work.  At any point in
the parse, there will be exactly one parser without children that is being
used to parse the input; all other parsers being used are its ancestors in the
parse tree.

A /subparser/ is a parser that is the child of another parser.

The /top-level parser/ is the root of the tree of parsers.

The /current parser/ or /innermost parser/ is the parser with no children that
is currently being used to parse the input.

A /rule/ is a kind of parser that makes building large, complex parsers
easier.  A /subrule/ is a rule that is the child of some other rule.  The
/current rule/ or /innermost rule/ is the one rule currently being used to
parse the input that has no subrules.  Note that while there is always exactly
one current parser, there may or may not be a current rule _emdash_ rules are
one kind of parser, and you may or may not be using them in your top-level
parser.

The /top-level parse/ is the parse operation being performed by the top-level
parser.  This term is necessary, because though most parse failures are local
to a particular parser, some parse failures cause the call to _p_ to indicate
failure of the entire parse.  For these cases, we say that such a local
failure "causes the top-level parse to fail".

There are a couple of special kinds of parsers that come up often in this
documentation.

One is a /sequence parser/; you will see it created using `operator>>()`, as
in `p1 >> p2 >> p3`.  A sequence parser tries to match all of its subparsers
to the input, one at a time, in order.  It matches the input iff all its
subparsers do.

The other is an /alternative parser/; you will see it created using
`operator|()`, as in `p1 | p2 | p3`.  A alternative parser tries to match all
of its subparsers to the input, one at a time, in order; it stops after
matching at most one subparser.  It matches the input iff one of its
subparsers does.

_Parser_ parsers each have an attribute associated with them, or explicitly
have no attribute.  An attribute is a value that the parser generates when it
matches the input.  For instance, the parser _d_ generates a `double` when it
matches the input.  Since it is not possible to write a type trait that
returns the attribute type of a parser, we need notation for concisely
communicating that relationship.  _ATTR_ is a notional macro that expands to
the attribute type of the parser passed to it; `_ATTR_np_(_d_)` is `double`.

Next, we'll look at some simple programs that parse using _Parser_.  We'll
start small and build up from there.

[endsect]

[section Hello, Whomever]

This is just about the most minimal example of using _Parser_ that one could
write.  We take a string from the command line, or `"World"` if none is given,
and then we parse it:

[hello_example]

The expression `*bp::char_` is a parser-expression.  It uses one of the many
parsers that _Parser_ provides, _ch_.  Like all _Parser_ parsers, it has
certain operations defined on it.  In this case, `*bp::char_` is using an
overloaded `operator*()` as the C++ version of a _kl_ operator.  Since C++ has
no postfix unary `*` operator, we have to use the one we have, so it is used
as a prefix.

So, `*bp::char_` means "any number of characters".  In other words, it really
cannot fail.  Even an empty string will match it.

The parse operation is performed by calling the _p_ function, passing the
parser as one of the arguments:

    bp::parse(input, *bp::char_, result);

The arguments here are: `input`, the string to parse; `*bp::char_`, the parser
used to do the parse; and `result`, and out-parameter into which to put the
result of the parse.  Don't get too caught up on this method of getting the
parse result out of _p_; there are multiple ways of doing so, and we'll cover
all of them in subsequent examples.

Also, just ignore for now the fact that _Parser_ somehow figured out that the
result type of the `*bp::char_` parser is a `std::string`.  There are clear
rules for this that we'll cover later.

The effects of this call to _p_ is not very interesting _emdash_ since the
parser we gave it cannot ever fail, and because we're placing the output in
the same type as the input, it just copies the contents of `input` to
`result`.

[endsect]

[section A Trivial Example]

Let's look at a slightly more complicated example, even if it is still
trivial.  Instead of taking any old `char`s we're given, let's require some
structure.  Let's parse one or more `double`s, separated by commas.

The _Parser_ parser for `double` is _d_.  So, to parse a single `double`, we'd
use _d_.  If we wanted to parse two `double`s in a row, we'd use:

    boost::parser::double_ >> boost::parser::double_

`operator>>()` in this expression is the sequence-operator; read is as
"followed by".  If we combine the sequence-operator with _kl_, we can get the
parser we want by writing:

    boost::parser::double_ >> *(',' >> boost::parser::double_)

This is a parser that matches at least one `double` _emdash_ because of the
first _d_ in the expression above _emdash_ followed by zero or more instances
of a-comma-followed-by-a-`double`.  Notice that we can use `','` directly.
Though it is not a parser, `operator>>()` and the other operators defined on
_Parser_ parsers have overloads that accept character/parser pairs of
arguments; these operator overloads will create the right parser to recognize
`','`.

[trivial_example]

The first example filled in an out-parameter to deliver the result of the
parse.  This call to _p_ returns a result instead.  As you can see, the result
is contextually convertible to `bool`, and `*result` is some sort of range.
In fact, the return type of this call to _p_ is
`std::optional<std::vector<double>>`.  Naturally, if the parse fails,
`std::nullopt` is returned.  We'll look at how _Parser_ maps the type of the
parser to the return type, or the filled in out-parameter's type, a bit later.

If I run it in a shell, this is the result:

[pre
$ example/trivial
Enter a list of doubles, separated by commas.  No pressure. 5.6,8.9
Great! It looks like you entered:
5.6
8.9
$ example/trivial
Enter a list of doubles, separated by commas.  No pressure. 5.6, 8.9
Good job!  Please proceed to the recovery annex for cake.
]

It does not recognize `"5.6, 8.9"`.  This is because it expects a comma
followed /immediately/ by a `double`, but I inserted a space after the comma.
The same failure to parse would occur if I put a space before the comma, or
before or after the list of `double`s.

[endsect]

[section A Trivial Example That Gracefully Handles Whitespace]

Let's modify the trivial parser we just saw to ignore any spaces that might
exist among the `double`s and commas.  To skip whitespace wherever we find it,
we can pass a /skip parser/ to our call to _p_ (we don't need to touch the
parser passed to _p_).  Here, we use `ascii::space`, which matches any ASCII
character `c` for which `std::isspace(c)` is true.

[trivial_skipper_example]

The skip parser, or /skipper/, is run between the subparsers within the
parser passed to _p_.  In this case, the skipper is run before the first
`double` is parsed, before any subsequent comma or `double` is parsed, and at
the end.  So, the strings `"3.6,5.9"` and `" 3.6 , \t 5.9 "` are parsed the
same by this program.

Skipping is an important concept in _Parser_.  You can skip anything, not just
ASCII whitespace; there are lots of other things you might want to skip.  The
skipper you pass to _p_ can be an arbitrary parser.  For example, if you write
a parser for a scripting language, you can write a skipper to skip whitespace,
inline comments, and end-of-line comments.

We'll be using skip parsers almost exclusively in the rest of the
documentation.  The ability to ignore the parts of your input that you don't
care about is so convenient that parsing without skipping is a rarity in
practice.

[endsect]

[section Semantic Actions]

Like all parsing systems (lex & yacc, _Spirit_, etc.), _Parser_ has a
mechanism for associating semantic actions with different parts of the parse.
Here is nearly the same program as we saw in the previous example, except that
it is implemented in terms of a semantic action that appends each parsed
`double` to a result, instead of automatically building and returning the
result:

[semantic_action_example]

Run in a shell, it looks like this:

[pre
$ example/semantic_actions
Enter a list of doubles, separated by commas. 4,3
Got one!
Got one!
You entered:
4
3
]

In _Parser_, semantic actions are implemented in terms of invocable objects
that take a single parameter to a parse-context object.  In the example we
used this lambda as our invocable:

[semantic_action_example_lambda]

We're both printing a message to `std::cout` and recording a parsed result in
the lambda.  It could do both, either, or neither of these things if you like.
The way we get the parsed `double` in the lambda is by asking the parse
context for it. `_attr(ctx)` is how you ask the parse context for the
attribute produced by the parser to which the semantic action is attached.
There are lots of functions like `_attr()` that can be used to access the
state in the parse context.  We'll cover more of them later on.  The next
section defines what exactly the parse context is and how it works.

TODO: Briefly introduce rules here.

[endsect]

[section The Parse Context]

Now would be a good time to describe the parse context in some detail.  Any
semantic action that you write will need to use the state in the parse
context, so you need to know what's available.

The parse context is a `hana::map` from tag types to elements.  Elements are
added to or removevd from it at different times during the parse.  For
instance, when a parser with a semantic action succeeds, it adds the attribute
it produces to the parse context, then calls the invocable semantic action.
This is efficient to do, because the `hana::map` remains fairly small, usually
around ten elements, and each element is stored as a pointer.  Copying the
entire map when mutating the context is therefore fast.

[note All these functions that take the parse context as their first parameter
will find by found by Argument-Dependent Lookup.  You will probably never need
to qualify them with `boost::parser::`.]

[heading Accessors for data that are always available]

By convention, the names of all _Parser_ functions that take a parse context,
and are therefore intended for use inside semantic actions, contain a leading
underscore.

[heading _pass_]

_pass_ returns a reference to a `bool` indicating the success of failure of
the current parse.  This can be used to force the current parse to pass or
fail:

    [](auto & ctx) {
        // If the attribute meets this predicate, fail the parse.
        if (some_condition(_attr(ctx)))
            _pass(ctx) = false;
    }

Note that for a semantic action to be executed, its associated parser must
already have succeeded.  So unless you previously wrote `_pass(ctx) = false`
somewhere, `_pass(ctx) = true` does nothing; it's redundant.

[heading _begin_, _end_ and _where_]

_begin_ and _end_ return the beginning and end of the range that you passed to
_p_, respectively.  _where_ returns a _v_ indicating the bounds of the input
matched by the current parse.  _where_ can be useful if you just want to parse
some text and return a result consisting of where certain elements are
located, without producing any other attributes.

[heading _error_handler_]

_error_handler_ returns a reference to the error handler associated with the
parser passed to _p_.  Any error handler must have the following member
functions:

[error_handler_api_1]

[error_handler_api_2]

If you call the second one, the one without the iterator parameter, it will
call the first with `_where(context).begin()` as the iterator parameter.  The
one without the iterator is the one you will use most often.  The one with the
explicit iterator parameter can be useful in situations where you have
messages that are related to each other, associated with multiple locations.
For instance, if you are parsing XML, you may want to report that a close-tag
does not match its associated open-tag by showing the line where the open-tag
was found.  That may of course not be located anywhere near
`_where(ctx).begin()`.  (A description of _globals_ is below.)

    [](auto & ctx) {
        // Assume we have a std::vector of open tags, and another
        // std::vector of iterators to where the open tags were parsed, in our
        // globals.
        if (_attr(ctx) != _globals(ctx).open_tags.back()) {
            std::string open_tag_msg =
                "Previous open-tag \"" + _globals(ctx).open_tags.back() + "\" here:";
            _error_handler(ctx).diagnose(
                boost::parser::diagnostic_kind::error,
                open_tag_msg,
                ctx,
                _globals(ctx).open_tags_position.back());
            std::string close_tag_msg =
                "does not match close-tag \"" + _attr(ctx) + "\" here:";
            _error_handler(ctx).diagnose(
                boost::parser::diagnostic_kind::error,
                close_tag_msg,
                ctx);

            // Explicitly fail the parse.  Diagnostics to not affect parse success.
            _pass(ctx) = false;
        }
    }

[heading _report_error_ and _report_warning_]

There are also some convenience functions that make the above code a little
less verbose, _report_error_ and _report_warning_:

    [](auto & ctx) {
        // Assume we have a std::vector of open tags, and another
        // std::vector of iterators to where the open tags were parsed, in our
        // globals.
        if (_attr(ctx) != _globals(ctx).open_tags.back()) {
            std::string open_tag_msg =
                "Previous open-tag \"" + _globals(ctx).open_tags.back() + "\" here:";
            _report_error(ctx, open_tag_msg, _globals(ctx).open_tag_positions.back());
            std::string close_tag_msg =
                "does not match close-tag \"" + _attr(ctx) + "\" here:";
            _report_error(ctx, close_tag_msg);

            // Explicitly fail the parse.  Diagnostics to not affect parse success.
            _pass(ctx) = false;
        }
    }

You should use these less verbose functions almost all the time.  The only
time you would want to use _error_handler_ is when you are using a custom
error handler, and you want access to some part of it's interface besides
`diagnose()`.

[heading Accessors for data that are only sometimes available]

[heading _attr_]

_attr_ returns a reference to the value of the current parser's attribute.  It
is available only when the current parser's parse is successful.  If the
parser has no semantic action, no attribute gets added to the parse context.
It can be used to read and write the current parser's attribute:

    [](auto & ctx) { _attr(ctx) = 3; }

If the current parser has no attribute, a _n_ is returned.

[heading _val_]

_val_ returns a reference to the value of the attribute of the current rule
being used to parse (if any), and is available even before the rule's parse is
successful.  It can be used to set the current rule's attribute, even from a
parser that is a subparser inside the rule.  Let's say we're writing a parser
with a semantic action that is within a rule.  If we want to set the current
rule's value to whatever this subparser parses, we would write this semantic
action:

    [](auto & ctx) { _val(ctx) = _attr(ctx); }

If there is no current rule, or the current rule has no attribute, a _n_ is
returned.

[heading _globals_]

_globals_ returns a reference to a user-supplied struct that contains whatever
data you want to use during the parse.  We'll get into this more later, but
for now, here's how you might use it:

    [](auto & ctx) {
        // black_list is some set of proscribed values that are not allowed.
        if (_globals(ctx).black_list.contains(_attr(ctx)))
            _pass(ctx) = false;
    }

[heading _locals_]

_locals_ returns a reference to one or more values that are local to the
current rule being parsed, if any.  If there are two or more local values,
_locals_ returns a reference to a `hana::tuple`.  Rules are something we
haven't gotten to yet, but here is how you use _locals_:

    [](auto & ctx) {
        auto & local = _locals(ctx);
        // Use local here.  If it is a hana::tuple, access its members like this:
        using namespace hana::literals;
        auto & first_element = local[0_c];
        auto & second_element = local[1_c];
    }

If there is no current rule, or the current rule has no locals, a _n_ is
returned.

[heading _params_]

_params_, like _locals_, applies to the current rule being used to parse, if
any.  It also returns a reference to a single value, if the current rule has
only one parameter, or a `hana::tuple` to multiple values if the current rule
has multiple parameters.

If there is no current rule, or the current rule has no parameters, a _n_ is
returned.

[note _n_ is a type that is used as a return value in _Parser_ for parse
context accessors.  _n_ is convertible to anything that has a default
constructor, convertible from anything, assignable form anything, and has
templated overloads for all the overloadable operators.  The intention is that
a misuse of _val_, _globals_, etc. should compile, and produce an assertion at
runtime.  Experience has shown that using a debugger for investigating the
stack that leads to your mistake is a far better user experience than sifting
through compiler diagnostics.  See the _rationale_ section for a more detailed
explanation.]

[endsect]

[section Symbol Tables]

When writing a parser, it often comes up that there is a set of strings that,
when parsed, are associated with a set of values 1-to-1.  It is tedious to
write parsers that recognize all the possible input strings when you have to
associate each one with an attribute via a semantic action.  Instead, we can
use a symbol table.

Say we want to parse Roman numerals, one of the most common work-related
parsing problems.  We want to recognize numbers that start with any number of
"M"s, representing thousands, followed by the hundreds, the tens, and the
ones.  Any of these may be absent from the input, but not all.  Here are three
symbol _Parser_ tables that we can use to recognize ones, tens, and hundreds
values, respectively:

[roman_numeral_symbol_tables]

A _symbols_ maps strings of `char` to their associated attributes.  The type
of the attribute must be specified as a template parameter to _symbols_
_emdash_ `int` in this case.

Any "M"s we encounter should add 1000 to the result, and all other values come
from the symbol tables.  Here are the semantic actions we'll need to do that:

[roman_numeral_actions]

`add_1000` just adds `1000` to `result`.  `add` adds whatever attribute is
produced by its parser to `result`.

Now we just need to put the pieces together to make a parser:

[roman_numeral_parser]

We've got a few new bits in play here, so let's break it down.  `'M'_l` is a
/literal parser/.  That is, it is a parser that parses a literal `char`, code
point, or string.  In this case, a `char` "M" is being parsed.  The `_l` bit
at the end is a _udl_ suffix that you can put after any `char`, `char32_t`, or
`char const *` to form a literal parser.  You can also make a literal parser
by writing _lit_ for some `x` of one of the previously mentioned types.

Why do we need any of this, considering that we just used a literal `','` in
our previous example?  The reason is that `'M'` is not used in an expression
with another _Parser_ parser.  It is used within `*'M'_l[add_1000]`.  If we'd
written `*'M'[add_1000]`, clearly that would be ill-formed; `char` has no
`operator*()`, nor an `operator[]()`, associated with it.

[tip Any time you want to use a `char`, `char32_t`, or string literal in a
_Parser_ parser, write it as-is if it is combined with a preexisting _Parser_
subparser `p`, as in `'x' >> p`.  Otherwise, you need to wrap it in a call to
_lit_, or use the `_l` _udl_ suffix.]

On to the next bit: `-hundreds[add]`.  By now, the use of the index operator
should be pretty familiar; it associates the semantic action `add` with the
parser `hundreds`.  The `operator-()` at the beginning is new.  It means that
the parser it is applied to is optional.  You can read it as "zero or one".
So, if `hundreds` is not successfully parsed after `*'M'[add_1000]`, nothing
happens, because `hundreds` is allowed to be missing _emdash_ it's optional.
If `hundreds` is parsed successfully, say by matching `"CC"`, the resulting
attribute, `200`, is added to `result` inside `add`.

Here is the full listing of the program.  Notice that it would have been
inappropriate to use a whitespace skipper here, since the entire parse is a
single number, so it was removed.

[roman_numeral_example]

[endsect]

[section Mutable Symbol Tables]

The previous example showed how to use a symbol table as a fixed lookup table.
What if we want to add things to the table during the parse?  We can do that,
but we need to do so within a semantic action.  First, here is our symbol
table, already with a single value in it:

[self_filling_symbol_table_table]

No surprise that it works to use the symbol table as a parser to parse the one
string in the symbol table.  Now, here's our parser:

[self_filling_symbol_table_parser]

Here, we've attached the semantic action not to a simple parser like _d_, but
to the sequence parser `(bp::char_ >> bp::int_)`.  This sequence parser
contains two parsers, each with its own attribute, so it produces two
attributes as a `hana::tuple`.

[self_filling_symbol_table_action]

Inside the semantic action, we can get the first element of the attribute
tuple using _udls_ provided by Boost.Hana, and `hana::tuple::operator[]()`.
The first attribute, from the _ch_, is `_attr(ctx)[0_c]`, and the second, from
the _i_, is `_attr(ctx)[1_c]`.  To add the symbol to the symbol table, we call
`insert()`.

[self_filling_symbol_table_parser]

During the parse, `("X", 9)` is parsed and added to the symbol table.  Then,
the second `'X'` is recognized by the symbol table parser.  However:

[self_filling_symbol_table_after_parse]

If we parse again, we find that `"X"` did not stay in the symbol table.  The
fact that `symbols` was declared const might have given you a hint that this
would happen.  Also, notice that the call to `insert()` in the semantic action
uses the parse context; that's where all the symbol table changes are stored
during the parse.

The full program:

[self_filling_symbol_table_example]

[note It is possible to add symbols to a _symbols_ permanently.  To do so, you
have to use a mutable _symbols_ object `s`, and add the symbols by calling
`s.add()`, instead of `s.insert()`.]

[endsect]

[section Alternative Parsers]

Frequently, you need to parse something that might have one of several forms.
`operator|()` is overloaded to form alternative parsers.  For example:

    namespace bp = boost::parser;
    auto const parser_1 = bp::int_ | bp::eps;

`parser_1` matches an integer, or if that fails, it matches /epsilon/, the
empty string.  This is equivalent to writing:

    namespace bp = boost::parser;
    auto const parser_2 = -bp::int_;

However, neither `parser_1` nor `parser_2` is equivalent to writing this:

    namespace bp = boost::parser;
    auto const parser_3 = bp::eps | bp::int_;

The reason is that alternative parsers try each of their subparsers, one at a
time, and stop on the first one that matches.  /Epsilon/ matches anything,
since it is zero length and consumes no input.  It even matches the end of
input.  This means that `parser_3` is equivalent to _e_ by itself.

[endsect]

[section The Parsers And Their Uses]

_Parser_ comes with all the parsers most parsing tasks will ever need.  (You
can also write your own; we'll cover that later.)  Each one is a `constexpr`
object, or a `constexpr` function.  Some of the non-functions are also
callable, such as _ch_, which may be used directly, or with arguments, as in
_ch_`('a', 'z')`.  Any parser that can be called, whether a function or
callable object, will be called a /callable parser/ from now on.  Note that
there are no nullary calalble parsers; they each take one or more arguments.

Each callable parser takes one or more /parse arguments/.  A parse argument
may be a value or an invocable object that accepts a reference to the parse
context.  The reference parameter may be mutable or constant.  For example:

    struct get_attribute
    {
        template<typename Context>
        auto operator()(Context & ctx)
        {
            return _attr(ctx);
        }
    };

This can also be a lambda.  For example:

    [](auto const & ctx) { return _attr(ctx); }

The operation that produces a value from a parse argument, which may be a
value or a callable taking a parse context argument, is referred to as
/resolving/ the parse argument.

Some callable parsers take a /parse predicate/.  A parse predicate is not
quite the same as a parse argument, because it must be a callable object, and
cannot be a value.  A parse predicate's return type must be contextually
convertible to `bool`.  For example:

    struct equals_three
    {
        template<typename Context>
        bool operator()(Context const & ctx)
        {
            return _attr(ctx) == 3;
        }
    };

This may of course be a lambda:

    [](auto & ctx) { return _attr(ctx) == 3; }

An example of how parse arguments are used:

    namespace bp = boost::parser;
    // This parser matches one code point that is at least 'a', and at most
    // the value of last_char, which comes from the globals.
    auto last_char = [](auto & ctx) { return _globals(ctx).last_char; }
    auto subparser = bp::char_('a', last_char);

Don't worry for now about what the globals are for now; the take-away is that
you can make any argument you pass to a parser depend on the current state of
the parse, by using the parse context:

    namespace bp = boost::parser;
    // This parser parses two code points.  For the parse to succeed, the
    // second one must be >= 'a' and <= the first one.
    auto set_last_char = [](auto & ctx) { _globals(ctx).last_char = _attr(x); };
    auto parser = bp::char_[set_last_char] >> subparser;

Each callable parser returns a new parser, parameterized using the arguments
given in the invocation.

TODO: This is way too long for a tutorial.  Put this after the examples, in a
reference section separate from the headers-reference (consider moving other
long tables, too).  Instead, just cover a few exemplars, like _i_, _f_, char_,
string.

This table lists all the _Parser_ parsers.  For the callable parsers, a
separate entry exists for each possible arity of arguments.  For a parser `p`,
if there is no entry for `p` without arguments, `p` is a function, and cannot
itself be used as a parser; it must be called.  In the table below:

* each entry is a global object usable directly in your parsers, unless
  otherwise noted;

* "code point" is used to refer to the elements of the input range, which
  asumes that the parse is being done in the Unicode-aware code path (if the
  parse is being done in the non-Unicode code path, read "code point" as
  "`char`");

* _RES_ is a notional macro that expands to the resolution of parse argument
  or evaluation of a parse predicate;

* "`_RES_np_(pred) == true`" is a shorthand notation for "`_RES_np_(pred)` is
  contextually convertible to `true`", and likewise for `false`;

* `c` is a character of type `char`, `char8_t`, or `char32_t`;

* `str` is a string literal of type `char const[]`, `char8_t const []`, or
  `char32_t const []`;

* `pred` is a parse predicate;

* `arg0`, `arg1`, `arg2`, ... are parse arguments;

* `a` is a semantic action;

* `r` is an object whose type models `parsable_range_like`; and

* `p`, `p1`, `p2`, ... are parsers.

[note The definition of `parsable_range_like` is:

[parsable_range_like_concept]

It is intended to be a range-like thing; a null-terminated sequence of
characters is considered range-like, given that a pointer `T *` to a
null-terminated string is isomorphic with `view<T *,
boost::text::null_sentinel>`.]

[note Some of the parsers in this table consume no input.  All parsers consume
the input they match unless otherwise stated in the table below.]

[table Parsers and Their Semantics
    [[Parser] [Semantics] [Attribute Type] [Notes]]

    [[ _e_ ]
      [ Matches /epsilon/, the empty string.  Always matches, and consumes no input. ]
      [ None. ]
      []]

    [[ `_e_(pred)` ]
     [ Fails to match the input if `_RES_np_(pred) == false`.  Otherwise, the semantics are those of _e_. ]
     [ None. ]
     []]

    [[ _ws_ ]
     [ Matches a single whitespace code point (see note), according to the Unicode White_Space property. ]
     [ None. ]
     [ For more info, see the [@https://www.unicode.org/Public/UCD/latest/ucd/PropList.txt Unicode properties].  _ws_ may consume one code point or two.  It only consumes two code points when it matches `"\r\n"`. ]]

    [[ _eol_ ]
     [ Matches a single newline (see note), following the "hard" line breaks in the Unicode line breaking algorithm. ]
     [ None. ]
     [ For more info, see the [@https://unicode.org/reports/tr14 Unicode Line Breaking Algorithm].  _eol_ may consume one code point or two.  It only consumes two code points when it matches `"\r\n"`. ]]

    [[ _eoi_ ]
     [ Matches only at the end of input, and consumes no input. ]
     [ None. ]
     []]

    [[ _attr_np_`(arg0)` ]
     [ Always matches, and consumes no input.  Generates the attribute `_RES_np_(arg0)`. ]
     [ `decltype(_RES_np_(arg0))`. ]
     []]

    [[ _ch_ ]
     [ Matches any single code point. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See _attr_gen_. ]
     []]

    [[ `_ch_(arg0)` ]
     [ Matches exactly the code point `_RES_np_(arg0)`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See _attr_gen_. ]
     []]

    [[ `_ch_(arg0, arg1)` ]
     [ Matches the next code point in the input `n`, if `_RES_np_(arg0) <= n && n <= _RES_np_(arg1)`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See _attr_gen_. ]
     []]

    [[ `_ch_(r)` ]
     [ Matches the next code point in the input `n`, if `_RES_np_(arg0) <= n && n <= _RES_np_(arg1)`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See _attr_gen_. ]
     [ `r` is taken to be in a UTF encoding.  The exact UTF used depends on the size of `r`'s element type.  If you do not pass UTF encoded ranges for `r`, the bahavior of _ch_ is undefined.  Note that ASCII is a subset of UTF-8, so ASCII is fine.  EBCDIC may not be.  `r` is not copied; a reference to it is taken.  The liftime of `_ch_(r)` must be within the lifetime of r`.  This overload of _ch_ does *not* take parse arguments. ]]

    [[ _cp_ ]
     [ Matches a single code point. ]
     [ `uint32_t` ]
     [ Similar to _ch_, but with a fixed `uint32_t` attribute type; _cu_ has all the same call operator overloads as _ch_, though they are not repeated here, for brevity. ]]

    [[ _cu_ ]
     [ Matches a single code point. ]
     [ `char` ]
     [ Similar to _ch_, but with a fixed `char` attribute type; _cu_ has all the same call operator overloads as _ch_, though they are not repeated here, for brevity.  Even though the name "`cu`" suggests that this parser match at the code unit level, it does not.  The name refers to the attribute type generated, much like the names _i_ versus _ui_. ]]

    [[ `_alnum_` ]
     [ Matches a single code point for which `std::alnum()` is `true`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See the entry for _ch_. ]
     [ Intended for parsing of ASCII only.  The results will be wrong for many, many cases if used for Unicode parsing. ]]

    [[ `_alpha_` ]
     [ Matches a single code point for which `std::isalpha()` is `true`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See the entry for _ch_. ]
     [ Intended for parsing of ASCII only.  The results will be wrong for many, many cases if used for Unicode parsing. ]]

    [[ `_blank_` ]
     [ Matches a single code point for which `std::isblank()` is `true`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See the entry for _ch_. ]
     [ Intended for parsing of ASCII only.  The results will be wrong for many, many cases if used for Unicode parsing. ]]

    [[ `_cntrl_` ]
     [ Matches a single code point for which `std::iscntrl()` is `true`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See the entry for _ch_. ]
     [ Intended for parsing of ASCII only.  The results will be wrong for many, many cases if used for Unicode parsing. ]]

    [[ `_digit_` ]
     [ Matches a single code point for which `std::isdigit()` is `true`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See the entry for _ch_. ]
     [ Intended for parsing of ASCII only.  The results will be wrong for many, many cases if used for Unicode parsing. ]]

    [[ `_graph_` ]
     [ Matches a single code point for which `std::isgraph()` is `true`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See the entry for _ch_. ]
     [ Intended for parsing of ASCII only.  The results will be wrong for many, many cases if used for Unicode parsing. ]]

    [[ `_print_` ]
     [ Matches a single code point for which `std::isprint()` is `true`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See the entry for _ch_. ]
     [ Intended for parsing of ASCII only.  The results will be wrong for many, many cases if used for Unicode parsing. ]]

    [[ `_punct_` ]
     [ Matches a single code point for which `std::ispunct()` is `true`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See the entry for _ch_. ]
     [ Intended for parsing of ASCII only.  The results will be wrong for many, many cases if used for Unicode parsing. ]]

    [[ `_space_` ]
     [ Matches a single code point for which `std::isspace()` is `true`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See the entry for _ch_. ]
     [ Intended for parsing of ASCII only.  The results will be wrong for many, many cases if used for Unicode parsing. ]]

    [[ `_xdigit_` ]
     [ Matches a single code point for which `std::isxdigit()` is `true`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See the entry for _ch_. ]
     [ Intended for parsing of ASCII only.  The results will be wrong for many, many cases if used for Unicode parsing. ]]

    [[ `_lower_` ]
     [ Matches a single code point for which `std::islower()` is `true`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See the entry for _ch_. ]
     [ Intended for parsing of ASCII only.  The results will be wrong for many, many cases if used for Unicode parsing. ]]

    [[ `_upper_` ]
     [ Matches a single code point for which `std::isupper()` is `true`. ]
     [ The code point type in Unicode parsing, or `char` in non-Unicode parsing.  See the entry for _ch_. ]
     [ Intended for parsing of ASCII only.  The results will be wrong for many, many cases if used for Unicode parsing. ]]

    [[ _lit_np_`(c)`]
     [ Matches exactly the given code point `c`. ]
     [ None. ]
     [_lit_ does *not* take parse arguments. ]]

    [[ `c_l` ]
     [ Matches exactly the given code point `c`. ]
     [ None. ]
     [ This is a _udl_ that represents `_lit_np_(c)`, for example `'F'_l`. ]]

    [[ _lit_np_`(r)`]
     [ Matches exactly the given string `r`. ]
     [ None. ]
     [ _lit_ does *not* take parse arguments. ]]

    [[ `str_l` ]
     [ Matches exactly the given string `str`. ]
     [ None. ]
     [ This is a _udl_ that represents `_lit_np_(s)`, for example `"a string"_l`. ]]

    [[ `_str_np_(r)`]
     [ Matches exactly `r`, and generates the match as an attribute. ]
     [ `std::string` ]
     [ _str_ does *not* take parse arguments. ]]

    [[ `str_p`]
     [ Matches exactly `str`, and generates the match as an attribute. ]
     [ `std::string` ]
     [ This is a _udl_ that represents `_str_np_(s)`, for example `"a string"_p`. ]]

    [[ _b_ ]
     [ Matches `"true"` or `"false"`. ]
     [ `bool` ]
     []]

    [[ _bin_ ]
     [ Matches a binary unsigned integral value. ]
     [ `unsigned int` ]
     [ For example, _bin_ would match `"101"`, and generate an attribute of `5u`. ]]

    [[ `_bin_(arg0)` ]
     [ Matches exactly the binary unsigned integral value `_RES_np_(arg0)`. ]
     [ `unsigned int` ]
     []]

    [[ _oct_ ]
     [ Matches an octal unsigned integral value. ]
     [ `unsigned int` ]
     [ For example, _oct_ would match `"31"`, and generate an attribute of `25u`. ]]

    [[ `_oct_(arg0)` ]
     [ Matches exactly the octal unsigned integral value `_RES_np_(arg0)`. ]
     [ `unsigned int` ]
     []]

    [[ _hex_ ]
     [ Matches a hexidecimal unsigned integral value. ]
     [ `unsigned int` ]
     [ For example, _hex_ would match `"ff"`, and generate an attribute of `255u`. ]]

    [[ `_hex_(arg0)` ]
     [ Matches exactly the hexidecimal unsigned integral value `_RES_np_(arg0)`. ]
     [ `unsigned int` ]
     []]

    [[ _us_ ]
     [ Matches an unsigned integral value. ]
     [ `unsigned short` ]
     []]

    [[ `_us_(arg0)` ]
     [ Matches exactly the unsigned integral value `_RES_np_(arg0)`. ]
     [ `unsigned int` ]
     []]

    [[ _ui_ ]
     [ Matches an unsigned integral value. ]
     [ `unsigned int` ]
     []]

    [[ `_ui_(arg0)` ]
     [ Matches exactly the unsigned integral value `_RES_np_(arg0)`. ]
     [ `unsigned int` ]
     []]

    [[ _ul_ ]
     [ Matches an unsigned integral value. ]
     [ `unsigned long` ]
     []]

    [[ `_ul_(arg0)` ]
     [ Matches exactly the unsigned integral value `_RES_np_(arg0)`. ]
     [ `unsigned int` ]
     []]

    [[ _ull_ ]
     [ Matches an unsigned integral value. ]
     [ `unsigned long long` ]
     []]

    [[ `_ull_(arg0)` ]
     [ Matches exactly the unsigned integral value `_RES_np_(arg0)`. ]
     [ `unsigned int` ]
     []]

    [[ _s_ ]
     [ Matches a signed integral value. ]
     [ `short` ]
     []]

    [[ `_s_(arg0)` ]
     [ Matches exactly the signed integral value `_RES_np_(arg0)`. ]
     [ `int` ]
     []]

    [[ _i_ ]
     [ Matches a signed integral value. ]
     [ `int` ]
     []]

    [[ `_i_(arg0)` ]
     [ Matches exactly the signed integral value `_RES_np_(arg0)`. ]
     [ `int` ]
     []]

    [[ _l_ ]
     [ Matches a signed integral value. ]
     [ `long` ]
     []]

    [[ `_l_(arg0)` ]
     [ Matches exactly the signed integral value `_RES_np_(arg0)`. ]
     [ `long` ]
     []]

    [[ _ll_ ]
     [ Matches a signed integral value. ]
     [ `long long` ]
     []]

    [[ `_ll_(arg0)` ]
     [ Matches exactly the signed integral value `_RES_np_(arg0)`. ]
     [ `long long` ]
     []]

    [[ _f_ ]
     [ Matches a floating-point number.  _f_ uses parsing implementation details from _Spirit_.  The specifics of what formats are accepted can be found in their _spirit_reals_.  Note that only the default `RealPolicies` is supported by _f_. ]
     [ `float` ]
     []]

    [[ _d_ ]
     [ Matches a floating-point number.  _d_ uses parsing implementation details from _Spirit_.  The specifics of what formats are accepted can be found in their _spirit_reals_.  Note that only the default `RealPolicies` is supported by _d_. ]
     [ `double` ]
     []]

    [[ `_rpt_np_(arg0)[p]` ]
     [ Matches iff `p` matches exactly `_RES_np_(arg0)` times. ]
     [ `std::vector<_ATTR_np_(p)>` ]
     [ The special value _inf_ may be used; it indicates unlimited repetition.  `decltype(_RES_np_(arg0))` must be implicitly convertible to `int64_t`. ]]

    [[ `_rpt_np_(arg0, arg1)[p]` ]
     [ Matches iff `p` matches between `_RES_np_(arg0)` and `_RES_np_(arg1)` times, inclusively. ]
     [ `std::vector<_ATTR_np_(p)>` ]
     [ The special value _inf_ may be used for the upper bound; it indicates unlimited repetition.  `decltype(_RES_np_(arg0))` and `decltype(_RES_np_(arg1))` each must be implicitly convertible to `int64_t`. ]]

    [[ `_if_np_(pred)[p]` ]
     [ Equivalent to `_e_(pred) >> p`. ]
     [ `std::optional<_ATTR_np_(p)>` ]
     [ It is an error to write `_if_np_(pred)`.  That is, it is an error to omit the conditionally matched parser `p`. ]]

    [[ `_sw_np_(arg0)(arg1, p1)(arg2, p2) ...` ]
     [ Equivalent to `p1` when `_RES_np_(arg0) == _RES_np_(arg1)`, `p2` when `_RES_np_(arg0) == _RES_np_(arg2)`, etc.  If there is such no `argN`, the behavior of _sw_ is undefined. ]
     [ `std::variant<_ATTR_np_(p1), _ATTR_np_(p2), ...>` ]
     [ It is an error to write `_sw_np_(arg0)`.  That is, it is an error to omit the conditionally matched parsers `p1`, `p2`, .... ]]

    [[ _symbols_t_ ]
     [ _symbols_ is an associative container of key, value pairs.  Each key is a `std::string` and each value has type `T`.  In the Unicode parsing path, the strings are considered to be UTF-8 encoded; in the non-Unicode path, no encoding is assumed.  _symbols_ Matches the longest prefix `pre` of the input that is equal to one of the keys `k`.  If the length `len` of `pre` is zero, and there is no zero-length key, it does not match the input.  If `len` is positive, the generated attribute is the value associated with `k`.]
     [ `T` ]
     [ Unlike the other entries in this table, _symbols_ is a type, not an object. ]]
]

[note A slightly more complete description of the attributes generated by
these parsers is in the next section.  The attributes are repeated here so you
can use see all the properties of the parsers in one place.]

TODO: int<>, uint<>

[endsect]

[section Directives]

A directive is an element of your parser that doesn't have any meaning by
itself.  Some are second-order parsers that need a first-order parser to do
the actual parsing.  Others influence the parse in some way.  Lexically, you
can spot a directive by its use of `[]`.  Non-directives never use `[]`, and
directives always do.

The directives that are second order parsers are technically directives, but
since they are also used to create parsers, it is more useful just to focus on
that.  The directives _rpt_ and _if_ were already described in the section on
parsers; we won't say more about them here.

That leaves the directives that affect aspects of the parse:

[heading _omit_]

`_omit_np_[p]` disables attribute generation for the parser `p`.  Not only
does `_omit_np_[p]` have no attribute, but any attribute generation work that
normally happens within `p` is skipped.

This directive can be useful in cases like this: say you have some fairly
complicated parser `p` that generates a large and expensive-to-construct
attribute.  Now say that you want to write a function that just counts how
many times `p` can match a string (where the matches are non-overlapping).
Instead of using `p` directly, and building all those attributes, or rewriting
`p` without the attribute generation, use _omit_.

[heading _raw_]

`_raw_np_[p]` changes the attribute from `_ATTR_np_(p)` to to a view that
delimits the subrange of the input that was matched by `p`.  The type of the
view is `_v_<I>`, where `I` is the type of the iterator used within the parse.
Note that this may not be the same as the iterator type passed to _p_.  For
instance, when parsing UTF-8, the iterator passed to _p_ may be `char8_t const
*`, but within the parse it will be a UTF-8 to UTF-32 transcoding (converting)
iterator.  Just like _omit_, _raw_ causes all attribute-generation work within
`p` to be skipped.

Similar to the re-use scenario for _omit_ above, _raw_ could be used to find
the *locations* of all non-overlapping matches of `p` in a string.

[heading _lexeme_]

`_lexeme_np_[p]` disables use of the skipper, if a skipper is being used,
within the parse of `p`.  This is useful, for instance, if you want to enable
skipping in most parts of your parser, but disable it only in one section
where it doesn't belong.  If you are skipping whitespace in most of your
parser, but want to parse strings that may contain spaces, you should use
_lexeme_:

    namespace bp = boost::parser;
    auto const string_parser = bp::lexeme['"' >> *(bp::char_ = '"') >> '"'];

Without _lexeme_, our string parser would corerctly match `"foo bar"`, but the
generated attribute would be `"foobar"`.

[heading _skip_]

_skip_ is like the inverse of _lexeme_.  It enables skipping in the parse,
even if it was not enabled before.  For example, within a call to _p_ that
uses a skipper, let's say we have these parsers in use:

    namespace bp = boost::parser;
    auto const one_or_more = +bp::char_;
    auto const skip_or_skip_not_there_is_no_try = bp::lexeme[bp::skip[one_or_more] >> one_or_more];

The use of _lexeme_ disables skipping, but then the use of _skip_ turns it
back on.  The net result is that the first occurrance of `one_or_more` will
use the skipper passed to _p_; the second will not.

_skip_ has another use.  You can parameterize skip with a different parser to
change the skipper just within the scope of the directive.  Let's say we
passed _space_ to _p_, and we're using these parsers somewhere within that
call:

    namespace bp = boost::parser;
    auto const zero_or_more = *bp::char_;
    auto const skip_both_ways = zero_or_more >> bp::skip(bp::ws)[zero_or_more];

The first occurrance of `zero_or_more` will use the skipper passed to _p_,
_space_; the second will use _ws_ as its skipper.

[endsect]

[section Combining Operations]

Certain overloaded operators are defined for all parsers in _Parser_.  We've
already seen some of them used in this tutorial, especially `operator>>()` and
`operator|()`, which are used to form sequence parsers and alternative
parsers, respectively.

Here are all the operators overloaded for parsers.  In the tables below:

* `c` is a character of type `char` or `char32_t`;

* `a` is a semantic action;

* `r` is an object whose type models `parsable_range_like` (see _concepts_);
  and

* `p`, `p1`, `p2`, ... are parsers.

[note Some of the expressions in this table consume no input.  All parsers
consume the input they match unless otherwise stated in the table below.]

[table Combining Operations and Their Semantics
    [[Expression] [Semantics] [Attribute Type] [Notes]]

    [[`!p`] [ Matches iff `p` does not match; consumes no input. ] [None.] []]
    [[`&p`] [ Matches iff `p` matches; consumes no input. ] [None.] []]
    [[`*p`] [ Parses using `p` repeatedly until `p` no longer matches; always matches. ] [`std::vector<_ATTR_np_(p)>`] []]
    [[`+p`] [ Parses using `p` repeatedly until `p` no longer matches; matches iff `p` matches at least once. ] [`std::vector<_ATTR_np_(p)>`] []]
    [[`-p`] [ Equivalent to `p | _e_`. ] [`std::optional<_ATTR_np_(p)>`] []]
    [[`p1 >> p2`] [ Matches only iff `p1` matches, and then `p2` matches. ] [`hana::tuple<_ATTR_np_(p1), _ATTR_np_(p2)>` (See note.)] [ `>>` is associative; `p1 >> p2 >> p3`, `(p1 >> p2) >> p3`, and `p1 >> (p2 >> p3)` are all equivalent.  This attribute type only applies to the case where `p1` and `p2` both generate attributes; see _attr_gen_ for the full rules. ]]
    [[`p >> c`] [ Equivalent to `p >> lit(c)`. ] [`_ATTR_np_(p1)`] []]
    [[`p >> r`] [ Equivalent to `p >> lit(r)`. ] [`_ATTR_np_(p1)`] []]
    [[`p1 > p2`] [ Matches only iff `p1` matches, and then `p2` matches.  No back-tracking is allowed after `p1` matches; if `p1` matches but then `p2` does not, the top-level parse fails. ] [`hana::tuple<_ATTR_np_(p1), _ATTR_np_(p2)>` (See note.)] [ `>` is associative; `p1 > p2 > p3`, `(p1 > p2) > p3`, and `p1 > (p2 > p3)` are all equivalent.  This attribute type only applies to the case where `p1` and `p2` both generate attributes; see _attr_gen_ for the full rules. ]]
    [[`p > c`] [ Equivalent to `p > lit(c)`. ] [`_ATTR_np_(p1)`] []]
    [[`p > r`] [ Equivalent to `p > lit(r)`. ] [`_ATTR_np_(p1)`] []]
    [[`p1 | p2`] [ Matches only iff either `p1` matches or `p2` matches. ] [`std::variant<_ATTR_np_(p1), _ATTR_np_(p2)>` (See note.)] [ `|` is associative; `p1 | p2 | p3`, `(p1 | p2) | p3`, and `p1 | (p2 | p3)` are all equivalent.  This attribute type only applies to the case where `p1` and `p2` both generate attributes; see _attr_gen_ for the full rules. ]]
    [[`p | c`] [ Equivalent to `p | lit(c)`. ] [`_ATTR_np_(p1)`] []]
    [[`p | r`] [ Equivalent to `p | lit(r)`. ] [`_ATTR_np_(p1)`] []]
    [[`p1 - p2`] [ Equivalent to `!p2 >> p1`. ] [`_ATTR_np_(p1)`] []]
    [[`p - c`] [ Equivalent to `p - lit(c)`. ] [`_ATTR_np_(p)`] []]
    [[`p - r`] [ Equivalent to `p - lit(r)`. ] [`_ATTR_np_(p)`] []]
    [[`p1 % p2`] [ Equivalent to `p1 >> *(p2 >> p1)`. ] [`std::vector<_ATTR_np_(p1)>`] []]
    [[`p % c`] [ Equivalent to `p % lit(c)`. ] [`std::vector<_ATTR_np_(p)>`] []]
    [[`p % r`] [ Equivalent to `p % lit(r)`. ] [`std::vector<_ATTR_np_(p)>`] []]
    [[`p[a]`] [ Matches iff `p` matches.  If `p` matches, the semantic action `a` is executed. ] [None.] []]
]

There are a couple of special rules not captured in the table above:

First, the zero-or-more and one-or-more repetitions (`operator*()` and
`operator+()`, respectively) may collapse when combined.  For any parser `p`,
`++p` collapses to `+p`; `**p`, `*+p`, and `+*p` each collapse to just `*p`.

Second, using _e_ in an alternative parser as any alternative *except* the
last one is a common source of errors; _Parser_ disallows it.  This is true
because, for any parser `p`, `_e_ | p` is equivalent to _e_, since _e_ always
matches.  This is not true for _e_ parameterized with a condition.  For any
condition `cond`, `_e_(cond)` is allowed to appear anywhere within an
alternative parser.

[endsect]

[section Attribute Generation]

So far, we've seen several different types of attributes that come from
different parsers, `int` for _i_, `hana::tuple<char, int>` for
`boost::parser::char_ >> boost::parser::int_`, etc.  Let's get into how this
works with a bit more rigor.

[note Some parsers have no attribute at all.  In the tables below, the type of
the attribute is listed as "None."  There is a non-`void` type that is
returned from each parser that lacks an attribute.  This keeps the logic
simple; having to handle the two cases _emdash_ `void` or non-`void` _emdash_
would make the library significantly more complicated.  The type of this
non-`void` attribute associated with these parsers is an implementation
detail.  The type comes from the `boost::parser::detail` namespace and is
pretty useless.  You should never see this type in practice.  Within semantic
actions, asking for the attribute of a non-attribute-producing parser (using
`_attr(ctx)`) will yield a value of the special type `boost::parser::none`.
When calling _p_ in a form that returns the attribute parsed, when there is no
attribute, simply returns `bool`; this indicates the success of failure of the
parse.]

[heading Parser attributes]

This table summarizes the attributes generated for all _Parser_ parsers.  In
the table, _RES_ is a notional macro that expands to the resolution of parse
argument or evaluation of a parse predicate; and `x` and `y` represent
arbitrary objects.

[table Parsers and Their Attributes
    [[Parser]              [Attribute Type]              [Notes]]

    [[ _e_ ]               [ None. ]                     []]
    [[ _eol_ ]             [ None. ]                     []]
    [[ _eoi_ ]             [ None. ]                     []]
    [[ `_attr_np_(x)` ]    [ `decltype(_RES_np_(x))` ][]]
    [[ _ch_ ]              [ The code point type in Unicode parsing, or `char` in non-Unicode parsing; see below. ]
     [Includes all the `_p` _udls_ that take a single character, and all parsers in the `boost::parser::ascii` namespace.]]
    [[ _cp_ ]              [ `uint32_t` ]                []]
    [[ _cu_ ]              [ `char` ]                    []]
    [[ `_lit_np_(x)`]      [ None. ]
     [Includes all the `_l` _udls_.]]
    [[ `_str_np_(x)`]      [ `std::string` ]
     [Includes all the `_p` _udls_ that take a string.]]
    [[ _b_ ]               [ `bool` ]                    []]

    [[ _bin_ ]             [ `unsigned int` ]            []]
    [[ _oct_ ]             [ `unsigned int` ]            []]
    [[ _hex_ ]             [ `unsigned int` ]            []]
    [[ _us_ ]              [ `unsigned short` ]          []]
    [[ _ui_ ]              [ `unsigned int` ]            []]
    [[ _ul_ ]              [ `unsigned long` ]           []]
    [[ _ull_ ]             [ `unsigned long long` ]      []]

    [[ _s_ ]               [ `short` ]                   []]
    [[ _i_ ]               [ `int` ]                     []]
    [[ _l_ ]               [ `long` ]                    []]
    [[ _ll_ ]              [ `long long` ]               []]
    [[ _f_ ]               [ `float` ]                   []]
    [[ _d_ ]               [ `double` ]                  []]

    [[ _symbols_t_ ]       [ `T` ]]
]

_ch_ is a bit odd, since its attribute type is polymorphic.  When you use _ch_
to parse text in the non-Unicode code path (i.e. a string of `char`), the
attribute is `char`.  When you use the exact same _ch_ to parse in the
Unicode-aware code path, all matching is code point based, and so the
attribute type is the type used to represent code points.  For typical uses,
that type is `uint32_t`.  All parsing of UTF-8 falls under this typical case.
The only time the code point type will be something different is if you call
_p_ with a code point sequence whose element type is something besides
`uint32_t`.  For example, when you parse plain `char`s, meaning that the
parsing is in the non-Unicode code path, the attribute of _ch_ is `char`:

    auto result = parse("some text", boost::parser::char_);
    static_assert(std::is_same_v<decltype(result), std::optional<char>>));

When you parse UTF-8, the matching is done on a code point basis, and the code
point type is `uint32_t`, so the attribute type is `uint32_t`:

    auto result = parse(boost::text::as_utf8("some text"), boost::parser::char_);
    static_assert(std::is_same_v<decltype(result), std::optional<uint32_t>>));

When you parse code points by explicitly giving a code point range to _p_, the
attribute type is whatever the input range's element type is:

    auto result = parse(U"some text", boost::parser::char_);
    static_assert(std::is_same_v<decltype(result), std::optional<char32_t>>));

[tip If you know or suspect that you will want to use the same parser in
Unicode and non-Unicode parsing modes, you can use _cp_ and/or _cu_ to enforce
a nonpolymorphic attribute type.]


[heading Combining operation attributes]

Combining operations of course affect the generation of attributes.  In the
tables below: `m` and `n` are parse arguments that resolve to integral values;
`pred` is a parse predicate; `arg0`, `arg1`, `arg2`, ... are parse arguments;
`a` is a semantic action; and `p`, `p1`, `p2`, ... are parsers that generate
attributes.

[table Combining Operations and Their Attributes
    [[Parser]                           [Attribute Type]]

    [[`!p`]                             [None.]]
    [[`&p`]                             [None.]]

    [[`*p`]                             [`std::vector<_ATTR_np_(p)>`]]
    [[`+p`]                             [`std::vector<_ATTR_np_(p)>`]]
    [[`+*p`]                            [`std::vector<_ATTR_np_(p)>`]]
    [[`*+p`]                            [`std::vector<_ATTR_np_(p)>`]]
    [[`-p`]                             [`std::optional<_ATTR_np_(p)>`]]

    [[`p1 >> p2`]                       [`hana::tuple<_ATTR_np_(p1), _ATTR_np_(p2)>`]]
    [[`p1 > p2`]                        [`hana::tuple<_ATTR_np_(p1), _ATTR_np_(p2)>`]]
    [[`p1 >> p2 >> p3`]                 [`hana::tuple<_ATTR_np_(p1), _ATTR_np_(p2), _ATTR_np_(p3)>`]]
    [[`p1 > p2 >> p3`]                  [`hana::tuple<_ATTR_np_(p1), _ATTR_np_(p2), _ATTR_np_(p3)>`]]
    [[`p1 >> p2 > p3`]                  [`hana::tuple<_ATTR_np_(p1), _ATTR_np_(p2), _ATTR_np_(p3)>`]]
    [[`p1 > p2 > p3`]                   [`hana::tuple<_ATTR_np_(p1), _ATTR_np_(p2), _ATTR_np_(p3)>`]]

    [[`p1 | p2`]                        [`std::variant<_ATTR_np_(p1), _ATTR_np_(p2)>`]]
    [[`p1 | p2 | p3`]                   [`std::variant<_ATTR_np_(p1), _ATTR_np_(p2), _ATTR_np_(p3)>`]]

    [[`p1 % p2`]                        [`std::vector<_ATTR_np_(p1)>`]]

    [[`p[a]`]                           [None.]]

    [[`_rpt_np_(arg0)[p]`]              [`std::vector<_ATTR_np_(p)>`]]
    [[`_rpt_np_(arg0, arg1)[p]`]        [`std::vector<_ATTR_np_(p)>`]]
    [[`_if_np_(pred)[p]`]               [`std::optional<_ATTR_np_(p)>`]]
    [[`_sw_np_(arg0)(arg1, p1)(arg2, p2)...`]
     [`std::variant<_ATTR_np_(p1), _ATTR_np_(p2), ...>`]]
]

There are a relatively small number of rules that define how sequence parsers
and alternative parsers's attributes are generated.  (Don't worry, there are
examples below.)

[heading Sequence parser attribute rules]

The attribute generation behavior of sequence parsers is conceptually pretty
simple:

* the attributes of subparsers form a tuple of values;

* subparsers that do not generate attributes do not contribute to the
  sequence's attribute;

* subparsers that do generate attributes usually contribute an individual
  element to the tuple result; except

* when containers of the same element type are next to each other, or
  individual elements are next to containers of their type, the two adjacent
  attributes collapse into one attribute; and

* if the result of all that is a degenerate tuple `hana::tuple<T>` (even if
  `T` is a type that means "no attribute"), the attribute becomes `T`.

More formally, the attribute generation algorithm works like this.  For a
sequence parser `p`, let the list of attribute types for the subparsers of `p`
be `{a0, a1, a2, ..., an}`.

We get the attribute of `p` by evaluating a compile-time left fold operation,
`left-fold({a1, a2, ..., an}, a0, OP)`.  `OP` is the combining operation that
takes the current attribute type (initially `a0`) and the next attribute type,
and returns the new current attribute type.  The current attribute type at the
end is the attribute type for `p`.

`OP` attempts to apply a series of rules, one at a time.  The rules are noted
as `A >> B -> C`, where `A` is the type of the current attribute type, `B` is
the type of the next attribute type, and `C` is the new current attribute
type.  In these rules, `C<T>` is a container of `T`; `none` is a special type
that indicates that there is no attribute; `T` is a type; and `Ts...` is a
parameter pack of one or more types.  Note that `T` may be the special type
`none`.

* `T >> none -> T`
* `C<T> >> C<T> -> C<T>`
* `T >> T -> vector<T>`
* `C<T> >> T -> C<T>`
* `C<T> >> optional<T> -> C<T>`
* `T >> C<T> -> C<T>`
* `optional<T> >> C<T> -> C<T>`
* `hana::tuple<none> >> T -> hana::tuple<T>`
* `hana::tuple<Ts...> >> T -> hana::tuple<Ts..., T>`

Again, if the result is that the attribute is `hana::tuple<T>`, the attribute
becomes `T`.

[note What constitutes a container in the rules above is determined by the
`container` concept:
    [container_concept]
]

[heading Alternative parser attribute rules]

The rules for alternative parsers are much simpler.  For an alternative parer
`p`, let the list of attribute types for the subparsers of `p` be `{a0, a1,
a2, ..., an}`.  The attribute of `p` is `std::variant<a0, a1, a2, ..., an>`,
with these exceptions:

* all the `none` attributes are left out, but if any were taken out, the
  attribute become a `std::optional`;

* if the result is `std::variant<T>`, the result becomes `T` instead; and

* if the result is `std::variant<>`, the result becomes `none` instead.

[heading Formation of containers in attributes]

There are no special rules for forming containers from non-containers.  For
instance, one of the rules above for sequence containers is `T >> T ->
vector<T>`.  So, you get a vector if you have multiple values in sequence.
Another rule is that the attribute of `*p` is `std::vector<_ATTR_np_(p)>`.  The
point is, _Parser_ will generate your favorite container out of sequences and
repetitions, as long as your favorite container is `std::vector`.

Another rule for sequence containers is that an value `x` and a container `c`
containing elements of `x`'s type will form a single container.  However,
`x`'s type must be exactly the same as the elements in `c`.  So, the attribute
of `char_ >> string("str")` is odd.  In the non-Unicode code path, `char_`'s
attribute type is guaranteed to be `char`, so `_ATTR_np_(char_ >> string("str"))`
is `std::string`.  If you are parsing UTF-8 in the Unicode code path,
`char_`'s attribute type is `uint32_t`, and `_ATTR_np_(char_ >> string("str"))` is
therefor `hana::tuple<uint32_t, std::string>`.

Again, there are no special rules here.

[heading Examples of attributes generated by sequence and alternative parsers]

In the table: `a` is a semantic action; and `p`, `p1`, `p2`, ... are parsers
that generate attributes.  Note that only `>>` is used here.  `>` has the
exact same attribute generation rules.

[table Sequence and Alternative Combining Operations and Their Attributes
    [[Expression]                    [Attribute Type]]

    [[`_e_ >> _e_`]                  [None.]]
    [[`p >> _e_`]                    [`_ATTR_np_(p)`]]
    [[`_e_ >> p`]                    [`_ATTR_np_(p)`]]

    [[`_cu_ >> _str_np_("str")`]     [`std::string`]]
    [[_str_np_`("str") >> `_cu_]     [`std::string`]]
    [[`*_cu_ >> _str_np_("str")`]    [`hana::tuple<std::vector<char>, std::string>`]]
    [[`_str_np_("str") >> *_cu_`]    [`hana::tuple<std::string, std::vector<char>>`]]

    [[`p >> p`]                      [`std::vector<_ATTR_np_(p)>`]]
    [[`*p >> p`]                     [`std::vector<_ATTR_np_(p)>`]]
    [[`p >> *p`]                     [`std::vector<_ATTR_np_(p)>`]]
    [[`*p >> -p`]                    [`std::vector<_ATTR_np_(p)>`]]
    [[`-p >> *p`]                    [`std::vector<_ATTR_np_(p)>`]]

    [[`_str_np_("str") >> _cu_`]     [`std::string`]]
    [[`_cu_ >> _str_np_("str")`]     [`std::string`]]
    [[`_str_np_("str") >> -_cu_`]    [`std::string`]]
    [[`-_cu_ >> _str_np_("str")`]    [`std::string`]]

    [[`!p1 | p2[a]`]                 [None.]]
    [[`p | p`]                       [`_ATTR_np_(p)`]]
    [[`p1 | p2`]                     [`std::variant<_ATTR_np_(p1), _ATTR_np_(p2)>`]]
    [[`p | `_e_]                     [`std::optional<_ATTR_np_(p)>`]]
    [[`p1 | p2 | _e_`]               [`std::optional<std::variant<_ATTR_np_(p1), _ATTR_np_(p2)>>`]]
    [[`p1 | p2[a] | p3`]             [`std::optional<std::variant<_ATTR_np_(p1), _ATTR_np_(p3)>>`]]
]


[heading Directives that affect attribute generation]

`_omit_np_[p]` disables attribute generation for the parser `p`.
`_raw_np_[p]` changes the attribute from `_ATTR_np_(p)` to a view that
indicates the subrange of the input that was matched by `p`.  See _directives_
for details.

[endsect]

[section The `parse()` API]

There are multiple overloads of _p_.  These overloads have some things in
common:

* They each return a value contextually convertible to `bool`.

* They each take at least a range to parse and a parser.  The "range to parse"
  may be an iterator/sentinel pair or an single range-like object.

* They each require forward iterability of the input.

* They each accept any input range with an integral element type.  This means
  that they can each parse ranges of `char`, `char8_t`, `uint16_t`, `int`,
  etc.

* When you call any of the iterator/sentinel pair overloads of _p_, for
 example `_p_np_(first, last, p, _ws_)`, it parses the range `[first, last)`,
 advancing `first` as it goes.  If the parse succeeds, the entire input may or
 may not have been matched.  The value of `first` will indicate the last
 location wthin the input that `p` matched.  The *whole* input was matched if
 and only if `first == last`.

* When you call any of the range-like overloads of _p_, for example `_p_np_(r,
  p, _ws_)`, _p_ only indicates success if *all* of `r` was matched by `p`.

[heading The overloads]

There are eight overloads of _p_, because there are three either/or options in
how you call it.

[heading Iterator/sentinel versus range-like]

You can call _p_ with an iterator and sentinel that delimit a range of
integral values.  For example:

    namespace bp = boost::parser;
    auto const p = /* some parser ... */;

    char const * str_1 = /* ... */;
    // Using null_sentinel, str_1 can point to three billion characters, and
    // we can call parse() without having to find the end of the string first.
    auto result_1 = bp::parse(str_1, boost::text::null_sentinel, p, bp::ws);

    char str_2[] = /* ... */;
    auto result_2 = bp::parse(std::begin(str_2), std::end(str_2), p, bp::ws);

The iterator/sentinel overloads can parse successfully without matching the
entire input.  You can tell if the entire input was matched by checking if
`first == last` is true after _p_ returns.

You can also call _p_ with a range of integral values.  When the range is a
reference to an array of characters, any terminating `0` is ignored; this
allows calls like `_p_np_("str", p)` to work naturally.

    namespace bp = boost::parser;
    auto const p = /* some parser ... */;

    std::u8string str_1 = "str";
    auto result_1 = bp::parse(str_1, p, bp::ws);

    // The null terminator is ignored.  This call parses s-t-r, not s-t-r-0.
    auto result_2 = bp::parse(U"str", p, bp::ws);

    char const * str_3 = "str";
    auto result_3 = bp::parse(boost::text::as_utf16(str_3), p, bp::ws);

You can also call _p_ with a pointer to a null-termianted string of integral
values.  _p_ considers pointers to null-terminated strings to be ranges,
since, for any pointer `T *` to a null-terminated string, `T *` is isomorphic
with `view<T *, boost::text::null_sentinel>`.

    namespace bp = boost::parser;
    auto const p = /* some parser ... */;

    char const * str_1 = /* ... */ ;
    auto result_1 = bp::parse(str_1, p, bp::ws);
    char8_t const * str_2 = /* ... */ ;
    auto result_2 = bp::parse(str_2, p, bp::ws);
    char16_t const * str_3 = /* ... */ ;
    auto result_3 = bp::parse(str_3, p, bp::ws);
    char32_t const * str_4 = /* ... */ ;
    auto result_4 = bp::parse(str_4, p, bp::ws);

    int const array[] = { 's', 't', 'r', 0 };
    int const * array_ptr = array;
    auto result_5 = bp::parse(array_ptr, p, bp::ws);

Since there is no way to indicate that `p` matches the input, but only a
prefix of the input was matched, the range-like (non-iterator/sentinel)
overloads of _p_ indicate failure if the entire input is not matched.

[heading With or without an attribute out-parameter]

    namespace bp = boost::parser;
    auto const p = '"' >> *(bp::char_ = '"') >> '"';
    char const * str = "\"two words\"" ;

    std::string result_1;
    bool const success = bp::parse(str, p, result_1);   // success is true; result_1 is "two words"
    auto result_2 = bp::parse(str, p);                  // !!result_2 is true; *result_2 is "two words"

When you call _p_ *with* an attribute out-parameter and parser `p`, the
expected type is *something like* `_ATTR_np_(p)`.  It doesn't have to be
exactly that; I'll explain in a bit.  The return type is `bool`.

When you call _p_ *without* an attribute out-parameter and parser `p`, the
return type is `std::optional<_ATTR_np_(p)>`.  Note that when `_ATTR_np_(p)`
is itself an `optional`, the return type is
`std::optional<std::optional<...>>`.  Each of those optionals tells you
something different.  The outer one tells you whether the parse succeeded.  If
so, the parser was successful, but it still generates an attribute that is an
`optional` _emdash_ that's the inner one.

[heading With or without a skipper]

    namespace bp = boost::parser;
    auto const p = '"' >> *(bp::char_ = '"') >> '"';
    char const * str = "\"two words\"" ;

    auto result_1 = bp::parse(str, p);         // !!result_1 is true; *result_1 is "two words"
    auto result_2 = bp::parse(str, p, bp::ws); // !!result_2 is true; *result_2 is "twowords"

[heading Compatability of attribute out-parameters]

For any call to _p_ that takes an attribute out-parameter, like `_p_np_("str",
p, bp::ws, out)`, the call is well-formed for a number of possible types of
`out`; `decltype(out)` does not need to be exactly `_ATTR_np_(p)`.

For instance, this is valid code that does not abort (remember that the
attribute type of _str_ is `std::string`):

    namespace bp = boost::parser;
    auto const p = bp::string("foo");

    std::vector<char> result;
    bool const success = bp::parse("foo", p, result);
    assert(success && result == std::vector<char>({'f', 'o', 'o'}));

Even though `p` generates a `std::string` attribute, when it actually takes
the data it generates and writes it into an attribute, it only assumes that
the attribute is a `container` (see _concepts_), not that it is some
particular container type.  It will happily `insert()` into a `std::string` or
a `std::vector<char>` all the same.  `std::string` and `std::vector<char>` are
both containers of `char`, but it will also insert into a container with a
different element type.  `p` just needs to be able to insert the elements it
produces into the attribute-container.  As long as an implicit conversion
allows that to work, everything is fine:

    namespace bp = boost::parser;
    auto const p = bp::string("foo");

    std::vector<int> result;
    bool const success = bp::parse("foo", p, result);
    assert(success && result == std::vector<int>({'f', 'o', 'o'}));

This works, too, even though it requires inserting elements from a generated
sequence of `uint32_t` into a container of `char` (remember that the attribute
type of `+_cp_` is `std::vector<uint32_t>`):

    namespace bp = boost::parser;
    auto const p = +bp::cp;

    std::string result;
    bool const success = bp::parse("foo", p, result);
    assert(success && result == "foo");

This next example works as well, even though the change to a container is not
at the top level.  It is an element of the result tuple:

    namespace bp = boost::parser;
    // p matches one or more non-spaces, followed by a single space, followed by one or more repetitions of "foo".
    auto const p = +(bp::cp - ' ') >> ' ' >> +string("foo");

    // attr_type is the attribute type generated by p.
    using attr_type = decltype(bp::parse(u8"", p));
    static_assert(
        std::is_same_v<
            attr_type,
            std::optional<
                boost::hana::tuple<std::vector<uint32_t>, std::string>>>);

    // This is similar to attr_type, with the std::vector<uint32_t> changed to a std::string.
    boost::hana::tuple<std::string, std::string> result;
    bool const success = bp::parse(u8"rôle foofoo", p, result);
    using namespace boost::hana::literals;

    assert(success);                              // p matches.
    assert(result[0_c].size() == 5u);             // The 4 code points "rôle" get transcoded to 5 UTF-8 code points to fit in the std::string.
    assert(result[0_c] == (char const *)u8"rôle");
    assert(result[1_c] == "foofoo");

As indicated in the inline comments, there are a couple of things to take away
from this example:

* If you change a container (such as `std::string` to `std::vector<int>`, or
  `std::vector<uint32_t>` to `std::deque<int>`), the call to _p_ will often
  still be well-formed

* When changing out a container type, if both containers contain integral
  values, and the removed container's element type is 4 bytes in size, and the
  new container's element type is 1 byte in size, _Parser_ assumes that this
  is a UTF-32-to-UTF-8 conversion, and silently transcodes the data when
  inserting into the new container.

[caution The detection of the need tp transcode from UTF-32 to UTF-8 applies to *all* integral values.  If you call _p_ with this parser:

    auto const p = +boost::parser::uint_;

using a `std::string` as an out-parameter, it will happily transcode your
unsigned ints to UTF-8.  This is almost certainly not what you want.  Don't
worry, though; this kind of case comes up pretty rarely, but wanting to parse
in Unicode mode and catch results in UTF-8 strings comes up all the time.]

Let's look at a case where another simple-seeming type replacement does *not* work:

        namespace bp = boost::parser;
        auto const p = +(bp::int_ >> +bp::cp);

        using attr_type = decltype(bp::parse(u8"", p));
        static_assert(std::is_same_v<
                      attr_type,
                      std::optional<std::vector<
                          boost::hana::tuple<int, std::vector<uint32_t>>>>>);

        std::vector<boost::hana::tuple<int, std::string>> result;
    #if 0
        bool const success = bp::parse(u8"42 rôle", p, bp::ws, result); // ill-formed!
    #endif

In this case, removing a `std::vector<uint32_t>` and putting a `std::string`
in its place makes the code ill-formed, even though we saw a similar
replacement earlier.  The reason this one does not work is that the replaced
container is part of the element type of yet another container.  At some point
in the code, `p` would try to insert a `boost::hana::tuple<int,
std::vector<uint32_t>>` _emdash_ the element type of the attribute type it
normally generates _emdash_ into a vector of `boost::hana::tuple<int,
std::string>`s.  There's no implicit conversion there, so teh code is
ill-formed.

The take-away for this last example is that the ability to arbitrarily swap
out data types within the type of the attribute you pass to _p_ is very
flexible, but is also limited to structurally simple cases.  When we discuss
rules in the next section, we'll see how this flexibility in the types of
attributes can help when writing complicated parsers.

[note Those were all examples of swapping out one container type for another.
They make good examples because that is more likely to be surprising, and so
it's getting lots of coverage here.  You can also do much simpler things like
parse using a _ui_, and writing its attribute into a `double`.  In general,
you can swap any type `T` out of the attribute, as long as `T` is not part of
the element type for some container within the attribute. ]

[heading Unicode versus non-Unicode parsing]

A call to _p_ either considers the entire input to be in a UTF format (UTF-8,
UTF-16, or UTF-32), or it considers the entire input to be in some unknown
encoding.  Here is how it deduces which case the call falls under:

* If the input range is a sequence of `char8_t`, or if the input is a
  `boost::text::utf8_view`, the input is UTF-8.

* Otherwise, if the input is a sequence of 1-byte integral values, the input
  is in an unknown encoding.

* Otherwise, the input is in a UTF encoding.

[tip if you want to want to parse in ASCII-only mode, or in some unkown
enciding, using only sequences of `char`, like `std::string` or `char const
*`.]

[tip If you want to ensure all input is parsed as Unicode, pass the input
range `r` as `boost::text::as_utf32(r)` _emdash_ that's the first thing that
happens to it inside _p_ in the Unicode parsing path anyway.]

[note Since passing `boost::text::utf8_view` is a special case, and since a
sequence of `char` is otherwise considered an unknown encoding,
`boost::parse::parse(boost::text::as_utf8(r), p)` treats `r` as UTF-8, whereas
`boost::parse::parse(r.begin(), r.end(), p)` does not.]

[heading The `trace_mode` parameter to _p_]

Debugging parsers is notoriously difficult once they reach a certain size.  To
get a verbose trace of your parse, pass `boost::parse::trace::on` as the final
parameter to _p_.  It will show you the current parser being matched, the
front of the input, and any attributes generated.  If an attribute appears
which it cannot print using stream insertion, it prints
`"<<unprintable-value>>"`.

TODO: `with_globabls()`, `with_error_handler()`

[endsect]

[section Rules]

TODO

Getting at one of a rule's arguments and passing it as an argument to another
parser can be very verbose.  __p_ is a variable template that allows you to
refer to the `n`th argument to the current rule, so that you can, in turn,
pass it to on of the rule's subparsers:

    auto const indent_n_def = boost::parser::repeat(boost::parser::_p<0>)[' '_l];

Using __p_ can prevent you from having to write a bunch of lambdas that get
each get an argument out of the parse context using `_params_np_(ctx)[0_c]` or
similar.

[endsect]

[section Unicode Support]

TODO

TODO: Unicode in symbol tables

[endsect]

[section Callback Parsing]

TODO

[endsect]

[section Best Practices]

TODO: Parse Unicode from the start.

TODO: Write rules, and test them in isolation.

TODO: Compile separately when you know the type of your input will not change.

[endsect]

[section Writing Your Own Parser]

TODO

[endsect]

[endsect]