A directive is an element of your parser that doesn't have any meaning by
itself. Some are second-order parsers that need a first-order parser to do
the actual parsing. Others influence the parse in some way. Lexically, you
can spot a directive by its use of [].
Non-directives never use [],
and directives always do.
The directives that are second order parsers are technically directives,
but since they are also used to create parsers, it is more useful just to
focus on that. The directives repeat()
and if_() were already described in
the section on parsers; we won't say more about them here.
That leaves the directives that affect aspects of the parse:
omit[p]
disables attribute generation for the parser p.
Not only does omit[p]
have no attribute, but any attribute generation work that normally happens
within p is skipped.
This directive can be useful in cases like this: say you have some fairly
complicated parser p that
generates a large and expensive-to-construct attribute. Now say that you
want to write a function that just counts how many times p
can match a string (where the matches are non-overlapping). Instead of using
p directly, and building
all those attributes, or rewriting p
without the attribute generation, use omit[].
raw[p]
changes the attribute from
to to a view that delimits the subrange of the input that was matched by
ATTR(p)p. The type of the view is
subrange<I>,
where I is the type of the
iterator used within the parse. Note that this may not be the same as the
iterator type passed to parse().
For instance, when parsing UTF-8, the iterator passed to parse()
may be char8_t const
*, but within the parse it will be
a UTF-8 to UTF-32 transcoding (converting) iterator. Just like omit[], raw[]
causes all attribute-generation work within p
to be skipped.
Similar to the re-use scenario for omit[]
above, raw[] could be used to find the
locations of all non-overlapping matches
of p in a string.
string_view[p]
is very similar to raw[p], except
that it changes the attribute of p
to std::basic_string_view<C>,
where C is the character
type of the underlying sequence being parsed. string_view[]
requires that the underlying range being parsed is contiguous. Since this
can only be detected in C++20 and later, string_view[]
is not available in C++17 mode.
Similar to the re-use scenario for omit[]
above, string_view[] could be used to find the
locations of all non-overlapping matches
of p in a string. Whether
raw[] or string_view[]
is more natural to use to report the locations depends on your use case,
but they are essentially the same.
lexeme[p]
disables use of the skipper, if a skipper is being used, within the parse
of p. This is useful, for
instance, if you want to enable skipping in most parts of your parser, but
disable it only in one section where it doesn't belong. If you are skipping
whitespace in most of your parser, but want to parse strings that may contain
spaces, you should use lexeme[]:
namespace bp = boost::parser; auto const string_parser = bp::lexeme['"' >> *(bp::char_ - '"') >> '"'];
Without lexeme[], our string parser would correctly
match "foo bar", but
the generated attribute would be "foobar".
skip[] is like the inverse of lexeme[]. It enables skipping in the
parse, even if it was not enabled before. For example, within a call to
parse() that uses a skipper, let's
say we have these parsers in use:
namespace bp = boost::parser; auto const one_or_more = +bp::char_; auto const skip_or_skip_not_there_is_no_try = bp::lexeme[bp::skip[one_or_more] >> one_or_more];
The use of lexeme[] disables skipping, but then
the use of skip[] turns it back on. The net
result is that the first occurrence of one_or_more
will use the skipper passed to parse();
the second will not.
skip[] has another use. You can parameterize
skip with a different parser to change the skipper just within the scope
of the directive. Let's say we passed ascii::space to parse(),
and we're using these parsers somewhere within that parse()
call:
namespace bp = boost::parser; auto const zero_or_more = *bp::char_; auto const skip_both_ways = zero_or_more >> bp::skip(bp::ws)[zero_or_more];
The first occurrence of zero_or_more
will use the skipper passed to parse(),
which is ascii::space;
the second will use ws
as its skipper.