PrevUpHomeNext

Directives

A directive is an element of your parser that doesn't have any meaning by itself. Some are second-order parsers that need a first-order parser to do the actual parsing. Others influence the parse in some way. Lexically, you can spot a directive by its use of []. Non-directives never use [], and directives always do.

The directives that are second order parsers are technically directives, but since they are also used to create parsers, it is more useful just to focus on that. The directives repeat() and if_() were already described in the section on parsers; we won't say more about them here.

That leaves the directives that affect aspects of the parse:

omit[]

omit[p] disables attribute generation for the parser p. Not only does omit[p] have no attribute, but any attribute generation work that normally happens within p is skipped.

This directive can be useful in cases like this: say you have some fairly complicated parser p that generates a large and expensive-to-construct attribute. Now say that you want to write a function that just counts how many times p can match a string (where the matches are non-overlapping). Instead of using p directly, and building all those attributes, or rewriting p without the attribute generation, use omit[].

raw[]

raw[p] changes the attribute from ATTR(p) to to a view that delimits the subrange of the input that was matched by p. The type of the view is subrange<I>, where I is the type of the iterator used within the parse. Note that this may not be the same as the iterator type passed to parse(). For instance, when parsing UTF-8, the iterator passed to parse() may be char8_t const *, but within the parse it will be a UTF-8 to UTF-32 transcoding (converting) iterator. Just like omit[], raw[] causes all attribute-generation work within p to be skipped.

Similar to the re-use scenario for omit[] above, raw[] could be used to find the locations of all non-overlapping matches of p in a string.

string_view[]

string_view[p] is very similar to raw[p], except that it changes the attribute of p to std::basic_string_view<C>, where C is the character type of the underlying sequence being parsed. string_view[] requires that the underlying range being parsed is contiguous. Since this can only be detected in C++20 and later, string_view[] is not available in C++17 mode.

Similar to the re-use scenario for omit[] above, string_view[] could be used to find the locations of all non-overlapping matches of p in a string. Whether raw[] or string_view[] is more natural to use to report the locations depends on your use case, but they are essentially the same.

lexeme[]

lexeme[p] disables use of the skipper, if a skipper is being used, within the parse of p. This is useful, for instance, if you want to enable skipping in most parts of your parser, but disable it only in one section where it doesn't belong. If you are skipping whitespace in most of your parser, but want to parse strings that may contain spaces, you should use lexeme[]:

namespace bp = boost::parser;
auto const string_parser = bp::lexeme['"' >> *(bp::char_ - '"') >> '"'];

Without lexeme[], our string parser would correctly match "foo bar", but the generated attribute would be "foobar".

skip[]

skip[] is like the inverse of lexeme[]. It enables skipping in the parse, even if it was not enabled before. For example, within a call to parse() that uses a skipper, let's say we have these parsers in use:

namespace bp = boost::parser;
auto const one_or_more = +bp::char_;
auto const skip_or_skip_not_there_is_no_try = bp::lexeme[bp::skip[one_or_more] >> one_or_more];

The use of lexeme[] disables skipping, but then the use of skip[] turns it back on. The net result is that the first occurrence of one_or_more will use the skipper passed to parse(); the second will not.

skip[] has another use. You can parameterize skip with a different parser to change the skipper just within the scope of the directive. Let's say we passed ascii::space to parse(), and we're using these parsers somewhere within that parse() call:

namespace bp = boost::parser;
auto const zero_or_more = *bp::char_;
auto const skip_both_ways = zero_or_more >> bp::skip(bp::ws)[zero_or_more];

The first occurrence of zero_or_more will use the skipper passed to parse(), which is ascii::space; the second will use ws as its skipper.


PrevUpHomeNext