mirror of
https://github.com/boostorg/parser.git
synced 2026-01-27 07:02:12 +00:00
139 lines
5.5 KiB
Plaintext
139 lines
5.5 KiB
Plaintext
[/
|
|
/ Distributed under the Boost Software License, Version 1.0. (See accompanying
|
|
/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
|
|
/]
|
|
|
|
[section Tutorial]
|
|
|
|
To get started, let's look at some simple programs that parse using _Parser_.
|
|
We'll start small and build up from there.
|
|
|
|
[section Hello, Whomever]
|
|
|
|
This is just about the most minimal example of using _Parser_ that one could
|
|
write. We take a string from the command line, or `"World"` if none is given,
|
|
and then we parse it:
|
|
|
|
[hello_example]
|
|
|
|
The expression `*bp::char_` is a parser-expression. It uses one of the many
|
|
parsers that _Parser_ provides, _ch_. Like all _Parser_ parsers, it has
|
|
certain operations defined on it. In this case, `*bp::char_` is using a C++
|
|
version of _kl_ operator _emdash_ `operator*()`. Since C++ has no postfix
|
|
unary `*` operator, we have to use the one we have, so it is used as a prefix.
|
|
|
|
So, `*bp::char_` means "any number of characters". In other words, it really
|
|
cannot fail. Even an empty string will match it.
|
|
|
|
The parse operation is performed by calling the _p_ function, passing the
|
|
parser as one of the arguments:
|
|
|
|
bp::parse(input, *bp::char_, result);
|
|
|
|
The arguments here are: `input`, the string to parse; `*bp::char_`, the parser
|
|
used to do the parse; and `result`, and out-parameter into which to put the
|
|
result of the parse. Don't get too caught up on this method of getting the
|
|
parse result out of _p_; there are multiple ways of doing so, and we'll cover
|
|
all of them in subsequent examples.
|
|
|
|
The effects of this call to _p_ is not very interesting _emdash_ since the
|
|
parser we gave it cannot ever fail, and because we're placing the output in
|
|
the same type as the input, it just copies the contents of `input` to
|
|
`result`.
|
|
|
|
[endsect]
|
|
|
|
[section A Trivial Example]
|
|
|
|
Let's look at a slightly more complicated example, even if it is still
|
|
trivial. Instead of taking any old `char`s we're given, let's require some
|
|
structure. Let's parse one or more `double`s, separated by commas.
|
|
|
|
The _Parser_ parser for `double` is _d_. So, to parse a single `double`, we'd
|
|
use _d_. If we wanted to parse two `double`s in a row, we'd use:
|
|
|
|
boost::parser::double_ >> boost::parser::double_
|
|
|
|
`operator>>()` in this expression is the sequence-operator; read is as
|
|
"followed by". If we combine the sequence-operator with _kl_, we can get the
|
|
parser we want by writing:
|
|
|
|
boost::parser::double_ >> *(',' >> boost::parser::double_)
|
|
|
|
This is a parser that matches at least one `double` _emdash_ because of the
|
|
first _d_ in the expression above _emdash_ followed by zero or more instances
|
|
of a-comma-followed-by-a-`double`. Notice that we can use `','` directly.
|
|
Though it is not a parser, `operator>>()` and the other operators defined on
|
|
_Parser_ parsers have overloads that accept characters/parser pairs of
|
|
arguments; these operator overloads will create the right parser to recognize
|
|
`','`.
|
|
|
|
[trivial_example]
|
|
|
|
The first example filled in an out-parameter to deliver the result of the
|
|
parse. This call to _p_ returns a result instead. As you can see, the result
|
|
is contextually convertible to `bool`, and `*result` is some sort of range.
|
|
In fact, the return type of this call to _p_ is
|
|
`std::optional<std::vector<double>>`. Naturally, if the parse fails,
|
|
`std::nullopt` is returned. We'll look at how _Parser_ maps the type of the
|
|
parser to the return type, or the filled in out-parameter's type, a bit later.
|
|
|
|
If I run it in a shell, this is the result:
|
|
|
|
[pre
|
|
$ example/trivial
|
|
Enter a list of doubles, separated by commas. No pressure. 5.6,8.9
|
|
Great! It looks like you entered:
|
|
5.6
|
|
8.9
|
|
$ example/trivial
|
|
Enter a list of doubles, separated by commas. No pressure. 5.6, 8.9
|
|
Good job! Please proceed to the recovery annex for cake.
|
|
]
|
|
|
|
It does not recognize `"5.6, 8.9"`. This is because it expects a comma
|
|
followed /immediately/ by a `double`, but I inserted a space after the comma.
|
|
The same failure to parse would occur if I put a space before or after the
|
|
list of `double`s.
|
|
|
|
[endsect]
|
|
|
|
[section A Trivial Example That Gracefully Handles Whitespace]
|
|
|
|
Let's modify the trivial parse we just did parse to ignore any spaces that
|
|
might exist among the `double`s and commas. To skip whitespace wherever we
|
|
find it, we can pass a /skip parser/ to our call to _p_ (we don't need to
|
|
touch the parser passed to _p_). Here, we use `ascii::space`, which matches
|
|
any ASCII character `c` for which `std::isspace(c)` is true.
|
|
|
|
[trivial_skipper_example]
|
|
|
|
The skip parser, or /skipper/, is run between the sub-parsers within the
|
|
parser passed to _p_. In this case, the skipper is run before the first
|
|
`double` is parsed, before any subsequent comma or `double` is parsed, and at
|
|
the end. So, the strings `"3.6,5.9"` and `" 3.6 , \t 5.9 "` are parsed the
|
|
same by this program.
|
|
|
|
Skipping is an important concept in _Parser_. You can skip anything, not just
|
|
ASCII whitespace; there are lots of other things you might want to skip. The
|
|
skipper you pass to _p_ can be an arbitrary parser. For example, if you write
|
|
a parser for a scripting language, you can write a skipper to skip comments.
|
|
|
|
We'll be using skip parsers almost exclusively in the rest of the
|
|
documentation. The ability to ignore the parts of your input that you don't
|
|
care about is so convenient that parsing without skipping is a rarity in
|
|
practice.
|
|
|
|
[endsect]
|
|
|
|
[section Semantic Actions]
|
|
|
|
Like all parsing systems (lex & yacc, _Spirit_, etc.), _Parser_ has a
|
|
mechanism for associating semantic actions with different parts of the parse.
|
|
|
|
[semantic_action_example]
|
|
|
|
[endsect]
|
|
|
|
[endsect]
|