2
0
mirror of https://github.com/boostorg/parser.git synced 2026-01-27 07:02:12 +00:00
Files
parser/doc/tutorial.qbk

139 lines
5.5 KiB
Plaintext

[/
/ Distributed under the Boost Software License, Version 1.0. (See accompanying
/ file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
/]
[section Tutorial]
To get started, let's look at some simple programs that parse using _Parser_.
We'll start small and build up from there.
[section Hello, Whomever]
This is just about the most minimal example of using _Parser_ that one could
write. We take a string from the command line, or `"World"` if none is given,
and then we parse it:
[hello_example]
The expression `*bp::char_` is a parser-expression. It uses one of the many
parsers that _Parser_ provides, _ch_. Like all _Parser_ parsers, it has
certain operations defined on it. In this case, `*bp::char_` is using a C++
version of _kl_ operator _emdash_ `operator*()`. Since C++ has no postfix
unary `*` operator, we have to use the one we have, so it is used as a prefix.
So, `*bp::char_` means "any number of characters". In other words, it really
cannot fail. Even an empty string will match it.
The parse operation is performed by calling the _p_ function, passing the
parser as one of the arguments:
bp::parse(input, *bp::char_, result);
The arguments here are: `input`, the string to parse; `*bp::char_`, the parser
used to do the parse; and `result`, and out-parameter into which to put the
result of the parse. Don't get too caught up on this method of getting the
parse result out of _p_; there are multiple ways of doing so, and we'll cover
all of them in subsequent examples.
The effects of this call to _p_ is not very interesting _emdash_ since the
parser we gave it cannot ever fail, and because we're placing the output in
the same type as the input, it just copies the contents of `input` to
`result`.
[endsect]
[section A Trivial Example]
Let's look at a slightly more complicated example, even if it is still
trivial. Instead of taking any old `char`s we're given, let's require some
structure. Let's parse one or more `double`s, separated by commas.
The _Parser_ parser for `double` is _d_. So, to parse a single `double`, we'd
use _d_. If we wanted to parse two `double`s in a row, we'd use:
boost::parser::double_ >> boost::parser::double_
`operator>>()` in this expression is the sequence-operator; read is as
"followed by". If we combine the sequence-operator with _kl_, we can get the
parser we want by writing:
boost::parser::double_ >> *(',' >> boost::parser::double_)
This is a parser that matches at least one `double` _emdash_ because of the
first _d_ in the expression above _emdash_ followed by zero or more instances
of a-comma-followed-by-a-`double`. Notice that we can use `','` directly.
Though it is not a parser, `operator>>()` and the other operators defined on
_Parser_ parsers have overloads that accept characters/parser pairs of
arguments; these operator overloads will create the right parser to recognize
`','`.
[trivial_example]
The first example filled in an out-parameter to deliver the result of the
parse. This call to _p_ returns a result instead. As you can see, the result
is contextually convertible to `bool`, and `*result` is some sort of range.
In fact, the return type of this call to _p_ is
`std::optional<std::vector<double>>`. Naturally, if the parse fails,
`std::nullopt` is returned. We'll look at how _Parser_ maps the type of the
parser to the return type, or the filled in out-parameter's type, a bit later.
If I run it in a shell, this is the result:
[pre
$ example/trivial
Enter a list of doubles, separated by commas. No pressure. 5.6,8.9
Great! It looks like you entered:
5.6
8.9
$ example/trivial
Enter a list of doubles, separated by commas. No pressure. 5.6, 8.9
Good job! Please proceed to the recovery annex for cake.
]
It does not recognize `"5.6, 8.9"`. This is because it expects a comma
followed /immediately/ by a `double`, but I inserted a space after the comma.
The same failure to parse would occur if I put a space before or after the
list of `double`s.
[endsect]
[section A Trivial Example That Gracefully Handles Whitespace]
Let's modify the trivial parse we just did parse to ignore any spaces that
might exist among the `double`s and commas. To skip whitespace wherever we
find it, we can pass a /skip parser/ to our call to _p_ (we don't need to
touch the parser passed to _p_). Here, we use `ascii::space`, which matches
any ASCII character `c` for which `std::isspace(c)` is true.
[trivial_skipper_example]
The skip parser, or /skipper/, is run between the sub-parsers within the
parser passed to _p_. In this case, the skipper is run before the first
`double` is parsed, before any subsequent comma or `double` is parsed, and at
the end. So, the strings `"3.6,5.9"` and `" 3.6 , \t 5.9 "` are parsed the
same by this program.
Skipping is an important concept in _Parser_. You can skip anything, not just
ASCII whitespace; there are lots of other things you might want to skip. The
skipper you pass to _p_ can be an arbitrary parser. For example, if you write
a parser for a scripting language, you can write a skipper to skip comments.
We'll be using skip parsers almost exclusively in the rest of the
documentation. The ability to ignore the parts of your input that you don't
care about is so convenient that parsing without skipping is a rarity in
practice.
[endsect]
[section Semantic Actions]
Like all parsing systems (lex & yacc, _Spirit_, etc.), _Parser_ has a
mechanism for associating semantic actions with different parts of the parse.
[semantic_action_example]
[endsect]
[endsect]