2
0
mirror of https://github.com/boostorg/parser.git synced 2026-01-22 05:22:32 +00:00
Files
parser/doc/html/boost_parser/tutorial/alternative_parsers.html
2024-12-08 17:19:48 -06:00

97 lines
8.4 KiB
HTML
Raw Permalink Blame History

This file contains invisible Unicode characters
This file contains invisible Unicode characters that are indistinguishable to humans but may be processed differently by a computer. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.
<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Alternative Parsers</title>
<link rel="stylesheet" href="../../boostbook.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
<link rel="home" href="../../index.html" title="Chapter 1. Boost.Parser">
<link rel="up" href="../tutorial.html" title="Tutorial">
<link rel="prev" href="parsing_into__struct_s_and__class_es.html" title="Parsing into structs and classes">
<link rel="next" href="parsing_quoted_strings.html" title="Parsing Quoted Strings">
<meta name="viewport" content="width=device-width, initial-scale=1">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<div class="spirit-nav">
<a accesskey="p" href="parsing_into__struct_s_and__class_es.html"><img src="../../images/prev.png" alt="Prev"></a><a accesskey="u" href="../tutorial.html"><img src="../../images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../images/home.png" alt="Home"></a><a accesskey="n" href="parsing_quoted_strings.html"><img src="../../images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h3 class="title">
<a name="boost_parser.tutorial.alternative_parsers"></a><a class="link" href="alternative_parsers.html" title="Alternative Parsers">Alternative
Parsers</a>
</h3></div></div></div>
<p>
Frequently, you need to parse something that might have one of several forms.
<code class="computeroutput"><span class="keyword">operator</span><span class="special">|</span></code>
is overloaded to form alternative parsers. For example:
</p>
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
<span class="keyword">auto</span> <span class="keyword">const</span> <span class="identifier">parser_1</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">int_</span> <span class="special">|</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">eps</span><span class="special">;</span>
</pre>
<p>
<code class="computeroutput"><span class="identifier">parser_1</span></code> matches an integer,
or if that fails, it matches <span class="emphasis"><em>epsilon</em></span>, the empty string.
This is equivalent to writing:
</p>
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
<span class="keyword">auto</span> <span class="keyword">const</span> <span class="identifier">parser_2</span> <span class="special">=</span> <span class="special">-</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">int_</span><span class="special">;</span>
</pre>
<p>
However, neither <code class="computeroutput"><span class="identifier">parser_1</span></code>
nor <code class="computeroutput"><span class="identifier">parser_2</span></code> is equivalent
to writing this:
</p>
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
<span class="keyword">auto</span> <span class="keyword">const</span> <span class="identifier">parser_3</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">eps</span> <span class="special">|</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">int_</span><span class="special">;</span> <span class="comment">// Does not do what you think.</span>
</pre>
<p>
The reason is that alternative parsers try each of their subparsers, one
at a time, and stop on the first one that matches. <span class="emphasis"><em>Epsilon</em></span>
matches anything, since it is zero length and consumes no input. It even
matches the end of input. This means that <code class="computeroutput"><span class="identifier">parser_3</span></code>
is equivalent to <code class="computeroutput"><a class="link" href="../../boost/parser/eps.html" title="Global eps">eps</a></code>
by itself.
</p>
<div class="note"><table border="0" summary="Note">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../images/note.png"></td>
<th align="left">Note</th>
</tr>
<tr><td align="left" valign="top"><p>
For this reason, writing <code class="computeroutput"><a class="link" href="../../boost/parser/eps.html" title="Global eps">eps</a> <span class="special">|</span>
<span class="identifier">p</span></code> for any parser p is considered
a bug. Debug builds will assert when <code class="computeroutput"><a class="link" href="../../boost/parser/eps.html" title="Global eps">eps</a> <span class="special">|</span>
<span class="identifier">p</span></code> is encountered.
</p></td></tr>
</table></div>
<div class="warning"><table border="0" summary="Warning">
<tr>
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Warning]" src="../../images/warning.png"></td>
<th align="left">Warning</th>
</tr>
<tr><td align="left" valign="top"><p>
This kind of error is very common when <code class="computeroutput"><a class="link" href="../../boost/parser/eps.html" title="Global eps">eps</a></code> is involved, and also
very easy to detect. However, it is possible to write <code class="computeroutput"><span class="identifier">P1</span>
<span class="special">&gt;&gt;</span> <span class="identifier">P2</span></code>,
where <code class="computeroutput"><span class="identifier">P1</span></code> is a prefix of
<code class="computeroutput"><span class="identifier">P2</span></code>, such as <code class="computeroutput"><span class="identifier">int_</span> <span class="special">|</span> <span class="keyword">int</span> <span class="special">&gt;&gt;</span> <span class="identifier">int_</span></code>, or <code class="computeroutput"><span class="identifier">repeat</span><span class="special">(</span><span class="number">4</span><span class="special">)[</span><span class="identifier">hex_digit</span><span class="special">]</span>
<span class="special">|</span> <span class="identifier">repeat</span><span class="special">(</span><span class="number">8</span><span class="special">)[</span><span class="identifier">hex_digit</span><span class="special">]</span></code>.
This is almost certainly an error, but is impossible to detect in the general
case — remember that <code class="computeroutput"><a class="link" href="../../boost/parser/rule.html" title="Struct template rule">rules</a></code> can be separately compiled,
and consider a pair of rules whose associated <code class="computeroutput"><span class="identifier">_def</span></code>
parsers are <code class="computeroutput"><span class="identifier">int_</span></code> and <code class="computeroutput"><span class="identifier">int_</span> <span class="special">&gt;&gt;</span>
<span class="identifier">int_</span></code>, respectively.
</p></td></tr>
</table></div>
</div>
<div class="copyright-footer">Copyright © 2020 T. Zachary Laine<p>
Distributed under the Boost Software License, Version 1.0. (See accompanying
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
</p>
</div>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="parsing_into__struct_s_and__class_es.html"><img src="../../images/prev.png" alt="Prev"></a><a accesskey="u" href="../tutorial.html"><img src="../../images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../images/home.png" alt="Home"></a><a accesskey="n" href="parsing_quoted_strings.html"><img src="../../images/next.png" alt="Next"></a>
</div>
</body>
</html>