mirror of
https://github.com/boostorg/parser.git
synced 2026-01-23 17:52:15 +00:00
606 lines
80 KiB
HTML
606 lines
80 KiB
HTML
<html>
|
||
<head>
|
||
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
|
||
<title>The parse() API</title>
|
||
<link rel="stylesheet" href="../../boostbook.css" type="text/css">
|
||
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
|
||
<link rel="home" href="../../index.html" title="Chapter 1. Boost.Parser">
|
||
<link rel="up" href="../tutorial.html" title="Tutorial">
|
||
<link rel="prev" href="attribute_generation.html" title="Attribute Generation">
|
||
<link rel="next" href="more_about_rules.html" title="More About Rules">
|
||
<meta name="viewport" content="width=device-width, initial-scale=1">
|
||
</head>
|
||
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
|
||
<div class="spirit-nav">
|
||
<a accesskey="p" href="attribute_generation.html"><img src="../../images/prev.png" alt="Prev"></a><a accesskey="u" href="../tutorial.html"><img src="../../images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../images/home.png" alt="Home"></a><a accesskey="n" href="more_about_rules.html"><img src="../../images/next.png" alt="Next"></a>
|
||
</div>
|
||
<div class="section">
|
||
<div class="titlepage"><div><div><h3 class="title">
|
||
<a name="boost_parser.tutorial.the__parse____api"></a><a class="link" href="the__parse____api.html" title="The parse() API">The <code class="computeroutput"><span class="identifier">parse</span><span class="special">()</span></code> API</a>
|
||
</h3></div></div></div>
|
||
<p>
|
||
There are multiple top-level parse functions. They have some things in common:
|
||
</p>
|
||
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
|
||
<li class="listitem">
|
||
They each return a value contextually convertible to <code class="computeroutput"><span class="keyword">bool</span></code>.
|
||
</li>
|
||
<li class="listitem">
|
||
They each take at least a range to parse and a parser. The "range
|
||
to parse" may be an iterator/sentinel pair or an single range object.
|
||
</li>
|
||
<li class="listitem">
|
||
They each require forward iterability of the range to parse.
|
||
</li>
|
||
<li class="listitem">
|
||
They each accept any range with a character element type. This means
|
||
that they can each parse ranges of <code class="computeroutput"><span class="keyword">char</span></code>,
|
||
<code class="computeroutput"><span class="keyword">wchar_t</span></code>, <code class="computeroutput"><span class="identifier">char8_t</span></code>,
|
||
<code class="computeroutput"><span class="keyword">char16_t</span></code>, or <code class="computeroutput"><span class="keyword">char32_t</span></code>.
|
||
</li>
|
||
<li class="listitem">
|
||
The overloads with <code class="computeroutput"><span class="identifier">prefix_</span></code>
|
||
in their name take an iterator/sentinel pair. For example <code class="computeroutput"><a class="link" href="../../boost/parser/prefix_parse_id15.html" title="Function template prefix_parse">prefix_parse</a><span class="special">(</span><span class="identifier">first</span><span class="special">,</span> <span class="identifier">last</span><span class="special">,</span> <span class="identifier">p</span><span class="special">,</span> <a class="link" href="../../boost/parser/ws.html" title="Global ws">ws</a><span class="special">)</span></code>,
|
||
which parses the range <code class="computeroutput"><span class="special">[</span><span class="identifier">first</span><span class="special">,</span> <span class="identifier">last</span><span class="special">)</span></code>,
|
||
advancing <code class="computeroutput"><span class="identifier">first</span></code> as it
|
||
goes. If the parse succeeds, the entire input may or may not have been
|
||
matched. The value of <code class="computeroutput"><span class="identifier">first</span></code>
|
||
will indicate the last location within the input that <code class="computeroutput"><span class="identifier">p</span></code>
|
||
matched. The <span class="bold"><strong>whole</strong></span> input was matched
|
||
if and only if <code class="computeroutput"><span class="identifier">first</span> <span class="special">==</span> <span class="identifier">last</span></code>
|
||
after the call to <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code>.
|
||
</li>
|
||
<li class="listitem">
|
||
When you call any of the range overloads of <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code>,
|
||
for example <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse</a><span class="special">(</span><span class="identifier">r</span><span class="special">,</span> <span class="identifier">p</span><span class="special">,</span> <a class="link" href="../../boost/parser/ws.html" title="Global ws">ws</a><span class="special">)</span></code>, <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code>
|
||
only indicates success if <span class="bold"><strong>all</strong></span> of <code class="computeroutput"><span class="identifier">r</span></code> was matched by <code class="computeroutput"><span class="identifier">p</span></code>.
|
||
</li>
|
||
</ul></div>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
<code class="computeroutput"><span class="keyword">wchar_t</span></code> is an accepted value
|
||
type for the input. Please note that this is interpreted as UTF-16 on MSVC,
|
||
and UTF-32 everywhere else.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<h5>
|
||
<a name="boost_parser.tutorial.the__parse____api.h0"></a>
|
||
<span class="phrase"><a name="boost_parser.tutorial.the__parse____api.the_overloads"></a></span><a class="link" href="the__parse____api.html#boost_parser.tutorial.the__parse____api.the_overloads">The overloads</a>
|
||
</h5>
|
||
<p>
|
||
There are eight overloads of <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code>
|
||
and <code class="computeroutput"><a class="link" href="../../boost/parser/prefix_parse_id15.html" title="Function template prefix_parse">prefix_parse()</a></code> combined, because there
|
||
are three either/or options in how you call them.
|
||
</p>
|
||
<h5>
|
||
<a name="boost_parser.tutorial.the__parse____api.h1"></a>
|
||
<span class="phrase"><a name="boost_parser.tutorial.the__parse____api.iterator_sentinel_versus_range"></a></span><a class="link" href="the__parse____api.html#boost_parser.tutorial.the__parse____api.iterator_sentinel_versus_range">Iterator/sentinel
|
||
versus range</a>
|
||
</h5>
|
||
<p>
|
||
You can call <code class="computeroutput"><a class="link" href="../../boost/parser/prefix_parse_id15.html" title="Function template prefix_parse">prefix_parse()</a></code>
|
||
with an iterator and sentinel that delimit a range of character values. For
|
||
example:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="keyword">const</span> <span class="identifier">p</span> <span class="special">=</span> <span class="comment">/* some parser ... */</span><span class="special">;</span>
|
||
|
||
<span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*</span> <span class="identifier">str_1</span> <span class="special">=</span> <span class="comment">/* ... */</span><span class="special">;</span>
|
||
<span class="comment">// Using null_sentinel, str_1 can point to three billion characters, and</span>
|
||
<span class="comment">// we can call prefix_parse() without having to find the end of the string first.</span>
|
||
<span class="keyword">auto</span> <span class="identifier">result_1</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">prefix_parse</span><span class="special">(</span><span class="identifier">str_1</span><span class="special">,</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">null_sentinel</span><span class="special">,</span> <span class="identifier">p</span><span class="special">,</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">ws</span><span class="special">);</span>
|
||
|
||
<span class="keyword">char</span> <span class="identifier">str_2</span><span class="special">[]</span> <span class="special">=</span> <span class="comment">/* ... */</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="identifier">result_2</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">prefix_parse</span><span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">begin</span><span class="special">(</span><span class="identifier">str_2</span><span class="special">),</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">end</span><span class="special">(</span><span class="identifier">str_2</span><span class="special">),</span> <span class="identifier">p</span><span class="special">,</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">ws</span><span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
The iterator/sentinel overloads can parse successfully without matching the
|
||
entire input. You can tell if the entire input was matched by checking if
|
||
<code class="computeroutput"><span class="identifier">first</span> <span class="special">==</span>
|
||
<span class="identifier">last</span></code> is true after <code class="computeroutput"><a class="link" href="../../boost/parser/prefix_parse_id15.html" title="Function template prefix_parse">prefix_parse()</a></code> returns.
|
||
</p>
|
||
<p>
|
||
By contrast, you call <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code>
|
||
with a range of character values. When the range is a reference to an array
|
||
of characters, any terminating <code class="computeroutput"><span class="number">0</span></code>
|
||
is ignored; this allows calls like <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse</a><span class="special">(</span><span class="string">"str"</span><span class="special">,</span>
|
||
<span class="identifier">p</span><span class="special">)</span></code>
|
||
to work naturally.
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="keyword">const</span> <span class="identifier">p</span> <span class="special">=</span> <span class="comment">/* some parser ... */</span><span class="special">;</span>
|
||
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">u8string</span> <span class="identifier">str_1</span> <span class="special">=</span> <span class="string">"str"</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="identifier">result_1</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">str_1</span><span class="special">,</span> <span class="identifier">p</span><span class="special">,</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">ws</span><span class="special">);</span>
|
||
|
||
<span class="comment">// The null terminator is ignored. This call parses s-t-r, not s-t-r-0.</span>
|
||
<span class="keyword">auto</span> <span class="identifier">result_2</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">U</span><span class="string">"str"</span><span class="special">,</span> <span class="identifier">p</span><span class="special">,</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">ws</span><span class="special">);</span>
|
||
|
||
<span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*</span> <span class="identifier">str_3</span> <span class="special">=</span> <span class="string">"str"</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="identifier">result_3</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">null_term</span><span class="special">(</span><span class="identifier">str_3</span><span class="special">)</span> <span class="special">|</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">as_utf16</span><span class="special">,</span> <span class="identifier">p</span><span class="special">,</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">ws</span><span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
Since there is no way to indicate that <code class="computeroutput"><span class="identifier">p</span></code>
|
||
matches the input, but only a prefix of the input was matched, the range
|
||
(non-iterator/sentinel) overloads of <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code>
|
||
indicate failure if the entire input is not matched.
|
||
</p>
|
||
<h5>
|
||
<a name="boost_parser.tutorial.the__parse____api.h2"></a>
|
||
<span class="phrase"><a name="boost_parser.tutorial.the__parse____api.with_or_without_an_attribute_out_parameter"></a></span><a class="link" href="the__parse____api.html#boost_parser.tutorial.the__parse____api.with_or_without_an_attribute_out_parameter">With
|
||
or without an attribute out-parameter</a>
|
||
</h5>
|
||
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="keyword">const</span> <span class="identifier">p</span> <span class="special">=</span> <span class="char">'"'</span> <span class="special">>></span> <span class="special">*(</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">char_</span> <span class="special">-</span> <span class="char">'"'</span><span class="special">)</span> <span class="special">>></span> <span class="char">'"'</span><span class="special">;</span>
|
||
<span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*</span> <span class="identifier">str</span> <span class="special">=</span> <span class="string">"\"two words\""</span> <span class="special">;</span>
|
||
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">result_1</span><span class="special">;</span>
|
||
<span class="keyword">bool</span> <span class="keyword">const</span> <span class="identifier">success</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">p</span><span class="special">,</span> <span class="identifier">result_1</span><span class="special">);</span> <span class="comment">// success is true; result_1 is "two words"</span>
|
||
<span class="keyword">auto</span> <span class="identifier">result_2</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">p</span><span class="special">);</span> <span class="comment">// !!result_2 is true; *result_2 is "two words"</span>
|
||
</pre>
|
||
<p>
|
||
When you call <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code> <span class="bold"><strong>with</strong></span>
|
||
an attribute out-parameter and parser <code class="computeroutput"><span class="identifier">p</span></code>,
|
||
the expected type is <span class="bold"><strong>something like</strong></span> <code class="computeroutput"><span class="emphasis"><em><code class="literal">ATTR</code></em></span><span class="special">(</span><span class="identifier">p</span><span class="special">)</span></code>.
|
||
It doesn't have to be exactly that; I'll explain in a bit. The return type
|
||
is <code class="computeroutput"><span class="keyword">bool</span></code>.
|
||
</p>
|
||
<p>
|
||
When you call <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code> <span class="bold"><strong>without</strong></span>
|
||
an attribute out-parameter and parser <code class="computeroutput"><span class="identifier">p</span></code>,
|
||
the return type is <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">optional</span><span class="special"><</span><span class="emphasis"><em><code class="literal">ATTR</code></em></span><span class="special">(</span><span class="identifier">p</span><span class="special">)></span></code>.
|
||
Note that when <code class="computeroutput"><span class="emphasis"><em><code class="literal">ATTR</code></em></span><span class="special">(</span><span class="identifier">p</span><span class="special">)</span></code>
|
||
is itself an <code class="computeroutput"><span class="identifier">optional</span></code>, the
|
||
return type is <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">optional</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">optional</span><span class="special"><...>></span></code>. Each of those optionals tells
|
||
you something different. The outer one tells you whether the parse succeeded.
|
||
If so, the parser was successful, but it still generates an attribute that
|
||
is an <code class="computeroutput"><span class="identifier">optional</span></code> — that's
|
||
the inner one.
|
||
</p>
|
||
<h5>
|
||
<a name="boost_parser.tutorial.the__parse____api.h3"></a>
|
||
<span class="phrase"><a name="boost_parser.tutorial.the__parse____api.with_or_without_a_skipper"></a></span><a class="link" href="the__parse____api.html#boost_parser.tutorial.the__parse____api.with_or_without_a_skipper">With
|
||
or without a skipper</a>
|
||
</h5>
|
||
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="keyword">const</span> <span class="identifier">p</span> <span class="special">=</span> <span class="char">'"'</span> <span class="special">>></span> <span class="special">*(</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">char_</span> <span class="special">-</span> <span class="char">'"'</span><span class="special">)</span> <span class="special">>></span> <span class="char">'"'</span><span class="special">;</span>
|
||
<span class="keyword">char</span> <span class="keyword">const</span> <span class="special">*</span> <span class="identifier">str</span> <span class="special">=</span> <span class="string">"\"two words\""</span> <span class="special">;</span>
|
||
|
||
<span class="keyword">auto</span> <span class="identifier">result_1</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">p</span><span class="special">);</span> <span class="comment">// !!result_1 is true; *result_1 is "two words"</span>
|
||
<span class="keyword">auto</span> <span class="identifier">result_2</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">p</span><span class="special">,</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">ws</span><span class="special">);</span> <span class="comment">// !!result_2 is true; *result_2 is "twowords"</span>
|
||
</pre>
|
||
<h5>
|
||
<a name="boost_parser.tutorial.the__parse____api.h4"></a>
|
||
<span class="phrase"><a name="boost_parser.tutorial.the__parse____api.compatibility_of_attribute_out_parameters"></a></span><a class="link" href="the__parse____api.html#boost_parser.tutorial.the__parse____api.compatibility_of_attribute_out_parameters">Compatibility
|
||
of attribute out-parameters</a>
|
||
</h5>
|
||
<p>
|
||
For any call to <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code> that takes an attribute
|
||
out-parameter, like <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse</a><span class="special">(</span><span class="string">"str"</span><span class="special">,</span>
|
||
<span class="identifier">p</span><span class="special">,</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">ws</span><span class="special">,</span> <span class="identifier">out</span><span class="special">)</span></code>,
|
||
the call is well-formed for a number of possible types of <code class="computeroutput"><span class="identifier">out</span></code>;
|
||
<code class="computeroutput"><span class="keyword">decltype</span><span class="special">(</span><span class="identifier">out</span><span class="special">)</span></code> does
|
||
not need to be exactly <code class="computeroutput"><span class="emphasis"><em><code class="literal">ATTR</code></em></span><span class="special">(</span><span class="identifier">p</span><span class="special">)</span></code>.
|
||
</p>
|
||
<p>
|
||
For instance, this is well-formed code that does not abort (remember that
|
||
the attribute type of <code class="computeroutput"><a class="link" href="../../boost/parser/string.html" title="Function template string">string()</a></code>
|
||
is <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>):
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="keyword">const</span> <span class="identifier">p</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">string</span><span class="special">(</span><span class="string">"foo"</span><span class="special">);</span>
|
||
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span>
|
||
<span class="keyword">bool</span> <span class="keyword">const</span> <span class="identifier">success</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="string">"foo"</span><span class="special">,</span> <span class="identifier">p</span><span class="special">,</span> <span class="identifier">result</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">success</span> <span class="special">&&</span> <span class="identifier">result</span> <span class="special">==</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">char</span><span class="special">>({</span><span class="char">'f'</span><span class="special">,</span> <span class="char">'o'</span><span class="special">,</span> <span class="char">'o'</span><span class="special">}));</span>
|
||
</pre>
|
||
<p>
|
||
Even though <code class="computeroutput"><span class="identifier">p</span></code> generates a
|
||
<code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> attribute, when it actually takes
|
||
the data it generates and writes it into an attribute, it only assumes that
|
||
the attribute is a <code class="computeroutput"><span class="identifier">container</span></code>
|
||
(see <a class="link" href="../concepts.html" title="Concepts">Concepts</a>), not that it
|
||
is some particular container type. It will happily <code class="computeroutput"><span class="identifier">insert</span><span class="special">()</span></code> into a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>
|
||
or a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span></code> all
|
||
the same. <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> and <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span></code>
|
||
are both containers of <code class="computeroutput"><span class="keyword">char</span></code>,
|
||
but it will also insert into a container with a different element type.
|
||
<code class="computeroutput"><span class="identifier">p</span></code> just needs to be able to
|
||
insert the elements it produces into the attribute-container. As long as
|
||
an implicit conversion allows that to work, everything is fine:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="keyword">const</span> <span class="identifier">p</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">string</span><span class="special">(</span><span class="string">"foo"</span><span class="special">);</span>
|
||
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">deque</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span>
|
||
<span class="keyword">bool</span> <span class="keyword">const</span> <span class="identifier">success</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="string">"foo"</span><span class="special">,</span> <span class="identifier">p</span><span class="special">,</span> <span class="identifier">result</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">success</span> <span class="special">&&</span> <span class="identifier">result</span> <span class="special">==</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">deque</span><span class="special"><</span><span class="keyword">int</span><span class="special">>({</span><span class="char">'f'</span><span class="special">,</span> <span class="char">'o'</span><span class="special">,</span> <span class="char">'o'</span><span class="special">}));</span>
|
||
</pre>
|
||
<p>
|
||
This works, too, even though it requires inserting elements from a generated
|
||
sequence of <code class="computeroutput"><span class="keyword">char32_t</span></code> into a
|
||
container of <code class="computeroutput"><span class="keyword">char</span></code> (remember
|
||
that the attribute type of <code class="computeroutput"><span class="special">+</span><a class="link" href="../../boost/parser/cp.html" title="Global cp">cp</a></code>
|
||
is <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">char32_t</span><span class="special">></span></code>):
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="keyword">const</span> <span class="identifier">p</span> <span class="special">=</span> <span class="special">+</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">cp</span><span class="special">;</span>
|
||
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">result</span><span class="special">;</span>
|
||
<span class="keyword">bool</span> <span class="keyword">const</span> <span class="identifier">success</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="string">"foo"</span><span class="special">,</span> <span class="identifier">p</span><span class="special">,</span> <span class="identifier">result</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">success</span> <span class="special">&&</span> <span class="identifier">result</span> <span class="special">==</span> <span class="string">"foo"</span><span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
This next example works as well, even though the change to a container is
|
||
not at the top level. It is an element of the result tuple:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="keyword">const</span> <span class="identifier">p</span> <span class="special">=</span> <span class="special">+(</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">cp</span> <span class="special">-</span> <span class="char">' '</span><span class="special">)</span> <span class="special">>></span> <span class="char">' '</span> <span class="special">>></span> <span class="identifier">string</span><span class="special">(</span><span class="string">"foo"</span><span class="special">);</span>
|
||
|
||
<span class="keyword">using</span> <span class="identifier">attr_type</span> <span class="special">=</span> <span class="keyword">decltype</span><span class="special">(</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">u8</span><span class="string">""</span><span class="special">,</span> <span class="identifier">p</span><span class="special">));</span>
|
||
<span class="keyword">static_assert</span><span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">is_same_v</span><span class="special"><</span>
|
||
<span class="identifier">attr_type</span><span class="special">,</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">optional</span><span class="special"><</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">tuple</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">>>>);</span>
|
||
|
||
<span class="keyword">using</span> <span class="keyword">namespace</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">literals</span><span class="special">;</span>
|
||
|
||
<span class="special">{</span>
|
||
<span class="comment">// This is similar to attr_type, with the first std::string changed to a std::vector<int>.</span>
|
||
<span class="identifier">bp</span><span class="special">::</span><span class="identifier">tuple</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">int</span><span class="special">>,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span>
|
||
<span class="keyword">bool</span> <span class="keyword">const</span> <span class="identifier">success</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">u8</span><span class="string">"rôle foo"</span> <span class="special">|</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">as_utf8</span><span class="special">,</span> <span class="identifier">p</span><span class="special">,</span> <span class="identifier">result</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">success</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">get</span><span class="special">(</span><span class="identifier">result</span><span class="special">,</span> <span class="number">0</span><span class="identifier">_c</span><span class="special">)</span> <span class="special">==</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">int</span><span class="special">>({</span><span class="char">'r'</span><span class="special">,</span> <span class="identifier">U</span><span class="char">'ô'</span><span class="special">,</span> <span class="char">'l'</span><span class="special">,</span> <span class="char">'e'</span><span class="special">}));</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">get</span><span class="special">(</span><span class="identifier">result</span><span class="special">,</span> <span class="number">1</span><span class="identifier">_c</span><span class="special">)</span> <span class="special">==</span> <span class="string">"foo"</span><span class="special">);</span>
|
||
<span class="special">}</span>
|
||
<span class="special">{</span>
|
||
<span class="comment">// This time, we have a std::vector<char> instead of a std::vector<int>.</span>
|
||
<span class="identifier">bp</span><span class="special">::</span><span class="identifier">tuple</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">char</span><span class="special">>,</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span>
|
||
<span class="keyword">bool</span> <span class="keyword">const</span> <span class="identifier">success</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">u8</span><span class="string">"rôle foo"</span> <span class="special">|</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">as_utf8</span><span class="special">,</span> <span class="identifier">p</span><span class="special">,</span> <span class="identifier">result</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">success</span><span class="special">);</span>
|
||
<span class="comment">// The 4 code points "rôle" get transcoded to 5 UTF-8 code points to fit in the std::string.</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">get</span><span class="special">(</span><span class="identifier">result</span><span class="special">,</span> <span class="number">0</span><span class="identifier">_c</span><span class="special">)</span> <span class="special">==</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">char</span><span class="special">>({</span><span class="char">'r'</span><span class="special">,</span> <span class="special">(</span><span class="keyword">char</span><span class="special">)</span><span class="number">0xc3</span><span class="special">,</span> <span class="special">(</span><span class="keyword">char</span><span class="special">)</span><span class="number">0xb4</span><span class="special">,</span> <span class="char">'l'</span><span class="special">,</span> <span class="char">'e'</span><span class="special">}));</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">get</span><span class="special">(</span><span class="identifier">result</span><span class="special">,</span> <span class="number">1</span><span class="identifier">_c</span><span class="special">)</span> <span class="special">==</span> <span class="string">"foo"</span><span class="special">);</span>
|
||
<span class="special">}</span>
|
||
</pre>
|
||
<p>
|
||
As indicated in the inline comments, there are a couple of things to take
|
||
away from this example:
|
||
</p>
|
||
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
|
||
<li class="listitem">
|
||
If you change an attribute out-param (such as <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>
|
||
to <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span></code>,
|
||
or <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">char32_t</span><span class="special">></span></code>
|
||
to <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">deque</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span></code>),
|
||
the call to <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code> will often still be
|
||
well-formed.
|
||
</li>
|
||
<li class="listitem">
|
||
When changing out a container type, if both containers contain character
|
||
values, the removed container's element type is <code class="computeroutput"><span class="keyword">char32_t</span></code>
|
||
(or <code class="computeroutput"><span class="keyword">wchar_t</span></code> for non-MSVC
|
||
builds), and the new container's element type is <code class="computeroutput"><span class="keyword">char</span></code>
|
||
or <code class="computeroutput"><span class="identifier">char8_t</span></code>, Boost.Parser
|
||
assumes that this is a UTF-32-to-UTF-8 conversion, and silently transcodes
|
||
the data when inserting into the new container.
|
||
</li>
|
||
</ul></div>
|
||
<p>
|
||
Let's look at a case where another simple-seeming type replacement does
|
||
<span class="bold"><strong>not</strong></span> work. First, the case that works:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="identifier">parser</span> <span class="special">=</span> <span class="special">-(</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">char_</span> <span class="special">%</span> <span class="char">','</span><span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">result</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="identifier">b</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="string">"a, b"</span><span class="special">,</span> <span class="identifier">parser</span><span class="special">,</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">ws</span><span class="special">,</span> <span class="identifier">result</span><span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
<code class="computeroutput"><span class="emphasis"><em><code class="literal">ATTR</code></em></span><span class="special">(</span><span class="identifier">parser</span><span class="special">)</span></code>
|
||
is <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">optional</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span></code>. Even though we pass a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span></code>,
|
||
everything is fine. However, if we modify this case only sightly, so that
|
||
the <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">optional</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span></code> is nested within the attribute, the code
|
||
becomes ill-formed.
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">S</span>
|
||
<span class="special">{</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">int</span><span class="special">></span> <span class="identifier">chars</span><span class="special">;</span>
|
||
<span class="keyword">int</span> <span class="identifier">i</span><span class="special">;</span>
|
||
<span class="special">};</span>
|
||
<span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="identifier">parser</span> <span class="special">=</span> <span class="special">-(</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">char_</span> <span class="special">%</span> <span class="char">','</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">int_</span><span class="special">;</span>
|
||
<span class="identifier">S</span> <span class="identifier">result</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="identifier">b</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="string">"a, b 42"</span><span class="special">,</span> <span class="identifier">parser</span><span class="special">,</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">ws</span><span class="special">,</span> <span class="identifier">result</span><span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
If we change <code class="computeroutput"><span class="identifier">chars</span></code> to a
|
||
<code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">vector</span><span class="special"><</span><span class="keyword">char</span><span class="special">></span></code>,
|
||
the code is still ill-formed. Same if we change <code class="computeroutput"><span class="identifier">chars</span></code>
|
||
to a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>. We must actually use <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">optional</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span></code> exactly to make the code well-formed
|
||
again.
|
||
</p>
|
||
<p>
|
||
The reason the same looseness from the top-level parser does not apply to
|
||
a nested parser is that, at some point in the code, the parser <code class="computeroutput"><span class="special">-(</span><span class="identifier">bp</span><span class="special">::</span><span class="identifier">char_</span> <span class="special">%</span> <span class="char">','</span><span class="special">)</span></code> would try
|
||
to assign a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">optional</span><span class="special"><</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span><span class="special">></span></code> — the element type of the attribute
|
||
type it normally generates — to a <code class="computeroutput"><span class="identifier">chars</span></code>.
|
||
If there's no implicit conversion there, the code is ill-formed.
|
||
</p>
|
||
<p>
|
||
The take-away for this last example is that the ability to arbitrarily swap
|
||
out data types within the type of the attribute you pass to <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code> is very flexible, but is
|
||
also limited to structurally simple cases. When we discuss <code class="computeroutput"><a class="link" href="../../boost/parser/rule.html" title="Struct template rule">rules</a></code> in the next section,
|
||
we'll see how this flexibility in the types of attributes can help when writing
|
||
complicated parsers.
|
||
</p>
|
||
<p>
|
||
Those were examples of swapping out one container type for another. They
|
||
make good examples because that is more likely to be surprising, and so it's
|
||
getting lots of coverage here. You can also do much simpler things like parse
|
||
using a <code class="computeroutput"><a class="link" href="../../boost/parser/uint_.html" title="Global uint_">uint_</a></code>,
|
||
and writing its attribute into a <code class="computeroutput"><span class="keyword">double</span></code>.
|
||
In general, you can swap any type <code class="computeroutput"><span class="identifier">T</span></code>
|
||
out of the attribute, as long as the swap would not result in some ill-formed
|
||
assignment within the parse.
|
||
</p>
|
||
<p>
|
||
Here is another example that also produces surprising results, for a different
|
||
reason.
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
|
||
<span class="keyword">constexpr</span> <span class="keyword">auto</span> <span class="identifier">parser</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">char_</span><span class="special">(</span><span class="char">'a'</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">char_</span><span class="special">(</span><span class="char">'b'</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">char_</span><span class="special">(</span><span class="char">'c'</span><span class="special">)</span> <span class="special">|</span>
|
||
<span class="identifier">bp</span><span class="special">::</span><span class="identifier">char_</span><span class="special">(</span><span class="char">'x'</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">char_</span><span class="special">(</span><span class="char">'y'</span><span class="special">)</span> <span class="special">>></span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">char_</span><span class="special">(</span><span class="char">'z'</span><span class="special">);</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">str</span> <span class="special">=</span> <span class="string">"abc"</span><span class="special">;</span>
|
||
<span class="identifier">bp</span><span class="special">::</span><span class="identifier">tuple</span><span class="special"><</span><span class="keyword">char</span><span class="special">,</span> <span class="keyword">char</span><span class="special">,</span> <span class="keyword">char</span><span class="special">></span> <span class="identifier">chars</span><span class="special">;</span>
|
||
<span class="keyword">bool</span> <span class="identifier">b</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">str</span><span class="special">,</span> <span class="identifier">parser</span><span class="special">,</span> <span class="identifier">chars</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">b</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">chars</span> <span class="special">==</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">tuple</span><span class="special">(</span><span class="char">'c'</span><span class="special">,</span> <span class="char">'\0'</span><span class="special">,</span> <span class="char">'\0'</span><span class="special">));</span>
|
||
</pre>
|
||
<p>
|
||
This looks wrong, but is expected behavior. At every stage of the parse that
|
||
produces an attribute, Boost.Parser tries to assign that attribute to some
|
||
part of the out-param attribute provided to <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code>,
|
||
if there is one. Note that <code class="computeroutput"><span class="emphasis"><em><code class="literal">ATTR</code></em></span><span class="special">(</span><span class="identifier">parser</span><span class="special">)</span></code> is <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>,
|
||
because each sequence parser is three <code class="computeroutput"><span class="identifier">char_</span></code>
|
||
parsers in a row, which forms a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>;
|
||
there are two such alternatives, so the overall attribute is also <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code>.
|
||
During the parse, when the first parser <code class="computeroutput"><span class="identifier">bp</span><span class="special">::</span><span class="identifier">char_</span><span class="special">(</span><span class="char">'a'</span><span class="special">)</span></code>
|
||
matches the input, it produces the attribute <code class="computeroutput"><span class="char">'a'</span></code>
|
||
and needs to assign it to its destination. Some logic inside the sequence
|
||
parser indicates that this <code class="computeroutput"><span class="char">'a'</span></code>
|
||
contributes to the value in the <code class="computeroutput"><span class="number">0</span></code>th
|
||
position in the result tuple, if the result is being written into a tuple.
|
||
Here, we passed a <code class="computeroutput"><span class="identifier">bp</span><span class="special">::</span><span class="identifier">tuple</span><span class="special"><</span><span class="keyword">char</span><span class="special">,</span> <span class="keyword">char</span><span class="special">,</span> <span class="keyword">char</span><span class="special">></span></code>,
|
||
so it writes <code class="computeroutput"><span class="char">'a'</span></code> into the first
|
||
element. Each subsequent <code class="computeroutput"><span class="identifier">char_</span></code>
|
||
parser does the same thing, and writes over the first element. If we had
|
||
passed a <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> as the out-param instead, the logic
|
||
would have seen that the out-param attribute is a string, and would have
|
||
appended <code class="computeroutput"><span class="char">'a'</span></code> to it. Then each subsequent
|
||
parser would have appended to the string.
|
||
</p>
|
||
<p>
|
||
Boost.Parser never looks at the arity of the tuple passed to <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code> to see if there are too
|
||
many or too few elements in it, compared to the expected attribute for the
|
||
parser. In this case, there are two extra elements that are never touched.
|
||
If there had been too few elements in the tuple, you would have seen a compilation
|
||
error. The reason that Boost.Parser never does this kind of type-checking
|
||
up front is that the loose assignment logic is spread out among the individual
|
||
parsers; the top-level parse can determine what the expected attribute is,
|
||
but not whether a passed attribute of another type is a suitable stand-in.
|
||
</p>
|
||
<h5>
|
||
<a name="boost_parser.tutorial.the__parse____api.h5"></a>
|
||
<span class="phrase"><a name="boost_parser.tutorial.the__parse____api.compatibility_of__code__phrase_role__identifier__variant__phrase___code__attribute_out_parameters"></a></span><a class="link" href="the__parse____api.html#boost_parser.tutorial.the__parse____api.compatibility_of__code__phrase_role__identifier__variant__phrase___code__attribute_out_parameters">Compatibility
|
||
of <code class="computeroutput"><span class="identifier">variant</span></code> attribute out-parameters</a>
|
||
</h5>
|
||
<p>
|
||
The use of a variant in an out-param is compatible if the default attribute
|
||
can be assigned to the <code class="computeroutput"><span class="identifier">variant</span></code>.
|
||
No other work is done to make the assignment compatible. For instance, this
|
||
will work as you'd expect:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">variant</span><span class="special"><</span><span class="keyword">int</span><span class="special">,</span> <span class="keyword">double</span><span class="special">></span> <span class="identifier">v</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="identifier">b</span> <span class="special">=</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="string">"42"</span><span class="special">,</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">int_</span><span class="special">,</span> <span class="identifier">v</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">b</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">v</span><span class="special">.</span><span class="identifier">index</span><span class="special">()</span> <span class="special">==</span> <span class="number">0</span><span class="special">);</span>
|
||
<span class="identifier">assert</span><span class="special">(</span><span class="identifier">std</span><span class="special">::</span><span class="identifier">get</span><span class="special"><</span><span class="number">0</span><span class="special">>(</span><span class="identifier">v</span><span class="special">)</span> <span class="special">==</span> <span class="number">42</span><span class="special">);</span>
|
||
</pre>
|
||
<p>
|
||
Again, this works because <code class="computeroutput"><span class="identifier">v</span> <span class="special">=</span> <span class="number">42</span></code> is well-formed.
|
||
However, other kinds of substitutions will not work. In particular, the
|
||
<code class="computeroutput"><a class="link" href="../../boost/parser/tuple.html" title="Type definition tuple">boost::parser::tuple</a></code>
|
||
to aggregate or aggregate to <code class="computeroutput"><a class="link" href="../../boost/parser/tuple.html" title="Type definition tuple">boost::parser::tuple</a></code> transformations will
|
||
not work. Here's an example.
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">key_value</span>
|
||
<span class="special">{</span>
|
||
<span class="keyword">int</span> <span class="identifier">key</span><span class="special">;</span>
|
||
<span class="keyword">double</span> <span class="identifier">value</span><span class="special">;</span>
|
||
<span class="special">};</span>
|
||
|
||
<span class="keyword">namespace</span> <span class="identifier">bp</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">variant</span><span class="special"><</span><span class="identifier">key_value</span><span class="special">,</span> <span class="keyword">double</span><span class="special">></span> <span class="identifier">kv_or_d</span><span class="special">;</span>
|
||
<span class="identifier">key_value</span> <span class="identifier">kv</span><span class="special">;</span>
|
||
<span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="string">"42 13.0"</span><span class="special">,</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">int_</span> <span class="special">>></span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">double_</span><span class="special">,</span> <span class="identifier">kv</span><span class="special">);</span> <span class="comment">// Ok.</span>
|
||
<span class="identifier">bp</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="string">"42 13.0"</span><span class="special">,</span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">int_</span> <span class="special">>></span> <span class="identifier">bp</span><span class="special">::</span><span class="identifier">double_</span><span class="special">,</span> <span class="identifier">kv_or_d</span><span class="special">);</span> <span class="comment">// Error: ill-formed!</span>
|
||
</pre>
|
||
<p>
|
||
In this case, it would be easy for Boost.Parser to look at the alternative
|
||
types covered by the variant, and do a conversion. However, there are many
|
||
cases in which there is no obviously correct variant alternative type, or
|
||
in which the user might expect one variant alternative type and get another.
|
||
Consider a couple of cases.
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">i_d</span> <span class="special">{</span> <span class="keyword">int</span> <span class="identifier">i</span><span class="special">;</span> <span class="keyword">double</span> <span class="identifier">d</span><span class="special">;</span> <span class="special">};</span>
|
||
<span class="keyword">struct</span> <span class="identifier">d_i</span> <span class="special">{</span> <span class="keyword">double</span> <span class="identifier">d</span><span class="special">;</span> <span class="keyword">int</span> <span class="identifier">i</span><span class="special">;</span> <span class="special">};</span>
|
||
<span class="keyword">using</span> <span class="identifier">v1</span> <span class="special">=</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">variant</span><span class="special"><</span><span class="identifier">i_d</span><span class="special">,</span> <span class="identifier">d_i</span><span class="special">>;</span>
|
||
|
||
<span class="keyword">struct</span> <span class="identifier">i_s</span> <span class="special">{</span> <span class="keyword">int</span> <span class="identifier">i</span><span class="special">;</span> <span class="keyword">short</span> <span class="identifier">s</span><span class="special">;</span> <span class="special">};</span>
|
||
<span class="keyword">struct</span> <span class="identifier">d_d</span> <span class="special">{</span> <span class="keyword">double</span> <span class="identifier">d1</span><span class="special">;</span> <span class="keyword">double</span> <span class="identifier">d2</span><span class="special">;</span> <span class="special">};</span>
|
||
<span class="keyword">using</span> <span class="identifier">v2</span> <span class="special">=</span> <span class="identifier">std</span><span class="special">::</span><span class="identifier">variant</span><span class="special"><</span><span class="identifier">i_s</span><span class="special">,</span> <span class="identifier">d_d</span><span class="special">>;</span>
|
||
|
||
<span class="keyword">using</span> <span class="identifier">tup_t</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">::</span><span class="identifier">tuple</span><span class="special"><</span><span class="keyword">short</span><span class="special">,</span> <span class="keyword">short</span><span class="special">>;</span>
|
||
</pre>
|
||
<p>
|
||
If we have a parser that produces a <code class="computeroutput"><span class="identifier">tup_t</span></code>,
|
||
and we have a <code class="computeroutput"><span class="identifier">v1</span></code> attribute
|
||
out-param, the correct variant alternative type clearly does not exist —
|
||
this case is ambiguous, and anyone can see that neither variant alternative
|
||
is a better match. If we were assigning a <code class="computeroutput"><span class="identifier">tup_t</span></code>
|
||
to <code class="computeroutput"><span class="identifier">v2</span></code>, it's even worse. The
|
||
same ambiguity exists, but to the user, <code class="computeroutput"><span class="identifier">i_s</span></code>
|
||
is clearly "closer" than <code class="computeroutput"><span class="identifier">d_d</span></code>.
|
||
</p>
|
||
<p>
|
||
So, Boost.Parser only does assignment. If some parser <code class="computeroutput"><span class="identifier">P</span></code>
|
||
generates a default attribute that is not assignable to a variant alternative
|
||
that you want to assign it to, you can just create a <code class="computeroutput"><a class="link" href="../../boost/parser/rule.html" title="Struct template rule">rule</a></code> that creates either an
|
||
exact variant alternative type, or the variant itself, and use <code class="computeroutput"><span class="identifier">P</span></code> as your rule's parser.
|
||
</p>
|
||
<h5>
|
||
<a name="boost_parser.tutorial.the__parse____api.h6"></a>
|
||
<span class="phrase"><a name="boost_parser.tutorial.the__parse____api.unicode_versus_non_unicode_parsing"></a></span><a class="link" href="the__parse____api.html#boost_parser.tutorial.the__parse____api.unicode_versus_non_unicode_parsing">Unicode
|
||
versus non-Unicode parsing</a>
|
||
</h5>
|
||
<p>
|
||
A call to <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code> either considers the entire
|
||
input to be in a UTF format (UTF-8, UTF-16, or UTF-32), or it considers the
|
||
entire input to be in some unknown encoding. Here is how it deduces which
|
||
case the call falls under:
|
||
</p>
|
||
<div class="itemizedlist"><ul class="itemizedlist" style="list-style-type: disc; ">
|
||
<li class="listitem">
|
||
If the range is a sequence of <code class="computeroutput"><span class="identifier">char8_t</span></code>,
|
||
or if the input is a <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">::</span><span class="identifier">utf8_view</span></code>,
|
||
the input is UTF-8.
|
||
</li>
|
||
<li class="listitem">
|
||
Otherwise, if the value type of the range is <code class="computeroutput"><span class="keyword">char</span></code>,
|
||
the input is in an unknown encoding.
|
||
</li>
|
||
<li class="listitem">
|
||
Otherwise, the input is in a UTF encoding.
|
||
</li>
|
||
</ul></div>
|
||
<div class="tip"><table border="0" summary="Tip">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../images/tip.png"></td>
|
||
<th align="left">Tip</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
if you want to want to parse in ASCII-only mode, or in some other non-Unicode
|
||
encoding, use only sequences of <code class="computeroutput"><span class="keyword">char</span></code>,
|
||
like <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span></code> or <code class="computeroutput"><span class="keyword">char</span>
|
||
<span class="keyword">const</span> <span class="special">*</span></code>.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<div class="tip"><table border="0" summary="Tip">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../images/tip.png"></td>
|
||
<th align="left">Tip</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
If you want to ensure all input is parsed as Unicode, pass the input range
|
||
<code class="computeroutput"><span class="identifier">r</span></code> as <code class="computeroutput"><span class="identifier">r</span>
|
||
<span class="special">|</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">::</span><span class="identifier">as_utf32</span></code>
|
||
— that's the first thing that happens to it inside <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code> in the Unicode parsing
|
||
path anyway.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<div class="note"><table border="0" summary="Note">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Note]" src="../../images/note.png"></td>
|
||
<th align="left">Note</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Since passing <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">::</span><span class="identifier">utfN_view</span></code> is a special case, and since
|
||
a sequence of <code class="computeroutput"><span class="keyword">char</span></code>s <code class="computeroutput"><span class="identifier">r</span></code> is otherwise considered an unknown
|
||
encoding, <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">r</span> <span class="special">|</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">::</span><span class="identifier">as_utf8</span><span class="special">,</span> <span class="identifier">p</span><span class="special">)</span></code> treats
|
||
<code class="computeroutput"><span class="identifier">r</span></code> as UTF-8, whereas <code class="computeroutput"><span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="identifier">r</span><span class="special">,</span> <span class="identifier">p</span><span class="special">)</span></code> does not.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<h5>
|
||
<a name="boost_parser.tutorial.the__parse____api.h7"></a>
|
||
<span class="phrase"><a name="boost_parser.tutorial.the__parse____api.the__code__phrase_role__identifier__trace_mode__phrase___code__parameter_to__functionname_alt__boost__parser__parse___code__phrase_role__identifier__parse__phrase__phrase_role__special______phrase___code___functionname_"></a></span><a class="link" href="the__parse____api.html#boost_parser.tutorial.the__parse____api.the__code__phrase_role__identifier__trace_mode__phrase___code__parameter_to__functionname_alt__boost__parser__parse___code__phrase_role__identifier__parse__phrase__phrase_role__special______phrase___code___functionname_">The
|
||
<code class="computeroutput"><span class="identifier">trace_mode</span></code> parameter to
|
||
parse()</a>
|
||
</h5>
|
||
<p>
|
||
Debugging parsers is notoriously difficult once they reach a certain size.
|
||
To get a verbose trace of your parse, pass <code class="computeroutput"><a class="link" href="../../boost/parser/trace.html" title="Type trace">boost::parser::trace</a><span class="special">::</span><span class="identifier">on</span></code> as the final parameter to <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code>. It will show you the current
|
||
parser being matched, the next few characters to be parsed, and any attributes
|
||
generated. See the <a class="link" href="error_handling_and_debugging.html" title="Error Handling and Debugging">Error
|
||
Handling and Debugging</a> section of the tutorial for details.
|
||
</p>
|
||
<h5>
|
||
<a name="boost_parser.tutorial.the__parse____api.h8"></a>
|
||
<span class="phrase"><a name="boost_parser.tutorial.the__parse____api.globals_and_error_handlers"></a></span><a class="link" href="the__parse____api.html#boost_parser.tutorial.the__parse____api.globals_and_error_handlers">Globals
|
||
and error handlers</a>
|
||
</h5>
|
||
<p>
|
||
Each call to <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code> can optionally have a globals
|
||
object associated with it. To use a particular globals object with you parser,
|
||
you call <code class="computeroutput"><a class="link" href="../../boost/parser/with_globals.html" title="Function template with_globals">with_globals()</a></code> to create a new parser with
|
||
the globals object in it:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">struct</span> <span class="identifier">globals_t</span>
|
||
<span class="special">{</span>
|
||
<span class="keyword">int</span> <span class="identifier">foo</span><span class="special">;</span>
|
||
<span class="identifier">std</span><span class="special">::</span><span class="identifier">string</span> <span class="identifier">bar</span><span class="special">;</span>
|
||
<span class="special">};</span>
|
||
<span class="keyword">auto</span> <span class="keyword">const</span> <span class="identifier">parser</span> <span class="special">=</span> <span class="comment">/* ... */</span><span class="special">;</span>
|
||
<span class="identifier">globals_t</span> <span class="identifier">globals</span><span class="special">{</span><span class="number">42</span><span class="special">,</span> <span class="string">"yay"</span><span class="special">};</span>
|
||
<span class="keyword">auto</span> <span class="identifier">result</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="string">"str"</span><span class="special">,</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">::</span><span class="identifier">with_globals</span><span class="special">(</span><span class="identifier">parser</span><span class="special">,</span> <span class="identifier">globals</span><span class="special">));</span>
|
||
</pre>
|
||
<p>
|
||
Every semantic action within that call to <code class="computeroutput"><a class="link" href="../../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code>
|
||
can access the same <code class="computeroutput"><span class="identifier">globals_t</span></code>
|
||
object using <code class="computeroutput"><span class="identifier">_globals</span><span class="special">(</span><span class="identifier">ctx</span><span class="special">)</span></code>.
|
||
</p>
|
||
<p>
|
||
The default error handler is great for most needs, but if you want to change
|
||
it, you can do so by creating a new parser with a call to <code class="computeroutput"><a class="link" href="../../boost/parser/with_error_handler.html" title="Function template with_error_handler">with_error_handler()</a></code>:
|
||
</p>
|
||
<pre class="programlisting"><span class="keyword">auto</span> <span class="keyword">const</span> <span class="identifier">parser</span> <span class="special">=</span> <span class="comment">/* ... */</span><span class="special">;</span>
|
||
<span class="identifier">my_error_handler</span> <span class="identifier">error_handler</span><span class="special">;</span>
|
||
<span class="keyword">auto</span> <span class="identifier">result</span> <span class="special">=</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">::</span><span class="identifier">parse</span><span class="special">(</span><span class="string">"str"</span><span class="special">,</span> <span class="identifier">boost</span><span class="special">::</span><span class="identifier">parser</span><span class="special">::</span><span class="identifier">with_error_handler</span><span class="special">(</span><span class="identifier">parser</span><span class="special">,</span> <span class="identifier">error_handler</span><span class="special">));</span>
|
||
</pre>
|
||
<div class="tip"><table border="0" summary="Tip">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Tip]" src="../../images/tip.png"></td>
|
||
<th align="left">Tip</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
If your parsing environment does not allow you to report errors to a terminal,
|
||
you may want to use <code class="computeroutput"><a class="link" href="../../boost/parser/callback_error_handler.html" title="Struct callback_error_handler">callback_error_handler</a></code> instead
|
||
of the default error handler.
|
||
</p></td></tr>
|
||
</table></div>
|
||
<div class="important"><table border="0" summary="Important">
|
||
<tr>
|
||
<td rowspan="2" align="center" valign="top" width="25"><img alt="[Important]" src="../../images/important.png"></td>
|
||
<th align="left">Important</th>
|
||
</tr>
|
||
<tr><td align="left" valign="top"><p>
|
||
Globals and the error handler are ignored, if present, on any parser except
|
||
the top-level parser.
|
||
</p></td></tr>
|
||
</table></div>
|
||
</div>
|
||
<div class="copyright-footer">Copyright © 2020 T. Zachary Laine<p>
|
||
Distributed under the Boost Software License, Version 1.0. (See accompanying
|
||
file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
|
||
</p>
|
||
</div>
|
||
<hr>
|
||
<div class="spirit-nav">
|
||
<a accesskey="p" href="attribute_generation.html"><img src="../../images/prev.png" alt="Prev"></a><a accesskey="u" href="../tutorial.html"><img src="../../images/up.png" alt="Up"></a><a accesskey="h" href="../../index.html"><img src="../../images/home.png" alt="Home"></a><a accesskey="n" href="more_about_rules.html"><img src="../../images/next.png" alt="Next"></a>
|
||
</div>
|
||
</body>
|
||
</html>
|