parser/doc/html/boost_parser/introduction.html

<html>
<head>
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
<title>Introduction</title>
<link rel="stylesheet" href="../boostbook.css" type="text/css">
<meta name="generator" content="DocBook XSL Stylesheets V1.79.1">
<link rel="home" href="../index.html" title="Chapter 1. Boost.Parser">
<link rel="up" href="../index.html" title="Chapter 1. Boost.Parser">
<link rel="prev" href="../index.html" title="Chapter 1. Boost.Parser">
<link rel="next" href="configuration_and_optional_features.html" title="Configuration and Optional Features">
<meta name="viewport" content="width=device-width, initial-scale=1">
</head>
<body bgcolor="white" text="black" link="#0000FF" vlink="#840084" alink="#0000FF">
<div class="spirit-nav">
<a accesskey="p" href="../index.html"><img src="../images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../images/home.png" alt="Home"></a><a accesskey="n" href="configuration_and_optional_features.html"><img src="../images/next.png" alt="Next"></a>
</div>
<div class="section">
<div class="titlepage"><div><div><h2 class="title" style="clear: both">
<a name="boost_parser.introduction"></a><a class="link" href="introduction.html" title="Introduction">Introduction</a>
</h2></div></div></div>
<p>
      Boost.Parser is a <a href="https://en.wikipedia.org/wiki/Parser_combinator" target="_top">parser
      combinator</a> library. That is, it consists of a set of low-level primitive
      parsers, and operations that can be used to combine those parsers into more
      complicated parsers.
    </p>
<p>
      There are primitive parsers that parse <span class="emphasis"><em>epsilon</em></span> (the empty
      string), <code class="computeroutput"><span class="keyword">char</span></code>s, <code class="computeroutput"><span class="keyword">int</span></code>s, <code class="computeroutput"><span class="keyword">float</span></code>s,
      etc.
    </p>
<p>
      There are operations which combine parsers to create new parsers. For instance,
      the <a href="https://en.wikipedia.org/wiki/Kleene_star" target="_top">Kleene star</a>
      operation takes an existing parser <code class="computeroutput"><span class="identifier">p</span></code>
      and creates a new parser that matches zero or more occurrences of whatever
      <code class="computeroutput"><span class="identifier">p</span></code> matches. Both callable objects
      and operator overloads are used for the combining operations. For instance,
      <code class="computeroutput"><span class="keyword">operator</span><span class="special">*()</span></code>
      is used for <a href="https://en.wikipedia.org/wiki/Kleene_star" target="_top">Kleene star</a>,
      and you can also write <code class="computeroutput"><span class="identifier">repeat</span><span class="special">(</span><span class="identifier">n</span><span class="special">)[</span><span class="identifier">p</span><span class="special">]</span></code> to create
      a parser for exactly <code class="computeroutput"><span class="identifier">n</span></code> repetitions
      of <code class="computeroutput"><span class="identifier">p</span></code>.
    </p>
<p>
      Boost.Parser also tries to accommodate the multiple ways that people often
      want to get a parse result out of their parsing code. Some parsing may best
      be done by returning an object that represents the result of the parse. Other
      parsing may best be done by filling in a preexisting data structure. Yet other
      parsing may best be done by parsing small sections of a large document, and
      reporting the results of subparsers as they are finished, via callbacks. Boost.Parser
      accommodates all these ways of working, and even makes it possible to do callback-based
      or non-callback-based parsing without rewriting any code (except by changing
      the top-level call from <code class="computeroutput"><a class="link" href="../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code>
      to <code class="computeroutput"><a class="link" href="../boost/parser/callback_parse_id6.html" title="Function template callback_parse">callback_parse()</a></code>).
    </p>
<p>
      All of Boost.Parser's public interfaces are sentinel- and range-friendly, just
      like the interfaces in <code class="computeroutput"><span class="identifier">std</span><span class="special">::</span><span class="identifier">ranges</span></code>.
    </p>
<p>
      Boost.Parser is Unicode-aware through and through. When you parse ranges of
      <code class="computeroutput"><span class="keyword">char</span></code>, Boost.Parser does not assume
      any particular encoding — not Unicode or any other encoding. Parsing
      of inputs <span class="bold"><strong>other than</strong></span> plain <code class="computeroutput"><span class="keyword">char</span></code>s
      assumes that the input is Unicode. In the Unicode-aware code paths, all parsing
      is done by matching code points. This means that you can feed UTF-8 strings
      into Boost.Parser, both as input and within your parser, and the right sort
      of matching occurs. For instance, if your parser is trying to match repetitions
      of the <code class="computeroutput"><span class="keyword">char</span></code> <code class="computeroutput"><span class="char">'\xcc'</span></code>
      (which is a lead byte from a UTF-8 sequence, and so is malformed UTF-8 if not
      followed by an appropriate UTF-8 code unit), it will <span class="bold"><strong>not</strong></span>
      match the start of <code class="computeroutput"><span class="string">"\xcc\x80"</span></code>
      (UTF-8 for the code point U+0300). Boost.Parser knows that the matching must
      be whole-code-point, and so it interprets the <code class="computeroutput"><span class="keyword">char</span></code>
      <code class="computeroutput"><span class="char">'\xcc'</span></code> as the code point U+00CC.
    </p>
<p>
      Error reporting is important to get right, and it is important to make errors
      easy to understand, especially for end-users. Boost.Parser produces runtime
      parse error messages that are very similar to the diagnostics that you get
      when compiling with GCC and Clang (it even supports warnings that don't fail
      the parse). The exact token associated with a diagnostic can be reported to
      the user, with the containing line quoted, and with a marker pointing right
      at the token. Boost.Parser takes care of this for you; your parser does not
      need to include any special code to make this happen. Of course, you can also
      replace the error handler entirely, if it doesn't fit your needs.
    </p>
<p>
      Debugging complex parsers can be a real nightmare. Boost.Parser makes it trivial
      to get a trace of your entire parse, with easy-to-read (and very verbose) indications
      of where each part of the trace is within the parse, the state of values produced
      by the parse, etc. Again, you don't need to write any code to make this happen
      — you just pass a parameter to <code class="computeroutput"><a class="link" href="../boost/parser/parse_id2.html" title="Function template parse">parse()</a></code>.
    </p>
<p>
      Dependencies are still a nightmare in C++, so Boost.Parser can be used as a
      purely standalone library, independent of Boost.
    </p>
</div>
<div class="copyright-footer">Copyright © 2020 T. Zachary Laine<p>
        Distributed under the Boost Software License, Version 1.0. (See accompanying
        file LICENSE_1_0.txt or copy at <a href="http://www.boost.org/LICENSE_1_0.txt" target="_top">http://www.boost.org/LICENSE_1_0.txt</a>)
      </p>
</div>
<hr>
<div class="spirit-nav">
<a accesskey="p" href="../index.html"><img src="../images/prev.png" alt="Prev"></a><a accesskey="u" href="../index.html"><img src="../images/up.png" alt="Up"></a><a accesskey="h" href="../index.html"><img src="../images/home.png" alt="Home"></a><a accesskey="n" href="configuration_and_optional_features.html"><img src="../images/next.png" alt="Next"></a>
</div>
</body>
</html>