mirror of
https://github.com/boostorg/url.git
synced 2026-02-15 01:22:11 +00:00
111 lines
4.0 KiB
Plaintext
111 lines
4.0 KiB
Plaintext
[/
|
|
Copyright (c) 2019 Vinnie Falco (vinnie.falco@gmail.com)
|
|
|
|
Distributed under the Boost Software License, Version 1.0. (See accompanying
|
|
file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)
|
|
|
|
Official repository: https://github.com/CPPAlliance/url
|
|
]
|
|
|
|
A URL string can be parsed using one of the parsing functions.
|
|
Each function parses according to a particular grammar specified
|
|
in __rfc3986__:
|
|
|
|
[table Parsing Functions [
|
|
[Function]
|
|
[Grammar]
|
|
][
|
|
[[link url.ref.boost__urls__parse_absolute_uri `parse_absolute_uri`]]
|
|
[[@https://datatracker.ietf.org/doc/html/rfc3986#section-4.3 ['absolute-URI]]]
|
|
][
|
|
[[link url.ref.boost__urls__parse_relative_ref `parse_relative_ref`]]
|
|
[[@https://datatracker.ietf.org/doc/html/rfc3986#section-4.2 ['relative-ref]]]
|
|
][
|
|
[[link url.ref.boost__urls__parse_uri `parse_uri`]]
|
|
[[@https://datatracker.ietf.org/doc/html/rfc3986#section-3 ['URI]]]
|
|
][
|
|
[[link url.ref.boost__urls__parse_uri_reference `parse_uri_reference`]]
|
|
[[@https://datatracker.ietf.org/doc/html/rfc3986#section-4.1 ['URI-reference]]]
|
|
]]
|
|
|
|
The collective grammars parsed by these algorithms are specified below.
|
|
To understand the syntax of the BNF specification, or to understand the
|
|
rest of the elements such as "scheme" below please refer to __rfc3986__:
|
|
|
|
```
|
|
absolute-URI = scheme ":" hier-part [ "?" query ]
|
|
|
|
relative-ref = relative-part [ "?" query ] [ "#" fragment ]
|
|
|
|
URI = scheme ":" hier-part [ "?" query ] [ "#" fragment ]
|
|
|
|
URI-reference = URI / relative-ref
|
|
|
|
hier-part = "//" authority path-abempty
|
|
/ path-absolute
|
|
/ path-rootless
|
|
/ path-empty
|
|
|
|
relative-part = "//" authority path-abempty
|
|
/ path-absolute
|
|
/ path-noscheme
|
|
/ path-empty
|
|
```
|
|
|
|
Each of these functions accepts a __string_view__ and returns a __url_view__
|
|
wrapped in a __result__ type. The following example parses a string literal
|
|
containing a URI:
|
|
```
|
|
result< url_view > r = parse_uri( "https://www.example.com/path/to/file.txt" );
|
|
|
|
if( r.has_value() ) // parsing was successful
|
|
{
|
|
url_view u = r.value(); // extract the url_view
|
|
|
|
std::cout << u; // format the URL to cout
|
|
}
|
|
else
|
|
{
|
|
std::cout << r.error().message(); // parsing failure; print error
|
|
}
|
|
```
|
|
|
|
The function throws nothing and returns the result in a variant-like container
|
|
which holds a __url_view__ or an __error_code__ in the case where the parsing
|
|
failed. Note that like a string view, the URL view does not own the underlying
|
|
character buffer. Instead, it references the string passed to the parsing
|
|
function. The caller is required to ensure that the lifetime of the string
|
|
extends until the view is destroyed.
|
|
A URL view containing a non-empty string cannot be constructed directly;
|
|
instead, it must be created using a parsing function. This guarantees that
|
|
any constructed view contains a syntactically valid URL already in its
|
|
serialized form.
|
|
|
|
The function
|
|
[link url.ref.boost__urls__url_view.collect `url_view::collect`]
|
|
may be used to create a copy of the underlying character buffer and attach
|
|
ownership of the buffer to a newly returned view, which is wrapped in a
|
|
shared pointer. The following code calls `collect` to create a read-only
|
|
copy:
|
|
```
|
|
// This will hold our copy
|
|
std::shared_ptr<url_view const> sp;
|
|
{
|
|
// result::value() will throw an exception if an error occurs
|
|
url_view u = parse_relative_ref( "/path/to/file.txt" ).value();
|
|
|
|
// create a copy with ownership and string lifetime extension
|
|
sp = u.collect();
|
|
|
|
// At this point the string literal goes out of scope
|
|
}
|
|
|
|
// but `*sp` remains valid since it has its own copy
|
|
std::cout << *sp;
|
|
```
|
|
|
|
The interface of __url_view__ decomposes the URL into its individual parts and
|
|
allows for inspection of the various parts as well as returning metadata about
|
|
the URL itself. These non-modifying observer operations are described in the
|
|
sections that follow.
|