Merge pull request #145 from cppalliance/docs

Collected documentation updates from review comments
This commit is contained in:
Matt Borland
2024-01-30 08:55:44 +01:00
committed by GitHub
12 changed files with 211 additions and 76 deletions

View File

@@ -19,13 +19,16 @@ Matt Borland
include::charconv/overview.adoc[]
include::charconv/build.adoc[]
include::charconv/basic_usage.adoc[]
include::charconv/api_reference.adoc[]
include::charconv/from_chars.adoc[]
include::charconv/to_chars.adoc[]
include::charconv/chars_format.adoc[]
include::charconv/limits.adoc[]
include::charconv/reference.adoc[]
#include::charconv/reference.adoc[]
include::charconv/benchmarks.adoc[]
include::charconv/sources.adoc[]
include::charconv/acknowledgments.adoc[]
include::charconv/copyright.adoc[]
:leveloffset: -1

View File

@@ -0,0 +1,16 @@
////
Copyright 2024 Matt Borland
Distributed under the Boost Software License, Version 1.0.
https://www.boost.org/LICENSE_1_0.txt
////
[#acknowledgments]
= Acknowledgments
:idprefix: ack_
Special thanks to the following people (non-inclusive list):
- Peter Dimov for providing technical guidance and contributing to the library throughout development
- Chris Kormanyos for serving as the library review manager
- Stephan T. Lavavej for providing the basis for the benchmarks.
- All that reviewed the library and provided feedback to make it better

View File

@@ -0,0 +1,34 @@
////
Copyright 2023 Matt Borland
Distributed under the Boost Software License, Version 1.0.
https://www.boost.org/LICENSE_1_0.txt
////
[#api_reference]
= API Reference
:idprefix: api_ref_
== Functions
- <<from_chars_definitions_, `boost::charconv::from_chars`>>
- <<from_chars_definitions_, `boost::charconv::from_chars_erange`>>
- <<to_chars_definitions_, `boost::charconv::to_chars`>>
== Structures
- <<from_chars_definitions_, `boost::charconv::from_chars_result`>>
- <<to_chars_definitions_, `boost::charconv::to_chars_result`>>
== Enums
- <<chars_format_defintion_,`boost::charconv::chars_format`>>
== Constants
- <<limits_definitions_, `boost::charconv::limits::digits`>>
- <<limits_definitions_, `boost::charconv::limits::digits10`>>
== Macros
- <<integral_usage_notes_, `BOOST_CHARCONV_CONSTEXPR`>>
- <<run_benchmarks_, `BOOST_CHARCONV_RUN_BENCHMARKS`>>

View File

@@ -0,0 +1,28 @@
////
Copyright 2024 Matt Borland
Distributed under the Boost Software License, Version 1.0.
https://www.boost.org/LICENSE_1_0.txt
////
[#basic_usage]
= Basic Usage Examples
:idprefix: basic_usage_
== Usage Examples
[source, c++]
----
#include <boost/charconv.hpp>
const char* buffer = "42";
int v = 0;
boost::charconv::from_chars_result r = boost::charconv::from_chars(buffer, buffer + std::strlen(buffer), v);
assert(r.ec == std::errc());
assert(v == 42);
char buffer[64];
int v = 123456;
boost::charconv:to_chars_result r = boost::charconv::to_chars(buffer, buffer + sizeof(buffer), v);
assert(r.ec == std::errc());
assert(!strncmp(buffer, "123456", 6)); // Strncmp returns 0 on match
----

View File

@@ -7,10 +7,14 @@ https://www.boost.org/LICENSE_1_0.txt
= Benchmarks
:idprefix: benchmarks
This section describes a range of performance benchmarks that have been run comparing this library with the standard library, and how to run your own benchmarks if required.
The values are relative to the performance of `std::printf` and `std::strtoX`.
Larger numbers are more performant (e.g. 2.00 means twice as fast, and 0.50 means it takes twice as long).
`std::printf` and `std::strtoX` are always listed first as they will be the reference value.
== How to run the Benchmarks
[#run_benchmarks_]
To run the benchmarks yourself, navigate to the test folder and define `BOOST_CHARCONV_RUN_BENCHMARKS` when running the tests.
An example on Linux with b2: `../../../b2 cxxstd=20 toolset=gcc-13 define=BOOST_CHARCONV_RUN_BENCHMARKS STL_benchmark linkflags="-lfmt" -a release` .
@@ -23,11 +27,14 @@ Additionally, you will need the following:
* https://github.com/google/double-conversion[libdouble-conversion]
* https://github.com/fmtlib/fmt[{fmt}]
== x86_64 Linux
== Results
[#benchmark_results_]
=== x86_64 Linux
Data in tables 1 - 4 were run on Ubuntu 23.04 with x86_64 architecture using GCC 13.1.0 with libstdc++.
=== Floating Point
==== Floating Point
.to_chars floating point with the shortest representation
|===
@@ -67,7 +74,7 @@ Data in tables 1 - 4 were run on Ubuntu 23.04 with x86_64 architecture using GCC
|1.16 / 1.30
|===
=== Integral
==== Integral
.to_chars base 10 integers
|===
@@ -103,11 +110,11 @@ Data in tables 1 - 4 were run on Ubuntu 23.04 with x86_64 architecture using GCC
|2.54 / 1.78
|===
== x86_64 Windows
=== x86_64 Windows
Data in tables 5 - 8 were run on Windows 11 with x86_64 architecture using MSVC 14.3 (V17.7.0).
=== Floating Point
==== Floating Point
.to_chars floating point with the shortest representation
|===
@@ -141,7 +148,7 @@ Data in tables 5 - 8 were run on Windows 11 with x86_64 architecture using MSVC
|2.06 / 5.21
|===
=== Integral
==== Integral
.to_chars base 10 integers
|===
@@ -175,11 +182,11 @@ Data in tables 5 - 8 were run on Windows 11 with x86_64 architecture using MSVC
|2.68 / 2.27
|===
== ARM MacOS
=== ARM MacOS
Data in tables 9-12 were run on MacOS Ventura 13.5.2 with M1 Pro architecture using Homebrew GCC 13.2.0 with libstdc++.
=== Floating Point
==== Floating Point
.to_chars floating point with the shortest representation
|===
@@ -220,7 +227,7 @@ Data in tables 9-12 were run on MacOS Ventura 13.5.2 with M1 Pro architecture us
|===
=== Integral
==== Integral
.to_chars base 10 integers
|===
@@ -255,6 +262,3 @@ Data in tables 9-12 were run on MacOS Ventura 13.5.2 with M1 Pro architecture us
|Boost.Charconv.from_chars
|2.27 / 1.65
|===
Special thanks to Stephan T. Lavavej for providing the basis for the benchmarks.

View File

@@ -4,7 +4,7 @@ Distributed under the Boost Software License, Version 1.0.
https://www.boost.org/LICENSE_1_0.txt
////
= Building the Library
= Getting Started
:idprefix: build_
== B2
@@ -29,6 +29,8 @@ To install the development environment, run:
sudo ./b2 install cxxstd=11
----
The value of cxxstd must be at least 11. https://www.boost.org/doc/libs/1_84_0/tools/build/doc/html/index.html[See the b2 documentation] under `cxxstd` for all valid values.
== vcpkg
Run the following commands to clone the latest version of Charconv and install it using vcpkg:
@@ -39,7 +41,7 @@ cd charconv
vcpkg install charconv --overlay-ports=ports/charconv
----
Any required Boost packages that do not already exist will be installed automatically.
Any required Boost packages not currently installed in your development environment will be installed automatically.
== Conan
@@ -67,3 +69,7 @@ For example, using a `conanfile.txt`:
[requires]
boost_charconv/1.0.0
----
== Dependencies
This library depends on: Boost.Assert, Boost.Config, Boost.Core, and https://gcc.gnu.org/onlinedocs/libquadmath/[libquadmath] on supported platforms (e.g. Linux with x86, x86_64, PPC64, and IA64).

View File

@@ -8,6 +8,12 @@ https://www.boost.org/LICENSE_1_0.txt
:idprefix: chars_format_
== chars_format overview
`boost::charconv::chars_format` is an `enum class` used to define the format of floating point types with `from_chars` and `to_chars`.
== Definition
[#chars_format_defintion_]
[source, c++]
----
namespace boost { namespace charconv {
@@ -22,7 +28,8 @@ enum class chars_format : unsigned
}} // Namespace boost::charconv
----
`boost::charconv::chars_format` is used to specify the format of floating point types with `from_chars` and `to_chars`.
== Formats
=== Scientific Format
Scientific format will be of the form `1.3e+03`.

View File

@@ -1,5 +1,5 @@
////
Copyright 2023 Matt Borland
Copyright 2023 - 2024 Matt Borland
Distributed under the Boost Software License, Version 1.0.
https://www.boost.org/LICENSE_1_0.txt
////
@@ -10,7 +10,11 @@ https://www.boost.org/LICENSE_1_0.txt
== from_chars overview
`from_chars` is a set of functions that parse a string from `[first, last)` in an attempt to convert the string into `value` according to the `chars_format` specified (if applicable).
The result of `from_chars` is `from_chars_result` which on success returns `ptr == last` and `ec == std::errc()`, and on failure returns `ptr` equal to the last valid character parsed or `last` on underflow/overflow, and `ec == std::errc::invalid_argument` or `std::errc::result_out_of_range` respectively.
The parsing of number is locale-independent (e.g. equivalent to the "C" locale).
The result of `from_chars` is `from_chars_result` which on success returns `ptr == last` and `ec == std::errc()`, and on failure returns `ptr` equal to the last valid character parsed or `last` on underflow/overflow, and `ec == std::errc::invalid_argument` or `std::errc::result_out_of_range` respectively. `from_chars` does not require the character sequence to be null terminated.
== Definitions
[#from_chars_definitions_]
[source, c++]
----
@@ -33,51 +37,67 @@ BOOST_CXX14_CONSTEXPR from_chars_result from_chars<bool>(const char* first, cons
template <typename Real>
from_chars_result from_chars(const char* first, const char* last, Real& value, chars_format fmt = chars_format::general) noexcept;
// See note below in from_chars for floating point types
// See note below Usage notes for from_chars for floating point types
template <typename Real>
from_chars_result from_chars_strict(const char* first, const char* last, Real& value, chars_format fmt = chars_format::general) noexcept;
from_chars_result from_chars_erange(const char* first, const char* last, Real& value, chars_format fmt = chars_format::general) noexcept;
}} // Namespace boost::charconv
----
== from_chars_result
* `ptr` - On return from `from_chars` it is a pointer to the first character not matching the pattern, or pointer to `last` if all characters are successfully parsed.
* `ec` - the error code. Values returned by `from_chars` are:
** `std::errc()` - successful parsing
** `std::errc::invalid_argument` - invalid argument (e.g. parsing a negative number into an unsigned type)
** `std::errc::result_out_of_range` - result out of range (e.g. overflow)
* `operator==` - compares the values of ptr and ec for equality
== from_chars
* `first`, `last` - valid range to parse
== from_chars parameters
* `first`, `last` - pointers to a valid range to parse
* `value` - where the output is stored upon successful parsing
* `base` (integer only) - the integer base to use. Must be between 2 and 36 inclusive
* `fmt` (floating point only) - The format of the buffer. See <<chars_format overview>> for description.
=== from_chars for integral types
== from_chars_result
* `ptr` - On return from `from_chars` it is a pointer to the first character not matching the pattern, or pointer to `last` if all characters are successfully parsed.
* `ec` - https://en.cppreference.com/w/cpp/error/errc[the error code]. Values returned by `from_chars` are:
|===
|Return Value | Description
| `std::errc()` | Successful Parsing
| `std::errc::invalid_argument` | 1) Parsing a negative into an unsigned type
2) Leading `+` sign
3) Leading space
4) Incompatible formatting (e.g. exponent on `chars_format::fixed`, or p as exponent on value that is not `chars_format::hex`) See <<chars_format overview>>
| `std::errc::result_out_of_range` | 1) Overflow
2) Underflow
|===
* `operator==` - compares the values of ptr and ec for equality
== Usage Notes
=== Usage notes for from_chars for integral types
* All built-in integral types are allowed except bool which is deleted
* These functions have been tested to support `\__int128` and `unsigned __int128`
* from_chars for integral types is constexpr when compiled using `-std=c++14` or newer
** One known exception is GCC 5 which does not support constexpr comparison of `const char*`.
* A valid string must only contain the characters for numbers. Leading spaces are not ignored, and will return `std::errc::invalid_argument`.
=== from_chars for floating point types
=== Usage notes for from_chars for floating point types
* On `std::errc::result_out_of_range` we return ±0 for small values (e.g. 1.0e-99999) or ±HUGE_VAL for large values (e.g. 1.0e+99999) to match the handling of `std::strtod`.
This is a divergence from the standard which states we should return the `value` argument unmodified.
** The rationale for this divergence is an open issue with LWG here: https://cplusplus.github.io/LWG/lwg-active.html#3081.
** `from_chars` has an open issue with LWG here: https://cplusplus.github.io/LWG/lwg-active.html#3081.
The standard for <charconv> does not distinguish between underflow and overflow like strtod does.
Let's say you are writing a JSON library, and you replace `std::strtod` with `boost::charconv::from_chars` for performance reasons.
Charconv returns std::errc::result_out_of_range on some conversion.
You would then have to parse the string again yourself to figure out which of the four possible reasons you got `std::errc::result_out_of_range`.
Charconv already had this information but could not give it to you.
Charconv can give you that information by using `boost::charconv::from_chars_erange` instead of `boost::charconv::from_chars` throughout the code base.
By implementing the resolution to the LWG issue that matches the established strtod behavior I think we are providing the correct behavior without waiting on the committee's decision.
** If you prefer the handling required by the standard (e.g. value is returned unmodified on `std::errc::result_out_of_range`) use `boost::charconv::from_chars_strict`.
The handling of `std::errc::result_out_of_range` is the only difference between `from_chars` and `from_chars_strict`.
* These functions have been tested to support all built-in floating-point types and those from C++23's `<stdfloat>`
** Long doubles can be 64, 80, or 128-bit, but must be IEEE 754 compliant. An example of a non-compliant, and therefore unsupported, format is `__ibm128`.
** Use of `__float128` or `std::float128_t` requires compiling with `-std=gnu++xx` and linking GCC's `libquadmath`.
This is done automatically when building with CMake.
== Examples
@@ -126,6 +146,9 @@ assert(v == 8.0427e-18);
----
=== std::errc::invalid_argument
The below is invalid because a negative value is being parsed into an unsigned integer.
[source, c++]
----
const char* buffer = "-123";
@@ -134,6 +157,9 @@ auto r = boost::charconv::from_chars(buffer, buffer + std::strlen(buffer), v);
assert(r.ec == std::errc::invalid_argument);
assert(!r); // Same as above but less verbose. Added in C++26.
----
The below is invalid because a fixed format floating-point value can not have an exponent.
[source, c++]
----
const char* buffer = "-1.573e-3";
@@ -142,7 +168,7 @@ auto r = boost::charconv::from_chars(buffer, buffer + std::strlen(buffer), v, bo
assert(r.ec == std::errc::invalid_argument);
assert(!r); // Same as above but less verbose. Added in C++26.
----
Note: In the event of std::errc::invalid_argument, v is not modified by `from_chars`
Note: In the event of `std::errc::invalid_argument`, v is not modified by `from_chars`
=== std::errc::result_out_of_range
[source, c++]

View File

@@ -11,6 +11,9 @@ https://www.boost.org/LICENSE_1_0.txt
The contents of `<boost/charconv/limits.hpp>` are designed to help the user optimize the size of the buffer required for `to_chars`.
== Definitions
[#limits_definitions_]
[source, c++]
----
namespace boost { namespace charconv {
@@ -35,6 +38,8 @@ The minimum size of the buffer that needs to be passed to `to_chars` to guarant
== Examples
The following two examples are for `max_chars10` to optimize the buffer size with `to_chars` for an integral type and a floating-point type respectively.
[source, c++]
----
char buffer [boost::charconv::limits<std::int32_t>::max_chars10;
@@ -55,6 +60,8 @@ assert(r); // Same as above but less verbose. Added in C++26.
assert(!strcmp(buffer, "3.40282347e+38")); // strcmp returns 0 on match
----
The following example is a usage of `max_chars` when used to serialize an integer in binary (base = 2).
[source, c++]
----
char buffer [boost::charconv::limits<std::uint16_t>::max_chars;

View File

@@ -11,39 +11,25 @@ https://www.boost.org/LICENSE_1_0.txt
== Description
Charconv is a collection of parsing functions that are locale-independent, non-allocating, and non-throwing.
This library requires a minimum of C++11.
Boost.Charconv converts character buffers to numbers, and numbers to character buffers.
It is a small library of two overloaded functions to do the heavy lifting, plus several supporting enums, structures, templates, and constants, with a particular focus on performance and consistency
across the supported development environments.
== Usage Examples
[source, c++]
----
#include <boost/charconv.hpp>
Why should I be interested in this Library? Charconv is locale-independent, non-allocating^1^, non-throwing and only requires a minimum of C++ 11.
It provides functionality similar to that found in `std::printf` or `std::strtod` with <<benchmark_results_, substantial performance increases>>.
This library can also be used in place of the standard library `<charconv>` if unavailable with your toolchain.
Currently only https://en.cppreference.com/w/cpp/compiler_support/17.html[GCC 11+ and MSVC 19.24+] support both integer and floating-point conversions in their implementation of `<charconv>`. +
If you are using either of those compilers, Boost.Charconv is at least as performant as `<charconv>`, and can be up to several times faster.
See: <<Benchmarks>>
const char* buffer = "42";
int v = 0;
boost::charconv::from_chars_result r = boost::charconv::from_chars(buffer, buffer + std::strlen(buffer), v);
assert(r.ec == std::errc());
assert(v == 42);
^1^ The one edge case where allocation may occur is you are parsing a string to an 80 or 128-bit `long double` or `__float128`, and the string is over 1024 bytes long.
char buffer[64];
int v = 123456;
boost::charconv:to_chars_result r = boost::charconv::to_chars(buffer, buffer + sizeof(buffer) - 1, v);
assert(r.ec == std::errc());
assert(!strncmp(buffer, "123456", 6)); // Strncmp returns 0 on match
== Supported Compilers / OS
----
== Supported Compilers
Boost.Charconv is tested on Ubuntu, macOS, and Windows with the following compilers:
* GCC 5 or later
* Clang 3.8 or later
* Visual Studio 2015 (14.0) or later
Tested on https://github.com/cppalliance/charconv/actions[Github Actions] and https://drone.cpp.al/cppalliance/charconv[Drone].
== Why use Boost.Charconv over <charconv>?
Currently only https://en.cppreference.com/w/cpp/compiler_support/17[GCC 11+ and MSVC 19.24+] support both integer and floating-point conversions in their implementation of `<charconv>`. +
If you are using either of those compilers, Boost.Charconv is at least as performant as `<charconv>`, and can be up to several times faster.
See: <<Benchmarks>>
Tested on https://github.com/cppalliance/charconv/actions[GitHub Actions] and https://drone.cpp.al/cppalliance/charconv[Drone].

View File

@@ -6,6 +6,9 @@ https://www.boost.org/LICENSE_1_0.txt
[#sources]
= Sources
The following papers and blog posts serve as the basis for the algorithms used in the library:
:idprefix:
:linkattrs:

View File

@@ -9,7 +9,12 @@ https://www.boost.org/LICENSE_1_0.txt
== to_chars overview
`to_chars` is a set of functions that attempts to convert `value` into a character buffer specified by `[first, last)`. The result of `to_chars` is `to_chars_result` which on success returns `ptr` equal to one-past-the-end of the characters written and `ec == std::errc()` and on failure returns `std::errc::result_out_of_range` and `ptr == last`.
`to_chars` is a set of functions that attempts to convert `value` into a character buffer specified by `[first, last)`.
The result of `to_chars` is `to_chars_result` which on success returns `ptr` equal to one-past-the-end of the characters written and `ec == std::errc()` and on failure returns `std::errc::result_out_of_range` and `ptr == last`.
`to_chars` does not null-terminate the returned characters.
== Definitions
[#to_chars_definitions_]
[source, c++]
----
@@ -36,14 +41,7 @@ to_chars_result to_chars(char* first, char* last, Real value, chars_format fmt =
}} // Namespace boost::charconv
----
== to_chars_result
* `ptr` - On return from `to_chars` points to one-past-the-end of the characters written on success or `last` on failure
* `ec` - the error code from `to_chars`. Returned values are:
** `std::errc()` - successful parsing
** `std::errc::result_out_of_range` - result out of range (e.g. overflow)
* `operator==` - compares the value of ptr and ec for equality
== to_chars
== to_chars parameters
* `first, last` - pointers to the beginning and end of the character buffer
* `value` - the value to be parsed into the buffer
* `base` (integer only) - the integer base to use. Must be between 2 and 36 inclusive
@@ -51,14 +49,30 @@ to_chars_result to_chars(char* first, char* last, Real value, chars_format fmt =
See <<chars_format overview>> for description.
* `precision` (float only) - the number of decimal places required
=== to_chars for integral types
== to_chars_result
* `ptr` - On return from `to_chars` points to one-past-the-end of the characters written on success or `last` on failure
* `ec` - https://en.cppreference.com/w/cpp/error/errc[the error code]. Values returned by `to_chars` are:
|===
|Return Value | Description
|`std::errc()` | Successful Parsing
| `std::errc::result_out_of_range` | 1) Overflow
2) Underflow
|===
* `operator==` - compares the value of ptr and ec for equality
== Usage Notes
=== Usage notes for to_chars for integral types
[#integral_usage_notes_]
* All built-in integral types are allowed except bool which is deleted
* from_chars for integral type is constexpr (BOOST_CHARCONV_CONSTEXPR is defined) when:
** compiled using `-std=c++14` or newer
** using a compiler with `\__builtin_ is_constant_evaluated`
* These functions have been tested to support `\__int128` and `unsigned __int128`
=== to_chars for floating point types
=== Usage notes for to_chars for floating point types
* The following will be returned when handling different values of `NaN`
** `qNaN` returns "nan"
** `-qNaN` returns "-nan(ind)"
@@ -67,6 +81,7 @@ See <<chars_format overview>> for description.
* These functions have been tested to support all built-in floating-point types and those from C++23's `<stdfloat>`
** Long doubles can be 64, 80, or 128-bit, but must be IEEE 754 compliant. An example of a non-compliant, and therefore unsupported, format is `ibm128`.
** Use of `__float128` or `std::float128_t` requires compiling with `-std=gnu++xx` and linking GCC's `libquadmath`.
This is done automatically when building with CMake.
== Examples