2
0
mirror of https://github.com/boostorg/atomic.git synced 2026-02-02 20:32:09 +00:00
Commit Graph

462 Commits

Author SHA1 Message Date
Andrey Semashev
23aa6d98df Partly revert eb50aea437.
ARM Architecture Reference Manual ARMv7-A and ARMv7-R edition allows to use
ldrex and other load-exclusive instructions without a matching strex in
Section A3.4.5 "Load-Exclusive and Store-Exclusive usage restrictions". And in
Section A3.5.3 "Atomicity in the ARM architecture" it states that ldrexd
atomically loads the 64-bit value from suitably aligned memory. This makes
the strexd added in eb50aea437 unnecessary.

ARM Architecture Reference Manual Armv8, for Armv8-A architecture profile
does not state explicitly that ldrexd can be used without a matching strexd,
but does not prohibit this either in Section E2.10.5 "Load-Exclusive and
Store-Exclusive instruction usage restrictions".
2020-06-14 15:38:42 +03:00
Andrey Semashev
7a4795c161 Added header/footer headers to centrally disable some useless warnings. 2020-06-14 01:34:37 +03:00
Andrey Semashev
b36797be8d Nonessential. 2020-06-12 22:59:46 +03:00
Andrey Semashev
eb50aea437 Added strexd to the 64-bit load asm backend on ARM.
Although we don't need to store anything after the load, we need to issue
strexd to reset the exclusive access mark on the storage address. So we
immediately store the loaded value back.

The technique to use ldrexd+strexd is described in ARM Architecture
Reference Manual ARMv8, Section B2.2.1. Although it is described for ARMv8,
the technique should be valid for previous versions as well.
2020-06-12 22:11:24 +03:00
Andrey Semashev
ec72e215b7 Use more fine grained capability macros includes and remove unneeded includes. 2020-06-12 19:35:05 +03:00
Andrey Semashev
5bc6d0389d Fixed compilation of asm-based backend for ARM.
Also, improve register allocation slightly for ARM32 and Thumb 2 modes.
2020-06-12 19:13:38 +03:00
Andrey Semashev
c205c7185b Adjusted ARM asm blocks formatting. 2020-06-12 15:27:39 +03:00
Andrey Semashev
3929919495 Implement a special test_clock for Windows.
The implementation uses GetTickCount/GetTickCount64 internally,
which is a steady and sufficiently low precision time source.

We need the clock to have relatively low precision so that wait
tests don't fail spuriously because the blocked threads wake up
too soon, according to more precise clocks.

boost::chrono::system_clock currently has an acceptably low precision,
but it is not a steady clock.
2020-06-12 13:32:32 +03:00
Andrey Semashev
72c87ca51b Use a lower resolution clock on Windows to reduce spurious test failures. 2020-06-12 03:24:29 +03:00
Andrey Semashev
629953ffe0 Removed forced inline markup from emulated APIs that don't use memory order.
Forced inline is mostly used to ensure the compiler is able to treat memory
order arguments as constants. It is also useful for constant propagation
on other arguments. This is not very useful for the emulated backend, so
we might as well allow the compiler to not inline the functions.
2020-06-12 03:14:33 +03:00
Andrey Semashev
69c150e178 Added a workaround for broken codegen in MSVC-8 affecting emulated wait.
When the emulated wait function is inlined, the compiler sometimes generates
code that acts as if a wrong value is returned from the wait function. The
compiler simply "forgets" to save the atomic value into an object on the
stack, which makes it later use a bogus value as the "returned" value.

Preventing inlining seems to work around the problem.

Discovered by wait_api notify_one/notify_all test failures for struct_3_bytes.
Oddly enough, the same test for uint32_t did not fail.
2020-06-12 02:51:37 +03:00
Andrey Semashev
1b8ec1700b Reworked IPC atomic tests to check for the is_always_lockfree property.
Checking for the capability macros is not good enough because ipc_atomic_ref
can be not lock-free even when the macro (and ipc_atomic) indicates lock-free.

We now check the is_always_lockfree property to decide whether to run or skip
tests for a given IPC atomic type.

Also, made struct_3_bytes output more informative.
2020-06-12 01:58:12 +03:00
Andrey Semashev
58b618d299 Added a basic compile test for fences. 2020-06-12 00:57:59 +03:00
Andrey Semashev
65ada4d229 Change to a shorter instruction for seq_cst fences on x86.
Also, use explicitly sized integers for dummy arguments to the fence
instructions.
2020-06-12 00:55:31 +03:00
Andrey Semashev
559eba81af Use dummy atomic instruction instead of mfence for seq_cst fences on x86.
mfence is more expensive on most recent CPUs than a lock-prefixed instruction
on a dummy location, while the latter is sufficient to implement sequential
consistency on x86. Some performance test results are available here:

https://shipilev.net/blog/2014/on-the-fence-with-dependencies/

Also, for seq_cst stores in gcc_atomic backend, use an xchg instead of
mov+mfence, which are generated by gcc versions older than 10.1.

The machinery to detect mfence presence is still left intact just in case
if we need to use this instruction in the future.

Closes https://github.com/boostorg/atomic/issues/36.
2020-06-11 22:32:01 +03:00
Andrey Semashev
36561406c2 Use api-ms-win-core-synch-l1-2-0.dll to query WaitOnAddress and friends.
It was suggested in a comment[1] that the correct dll to use to resolve
WaitOnAddress, WakeByAddressSingle and WakeByAddressAll is
api-ms-win-core-synch-l1-2-0.dll instead of KernelBase.dll. On some systems
KernelBase.dll may not be available and the WaitOnAddress API may be implemented
in a different library.

Tests have shown that GetModuleHandleW(L"api-ms-win-core-synch-l1-2-0.dll")
returns a handle for KernelBase.dll anyway. Also, there exists a presumably
older version of this library: api-ms-win-core-synch-l1-1-0.dll. The older
version is also "loaded" into the process and also resolves to KernelBase.dll,
which suggests that hopefully api-ms-win-core-synch-l1-2-0.dll will stay
available and working in the forseeable future.

[1]: https://github.com/microsoft/STL/pull/593#issuecomment-641019129
2020-06-11 13:07:46 +03:00
Andrey Semashev
ea70d79920 Fixed capability macros for 80-bit x87 long double types.
Capability macros for 80-bit long double would indicate no lock-free
support even if 128-bit atomic operations were available.
2020-06-11 13:07:46 +03:00
Andrey Semashev
53978fca3d Added a link to the article about Linux ARM atomic functions. 2020-06-11 13:07:45 +03:00
Andrey Semashev
e5e96fbc9a Added atomic_unsigned/signed_lock_free typedefs introduced in C++20.
The typedefs indicate the atomic object type for an unsigned/signed
integer that is lock-free and preferably has native support for waiting
and notifying operations.
2020-06-11 13:07:45 +03:00
Andrey Semashev
80cfbfd0de Added implementation of inter-process atomics.
The inter-process atomics have ipc_ prefixes: ipc_atomic, ipc_atomic_ref
and ipc_atomic_flag. These types are similar to their unprefixed counterparts
with the following distinctions:

- The operations are provided with an added precondition that is_lock_free()
  returns true.
- All operations, including waiting/notifying operations, are address-free,
  so the types are suitable for inter-process communication.
- The new has_native_wait_notify() operation and always_has_native_wait_notify
  static constant allow to test if the target platform has native support for
  address-free waiting/notifying operations. If it does not, a generic
  implementation is used based on a busy wait.
- The new set of capability macros added. The macros are named
  BOOST_ATOMIC_HAS_NATIVE_<T>_IPC_WAIT_NOTIFY and indicate whether address-free
  waiting/notifying operations are supported natively for a given type.

Additionally, to unify interface and implementation of different components,
the has_native_wait_notify() operation and always_has_native_wait_notify
static constant were added to non-IPC atomic types as well. Added
BOOST_ATOMIC_HAS_NATIVE_<T>_WAIT_NOTIFY capability macros to indicate
native support for inter-thread waiting/notifying operations.

Also, added is_lock_free() and is_always_lock_free to atomic_flag.

This commit adds implementation, docs and tests.
2020-06-11 13:07:16 +03:00
Andrey Semashev
e4f8770665 Reorganized atomic, atomic_ref and atomic_flag implementation.
Moved public classes definitions to the public headers and renamed
the internal implementation headers. This will allow to reuse the
implementation headers for inter-process atomics later.
2020-06-09 21:56:03 +03:00
Andrey Semashev
352a954ac1 Corrected BOOST_ATOMIC_FLAG_LOCK_FREE definition. 2020-06-09 21:55:38 +03:00
Andrey Semashev
4b6884d9c9 Added a note explaining the incompatibility between atomic and atomic_ref. 2020-06-08 00:06:21 +03:00
Andrey Semashev
1cd7ba9bc5 Documented value() operation, clarified the limitation of no padding bits.
The value() operation is useful with futexes, but should not be used for
anything else, basically.

The lack of support for types with padding bits is documented more prominently.
The docs do mention that `long double` on x86 is supported though.

Also, added description of the new tests added recently.

Related to https://github.com/boostorg/atomic/issues/34.
2020-06-07 20:28:09 +03:00
Andrey Semashev
32c396f4f1 Corrected syntax for integer constants in Alpha asm blocks. 2020-06-07 00:17:38 +03:00
Andrey Semashev
f2a67c0424 Merge pull request #33 from boostorg/pr/remove-cmake-install
Remove boost_install call from CMakeLists.txt
2020-06-04 15:17:38 +03:00
Peter Dimov
58f1676cca Remove boost_install call from CMakeLists.txt 2020-06-04 15:09:26 +03:00
Andrey Semashev
58ea8f7837 Added an AppVeyor CI job to test CMake build. 2020-06-03 01:48:48 +03:00
Andrey Semashev
d5dc8f185a Added support for build-time configuration of the lock pool size.
The user may define BOOST_ATOMIC_LOCK_POOL_SIZE_LOG2 macro to specify
binary logarithm of the size of the internal lock pool. The macro
only has effect when building Boost.Atomic.
2020-06-03 01:48:48 +03:00
Andrey Semashev
c849b6d877 Use 32-bit storage to implement atomic_flag.
Most platforms that support futexes or similar mechanisms support it
for 32-bit integers, which makes it more preferred to implement
atomic_flag efficiently. Most architectures also support 32-bit atomic
operations natively as well.

Also, reduced code duplication in instantiating operation backends.
2020-06-03 01:48:48 +03:00
Andrey Semashev
b737d8357a Removed unused variables. 2020-06-03 01:48:48 +03:00
Andrey Semashev
8650e24f65 Enable FreeBSD job in Travis CI. 2020-06-03 01:48:48 +03:00
Andrey Semashev
b9fadc852a Added Windows backend for waiting/notifying operations.
The backend uses runtime detection of availability of Windows API
for futex-like operations (only available since Windows 8).
2020-06-03 01:48:48 +03:00
Andrey Semashev
e72ccb02e4 Added support for NetBSD futex variant. 2020-06-03 01:48:48 +03:00
Andrey Semashev
214169b86e Added DragonFly BSD umtx backend for waiting/notifying operations. 2020-06-03 01:48:48 +03:00
Andrey Semashev
b5988af279 Added FreeBSD _umtx_op backend for waiting/notifying operations. 2020-06-03 01:48:48 +03:00
Andrey Semashev
bf182818f4 Added futex-based implementation for waiting/notifying operations. 2020-06-03 01:48:37 +03:00
Andrey Semashev
76e25f36a3 Added generic implementation of C++20 waiting/notifying operations.
The generic implementation is based on the lock pool. A list of condition
variables (or waiting futexes) is added per lock. Basically, the lock
pool serves as a global hash table, where each lock represents
a bucket and each wait state is an element. Every wait operation
allocates a wait state keyed on the pointer to the atomic object. Notify
operations look up the wait state by the atomic pointer and notify
the condition variable/futex. The corresponding lock needs to be acquired
to protect the wait state list during all wait/notify operations.

Backends not involving the lock pool are going to be added later.

The implementation of wait operation extends the C++20 definition in that
it returns the newly loaded value instead of void. This allows the caller
to avoid loading the value himself.

The waiting/notifying operations are not address-free. Address-free variants
will be added later.

Added tests for the new operations and refactored existing tests for atomic
operations. Added docs for the new operations.
2020-06-03 01:39:20 +03:00
Andrey Semashev
8472012d9f Cast pointers to uintptr_t.
This silences bogus MSVC-8 warnings about possible pointer truncation.
2020-05-31 23:33:33 +03:00
Andrey Semashev
7497d41fa7 Added support for yield ARMv8-A instruction.
Also, added "memory" clobber for pause instruction to prevent the compiler
to reorder memory loads and stores across pause().
2020-05-28 20:26:51 +03:00
Andrey Semashev
8e387475a5 Replaced integral_truncate with bitwise_cast in atomic_ref<integral>.
bitwise_cast is more lightweight in terms of compile times and is equivalent
to integral_truncate in case of atomic_ref as its storage type is always
of the same size as the value type.
2020-05-23 23:44:05 +03:00
Andrey Semashev
90568e8e1c Ensure that the atomic_ref storage size matches the value size.
Also, removed the unnecessary uintptr_storage_type from atomic_ref
specialization for pointers.
2020-05-23 23:32:38 +03:00
Andrey Semashev
e3dce0e226 Added missing const qualifiers in atomic_ref ops and updated tests to verify this. 2020-05-23 23:19:14 +03:00
Andrey Semashev
ddb9cbaadd Slightly cleaner way of aligning the storage pointer. 2020-05-23 23:07:16 +03:00
Andrey Semashev
0b94a7d655 Fixed incorrect size of buffer_storage on C++03 compilers.
Due to BOOST_ATOMIC_DETAIL_ALIGNED_VAR_TPL macro expansion, the aligner
data member was made an array, which increased the size of the resulting
buffer_storage. This caused memory corruption with atomic_ref, which
requires the storage type to be of the same size as the value.

To protect against such mistakes in the future, changed
BOOST_ATOMIC_DETAIL_ALIGNED_VAR_TPL and BOOST_ATOMIC_DETAIL_ALIGNED_VAR
definitions to prohibit their direct use with arrays.
2020-05-23 23:05:43 +03:00
Andrey Semashev
75a6423a37 Fixed name clash between macro param and type_with_alignment::type member. 2020-05-22 22:46:19 +03:00
Andrey Semashev
5cfa550311 Moved aligned variable declaration workaround to a separate header. 2020-05-22 18:29:08 +03:00
Andrey Semashev
977e80e0d5 Added gcc 10 build jobs to Travis CI. 2020-05-22 18:28:44 +03:00
Andrey Semashev
0082be23f2 Added clang-10 jobs to Travis CI. 2020-05-05 23:11:01 +03:00
Andrey Semashev
9cdf02b612 Merge pull request #23 from bazald/develop
Change example documentation for the multi-producer queue to indicate lock-freedom
2020-04-29 18:55:16 +03:00