atomic

mirror of https://github.com/boostorg/atomic.git synced 2026-02-02 08:22:08 +00:00

Author	SHA1	Message	Date
Andrey Semashev	d42a407bb3	Use intrinsics in gcc sync backend to load and store large objects. There is no guarantee of atomicity of plain loads and stores of anything larger than a byte on an arbitrary hardware architecture. However, all modern architectures seem to guarantee atomicity of loads and stores of suitably aligned objects ate least up to a pointer size, so we use that as the threshold. For larger objects we have to use intrinsics to guarantee atomicity.	2020-06-21 19:07:28 +03:00
Andrey Semashev	b02b59fd3a	Separated arch-specific core and fence operations to new ops structures. The old operations template is replaced with core_operations, which falls back to core_arch_operations, which falls back to core_operations_emulated. The core_operations layer is intended for more or less architecture-neutral backends, like the one based on gcc __atomic* intrinsics. It may fall back to core_arch_operations where it is not supported by the compiler or where the latter is more optimal. For example, where gcc does not implement 128-bit atomic operations via __atomic* intrinsics, we support them in the core_arch_operations backend, which uses inline assembler blocks. The old emulated_operations template is largely unchanged and was renamed to core_operations_emulated for naming consistency. All other operation templates were also renamed for consistency (e.g. generic_wait_operations -> wait_operations_generic). Fence operations have been extracted to a separate set of structures: fence_operations, fence_arch_opereations and fence_operations_emulated. These are similar to the core operations described above. This structuring also allows to fall back from fence_operations to fence_arch_opereations when the latter is more optimal. The net result of these changes is that 128-bit and 64-bit atomic operations should now be consistently supported on all architectures that support them. Previously, only x86 was supported via local hacks for gcc and clang.	2020-06-21 19:07:20 +03:00
Andrey Semashev	93e6b3a3f6	Fixed gcc asm-based PPC backend when 8 and 16-bit insns are unavailable. We need to explicitly qualify base_type to call fence functions since the base class is dependent on a template parameter now.	2020-06-20 00:21:36 +03:00
Andrey Semashev	18d5e470ba	Initialize the dummy variable in x86 atomic_thread_fence(seq_cst). The initialization is not needed for the code, but it is needed to make tools like valgrind happy. Otherwise, the tools would mark the instructions as accessing uninitialized data. Also, changed the dummy variable to a byte. This may allow for a more lax alignment.	2020-06-19 01:25:30 +03:00
Andrey Semashev	8b7a92a374	Corrected include guards naming.	2020-06-18 23:54:48 +03:00
Andrey Semashev	651dfd4afb	Added gcc asm-based backend for AArch64. The backend implements core and extra atomic operations using gcc asm blocks. The implementation supports extensions added in ARMv8.1 and ARMv8.3. It supports both little and big endian targets. Currently, the code has not been tested on real hardware. It has been tested on a QEMU VM.	2020-06-18 12:46:03 +00:00
Andrey Semashev	23aa6d98df	Partly revert `eb50aea437`. ARM Architecture Reference Manual ARMv7-A and ARMv7-R edition allows to use ldrex and other load-exclusive instructions without a matching strex in Section A3.4.5 "Load-Exclusive and Store-Exclusive usage restrictions". And in Section A3.5.3 "Atomicity in the ARM architecture" it states that ldrexd atomically loads the 64-bit value from suitably aligned memory. This makes the strexd added in `eb50aea437` unnecessary. ARM Architecture Reference Manual Armv8, for Armv8-A architecture profile does not state explicitly that ldrexd can be used without a matching strexd, but does not prohibit this either in Section E2.10.5 "Load-Exclusive and Store-Exclusive instruction usage restrictions".	2020-06-14 15:38:42 +03:00
Andrey Semashev	7a4795c161	Added header/footer headers to centrally disable some useless warnings.	2020-06-14 01:34:37 +03:00
Andrey Semashev	b36797be8d	Nonessential.	2020-06-12 22:59:46 +03:00
Andrey Semashev	eb50aea437	Added strexd to the 64-bit load asm backend on ARM. Although we don't need to store anything after the load, we need to issue strexd to reset the exclusive access mark on the storage address. So we immediately store the loaded value back. The technique to use ldrexd+strexd is described in ARM Architecture Reference Manual ARMv8, Section B2.2.1. Although it is described for ARMv8, the technique should be valid for previous versions as well.	2020-06-12 22:11:24 +03:00
Andrey Semashev	ec72e215b7	Use more fine grained capability macros includes and remove unneeded includes.	2020-06-12 19:35:05 +03:00
Andrey Semashev	5bc6d0389d	Fixed compilation of asm-based backend for ARM. Also, improve register allocation slightly for ARM32 and Thumb 2 modes.	2020-06-12 19:13:38 +03:00
Andrey Semashev	c205c7185b	Adjusted ARM asm blocks formatting.	2020-06-12 15:27:39 +03:00
Andrey Semashev	629953ffe0	Removed forced inline markup from emulated APIs that don't use memory order. Forced inline is mostly used to ensure the compiler is able to treat memory order arguments as constants. It is also useful for constant propagation on other arguments. This is not very useful for the emulated backend, so we might as well allow the compiler to not inline the functions.	2020-06-12 03:14:33 +03:00
Andrey Semashev	69c150e178	Added a workaround for broken codegen in MSVC-8 affecting emulated wait. When the emulated wait function is inlined, the compiler sometimes generates code that acts as if a wrong value is returned from the wait function. The compiler simply "forgets" to save the atomic value into an object on the stack, which makes it later use a bogus value as the "returned" value. Preventing inlining seems to work around the problem. Discovered by wait_api notify_one/notify_all test failures for struct_3_bytes. Oddly enough, the same test for uint32_t did not fail.	2020-06-12 02:51:37 +03:00
Andrey Semashev	65ada4d229	Change to a shorter instruction for seq_cst fences on x86. Also, use explicitly sized integers for dummy arguments to the fence instructions.	2020-06-12 00:55:31 +03:00
Andrey Semashev	559eba81af	Use dummy atomic instruction instead of mfence for seq_cst fences on x86. mfence is more expensive on most recent CPUs than a lock-prefixed instruction on a dummy location, while the latter is sufficient to implement sequential consistency on x86. Some performance test results are available here: https://shipilev.net/blog/2014/on-the-fence-with-dependencies/ Also, for seq_cst stores in gcc_atomic backend, use an xchg instead of mov+mfence, which are generated by gcc versions older than 10.1. The machinery to detect mfence presence is still left intact just in case if we need to use this instruction in the future. Closes https://github.com/boostorg/atomic/issues/36.	2020-06-11 22:32:01 +03:00
Andrey Semashev	ea70d79920	Fixed capability macros for 80-bit x87 long double types. Capability macros for 80-bit long double would indicate no lock-free support even if 128-bit atomic operations were available.	2020-06-11 13:07:46 +03:00
Andrey Semashev	53978fca3d	Added a link to the article about Linux ARM atomic functions.	2020-06-11 13:07:45 +03:00
Andrey Semashev	e5e96fbc9a	Added atomic_unsigned/signed_lock_free typedefs introduced in C++20. The typedefs indicate the atomic object type for an unsigned/signed integer that is lock-free and preferably has native support for waiting and notifying operations.	2020-06-11 13:07:45 +03:00
Andrey Semashev	80cfbfd0de	Added implementation of inter-process atomics. The inter-process atomics have ipc_ prefixes: ipc_atomic, ipc_atomic_ref and ipc_atomic_flag. These types are similar to their unprefixed counterparts with the following distinctions: - The operations are provided with an added precondition that is_lock_free() returns true. - All operations, including waiting/notifying operations, are address-free, so the types are suitable for inter-process communication. - The new has_native_wait_notify() operation and always_has_native_wait_notify static constant allow to test if the target platform has native support for address-free waiting/notifying operations. If it does not, a generic implementation is used based on a busy wait. - The new set of capability macros added. The macros are named BOOST_ATOMIC_HAS_NATIVE_<T>_IPC_WAIT_NOTIFY and indicate whether address-free waiting/notifying operations are supported natively for a given type. Additionally, to unify interface and implementation of different components, the has_native_wait_notify() operation and always_has_native_wait_notify static constant were added to non-IPC atomic types as well. Added BOOST_ATOMIC_HAS_NATIVE_<T>_WAIT_NOTIFY capability macros to indicate native support for inter-thread waiting/notifying operations. Also, added is_lock_free() and is_always_lock_free to atomic_flag. This commit adds implementation, docs and tests.	2020-06-11 13:07:16 +03:00
Andrey Semashev	e4f8770665	Reorganized atomic, atomic_ref and atomic_flag implementation. Moved public classes definitions to the public headers and renamed the internal implementation headers. This will allow to reuse the implementation headers for inter-process atomics later.	2020-06-09 21:56:03 +03:00
Andrey Semashev	352a954ac1	Corrected BOOST_ATOMIC_FLAG_LOCK_FREE definition.	2020-06-09 21:55:38 +03:00
Andrey Semashev	32c396f4f1	Corrected syntax for integer constants in Alpha asm blocks.	2020-06-07 00:17:38 +03:00
Andrey Semashev	c849b6d877	Use 32-bit storage to implement atomic_flag. Most platforms that support futexes or similar mechanisms support it for 32-bit integers, which makes it more preferred to implement atomic_flag efficiently. Most architectures also support 32-bit atomic operations natively as well. Also, reduced code duplication in instantiating operation backends.	2020-06-03 01:48:48 +03:00
Andrey Semashev	b9fadc852a	Added Windows backend for waiting/notifying operations. The backend uses runtime detection of availability of Windows API for futex-like operations (only available since Windows 8).	2020-06-03 01:48:48 +03:00
Andrey Semashev	e72ccb02e4	Added support for NetBSD futex variant.	2020-06-03 01:48:48 +03:00
Andrey Semashev	214169b86e	Added DragonFly BSD umtx backend for waiting/notifying operations.	2020-06-03 01:48:48 +03:00
Andrey Semashev	b5988af279	Added FreeBSD _umtx_op backend for waiting/notifying operations.	2020-06-03 01:48:48 +03:00
Andrey Semashev	bf182818f4	Added futex-based implementation for waiting/notifying operations.	2020-06-03 01:48:37 +03:00
Andrey Semashev	76e25f36a3	Added generic implementation of C++20 waiting/notifying operations. The generic implementation is based on the lock pool. A list of condition variables (or waiting futexes) is added per lock. Basically, the lock pool serves as a global hash table, where each lock represents a bucket and each wait state is an element. Every wait operation allocates a wait state keyed on the pointer to the atomic object. Notify operations look up the wait state by the atomic pointer and notify the condition variable/futex. The corresponding lock needs to be acquired to protect the wait state list during all wait/notify operations. Backends not involving the lock pool are going to be added later. The implementation of wait operation extends the C++20 definition in that it returns the newly loaded value instead of void. This allows the caller to avoid loading the value himself. The waiting/notifying operations are not address-free. Address-free variants will be added later. Added tests for the new operations and refactored existing tests for atomic operations. Added docs for the new operations.	2020-06-03 01:39:20 +03:00
Andrey Semashev	8472012d9f	Cast pointers to uintptr_t. This silences bogus MSVC-8 warnings about possible pointer truncation.	2020-05-31 23:33:33 +03:00
Andrey Semashev	7497d41fa7	Added support for yield ARMv8-A instruction. Also, added "memory" clobber for pause instruction to prevent the compiler to reorder memory loads and stores across pause().	2020-05-28 20:26:51 +03:00
Andrey Semashev	8e387475a5	Replaced integral_truncate with bitwise_cast in atomic_ref<integral>. bitwise_cast is more lightweight in terms of compile times and is equivalent to integral_truncate in case of atomic_ref as its storage type is always of the same size as the value type.	2020-05-23 23:44:05 +03:00
Andrey Semashev	90568e8e1c	Ensure that the atomic_ref storage size matches the value size. Also, removed the unnecessary uintptr_storage_type from atomic_ref specialization for pointers.	2020-05-23 23:32:38 +03:00
Andrey Semashev	e3dce0e226	Added missing const qualifiers in atomic_ref ops and updated tests to verify this.	2020-05-23 23:19:14 +03:00
Andrey Semashev	0b94a7d655	Fixed incorrect size of buffer_storage on C++03 compilers. Due to BOOST_ATOMIC_DETAIL_ALIGNED_VAR_TPL macro expansion, the aligner data member was made an array, which increased the size of the resulting buffer_storage. This caused memory corruption with atomic_ref, which requires the storage type to be of the same size as the value. To protect against such mistakes in the future, changed BOOST_ATOMIC_DETAIL_ALIGNED_VAR_TPL and BOOST_ATOMIC_DETAIL_ALIGNED_VAR definitions to prohibit their direct use with arrays.	2020-05-23 23:05:43 +03:00
Andrey Semashev	75a6423a37	Fixed name clash between macro param and type_with_alignment::type member.	2020-05-22 22:46:19 +03:00
Andrey Semashev	5cfa550311	Moved aligned variable declaration workaround to a separate header.	2020-05-22 18:29:08 +03:00
Andrey Semashev	0e5e52efad	Improve lock pool implementation. Increased lock pool size to 64 entries and improve pool efficiency: - Shift off lower pointer bits that are zero due to object alignment. - Mix higher pointer bits to account for alignment typically imposed by malloc/new implementations. - Use bit masking to select a lock from pool, given that the pool size is a power of 2 now. Also, extracted (u)intptr_t definition to a common header to avoid code duplication.	2020-03-09 20:01:49 +03:00
Andrey Semashev	c4e60c3a65	Added static asserts verifying that the user's type is acceptable for atomics. Require that user's type is complete and trivially copyable.	2020-03-04 00:11:39 +03:00
Andrey Semashev	f32165562d	Apply missing max_align_t workaround to gcc 4.8 as well.	2020-03-01 22:15:16 +03:00
Andrey Semashev	f91557eeef	Added a workaround for missing max_align_t in gcc 4.7.	2020-03-01 20:29:54 +03:00
Andrey Semashev	5e88a03da1	Fixed compilation due to missing type_with_alignment definition.	2020-03-01 20:20:55 +03:00
Andrey Semashev	28c79c0147	Updated copyright years.	2020-03-01 19:23:32 +03:00
Andrey Semashev	6d82535d88	Renamed integral_extend.hpp to integral_conversions.hpp.	2020-03-01 19:21:56 +03:00
Andrey Semashev	e65d952cdf	Use static_casts to convert value_type to storage_type for integral atomics. This simplifies the code slightly without changing semantics. static_cast was already used in atomic constructor in order to make it constexpr, and this commit makes the rest of the code consistent.	2020-03-01 19:14:38 +03:00
Andrey Semashev	e96f56aed1	Added a workaround for MSVC 14.0 alignas issues in 32-bit mode. The compiler allows to apply alignas but later fails to pass arguments of the aligned types to functions with error C2719. At the same time, std::max_align_t has alignment of 8 and the error doesn't show up when the type is aligned using the union trick. Thus we disable alignas for MSVC 14.0 in 32-bit mode. Also, use std::max_align_t on MSVC, when possible.	2020-03-01 18:41:14 +03:00
Andrey Semashev	c22fc7d416	Added a workaround for gcc 4.7 not supporting constexpr ctors with unions. gcc 4.7 does not support constexpr constructors that initialize one member of an anonymous union data member of the class. atomic and atomic_flag no longer have constexpr constructors on this compiler.	2020-03-01 18:05:01 +03:00
Andrey Semashev	91b7d325e0	Added a workaround for older gcc and clang with broken std::alignment_of. gcc older than 8.1 and clang older than 8.0 produce incorrect results of std::alignment_of for 64-bit types on 32-bit x86. Use boost::alignment_of, which contains workarounds for these compilers.	2020-03-01 16:53:36 +03:00

1 2 3 4 5 ...

285 Commits