Knowledge (XXG)

Memory ordering

Source 📝

168: 765:, in which any reordering is permitted (even across statements) if no effect on the visible program semantics results. Under this rule, the order of operations in the translated code can vary wildly from the specified program order. If the compiler is permitted to make optimistic assumptions about distinct pointer expressions having no alias overlap in a case where such aliasing actually exists (this would normally be classified as an ill-formed program exhibiting 351:, or during debugging when using a hardware debugging aid with access to the machine state (some support for this is often built directly into the CPU or microcontroller as functionally independent circuitry apart from the execution core which continues to operate even when the core itself is halted for static inspection of its execution state). Compile-time memory order concerns itself with the former, and does not concern itself with these other views. 66: 25: 788:
A complete grasp of memory order semantics is considered to be an arcane specialization even among the subpopulation of professional systems programmers who are typically best informed in this subject area. Most programmers settle for an adequate working grasp of these issues within the normal domain
432:
on the floating point data type available in most programming languages is not commutative in rounding effects, making effects of the order of expression visible in small differences of the computed result (small initial differences may however cascade into arbitrarily large differences over a longer
554:
where reads and writes to memory trigger I/O operations, or changes to the processor's operational mode, which are highly visible side effects. For the above example, assume for now that the pointers are pointing to regular program memory, without these side-effects. The compiler is free to reorder
990:
Dependent loads can be reordered (this is unique for Alpha). If the processor first fetches a pointer to some data and then the data, it might not fetch the data itself but use stale data which it has already cached and not yet invalidated. Allowing this relaxation makes cache hardware simpler and
923:
In many programming languages different types of barriers can be combined with other operations (like load, store, atomic increment, atomic compare and swap), so no extra memory barrier is needed before or after it (or both). Depending on a CPU architecture being targeted these language constructs
747:
By far the largest class of side effects in a modern procedural language involve memory write operations, so rules around memory ordering are a dominant component in the definition of program order semantics. The reordering of the functions calls above might appear to be a different consideration,
743:
As a result, many high-level compiled languages, such as C/C++, have evolved to have intricate and sophisticated semantic specifications about where the compiler is permitted to make optimistic assumptions in code reordering in pursuit of the highest possible performance, and where the compiler is
659:
Because of possible aliasing effects, pointer expressions are difficult to rearrange without risking visible program effects. In the common case, there might not be any aliasing in effect, so the code appears to run normally as before. But in the edge case where aliasing is present, severe program
739:
In general, compiled languages are not detailed enough in their specification for the compiler to determine formally at compile time which pointers are potentially aliased and which are not. The safest course of action is for the compiler to assume that all pointers are potentially aliased at all
397:
At the machine level, few machines can add three numbers together in a single instruction, and so the compiler will have to translate this expression into two addition operations. If the semantics of the program language restrict the compiler into translating the expression in left-to-right order
995:
systems) cache line invalidations sent to other processors are processed in lazy fashion by default, unless requested explicitly to be processed between dependent loads. The Alpha architecture specification also allows other forms of dependent loads reordering, for example using speculative data
858:
Other high-level languages tilt toward such a declaration attribute amounting to a strong guarantee with no loop-holes to violate this guarantee provided within the language itself; all bets are off on this language guarantee if your application links a library written in a different programming
451:, forcing all effects of one statement to be complete before the next statement is executed. This will force the compiler to generate code corresponding to the statement order expressed. Statements are, however, often more complicated, and may contain internal 839:
to the declaration of its pointer argument, rendering the expression well defined. Thus the modern culture of C/C++ has become somewhat obsessive about supplying const qualifiers to function argument declarations in all viable cases.
2012: 2001: 477:
as it finds convenient, resulting in large-scale changes of program memory order. In a pure functional programming language, function calls are forbidden from having side effects on the visible program state (other than its
339:
Most programming languages have some notion of a thread of execution which executes statements in a defined order. Traditional compilers translate high-level expressions to a sequence of low-level instructions relative to a
769:), the adverse results of an aggressive code-optimization transformation are impossible to guess prior to code execution or direct code inspection. The realm of undefined behavior has nearly limitless manifestations. 660:
errors can result. Even if these edge cases are entirely absent in normal execution, it opens the door for a malicious adversary to contrive an input where aliasing exists, potentially leading to a computer security
482:) and the difference in machine memory order due to function call ordering will be inconsequential to program semantics. In procedural languages, the functions called might have side-effects, such as performing an 772:
It is the responsibility of the programmer to consult the language specification to avoid writing ill-formed programs where the semantics are potentially changed as a result of any legal compiler optimization.
784:
Some high-level languages eliminate pointer constructions altogether, as this level of alertness and attention to detail is considered too high to reliably maintain even among professional programmers.
1382:
This column indicates the behaviour of the vast majority of x86 processors. Some rare specialised x86 processors (IDT WinChip manufactured around 1998) may have weaker 'oostore' memory ordering.
1902:
Creates a barrier across which the compiler will not schedule any data access instruction. The compiler may allocate local data in registers across a memory barrier, but not global data.
748:
but this usually devolves into concerns about memory effects internal to the called functions interacting with memory operations in the expression which generates the function call.
924:
will translate to either special instructions, to multiple instructions (i.e. barrier and load), or to normal instruction, depending on hardware memory ordering guarantees.
585:
Suppose, however, that the programmer is concerned about the visible semantics of integer overflow and breaks the statement apart as the program level as follows:
347:
Execution effects are visible at two levels: within the program code at a high level, and at the machine level as viewed by other threads or processing elements in
436:
If the programmer is concerned about integer overflow or rounding effects in floating point, the same program may be coded at the original high level as follows:
327:
processors have the strongest memory order, but may still defer memory store instructions until after memory load instructions. On the other end of the spectrum,
364:
During compilation, hardware instructions are often generated at a finer granularity than specified in the high-level code. The primary observable effect in a
465:
for the function call, which involves many reads and writes to machine memory. In most compiled languages, the compiler is free to order the function calls
740:
times. This level of conservative pessimism tends to produce dreadful performance as compared to the optimistic assumption that no aliasing exists, ever.
1934:
Victor Alessandrini, 2015. Shared Memory Application Programming: Concepts and Strategies in Multicore Application Programming. Elsevier Science. p. 176.
574:
This would not be viewed as efficient in most instances, and pointer writes have potential side-effects on visible machine state. Since the compiler is
2126: 715:
are free from program visible side-effects, all three choices will produce a program with the same visible program effects. If the implementation of
292:
when either the order of operations cannot change or when such changes have no visible effect on any thread. Conversely, the memory order is called
2221: 374:
The print statement follows the statement which assigns to the variable sum, and thus when the print statement references the computed variable
1500: 266: 1882: 1939: 1846: 2211: 873:
These barriers prevent a compiler from reordering instructions during compile time – they do not prevent reordering by CPU during runtime.
855:
does this in a way that can break the expression above, it should not be declaring the pointer argument type as const in the first place.
189: 378:
it references this result as an observable effect of the prior execution sequence. As defined by the rules of program sequence, when the
2125:
Handling Memory Ordering in Multithreaded Applications with Oracle Solaris Studio 12 Update 2: Part 2, Memory Barriers and Memory Fence
1989: 424:
Note that the integer data type in most programming languages only follows the algebra for the mathematics integers in the absence of
398:(for example), then the generated code will look as if the programmer had written the following statements in the original program: 229: 211: 149: 52: 2024: 83: 38: 789:
of their programming expertise. At the extreme end of specialization in memory order semantics are the programmers who author
130: 2206: 2191: 521: 87: 805:
Note that local variables can not be assumed to be free of aliasing if a pointer to such a variable escapes into the wild:
102: 2139: 836: 991:
faster but leads to the requirement of memory barriers for readers and writers. On Alpha hardware (like multiprocessor
546:. When reading from standard program storage, there are no side-effects due to the order of memory read operations. In 109: 2201: 661: 489:
Again, a programmer concerned with these effects can become more pedantic in expressing the original source program:
2071: 1964: 182: 176: 2169: 2108: 1589: 543: 520:
Now consider the same summation expressed with pointer indirection, in a language such as C or C++ which supports
76: 2216: 1564: 938: 486:, or updating a variable in global program scope, both of which produce visible effects with the program model. 429: 116: 2145: 1737: 599:. This guarantees the order of the two addition operations, but potentially introduces a new problem of address 588:// as directly authored by the programmer // with aliasing concerns *sum = *a + *b; *sum = *sum + *c; 193: 2196: 323:, few compilers or CPU architectures ensure perfectly strong ordering. Among the commonly used architectures, 1706: 1532: 1466:
lfence (asm), void _mm_lfence(void) sfence (asm), void _mm_sfence(void) mfence (asm), void _mm_mfence(void)
878: 98: 1550: 1448:
Many architectures with SMP support have special hardware instruction for flushing reads and writes during
908: 1794: 595:. The second statement encodes two memory reads (in either order) which must precede the second update of 365: 348: 274: 495:
In programming languages where the statement boundary is defined as a sequence point, the function calls
300:
when one thread cannot predict the order of operations arising from another thread. Many naïvely written
1449: 262: 1879: 304:
fail when compiled or executed with a weak memory order. The problem is most often solved by inserting
897: 591:
The first statement encodes two memory reads, which must precede (in either order) the first write to
1843: 794: 600: 535: 415: 405: 44: 2082: 1951: 1917: 984: 778: 301: 1815: 1673: 1525: 790: 766: 571:// as rewritten by the compiler // generally forbidden *sum = *a + *b; *sum = *sum + *c; 555:
these reads in program order as it sees fit, and there will be no program-visible side effects.
1790: 568:
Here the language definition is unlikely to allow the compiler to break this apart as follows:
2048: 1647: 639:// what the program becomes with *c and *sum aliased *sum = *a + *b; *sum = *sum + *sum; 2151: 1935: 1862: 1618: 1480: 551: 270: 538:" a pointer and involves reading from memory at a location specified by the current value of 123: 977: 646:
is overwritten before its first access, and instead we obtain the algebraic equivalent of:
425: 312: 2072:
Data Memory Barrier, Data Synchronization Barrier, and Instruction Synchronization Barrier.
777:
traditionally places a high burden on the programmer to be aware of these issues, with the
578:
allowed this particular splitting transformation, the only write to the memory location of
1886: 1850: 951:
Loads can be reordered after loads (for better working of cache coherency, better scaling)
889:
This C11/C++11 function forbids the compiler to reorder read and write commands around it:
848: 547: 341: 278: 246: 911:
compiler (MSVC) supports some intrinsics only for x86/x64 (all of these are deprecated):
744:
required to make pessimistic assumptions in code reordering to avoid semantic hazards.
277:, it must necessarily ensure that the reordering does not change the output of ordinary 1768: 1764: 1443: 966: 965:
Weak consistency (reads and writes are arbitrarily reordered, limited only by explicit
868: 671: 614:
are aliased to the same memory location, and rewrite both versions of the program with
448: 305: 2185: 2094: 452: 649:// algebraic equivalent of the aliased case above *sum = (*a + *b) + (*a + *b); 483: 479: 258: 885:
asm volatile("" ::: "memory"); __asm__ __volatile__ ("" ::: "memory");
253:. Memory ordering depends on both the order of the instructions generated by the 1558: 992: 987:
from being executed without special instruction cache flush/reload instructions.
462: 320: 65: 723:
contain the side-effect of any pointer write subject to aliasing with pointers
2163: 1702: 762: 624:
There are no problems here. The original value of what we originally wrote as
603:: any of these pointers could potentially refer to the same memory location. 731:, the three choices are liable to produce different visible program effects. 636:
but this was overwritten in the first place and it's of no special concern.
328: 316: 542:. The effects of reading from a pointer are determined by architecture's 254: 582:
must logically follow the three pointer reads in the value expression.
90: in this template. Unsourced material may be challenged and removed. 1490: 1470: 774: 461:
At the machine level, calling a function usually involves setting up a
1897: 390:
must be that of the most recently executed assignment to the variable
1460: 819:, including leaving a copy around in global state which the function 324: 859:
language (though this is considered to be egregiously bad design).
2157: 983:
There can be incoherent instruction cache pipeline, which prevents
761:
Modern compilers sometimes take this a step further by means of an
1822: 1510: 1646:
Howells, David; McKenney, Paul E; Deacon, Will; Zijlstra, Peter.
674:'temp' of suitable type temp = *a + *b; *sum = temp + *c; 677:
Finally consider the indirect case with added function calls:
1456: 945:
Sequential consistency (all reads and all writes are in-order)
331:
processors make practically no guarantees about memory order.
250: 161: 59: 18: 835:
can be conspicuously prevented from doing this by applying a
831:, making this expression ill-defined in order of execution. 691:
before either function call, it may defer the evaluation of
368:
language is assignment of a new value to a named variable.
996:
reads ahead of knowing the real pointer to be dereferenced.
948:
Relaxed consistency (some types of reordering are allowed)
1992:, from "Memory Ordering in Modern Microprocessors, Part I" 1404:
Total store order (only supported with the Ztso extension)
851:
the constness attribute away as a dangerous expedient. If
667:
A safe reordering of the previous program is as follows:
2178:. 4th edition. J Hennessy, D Patterson, 2007. Chapter 4.6 933:
In symmetric multiprocessing (SMP) microprocessor systems
877:
Any of these GNU inline assembler statements forbids the
2025:"MIPS® Coherence Protocol Specification, Revision 01.01" 1952:
Reordering on an Alpha processor by Kourosh Gharachorloo
1912: 1910: 265:. However, memory order is of little concern outside of 1965:"Memory Barriers: a Hardware View for Software Hackers" 1641: 1639: 2095:"36793 – x86-64 does not get __sync_synchronize right" 881:
compiler to reorder read and write commands around it:
2176:
Computer Architecture — A quantitative approach
492:
sum = f(a); sum = sum + g(b); sum = sum + h(c);
394:(in this case the immediately previous statement). 1816:"Intel 64 Architecture Memory Ordering White Paper" 1425:
Relaxed-memory order (not supported on recent CPUs)
915:_ReadBarrier() _WriteBarrier() _ReadWriteBarrier() 516:
Program-order effects involving pointer expressions
1431:Partial store order (not supported on recent CPUs) 447:Many languages treat the statement boundary as a 414:If the compiler is also permitted to exploit the 1528:that emit hardware memory barrier instructions: 937:There are several memory-consistency models for 900:(ICC/ICL) uses "full compiler fence" intrinsics: 2109:"MemoryBarrier function (winnt.h) - Win32 apps" 1900: 652:which assigns an entirely different value into 606:For example, let's assume in this example that 1789:Manson, Jeremy; Goetz, Brian (February 2004). 443:Program-order effects involving function calls 360:Program-order effects of expression evaluation 1520:Compiler support for hardware memory barriers 815:might have done with the supplied pointer to 16:Order of accesses to computer memory by a CPU 8: 2140:Shared Memory Consistency Models: A Tutorial 404:If the compiler is permitted to exploit the 1738:"Memory Ordering in Modern Microprocessors" 893:atomic_signal_fence(memory_order_acq_rel); 53:Learn how and when to remove these messages 1898:Intel(R) C++ Compiler Intrinsics Reference 1759: 1757: 1731: 1729: 1727: 1725: 1723: 1612: 1610: 1608: 1606: 1604: 863:Compile-time memory barrier implementation 2146:Memory Ordering in Modern Microprocessors 2142:by Sarita V Adve and Kourosh Gharachorloo 752:Additional difficulties and complications 418:of addition, it might instead generate: 408:of addition, it might instead generate: 230:Learn how and when to remove this message 212:Learn how and when to remove this message 150:Learn how and when to remove this message 1769:"Re: branch prediction, renaming, joins" 1696: 1694: 999: 507:must now execute in that precise order. 175:This article includes a list of general 2165:IA (Intel Architecture) Memory Ordering 1600: 1375: 980:can be reordered with loads and stores. 550:programming, it is very common to have 2172:- Google Tech Talk by Richard L Hudson 1736:McKenney, Paul E (19 September 2007). 1438:Hardware memory barrier implementation 1332:Incoherent instruction cache/pipeline 1001:Memory ordering in some architectures 823:later accesses. In the simplest case, 811:There is no telling what the function 735:Memory order in language specification 261:and the execution order of the CPU at 2158:A formal kernel memory-ordering model 1142:Stores can be reordered after stores 781:languages C and C++ not far behind. 656:due to the statement rearrangement. 315:of different types of memory such as 7: 1617:Preshing, Jeff (30 September 2012). 1262:Atomic can be reordered with stores 1181:Stores can be reordered after loads 1104:Loads can be reordered after stores 957:Stores can be reordered after stores 683:The compiler may choose to evaluate 88:adding citations to reliable sources 1990:Table 1. Summary of Memory Ordering 1225:Atomic can be reordered with loads 1061:Loads can be reordered after loads 960:Stores can be reordered after loads 954:Loads can be reordered after stores 827:writes a new value to the variable 562:value is also pointer indirected? 275:changes the order of any operations 843:C and C++ permit the internals of 699:or it may defer the evaluation of 632:, and so is the original value of 344:at the underlying machine level. 181:it lacks sufficient corresponding 14: 1791:"JSR 133 (Java Memory Model) FAQ" 1674:"Memory Ordering at Compile Time" 1300:Dependent loads can be reordered 273:, because if the compiler or CPU 34:This article has multiple issues. 1963:McKenney, Paul E (7 June 2010). 421:sum = a + c; sum = sum + b; 411:sum = b + c; sum = a + sum; 371:sum = a + b + c; print(sum); 166: 64: 23: 1672:Preshing, Jeff (25 June 2012). 1619:"Weak vs. Strong Memory Models" 1535:, version 4.4.0 and later, has 511:Specific issues of memory order 439:sum = a + b; sum = sum + c; 355:General issues of program order 308:instructions into the program. 284:The memory order is said to be 75:needs additional citations for 42:or discuss these issues on the 2222:Concurrency (computer science) 1648:"Linux Kernel Memory Barriers" 1516:dmb (asm) dsb (asm) isb (asm) 703:until after the function call 695:until after the function call 401:sum = a + b; sum = sum + c; 311:In order to fully utilize the 1: 2152:Weak vs. Strong Memory Models 1390:RISC-V memory ordering models 1703:"Relaxed-Memory Concurrency" 335:Compile-time memory ordering 245:is the order of accesses to 2212:Programming language design 2013:MFENCE — Memory Fence 1419:Total store order (default) 1411:SPARC memory ordering modes 1398:Weak memory order (default) 801:Aliasing of local variables 642:Here the original value of 628:is lost upon assignment to 458:sum = f(a) + g(b) + h(c); 2238: 2002:SFENCE — Store Fence 1863:"std::atomic_signal_fence" 1590:Memory model (programming) 1441: 866: 530:Evaluating the expression 2049:"MIPS instruction set R5" 1652:The Linux Kernel Archives 1565:Sun Studio Compiler Suite 1069: 1038: 1035: 1032: 1029: 1026: 1023: 1020: 1017: 1014: 1011: 1008: 1005: 808:sum = f(&a) + g(a); 430:floating-point arithmetic 382:function call references 757:Optimization under as-if 621:*sum = *a + *b + *sum; 1707:University of Cambridge 1678:Preshing on Programming 1623:Preshing on Programming 1542:Since C11 and C++11 an 1524:Some compilers support 928:Runtime memory ordering 680:*sum = f(*a) + g(*b); 670:// declare a temporary 618:standing in for both. 290:sequentially consistent 196:more precise citations. 2160:by Jade Alglave et al. 1904: 1795:University of Maryland 565:*sum = *a + *b + *c; 366:procedural programming 349:concurrent programming 2207:Compiler construction 2192:Computer architecture 1544:atomic_thread_fence() 527:sum = *a + *b + *c; 1880:ECC compiler-intel.h 1577:__machine_rw_barrier 1561:header (deprecated). 1551:Microsoft Visual C++ 909:Microsoft Visual C++ 795:concurrent computing 416:commutative property 406:associative property 84:improve this article 1918:"_ReadWriteBarrier" 1767:(8 December 2003). 1573:__machine_w_barrier 1569:__machine_r_barrier 1002: 985:self-modifying code 904:__memory_barrier() 791:software frameworks 779:systems programming 707:. If the functions 302:parallel algorithms 2202:Consistency models 2148:by Paul E McKenney 1885:2011-07-24 at the 1849:2011-07-24 at the 1844:GCC compiler-gcc.h 1546:command was added. 1537:__sync_synchronize 1000: 898:Intel C++ Compiler 767:undefined behavior 2115:. 6 October 2021. 1940:978-0-12-803820-8 1372: 1371: 978:Atomic operations 919:Combined barriers 552:memory-mapped I/O 271:memory-mapped I/O 240: 239: 232: 222: 221: 214: 160: 159: 152: 134: 99:"Memory ordering" 57: 2229: 2217:Run-time systems 2166: 2154:by Jeff Preshing 2128: 2123: 2117: 2116: 2105: 2099: 2098: 2091: 2085: 2080: 2074: 2069: 2063: 2062: 2060: 2059: 2053: 2045: 2039: 2038: 2036: 2035: 2029: 2021: 2015: 2010: 2004: 1999: 1993: 1987: 1981: 1979: 1977: 1975: 1969: 1960: 1954: 1949: 1943: 1932: 1926: 1925: 1924:. 3 August 2021. 1914: 1905: 1895: 1889: 1877: 1871: 1870: 1859: 1853: 1841: 1835: 1834: 1832: 1830: 1820: 1812: 1806: 1805: 1803: 1801: 1786: 1780: 1779: 1777: 1775: 1761: 1752: 1751: 1749: 1747: 1742: 1733: 1718: 1717: 1715: 1713: 1698: 1689: 1688: 1686: 1684: 1669: 1663: 1662: 1660: 1658: 1643: 1634: 1633: 1631: 1629: 1614: 1578: 1574: 1570: 1556: 1545: 1538: 1383: 1380: 1003: 854: 846: 834: 830: 826: 822: 818: 814: 730: 726: 722: 718: 714: 710: 706: 702: 698: 694: 690: 686: 655: 645: 635: 631: 627: 617: 613: 609: 598: 594: 581: 541: 533: 506: 502: 498: 476: 472: 468: 426:integer overflow 393: 389: 385: 381: 377: 235: 228: 217: 210: 206: 203: 197: 192:this article by 183:inline citations 170: 169: 162: 155: 148: 144: 141: 135: 133: 92: 68: 60: 49: 27: 26: 19: 2237: 2236: 2232: 2231: 2230: 2228: 2227: 2226: 2197:Computer memory 2182: 2181: 2164: 2136: 2134:Further reading 2131: 2124: 2120: 2113:Microsoft Learn 2107: 2106: 2102: 2093: 2092: 2088: 2083:Atomic Builtins 2081: 2077: 2070: 2066: 2057: 2055: 2054:. p. 59-60 2051: 2047: 2046: 2042: 2033: 2031: 2027: 2023: 2022: 2018: 2011: 2007: 2000: 1996: 1988: 1984: 1973: 1971: 1967: 1962: 1961: 1957: 1950: 1946: 1933: 1929: 1922:Microsoft Learn 1916: 1915: 1908: 1896: 1892: 1887:Wayback Machine 1878: 1874: 1861: 1860: 1856: 1851:Wayback Machine 1842: 1838: 1828: 1826: 1818: 1814: 1813: 1809: 1799: 1797: 1788: 1787: 1783: 1773: 1771: 1765:Torvalds, Linus 1763: 1762: 1755: 1745: 1743: 1740: 1735: 1734: 1721: 1711: 1709: 1701:Sewell, Peter. 1700: 1699: 1692: 1682: 1680: 1671: 1670: 1666: 1656: 1654: 1645: 1644: 1637: 1627: 1625: 1616: 1615: 1602: 1598: 1586: 1576: 1572: 1568: 1555:MemoryBarrier() 1554: 1543: 1536: 1522: 1517: 1507: 1497: 1487: 1477: 1467: 1446: 1440: 1387: 1386: 1381: 1377: 1071: 1039:z/Architecture 967:memory barriers 935: 930: 921: 916: 905: 894: 886: 871: 865: 852: 844: 837:const qualifier 832: 828: 824: 820: 816: 812: 809: 803: 759: 754: 737: 728: 724: 720: 716: 712: 708: 704: 700: 696: 692: 688: 684: 681: 675: 653: 650: 643: 640: 633: 629: 625: 622: 615: 611: 607: 596: 592: 589: 579: 572: 566: 548:embedded system 539: 531: 528: 518: 513: 504: 500: 496: 493: 474: 470: 466: 459: 445: 440: 433:computation). 422: 412: 402: 391: 387: 386:, the value of 383: 379: 375: 372: 362: 357: 342:program counter 337: 279:single-threaded 247:computer memory 243:Memory ordering 236: 225: 224: 223: 218: 207: 201: 198: 188:Please help to 187: 171: 167: 156: 145: 139: 136: 93: 91: 81: 69: 28: 24: 17: 12: 11: 5: 2235: 2233: 2225: 2224: 2219: 2214: 2209: 2204: 2199: 2194: 2184: 2183: 2180: 2179: 2173: 2161: 2155: 2149: 2143: 2135: 2132: 2130: 2129: 2118: 2100: 2086: 2075: 2064: 2040: 2016: 2005: 1994: 1982: 1955: 1944: 1927: 1906: 1890: 1872: 1854: 1836: 1807: 1781: 1753: 1719: 1690: 1664: 1635: 1599: 1597: 1594: 1593: 1592: 1585: 1582: 1581: 1580: 1562: 1547: 1540: 1521: 1518: 1515: 1514: 1513: 1505: 1504: 1503: 1495: 1494: 1493: 1485: 1484: 1483: 1475: 1474: 1473: 1465: 1464: 1463: 1444:Memory barrier 1439: 1436: 1435: 1434: 1433: 1432: 1429: 1426: 1423: 1420: 1417: 1412: 1408: 1407: 1406: 1405: 1402: 1399: 1396: 1391: 1385: 1384: 1374: 1373: 1370: 1369: 1367: 1364: 1362: 1359: 1356: 1353: 1350: 1347: 1345: 1342: 1339: 1336: 1333: 1329: 1328: 1326: 1324: 1322: 1320: 1318: 1316: 1314: 1312: 1310: 1308: 1306: 1304: 1301: 1297: 1296: 1294: 1291: 1289: 1287: 1285: 1282: 1279: 1276: 1274: 1272: 1269: 1266: 1263: 1259: 1258: 1256: 1253: 1251: 1249: 1247: 1245: 1242: 1239: 1237: 1235: 1232: 1229: 1226: 1222: 1221: 1218: 1215: 1212: 1209: 1206: 1203: 1200: 1197: 1194: 1191: 1188: 1185: 1182: 1178: 1177: 1175: 1172: 1170: 1168: 1166: 1163: 1160: 1157: 1154: 1152: 1149: 1146: 1143: 1139: 1138: 1136: 1133: 1131: 1129: 1127: 1125: 1122: 1119: 1116: 1114: 1111: 1108: 1105: 1101: 1100: 1098: 1095: 1093: 1091: 1089: 1087: 1084: 1081: 1078: 1076: 1073: 1072:implementation 1068: 1065: 1062: 1058: 1057: 1054: 1051: 1048: 1045: 1041: 1040: 1037: 1034: 1031: 1028: 1025: 1022: 1019: 1016: 1013: 1010: 1007: 998: 997: 988: 981: 971: 970: 963: 962: 961: 958: 955: 952: 946: 934: 931: 929: 926: 920: 917: 914: 913: 912: 903: 902: 901: 892: 891: 890: 884: 883: 882: 869:Memory barrier 864: 861: 807: 802: 799: 793:in support of 758: 755: 753: 750: 736: 733: 679: 672:local variable 669: 648: 638: 620: 587: 570: 564: 526: 517: 514: 512: 509: 491: 457: 453:function calls 449:sequence point 444: 441: 438: 420: 410: 400: 370: 361: 358: 356: 353: 336: 333: 306:memory barrier 267:multithreading 238: 237: 220: 219: 174: 172: 165: 158: 157: 73:This template 72: 70: 63: 58: 32: 31: 29: 22: 15: 13: 10: 9: 6: 4: 3: 2: 2234: 2223: 2220: 2218: 2215: 2213: 2210: 2208: 2205: 2203: 2200: 2198: 2195: 2193: 2190: 2189: 2187: 2177: 2174: 2171: 2167: 2162: 2159: 2156: 2153: 2150: 2147: 2144: 2141: 2138: 2137: 2133: 2127: 2122: 2119: 2114: 2110: 2104: 2101: 2096: 2090: 2087: 2084: 2079: 2076: 2073: 2068: 2065: 2050: 2044: 2041: 2026: 2020: 2017: 2014: 2009: 2006: 2003: 1998: 1995: 1991: 1986: 1983: 1966: 1959: 1956: 1953: 1948: 1945: 1941: 1937: 1931: 1928: 1923: 1919: 1913: 1911: 1907: 1903: 1899: 1894: 1891: 1888: 1884: 1881: 1876: 1873: 1868: 1864: 1858: 1855: 1852: 1848: 1845: 1840: 1837: 1825:. August 2007 1824: 1817: 1811: 1808: 1796: 1792: 1785: 1782: 1770: 1766: 1760: 1758: 1754: 1739: 1732: 1730: 1728: 1726: 1724: 1720: 1708: 1704: 1697: 1695: 1691: 1679: 1675: 1668: 1665: 1653: 1649: 1642: 1640: 1636: 1624: 1620: 1613: 1611: 1609: 1607: 1605: 1601: 1595: 1591: 1588: 1587: 1583: 1566: 1563: 1560: 1557:macro in the 1553:compiler has 1552: 1548: 1541: 1534: 1531: 1530: 1529: 1527: 1519: 1512: 1509: 1508: 1502: 1499: 1498: 1492: 1489: 1488: 1482: 1479: 1478: 1472: 1469: 1468: 1462: 1458: 1455: 1454: 1453: 1451: 1445: 1437: 1430: 1427: 1424: 1421: 1418: 1415: 1414: 1413: 1410: 1409: 1403: 1400: 1397: 1394: 1393: 1392: 1389: 1388: 1379: 1376: 1368: 1365: 1363: 1360: 1357: 1354: 1351: 1348: 1346: 1343: 1340: 1337: 1334: 1331: 1330: 1327: 1325: 1323: 1321: 1319: 1317: 1315: 1313: 1311: 1309: 1307: 1305: 1302: 1299: 1298: 1295: 1292: 1290: 1288: 1286: 1283: 1280: 1277: 1275: 1273: 1270: 1267: 1264: 1261: 1260: 1257: 1254: 1252: 1250: 1248: 1246: 1243: 1240: 1238: 1236: 1233: 1230: 1227: 1224: 1223: 1219: 1216: 1213: 1210: 1207: 1204: 1201: 1198: 1195: 1192: 1189: 1186: 1183: 1180: 1179: 1176: 1173: 1171: 1169: 1167: 1164: 1161: 1158: 1155: 1153: 1150: 1147: 1144: 1141: 1140: 1137: 1134: 1132: 1130: 1128: 1126: 1123: 1120: 1117: 1115: 1112: 1109: 1106: 1103: 1102: 1099: 1096: 1094: 1092: 1090: 1088: 1085: 1082: 1079: 1077: 1074: 1066: 1063: 1060: 1059: 1055: 1052: 1049: 1046: 1043: 1042: 1004: 994: 989: 986: 982: 979: 976: 975: 974: 973:On some CPUs 968: 964: 959: 956: 953: 950: 949: 947: 944: 943: 942: 940: 932: 927: 925: 918: 910: 907: 906: 899: 896: 895: 888: 887: 880: 876: 875: 874: 870: 862: 860: 856: 850: 841: 838: 806: 800: 798: 796: 792: 786: 782: 780: 776: 770: 768: 764: 756: 751: 749: 745: 741: 734: 732: 678: 673: 668: 665: 663: 657: 647: 637: 619: 604: 602: 586: 583: 577: 569: 563: 561: 556: 553: 549: 545: 537: 536:dereferencing 525: 523: 515: 510: 508: 490: 487: 485: 484:I/O operation 481: 464: 456: 454: 450: 442: 437: 434: 431: 427: 419: 417: 409: 407: 399: 395: 369: 367: 359: 354: 352: 350: 345: 343: 334: 332: 330: 326: 322: 318: 314: 309: 307: 303: 299: 295: 291: 287: 282: 280: 276: 272: 268: 264: 260: 256: 252: 248: 244: 234: 231: 216: 213: 205: 195: 191: 185: 184: 178: 173: 164: 163: 154: 151: 143: 132: 129: 125: 122: 118: 115: 111: 108: 104: 101: –  100: 96: 95:Find sources: 89: 85: 79: 78: 71: 67: 62: 61: 56: 54: 47: 46: 41: 40: 35: 30: 21: 20: 2175: 2121: 2112: 2103: 2089: 2078: 2067: 2056:. Retrieved 2043: 2032:. Retrieved 2030:. p. 26 2019: 2008: 1997: 1985: 1972:. Retrieved 1970:. p. 16 1958: 1947: 1930: 1921: 1901: 1893: 1875: 1867:ccpreference 1866: 1857: 1839: 1827:. Retrieved 1810: 1798:. Retrieved 1784: 1772:. Retrieved 1744:. Retrieved 1710:. Retrieved 1681:. Retrieved 1677: 1667: 1655:. Retrieved 1651: 1626:. Retrieved 1622: 1523: 1447: 1378: 972: 936: 922: 872: 857: 842: 810: 804: 787: 783: 771: 760: 746: 742: 738: 682: 676: 666: 658: 651: 641: 623: 605: 590: 584: 575: 573: 567: 559: 557: 544:memory model 529: 519: 494: 488: 480:return value 460: 446: 435: 423: 413: 403: 396: 373: 363: 346: 338: 321:memory banks 310: 297: 293: 289: 285: 283: 259:compile time 242: 241: 226: 208: 199: 180: 146: 137: 127: 120: 113: 106: 94: 82:Please help 77:verification 74: 50: 43: 37: 36:Please help 33: 1559:Windows API 1486:sync (asm) 1476:sync (asm) 993:Alpha 21264 534:is termed " 463:stack frame 194:introducing 2186:Categories 2058:2023-12-15 2034:2023-12-15 1596:References 1506:dcs (asm) 1442:See also: 867:See also: 763:as-if rule 177:references 110:newspapers 39:improve it 1980:Figure 5. 1496:mf (asm) 1070:depend on 941:systems: 849:type cast 797:models. 428:and that 329:DEC Alpha 313:bandwidth 202:July 2023 140:July 2023 45:talk page 1974:3 August 1883:Archived 1847:Archived 1829:3 August 1800:3 August 1774:3 August 1746:3 August 1712:3 August 1683:3 August 1657:3 August 1628:3 August 1584:See also 1526:builtins 1021:PA-RISC 601:aliasing 560:assigned 558:What if 522:pointers 255:compiler 2170:YouTube 1491:Itanium 1471:PowerPC 1450:runtime 1018:RISC-V 775:Fortran 662:exploit 298:relaxed 263:runtime 190:improve 124:scholar 1938:  1461:x86-64 1036:IA-64 1033:AMD64 1027:SPARC 1024:POWER 1012:ARMv7 1009:Alpha 503:, and 473:, and 325:x86-64 317:caches 286:strong 281:code. 179:, but 126:  119:  112:  105:  97:  2052:(PDF) 2028:(PDF) 1968:(PDF) 1823:Intel 1819:(PDF) 1741:(PDF) 1511:ARMv7 1501:POWER 1030:x86 1015:MIPS 1006:Type 380:print 249:by a 131:JSTOR 117:books 1976:2024 1936:ISBN 1831:2024 1802:2024 1776:2024 1748:2024 1714:2024 1685:2024 1659:2024 1630:2024 1575:and 1567:has 1549:The 1481:MIPS 1056:TSO 1053:PSO 1050:RMO 1047:TSO 1044:WMO 711:and 687:and 654:*sum 644:*sum 634:*sum 630:*sum 616:*sum 612:*sum 610:and 597:*sum 593:*sum 319:and 294:weak 269:and 103:news 2168:on 1533:GCC 1457:x86 1428:PSO 1422:RMO 1416:TSO 1401:TSO 1395:WMO 939:SMP 879:GCC 847:to 727:or 719:or 664:. 580:sum 576:not 524:: 455:. 392:sum 388:sum 384:sum 376:sum 296:or 288:or 257:at 251:CPU 86:by 2188:: 2111:. 1920:. 1909:^ 1865:. 1821:. 1793:. 1756:^ 1722:^ 1705:. 1693:^ 1676:. 1650:. 1638:^ 1621:. 1603:^ 1571:, 1459:, 1452:. 1220:Y 1067:Y 701:*a 693:*b 689:*b 685:*a 626:*c 608:*c 532:*x 499:, 469:, 48:. 2097:. 2061:. 2037:. 1978:. 1942:. 1869:. 1833:. 1804:. 1778:. 1750:. 1716:. 1687:. 1661:. 1632:. 1579:. 1539:. 1366:Y 1361:Y 1358:Y 1355:Y 1352:Y 1349:Y 1344:Y 1341:Y 1338:Y 1335:Y 1303:Y 1293:Y 1284:Y 1281:Y 1278:Y 1271:Y 1268:Y 1265:Y 1255:Y 1244:Y 1241:Y 1234:Y 1231:Y 1228:Y 1217:Y 1214:Y 1211:Y 1208:Y 1205:Y 1202:Y 1199:Y 1196:Y 1193:Y 1190:Y 1187:Y 1184:Y 1174:Y 1165:Y 1162:Y 1159:Y 1156:Y 1151:Y 1148:Y 1145:Y 1135:Y 1124:Y 1121:Y 1118:Y 1113:Y 1110:Y 1107:Y 1097:Y 1086:Y 1083:Y 1080:Y 1075:Y 1064:Y 969:) 853:f 845:f 833:f 829:a 825:f 821:g 817:a 813:f 729:b 725:a 721:g 717:f 713:g 709:f 705:g 697:f 540:x 505:h 501:g 497:f 475:h 471:g 467:f 233:) 227:( 215:) 209:( 204:) 200:( 186:. 153:) 147:( 142:) 138:( 128:· 121:· 114:· 107:· 80:. 55:) 51:(

Index

improve it
talk page
Learn how and when to remove these messages

verification
improve this article
adding citations to reliable sources
"Memory ordering"
news
newspapers
books
scholar
JSTOR
Learn how and when to remove this message
references
inline citations
improve
introducing
Learn how and when to remove this message
Learn how and when to remove this message
computer memory
CPU
compiler
compile time
runtime
multithreading
memory-mapped I/O
changes the order of any operations
single-threaded
parallel algorithms

Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.