TXPAUSE : polite waiting for hardware transactional memory
Posted
by Dave
on Oracle Blogs
See other posts from Oracle Blogs
or by Dave
Published on Thu, 25 Oct 2012 04:25:48 +0000
Indexed on
2012/10/25
11:09 UTC
Read the original article
Hit count: 250
/General
Classic locks are an appropriate tool to prevent potentially conflicting operations A and B, invoked by different threads, from running at the same time. In a sense the locks cause either A to run before B or vice-versa. Similarly, we can replace the locks with hardware transactional memory, or use transactional lock elision to leverage potential disjoint access parallelism between A and B. But often we want A to wait until B has run. In a Pthreads environment we'd usually use locks in conjunction with condition variables to implement our "wait until" constraint. MONITOR-MWAIT is another way to wait for a memory location to change, but it only allows us to track one cache line and it's only available on x86. There's no similar "wait until" construct for hardware transactions. At the instruction-set level a simple way to express "wait until" in transactions would be to add a new TXPAUSE instruction that could be used within an active hardware transaction. TXPAUSE would politely stall the invoking thread, possibly surrendering or yielding compute resources, while at the same time continuing to track the transaction's address-set. Once a transaction has executed TXPAUSE it can only abort. Ideally that'd happen when some other thread modifies a variable that's in the transaction's read-set or write-set. And since we're aborting all writes would be discarded. In a sense this gives us multi-location MWAIT but with much more flexibility. We could also augment the TXPAUSE with a cycle-count bound to cap the time spent stalled.
I should note that we can already enter a tight spin loop in a transaction to wait for updates to address-set to cause an abort. Assuming that the implementation monitors the address-set via cache-coherence probes, by waiting in this fashion we actually communicate via the probes, and not via memory values. That is the updating thread signals the waiter via probes instead of by traditional memory values. But TXPAUSE gives us a polite way to spin.
© Oracle Blogs or respective owner