c++ - boost vs std atomic sequential consistency semantics -


i'd write c++ lock-free object there many logger threads logging large global (non-atomic) ring buffer, occasional reader thread wants read data in buffer possible. ended having global atomic counter loggers locations write to, , each logger increments counter atomically before writing. reader tries read buffer , per-logger local (atomic) variable know whether particular buffer entries busy being written logger, avoid using them.

so have synchronization between pure reader thread , many writer threads. sense problem can solved without using locks, , can rely on "happens after" relation determine whether program correct.

i've tried relaxed atomic operation, won't work: atomic variable stores releases , loads acquires, , guarantee acquire (and subsequent work) "happen after" release (and preceding work). means there no way reader thread (doing no store @ all) guarantee "happens after" time reads buffer, means don't know whether logger has overwritten part of buffer when thread reading it.

so turned sequential consistency. me, "atomic" means boost.atomic, notion of sequential consistency has "pattern" documented:

the third pattern coordinating threads via boost.atomic uses seq_cst coordination: if ...

  1. thread1 performs operation a,
  2. thread1 subsequently performs operation seq_cst,
  3. thread1 subsequently performs operation b,
  4. thread2 performs operation c,
  5. thread2 subsequently performs operation seq_cst,
  6. thread2 subsequently performs operation d,

then either "a happens-before d" or "c happens-before b" holds.

note second , fifth lines "any operation", without saying whether modify anything, or operates on. provides guarantee wanted.

all happy until watch talk of herb sutter titled "atomic<> weapnos". implies seq_cst acq_rel, additional guarantee of consistent atomic stores ordering. turned cppreference.com, have similar description.

so questions:

  1. does c++11 , boost atomic implement same memory model?
  2. if (1) "yes", mean "pattern" described boost somehow implied c++11 memory model? how? or mean documentation of either boost or c++11 in cppreference wrong?
  3. if (1) "no", or (2) "yes, boost documentation incorrect", there way achieve effect want in c++11, namely have guarantee (the work subsequent to) atomic store happens after (the work preceding) atomic load?

i saw no answer here, asked again in boost user mailing list. saw no answer there either (apart suggestion boost lockfree), planed ask herb sutter (expecting no answer anyway). before doing that, googled "c++ memory model" little more deeply. after reading page of hans boehm (http://www.hboehm.info/c++mm/), answer of own question. googled bit more, time "c++ data race", , landed @ page bartosz milewski (http://bartoszmilewski.com/2014/10/25/dealing-with-benign-data-races-the-c-way/). can answer more of own question. unluckily, still don't know how want given knowledge. perhaps want unachieveable in standard c++.

my first part of question: "does c++11 , boost.atomic implement same memory model?" answer is, mostly, "yes". second part of question: "if (1) 'yes', mean "pattern" described boost somehow implied c++11 memory model?" answer again, yes. "how?" answered proof found here (http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2007/n2392.html). essentially, data race free programs, little bit added acq_rel sufficient guarantee behavior required seq_cst. both documentation, although perhaps confusing, correct.

now real problem: although both (1) , (2) "yes" answers, original program wrong! neglected (actually, i'm unaware of) important rule of c++: program data race has undefined behavior (rather "unspecified" or "implementation defined" one). is, compiler guarantees behavior of program if program has absolutely no data race. without lock, program contains data race: pure reader thread can read time, @ time when logger thread busy writing. "undefined behavior", , rule says computer can (the "catch fire" rule). fix it, 1 has use ideas found in page of bartosz milewski mentioned earlier, i.e., change ring buffer contain atomic content, compiler knows ordering important , must not reordered operations marked require sequential consistency. if overhead minimization desired, 1 can write using relaxed atomic operations.

unluckily, applies reader thread too. can no longer "memcpy" whole memory buffer. instead must use relaxed atomic operations read buffer, 1 word after another. kills performance, have no choice actually. luckily me, dumper's performance not important me @ all: gets run anyway. if want performance of "memcpy", answer of "no solution": c++ provides no semantics of "i know there data race, can return me here don't screw program". either ensure there no data race , pay cost defined, or have data race , compiler allowed put jail.


Popular posts from this blog