Hacker Newsnew | past | comments | ask | show | jobs | submit | sbahra's commentslogin


Thanks for sharing!

For single-writer multi-reader scenario this requires no atomic fences or operations on TSO: http://concurrencykit.org/presentations/lpc2015.pdf

Works on any open-addressed scheme (there is also a robin hood implementation).


overthewire.org


Give us a spin! Would absolutely love more feedback, it has had a big impact for us already. It’s free for open-source projects.

The tool for ASAN/etc... is just a few weeks old but we have been doing crash / error reporting and custom debuggers (built-in static analysis) for the last 5 years.

Welcome to also shoot me a mail at sbahra@backtrace.io.


Whoops, good catch! I will fix, thanks.


Great write-up! A few notes worth mentioning (IMO) below:

Garbage collection: This is only true in absence of garbage collection. If you're paying garbage collection cost to begin with, this is not an issue. Also, note things such as http://web.mit.edu/willtor/www/res/threadscan-spaa-15.pdf

This only applies to dynamic lock-free data structures: Or, data structures requiring memory allocation. If you're using bounded buffers and don't require memory allocation, this isn't an issue.

Taxonomy: Not all passive schemes are quiescent-state-based. In QSBR, quiescent points are a first-class entity while that is not the case in EBR (you have only 3 distinct states). Absent extensions you are unable to differentiate one quiescent point from another, which has real implications on being able to implement things like deferred / non-blocking reclamation efficiently. There are some advantages to this, a while ago Paul Khuong contributed a reference counting scheme to epoch reclamation allowing for overlapping protected sections (bounded epoch makes reference counting practical, something you can't do with unbounded quiescent points). It's pretty great and note, it doesn't require a strict notion of quiescence! You'll find this in Concurrency Kit.

It is also worth noting the HP, etc... (pointer-based schemes) can be used to implement QSBR.

For these reasons, I prefer to classify these techniques into two primary families (based on the semantics of the interface rather than the implementation): "passive schemes" and "active schemes". Passive schemes do not require cooperation from the algorithms requiring safe memory reclamation (EBR, QSBR, etc...) while "active schemes" do (HP, PTB, etc... in their textbook form require modification to the algorithm).

On blocking for passive schemes: It is worth noting that QSBR, EBR and other "passive schemes" do have "non-blocking" (informally) interfaces (rcu_call, text-book EBR utilizes limbo lists, etc...) such that writers do not have to block on the fast path, but of course, there is the cost of memory accumulating until a grace period is detected (so, not non-blocking in the formal sense but if you've the memory to spare and sufficient forward progress, becomes a non-issue). In my implementations, I typically use a dedicated thread for forward progress / epoch detection, ensuring the writer has forward progress.

You have much higher granularity and the lock-free progress guarantee in hazard pointers, because it is pointer-based (tied to the forward progress guarantees of the underlying algorithm).

Under-appreciated recent development: An interesting thing to note here is there are schemes such as https://www2.seas.gwu.edu/~seotest/publications/eurosys16ps.... which do not necessitate the same heavy-weight barriers in the presence of invariant TSC+TSO and with a sufficiently high granularity TSC, can provide the same forward progress guarantees as hazard pointers.

On hazard pointers being slow: One interesting thing to note is hazard pointers can also be used to implement "passive" schemes such as proxy collection, to give similar performance properties as EBR and QSBR.

On real world implementations: It is worth mentioning http://concurrencykit.org :-P


I'm having trouble understanding your point about GC. The post points out that GC solves some of these problems, at the expense of creating waits. Do you think there's something wrong with the way it describes the issue?


Not GP, but I think what he's getting at is that the GC cost is already paid and doesn't get paid again for reclaiming memory with lock-free algorithms.


This is quiescent-state-based reclamation, other implementations exist that are cheap. I have seen the technique you're talking about used in K42 (after I had also thought I invented something novel, when I used the same approach as you :-P).


Doesn't the strict definition of QBSR require a thread to keep no references to shared data when it is in quiescent state, whereas epoch-based reclamation allows threads to simply retain no references to older versions of shared data when it advances its epoch?


I'm referring to the FIFO implementation. On the definition of QSBR though, frankly, I think the term gets overloaded in some of the more recent literature in this area. Both EBR and QSBR require that no hazardous references get carried over across "protected sections" (EBR is explicit read-side protected sections and in QSBR, across a quiescent point).


I think it does fit that description, but for some reason none of the implementations I found were completely without atomic ops. They had at least barriers which aren't free on ARM.


wouldn't you need acquire/release barriers around your load and stores of your token (or alternatively acq/rel load/stores)? Otherwise your other operations won't be ordered with regard the epoch marked by the token.


Or use backtrace.io to deal with it. It'll transfer the core dump or a derivative to a centralized server.

Transparency: co-founder.


Can confirm.


It reverts to a bus lock in this split transaction case, not cache line locking (thus stalling whole system).


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: