I ran into this problem last year, that ordinary mutexes were not working out due to various reasons.
Since I do embedded programming, I looked through the ARM manuals and documentation, to see what they have on offer.
I ran into the lovely LDREX and STREX instructions, which are just awesome.
It was pretty trivial to rewrite e.g. a custom allocator that used to grab a mutex, poke its internal freelists, and release the mutex so that it instead does the poking with LDREX/STREX, thus obtaining thread-safety while doing the actual work it wanted to do. Lock-free programming can be beautiful like that.
Code that works correctly on that particular programmer's x86 machine, you mean. "Works great on my PC at home and at work" does not necessary imply "Works great on the 64 CPU production machine in the datacenter".
Shouldn't atomics behave similarly across all platforms? I am thinking of the GCC builtin atomics [1].
What sort of code are you thinking about? Even lower-level / non-abstracted atomics and memory fences that are specific to a particular architecture? I actually did run into code like this for a MIPS-based CPU and we converted the code to using the GCC builtins for portability. Funny thing is that the next hardware architecture toolchain wasn't GCC-based, so there was no benefit... we had to make wrappers for the atomics... and hunt down some absurd bugs related to L2 cache writebacks and mem fences. It was an exercise in how long I could stick with a problem... some serious hunting and staring at debug output / hex dumps..
Does side-effect-free programming with persistent data-structures on share-nothing, actor-based architecture, in other words Erlang is qualified to be lock-free?
While responses have well addressed the deadlock bit of this, and someone else addressed the side effect free bit, I wantedI wanted to actually make sure you saw why actors necessarily involve side effects.
%Here is a function that rather idiotically takes a process
%id, a message, and just sends the message to the process,
%in Erlang, and then returns the atom 'ok'.
send_message(Pid, Message) ->
Pid ! Message,
ok.
If this was side effect free, that first line of the function, Pid ! Message, could safely be removed by the compiler, because it either returns nothing, or we are doing nothing with the return; either way, if it has no side effects it is not doing anything. Much like if I were to replace it with something that obviously does not cause side effects, such as 5 + 7; we're not doing anything with the 12 that that returns, so why bother calculate it?
But obviously that is not the case; we are expecting the process that Pid corresponds to to receive that message, and to do something with it.
The very act of sending a message is a side effect.
Come on, when people are talking about a code without side effects they meant so called pure functions, which always produce the same output for the same input. Message passing is a way of communication which is about "causing effects" so let's not take it to absurd.
On the other hand, non-networking procedures in Erlang are pure functions, so they could be safely evaluated in parallel. Its spawned processes are isolated, they share nothing, not even a common stack, so they could fail and being restarted without affecting any other code or data.
It is much better to read Armstrong's thesis where he gives outline of the design of Erlang, hows and whys. It might has a bit ugly syntax, but semantics are fine.
My post wasn't a criticism. Just that saying that Erlang is side effect free (per the OP), is misplaced. Erlang is an insanely pragmatic language that I use daily, but it very intentionally does not bend over backwards to avoid side effects, instead allowing them in cases where they make sense.
No. Lock-free applies to shared, mutable data structures. Erlang (and Haskell in most cases) operate at a much higher level. Remember that side-effect-free programming is a high-level abstraction over most-certainly-not-side-effect-free code. Erlang's process mailboxes might make use of a lock-free queue, though. Architecting your program with actors is most certainly not lock-free as deadlocks may occur.
But being lock-free doesn't necessarily mean better or even faster.
Even working with persistent data structures as in Clojure is not usually considered to be lock-free (even though there are no locks involved, no mutation is taking place). But Clojure does support lock-free programming, for example with atoms.
So, no sharing - no problem, no mutation, no need to synchronize access?
According to classic maxima, the way you structure your data influences the shape of your procedures, that is why list procedures are naturally recursive.
Similarly, if you have no destructive operations, you could partition your data any way you wish, and process the parts in parallel, this is, for example, how make -j4 works. As long as your obj files share nothing they could be build in any order and in parallel.
The idea is that parallelism begins from how you structure your data, and hardware primitives are of much less importance. 10 threads updating one global variable is a disaster no matter what.
> The idea is that parallelism begins from how you structure your data
Sure, but again, you're talking high-level. At a low-enough level, every operation is destructive, because that's how our machines work. If you're using immutable data-structures, then you're allocating memory, so you are bumping (destructively) some heap pointer. Even if that pointer is thread-local, sooner or later some garbage collection will take place, which requires some form of synchronization.
The idea is that somewhere in your software stack, be it the OS or the JVM or whatever, there will be shared mutable state, and that's where lock-free data structures are handy. Some high-level languages abstract that away from you so you don't have to deal with this stuff.
Also, parallelism is only part of the story. Parallelism, or data parallelism, divides the data so that it can be processed cooperatively in parallel, but that's not always what you have. Lock-free data structures are useful when doing concurrency rather than parallelism, i.e. lots of different computations (rather than one), competing for (rather than cooperatively sharing) limited resources.
I'm not sure what you mean by side-effect-free programming on an actor-based architecture. The whole point of actors is to contain side effects, not eliminate them.
Erlang is certainly not lock free - I'll use a Scala/Akka example since I know it better than Erlang:
class DiningPhilosopher(leftForkHolder: Actor, rightForkHolder: Actor) extends Actor {
leftForkHolder ! ForkRequest
def receive = {
case ForkFromLeft => {rightForkHolder ! ForkRequest; ...}
}
}
That'll deadlock for sure.
I think stream based programming without loops (i.e., the stream processes form a digraph) will generally be lock free.
Since I do embedded programming, I looked through the ARM manuals and documentation, to see what they have on offer.
I ran into the lovely LDREX and STREX instructions, which are just awesome.
It was pretty trivial to rewrite e.g. a custom allocator that used to grab a mutex, poke its internal freelists, and release the mutex so that it instead does the poking with LDREX/STREX, thus obtaining thread-safety while doing the actual work it wanted to do. Lock-free programming can be beautiful like that.
See http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.... for the instruction reference on LDREX/STREX.