Hacker News new | past | comments | ask | show | jobs | submit login
Java Concurrency – Understanding the Basics of Threads (turkogluc.com)
142 points by turkogluc 4 months ago | hide | past | favorite | 82 comments



With Futures and Executors/ExecutorServices I find that I rarely ever need to use raw Threads these days. Most of the thread-safety issues commonly encountered are eliminated with this approach as well.


Likewise, with futures and executors I haven't had to touch threads directly for some time.

They give you the tools to just say "go away and do these things", which after years of dealing directly with pthreads in C was a breath of fresh air!


But compare this very recent submission: https://news.ycombinator.com/item?id=24921657

which presents threads as the solution to the pain of using futures.


STM is not a solution to the pain of futures, STM is a "solution" to the pain of properly using locks, in particular when the concurrency is very fine-grained.

A future internally is just a value or a set of waiters. Whether they are "colored" (ie. can you wait on a future in a non-future context), or based on threads or cooperative concurrency (ie. async) etc. is entirely an implementation detail, with significantly differing trade-offs.

In that sense, you can really only compare future implementations with other concurrency systems.


yeah, calling Thread.interrupt(), join() or similar methods are often a code smell for a bad "programming 101" teacher.

Executors are the way to go for almost all finite-time concurrency.

New threads should normally only be used for stuff that keeps running until the process quits.


Pretty much. I can't even recall the last time I've had to touch a low level threading class in Java since the executors cover so many of the common use cases.


I don't know why everyone is so tense in here :D I am planning to write a series of posts about concurrency (not because I am an expert but I want to improve myself by this way), including the executor framework, completable futures, locks and synchronisation. Obviously it is not possible to go into details of all in a single post. For one who knows operating system level details, and new to java I thing this is where you start, simply with Thread API. Even if you are using high-level abstraction this is these are basics to know. It is quite popular in the job interviews. And also it is quite common for people uses the Executor framework daily but does not know how to properly stop a thread, or non-daemon threads etc.


I have published a new article about Executor Framework: https://turkogluc.com/java-concurrency-executor-services/


I was going to comment that as well. Futures, Managed Executors, tasks, runnables, are more higher level structures that are better suited for general use. Those constructs are often implemented using threads though, so it's worth knowing what's happening one layer below the abstraction layer.


Absolutely, executors offer much better (safer, more consistent) lifecycle management of a Runnable vs. a homegrown solution in my experience. The last time I extended Thread I think it was just to pull off a custom name format.


Except for that pesky swallowing of exceptions, I agree


Years ago, I tried learning how to use threads by following tutorials similar to this one, where you are taught how to implement threads from {python, java, c++}. However, it wasn't until I studied operating systems (when I returned to graduate school for computer science) was I able to wrap my mind around threads — from a language agnostic view point, how and what lightweight processes are, how to implement locks and synchronization barriers — and how they help facilitate concurrency.


Seconded. It’s silly to learn threads “from the outside in” — thinking of them as an opaque abstraction and trying to understand the API they present. There’s no coherent abstraction there; you’ll only learn to cargo-cult the API, without gaining an intuition for what threads “are” or when and where you’d want to use those APIs.

The key thing to know, is that threads aren’t a first-class kernel object. In OS kernels, there are only OS processes and memory regions.

To learn about threads, you should just learn about OS processes; and then learn that distinct OS processes can share memory regions between them, often via subprocess-spawn-time inheritance. Learn what fork(2) does on POSIX, and how it manages to be fast.

Starting with that intuition, it’s simple to then absorb what “threads” actually are: a usage pattern for spawning and managing OS processes that share memory; and a set of convenience APIs (that may be in-kernel, as in Windows; or purely in userland, as in Linux) for setting up this usage pattern. Everything these “threading” APIs can do, you can do yourself directly using the process-management and memory-mapping APIs. And those same calls are all that e.g. libpthread is doing.


This strikes me as rather focused on Linux kernel implementation details, since in Windows processes and threads are actually distinct concepts (as opposed to the Linux kernel, which really only knows about tasks), where every live process has n>1 threads and the address space of threads is afaik strictly defined through the process it is part of.


I just visualise it as different instruction pointers with their own stack and shared heap. But I’m coming from Java so that might be an oversimplification!


I know about unix/linux processes, ipc and the relevant system calls (exec*, fork, clone, ...) but where do I continue from there?

Studying C, I haven't really come across threads other than trying out the things in `pthread.h`.

Would you recommend just reading the source code of that header for a better understanding?


What books are used in your program?


Idk about GP, but one book I highly recommend is Java Concurrency in Practice - https://www.amazon.com/Java-Concurrency-Practice-Brian-Goetz...

It's old, but the material holds up well since it covers a lot of fundamentals


We were an all-Java shop and we were considering how to make our application a SAAS cloud application. Our senior engineers read this book. They all agreed that it was very educational, but the conclusion was that Java concurrency in practice has too many footguns, and so we ended up adopting Clojure.

I think modern Java has better support for it, but if you've got mutable state spread throughout your application you're going to have a hard time no matter what.


> senior engineers read this book

How does one become a senior engineer if you don't understand concurrency?

Mutable state is most easily solved by having cpoies of everything, but then that's a tradeoff between performance and infrastructure/resource costs, but I guess that if you're in an all-Java shop that isn't much of an issue.


Reading a book on the basics doesn't mean you don't have a grasp of them. I'd argue that refreshing your knowledge on things is a mark of a good engineer, regardless of seniority.


Rich Hickey is supposed to have said that he created Clojure because he was tired of telling people to read that book.

Best I can find as a source for now is https://www.youtube.com/watch?v=2y5Pv4yN0b0 -- I thought there was a link somewhere to Hickey himself saying this, but can't find it.


I have the book. I've tried reading it several times. I just can't get into it. My major complaint is that it gives a ton of code examples and then "don't do this" is written under it. So one has to be extremely alert at all times while reading through the book. It dives right into the subject as if one has already been writing threaded code. Perhaps that's the target audience.


I also recommend it, a little bit old because it doesn't cover new features but the fundamentals are strong, Brian Goetz really did a great job.


Project Loom[0] is going to be coming out at some point. That brings direct JVM support for delimited continuations and fibers.

That's really going to change and simplify JVM concurrency and I think many other things. Delimited continuations can be used to implement algebraic effects, which is exciting for functional programming.

[0] https://openjdk.java.net/projects/loom/


> Delimited continuations can be used to implement algebraic effects, which is exciting for functional programming.

How-so? I thought the type system was the precluding factor for algebraic effects, not the threading model.


You won't be able to type them in Java, at least not now; however it would allow new JVM languages to do it. You could also do fibers on the JVM before, but they didn't have access to the actual JVM stack. That kind of limits their usefulness and performance.


I honestly don’t think it will, not for a long time. It’s going to make things more complicated, because now we have to figure out how to move the entire ecosystem to it incrementally, while operating and maintaining systems during the long transition period, and making technical decisions at every step about how best to do concurrency with all those extra constraints.


Their compatibility story is alright though, at least they are trying to make "Thread" forward-compatible with the new runtime. And they are working on getting Netty to work, which will immediately get things compiling (not working) for a lot of projects.

The biggest challenge will be reconciling Futures, Netty and ThreadLocal (FiberLocal) patterns. Think defining a SQL transaction lifetime, a distributed lock, or a OpenTracing span. For Spring Framework people, no big deal. For everyone else, lots of complex decisions to make.


I don't see it used much in legacy systems, but the benefits are large enough that I could see new Akka-ish libraries built on it.


I'm not a Java developer, but isn't RxJava the current best practice around managing concurrency in Java? I thought the consensus was that manually dealing with thread creation is too error-prone and unmanagable.


RxJava has some nice tools for async buffering, debounce, etc., but it pays to understand CountdownLatch, Semaphore, Mutex, ExecutorService, etc., and I would definitely not consider RxJava a substitute for other things.

Do avoid anything related to the hoary old Java "Future" class, though. CompletionStage or get out!


No, ExecutorServices are the current best practice around managing concurrency in Java. RxJava is only something you should use if you have specific performance requirements (with data to back it up), and you need the Observable pattern.


We are using the Akka framework so as to not to have to deal with threads directly. Message passing and immutable objects simplify a lot while adding one more abstraction layer.


Not quite. There were plenty and simpler abstractions to easily manage concurrency before, without even touching Thread. As someone has mentioned, Executors, ExecutorService, ThreadPools, ...

RxJava feels like an unnatural port from single-threaded mindsets.


RxJava got hyped in Android before Google went with Kotlt first, now they are into co-routines and depending on the Jetpack project, Java developers might still be able to use it, or be forced into Kotlin.


What a strange coincidence! I have started learning Java concurrency from the book[1]. I am on the synchronization chapter and it looks like managing threads via Runnable directly is going to be painful. I am hoping I get a good intro to Eecutors somewhere down the road. Adding the OP tutorial to go through once I have a better hold on writing concurrent programs.

1. https://www.amazon.in/Java-Threads-Concurrency-Utilities-Fri...


I wouldn't accept anyone's PR with explicit Thread usage in them. You should either use some high-level construct like CompletableFuture or a concurrent data structure instead.


The font colors in the code examples are quite hard to read on my machine. `public class Main {` is so dark that I have to highlight to read


The problem is not the threads, it is the mutations of variables which boost the complexity of the code. So a tutorial on creation of theads actually an invitation to hell. Nothing is cool about it. Cool thing is achieving concurrency without threads/race conditions/shared memory


A computer is a mutation machine, you cannot escape mutation by hand-waving it away. If you are writing programs in which you can achieve concurrency without threads and shared memory, it’s because you’re building on the shoulders of all the engineers who didn’t hand-wave it away. Many of us, due to product requirements, don’t have the luxury of using higher level abstractions like that.


And yet we're comfortable hand-waving GOTO away - that is, not calling computers GOTO machines.


There are thousands of engineers (at least) who use it or its equivalent every day. Just because they’ve built abstractions that allow you to ignore it, doesn’t mean nobody has to deal with it anymore.


But that's identical to what he's saying; he's not saying no one has to deal with mutable memory. Just that most developers who need concurrency shouldn't have to. Same as registers, GOTO, etc.


That's not what they said. They said "nothing is cool about a tutorial on creation of threads", and that the cool thing is "achieving concurrency without threads/race conditions/shared memory" which ironically is only enabled by all the engineers who spend their time working on and maintaining those "uncool" things.

If they don't need to use threads, then good for them. But to dismiss threads and learning material about threads as "uncool" is just silly. The thing that enables that misunderstanding is all the work that's done on them in the first place.


There's a lot cool about threads, and you can learn to implement them well.

Threads handled well do not need to have race conditions, and race conditions/deadlocks are also very possible in distributed, message-passing systems.


Can databases be efficiently implemented without shared access?

Can message passing accomplish this at the same level of performance?

although I agree that simplifying resource access should probably be considered before fully shared state.


> Can databases be efficiently implemented without shared access?

Let me ask a different question: Why did databases take off in the way they did? Sure they persist stuff to disk, but so do files. What they offer is a concurrency model so good that you almost never think about it. Beginner programmers can competently write large, concurrent systems by writing single-threaded programs which are backed by a central DB, without even knowing the term "race condition".

If beginner database articles told users how to make database Threads, Thread groups, and how to signal and catch interruptions, I don't think databases would have enjoyed nearly as much popularity.

While Threads are fundamental to Java concurrency, I kinda agree with yetkin's point. It introduces the Thread footgun without even paying lip service to the problems of shared, mutable state.


Do you use other paradigm / languages ? (clojure comes to mind, but maybe others)


I'll add Pony to the list. This language uses the actor model like Akka and Erlang, but allows for the safe sharing of data between actors enforced by an an ingenious use of the type system. The result is an actor programming model with better performance than Erlang because mutable data can be safely shared.

I have been a long time Java developer, and I have worked a lot with highly concurrent code. Pony really opened my eyes to what was possible.

Unfortunately, the language, standard library and runtime is still pretty immature. It does however have very good 'C' interop. So for some problems it would be a very good fit.


For Java at least, the Java Concurrency API is preferred.


Akka actor framework comes to mind. I am in the process of learning it and it is definitely simpler to wrap your head around it.


Actor model with Elixir/Erlang and the BEAM VM.


All the cool people are in Hell. Only go to Heaven for the climate.


The concepts of threads and concurrent data access is simple enough for any decent programmer to comprehend. There is no hell here. Sure there are some complex cases but complex cases will arise in many situations when programming things.

And achieving concurrency without shared memory is impossible in general case. Sure it is possible to isolate such access to a separate layer and make it transparent for the rest of the program but someone still has to program such layer.


The problem for novices is that a program that behaves correctly looks a lot like a correct program. Until one day it doesn’t.

And because you’re in production and getting random spurious failures, the panicked (but common) reaction is to wrap every shared resource in a synchronized block. Which makes an incorrect implementation worse but possibly correct.


If the resource is shared and being accessed from many threads and is both written to and read from then it is the correct behavior to to lock it with the proper type of lock at access time. Depending on resource it might be possible to split it into few with more granular access.

As for novices: they are called that for reason and supposed to be under supervision rather than allowed running wild.


Why is this being downvoted? It's the truth.

HN needs to only allow downvotes that have an accompanying explanation comment.


HN uses downvotes mostly to boo the people with opinions deviating from common party line. As for reasonable explanation - you're asking too much. Programming as many other things often are treated as the religion. No arguments, it just is.


As with most other compromised social sites, no badthink allowed here and how dare you.


Novices don't build working concurrent systems of any kind with any toolkit, period. Concurrency is hard and thinking all the "concurrency problems" go away with some message passing is both ludicrous and dangerous. Fearless concurrency can only be attained through understanding, not by thinking all your problems went away because you're using a "cool approach".


Surprisingly this is what the akka framework promises : Message passing and immutability of objects.


Software usually has state (unless that state is completely kept and managed externally in a database for example). And the state mutates. Simple case example is a big array that has to be processed in place.


It's pretty easy to make the leap from individual SQL statements to SQL statements which are wrapped in a transaction.


Excellent example for making my point, since "just wrap it in a transaction" usually leads to concurrency bugs like the beloved lost update.


If you're talking database like transaction it "usually" leads to concurrency bugs only if the transaction level is not strictly serializable. It does not hurt to know things before labeling them.


This is not something I'm familiar with. What's the beloved lost update and what transactions are you using that suffer from it?


Transactions give varying degrees of "isolation" between them, depending on the database (and its version + configuration). For example, in what SQL would call READ COMMITTED, where transactions will only read data that has been committed, read-modify-write updates are generally bugs. The classic example:

    - Intent: both transactions deduct 50 money
    - transaction 1: SELECT balance FROM account; // = 100
    - transaction 2: SELECT balance FROM account: // = 100
    - transaction 1: UPDATE account SET balance = 50
    - transaction 1: COMMIT
    - transaction 2: UPDATE account SET balance = 50
    - transaction 2: COMMIT
    - Result: balance is 50, but should be 0
With serializabile transactions (not all databases have this, particularly if you look beyond SQL):

    - Intent: both transactions deduct 50 money
    - transaction 1: SELECT balance FROM account; // = 100
    - transaction 2: SELECT balance FROM account: // = 100
    - transaction 1: UPDATE account SET balance = 50
    - transaction 1: COMMIT
    - transaction 2: UPDATE account SET balance = 50
    - transaction 2: COMMIT -> Fails, needs to retry
    - transaction 2b: SELECT balance FROM account: // = 50
    - transaction 2b: UPDATE account SET balance = 0
    - transaction 2b: COMMIT -> Ok!
    - Result: balance is 0
Because this is needed so frequently, databases have calculated updates, basically atomic operations:

    - transaction 1: UPDATE account SET balance = balance - 50; // values indeterminate
    - transaction 2: UPDATE account SET balance = balance - 50; // values indeterminate
    - transactions 1,2: COMMIT
    - Result: balance is 0
Or, one could lock the rows, like so:

    - transaction 1: SELECT FOR UPDATE balance FROM account; // = 100
    - transaction 2: SELECT FOR UPDATE balance FROM account: // = transaction 2 is stalled until transaction 1 commits or rollbacks
    - transaction 1: UPDATE account SET balance = 50
    - transaction 1: COMMIT
    // transaction 2 can now continue and gets balance = 50
    - transaction 2: UPDATE account SET balance = 00
    - transaction 2: COMMIT
    - Result: balance is 0
And this is just one simple example of the problems you can have concurrently accessing one table, even while using transactions. Not to speak of the issues you can run into when interacting with systems outside a single database, which don't interact with the transaction semantics of the DB.

Concurrency is just very non-trivial regardless the abstraction.


Well, I guess I'll just keep digging myself further into a hole!

I want to focus entirely on your first example.

Let me ask you: What is it about your first example that makes you call it transactional? If it behaves as badly as you say, shouldn't it be called a 'method' or a 'procedure'? Because my "fix" for it is to actually use transactions. I suspect your fix would be the same.

Why did you choose to interleave its steps like that, when "Isolation ensures that concurrent execution of transactions leaves the database in the same state that would have been obtained if the transactions were executed sequentially."

If you're telling it like it is, then I cannot argue with facts. I guess I'll stop using DBs, at least until they figure this stuff out in 1973.


> Let me ask you: What is it about your first example that makes you call it transactional? If it behaves as badly as you say, shouldn't it be called a 'method' or a 'procedure'? Because my "fix" for it is to actually use transactions. I suspect your fix would be the same.

We have two concurrent tasks both doing exactly the same thing in order to deduct 50 money:

    BEGIN TRANSACTION;
    SELECT balance FROM account; // = 100
    UPDATE account SET balance = 50; // calculated by application as 100-50
    COMMIT;
Perhaps I misunderstand you, or you misunderstood the way I presented the example (possibly because I presented it poorly). But in my mind there is hardly a way to describe this code as "not transactional".

I merely showed one possible way how these concurrent tasks may execute in practice leading to bugs. Of course, for casual testing this will actually look and work correctly. As one commenter far up the thread said (as an attempt to refute understanding of concurrency as necessary)

> The problem for novices is that a program that behaves correctly looks a lot like a correct program. Until one day it doesn’t.

> And because you’re in production and getting random spurious failures, the panicked (but common) reaction is to wrap every shared resource in a synchronized block. Which makes an incorrect implementation worse but possibly correct.

Then,

> Why did you choose to interleave its steps like that, when "Isolation ensures that concurrent execution of transactions leaves the database in the same state that would have been obtained if the transactions were executed sequentially."

That is only one of the possible ways for transactions to work. Note that IIRC the only database that interprets the SQL standard like this is postgres, while MySQL and Oracle still have (more subtle) serialization issues even on the SERIALIZABLE isolation level (example: https://stackoverflow.com/a/49425872).

Note that you can end up with deadlocks and transaction failures on any level stricter than READ COMMITTED, so the application needs to be able to deal with both of these.


> The problem for novices is that a program that behaves correctly looks a lot like a correct program. Until one day it doesn’t.

> And because you’re in production and getting random spurious failures, the panicked (but common) reaction is to wrap every shared resource in a synchronized block.

Yep yep - that's the Java + Threads model. It's (relatively) harder to take single-threaded logic and make it behave in a multi-threaded setting. Compared to the SQL model, where it's (relatively) easier to take single-threaded logic, wrap it in BEGIN/END TRANSACTION, and have it perform exactly as expected.

OK I get you now. In saying that SQL concurrency was easy and Java concurrency was hard I didn't think about what would happen if you tried to write a mixed Java/SQL transaction; I didn't realise there was a bunch of Java running between your SQL statements. So what would my fix be? Get rid of the Java and replace it with SQL.

> Note that you can end up with deadlocks and transaction failures on any level stricter than READ COMMITTED, so the application needs to be able to deal with both of these.

That's cool - transactions proceed completely or not at all.

About the "not transactional" thing, I was applying (a => b) => (^b => ^a). That is, since transactions are isolated, and you demonstrated code that wasn't isolated, I can conclude that it wasn't a transaction. Maybe I need to adjust my thinking a bit:

    assumption i) Atomicity says "The series of operations cannot be separated with only some of them being executed".

    assumption ii) Isolation says "Isolation ensures that concurrent execution of transactions leaves the database in the same state that would have been obtained if the transactions were executed sequentially."

    assumption iii) I use transactions because they're atomic and isolated.

    A *SELECT balance* was run, passing its value out to the real world before the commit succeeded.  This breaks assumptions i and iii.

    "That is only one of the possible ways for transactions to work" breaks assumption ii and iii.

    So, I can only conclude I should not use transactions.


What's a better alternative to synchronizing access to shared resources?


Treat it like GC and don't leave it up to the programmer.


An make a programmer unable to achieve highest performance when needed. We leave in supposedly free world. If you want to be "protected" be my guest and use languages with GC. Plenty of those. For somebody who need the opposite and uses "unprotected" tools - leave them alone. You have no rights to decide how other people do their work unless they're under your direct control.


Does that involve using concurrency primatives that basically don't allow access to any shared mutable state?


I'd say provide concurrency primitives to disallow direct access to shared mutable state. You still do the reads and the writes, but you let the system take and release locks for you.

Let's say you wanted to turn a list into a bounded list of 4 elements.

Race-condition insert:

    if (sz < 4) {
      list.insert(x);
      sz++;
    }
Safe insert:

    atomically {
      if (sz < 4) {
        list.insert(x);
        sz++;
      }
    }
So atomically organises the locking/unlocking/rollback for you such that a fifth element will not be inserted.


Aren't the semantics of this exactly the same as Java's synchronise? What error does it protect you against compared to synchronising on list? What happens if .insert() also uses atomically{} somehow?


Glad you asked!

synchronized locks code, atomically locks data.

It maps to my thinking better, because what I'm interested in is manipulating two or more resources at the same time. I'm not interested in making sure two threads aren't in the same region of code at the same time.

> What error does it protect you against compared to synchronising on list? Synchronized is perfectly fine for my example. I should have picked a better example of two resources (instead of list and size of list), because synchronized gets trickier once you start combining different pieces.

For example, I have two bounded lists, and I want to transfer an element from one to the other. I've already written synchronized versions of remove and insert, so let's try with them:

    transfer(otherlist) {
        x = otherlist.synchronizedRemove();
        this.synchronizedInsert(x);
    }
There will be a moment in time where the outside world can't see x in either list. Maybe I crash and x is gone for good. Or maybe the destination list becomes full and I'm left holding the x with nowhere to put it. So what is to be done? I could synchronize transfer but that still wouldn't fix the vanishing x, or the destination filling up. So I paid the performance cost of taking two/three locks and I've still ended up buggy.

I think the fix here is to lock each list, then no-one else can access them and it should fix the vanishing x:

    transfer(otherlist) {
        synchronized(this) {
            synchronized(otherlist) {
                x = otherlist.synchronizedRemove();
                this.synchronizedInsert(x);
            }
        }
    }
I think that's correct? But now I have taken too many locks - I only needed remove and insert not synchronizedRemove and synchronizedInsert. And now I've introduced deadlock possibility - if two transfers are attempted in opposite directions.

I can fix the too many locks problem, by exposing non-synchronized remove and insert and have transfer call them instead. But then callers and I will accidentally call the wrong one. I break any pretence of encapsulation by exposing unsafe and safe versions of a method. The deadlock is harder to fix. I'd need to synchronize the lists in some agreed order (and have everyone else obey that ordering in their other methods too!).

Instead, I want my implementation to look something like:

    BoundedList {

        transactional Int sz;
        transactional List list;

        transactional insert(x) {
            if (sz < 4) {
            list.insert(x);
            sz++;
            }
        }

        transactional remove() {...}

        transactional transfer(otherlist) {
            x = otherlist.remove();
            this.insert(x);
        }
    }
> What happens if .insert() also uses atomically{} somehow?

A good implementation would throw a compile-time error. A bad implementation could throw a runtime error.

In order to do this, transactional actions would need to be marked as such - to prevent mixing them up. atomically by definition is a non-transactional action (because it's the thing that commits all the steps to the outside world) so if you find an atomically inside a transaction, it's a type error.

You've already used a system like this if you've worked with any decent SQL implementation:

    BEGIN TRANSACTION

    UPDATE accounts
    SET money = money - 50
    WHERE accountId = 'src'

    UPDATE accounts
    SET money = money + 50
    WHERE accountId = 'dest'

    COMMIT TRANSACTION
I didn't take any 'locks'. I just wrapped two perfectly good individual actions and said 'run them both or not at all'. Though to be fair I am getting a lot of grief in another thread by suggesting that even novices could wrap up their SQL like that without getting it wrong.


Thanks for the detailed reply.

So, atomically{} is basically like a SQL transaction and would repeat or signify failure if it cannot commit the changes you make inside the code block, similar to a CAS lock-free algorithm. This seems quite limited though, you are basically constrained to writing code within the atomic block that deals with value types only, and with no side-effects. Otherwise how would the compiler or runtime know how to roll it back?

That sounds useful but doesn't seem to cover all the use cases of thread synchronization by a long shot. Isn't it also the case that even knowing how to implement interesting alrogithms in a lock-free manner is an area of significant ongoing research? For example I think only recently someone worked out how to implement a lock-free ring buffer (https://ferrous-systems.com/blog/lock-free-ring-buffer/)


> The concepts of threads and concurrent data access is simple enough for any decent programmer to comprehend. There is no hell here.

It's notoriously difficult to reason about concurrent programs using intuition. Much more difficult than reasoning about non-concurrent imperative code. This is why there are articles like [0], and why a bug in a Wikipedia article on a fundamental concurrency algorithm went unnoticed until an analysis tool detected the issue, [1] and why lock-free algorithms in particular are so tricky to get right.

[0] https://www.cs.umd.edu/~pugh/java/memoryModel/DoubleCheckedL...

[1] [PDF] https://llvm.org/pubs/2008-08-SPIN-Pancam.pdf


Threading concept is simple,real world is not.

if i had no idea about C10K problem, success of Nginx, Redis, other concurrency success stories on Actors, CSP concept, Concurrency via messages over shared memory, I would say threads are ok when you can use it. But indeed it is so simple and tempting people to design shitty software. Software is a very welcoming medium for it. it is hell.


And understand it's limitations: https://news.ycombinator.com/item?id=24955376 Java's insecure parallelism




Applications are open for YC Summer 2021

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: