Hacker News new | past | comments | ask | show | jobs | submit login

Javascript devs, for all their flaws, understand async programming far, far better than the average C programmer.

Indeed, your assertion that we need "locking primitives" to counteract "race conditions" is evidence of that. Yes, JavaScript can have race conditions, but not multi-threaded race conditions that cause resource contention [0].

So what good would locking primitives be?

And as a solution to single-threaded race conditions, it's becoming more and more common to use pure functions and immutable data structures in Javascript. In the ReactJS world, it's practically standard to use immutable data structures.

Furthermore, JS devs know how to structure their code, either using nested callbacks or promises, or now, async/await, to avoid async data races. Anyone who programs primarily asynchronously understands these things.

So in summary, it's pretty arrogant to assume that JS devs are going to have to "re-learn" the same things that C programmers learned in the 1990s (also ignorant, because 1990s C was decidedly synchronous). The only devs I know that struggle with race conditions in Javascript are those who are coming from another language or paradigm and who fundamentally fail to understand asynchronous programming.

0. The only exception to this would be those NodeJS devs who use multiple processes or browser devs using web workers along with the ultra-new and not very well-supported shared (memory) resources, both of which are rare in the JS world because there's not much need. You can achieve adequate performance off of one thread for practically any IO-bound application unless you're operating at Facebook scale. And in those case when people are using multiple processes for CPU bound algorithms, they're almost always using them with async message queues anyway, which obviates the need for any locking primitives.

Edit: https://news.ycombinator.com/item?id=21065831 in this case, async/await can introduce a problem. But using an immutable data structure reference as would almost always eliminate this issue.

Also, in practice you're probably using a database whose library has transactions, so you'd "lock" the transaction in this way.

But OK, if you use async/await or generators along with mutability, then a locking primitive could be useful, I'll concede. Although in a single-threaded program, a boolean is just as good.




Here's an example of how you might get a race condition in JS:

    async function deduct(amt) {
        var balance = await getBalance();
        if (balance >= amt)
            return await setBalance(balance - amt);
    }
One way to resolve this would be with a mutex to protect the balance during the critical section (which is async). What would you suggest instead?


I mean the answer here is an async getBalance and setBalance is the incorrect way to do this, and a mutex won't solve that.

if balance is a javascript variable, access shouldn't be provided by async functions. In the case that mutex is a remote resource of some sort (file or network resource) a process local lock won't solve this for you either.

It seems to me that the lack of locks make forces software engineers to consider the nature of their data instead of just grabbing for the inappropriate os lock.


More subtly, here's an example of a hidden race condition in JS:

  async function totalSize(fol) {
    const files = await fol.getFiles();
    let totalSize = 0;
    await Promise.all(files.map(async file => {
      totalSize += await file.getSize();
    }));
    // totalSize is now way too small
    return totalSize;
  }


I would have written it in a more functional way:

  async function totalSize(fol) {
    const files = await fol.getFiles();
    const sizes = await Promise.all(files.map(file => file.getSize()));
    return sizes.reduce((acc, size) => acc + size);
  }


This is an interesting example because it demonstrates a way to confuse programmers that wasn't previously possible. Thanks for sharing it; it was a gem on an article otherwise full of confused comments.

After sharing it at work, someone pointed out that it isn't technically a race condition. The problem isn't caused by operations happening in an unpredictable order. It's unlikely that any of the promises are already resolved, so the lhs is always evaluated first. It's just that the programmer is surprised by the order of operations.

The takeaway is unsurprising: that `await` should be a signal to make one think carefully about state change; having `await` right after `+=` should be a big signal.

But the fact that an unwary programmer can actually be tripped up is interesting, despite some other comments on this article claiming such trip-ups are inevitable.


Because all the promises are waiting on `fileSize`, right?

But do you mean that JS 1. will read `totalSize` then 2. do the asynchronous call, then 3. add and set? Seems like it's ambiguous and JS could just as easily read `totalSize` after the call, and all would be OK.

Or is the ordering specified?

Thanks for this clever example!


The ordering is specified left to right.

  totalSize += await getSize()
becomes

  totalSize = totalSize + await getSize()
So all the map callbacks run one by one, read totalSize as 0, and then suspend waiting for getSize(). Each one then resolves and assigns totalSize to be 0 + the size.

The race is what order the getSize() calls return in since only the last one will control the return value. Otherwise the issue isn't a race but just a logical ordering bug.

(This isn't super different than doing array[two()] = one() since two will actually run first, so ex. array[i += 1] = i will modify i before assigning the value.)

Correct would be to change the body of the map to:

  const fileSize = await file.getSize();
  totalSize += fileSize;
or the whole function to:

  async function totalSize(fol) {
    const files = await fol.getFiles();
    const sizes = await Promise.all(files.map(file => file.getSize()));
    return sizes.reduce((totalSize, size) => totalSize + size, 0);
  }


Wow, why would you define it this way. I had to check the spec because I didn't believe you. They got #2/#3 backwards.

https://es5.github.io/#x11.13.2


What if you change it to

  totalSize = await file.getSize() + totalSize; 
Would it be fine then?


I mean, if something like this is behind a promise or an async call, that probably means that there's I/O involved (no point using async to access local in-memory data), which means that the local locking primitive won't be incredibly useful, you're going to have to lock it via whatever mechanism the I/O channel (or some API using the I/O channel) provides, such as file locking or some API call or something.


Yep, that's the benefit of having to explicitly mark each function explicitly as async. I've heard a lot of people complaining that it's painful, especially during refactorings since it bubbles up, but that's precisely the point: it allows you to know (and choose) whether a function can or cannot be executed interleaved with something else.

That said, if new developers just blindly use async/await without understanding what's going on ... well that's another problem.

The field is filled with foot-guns, for some definition of guns (and for some definition of foot).

The problem with C, or better the problem with pre-emptive multitasking in general, is that even relatively knowledgeable developers were constantly hitting subtle issues with the memory model. Consider the good old trap of efficiently lazy-initializing a Singleton in Java: https://en.m.wikipedia.org/wiki/Double-checked_locking#Usage...


If the balance is handled in a remote server or location then you should have a remote deduct method which handles the operation in an atomic idempotent manner.

If you are handling data locally e.g in a file then you would ideally rely on OS write lock and only store the deltas and never change state. You could then calculate the balance from the log of deductions and insertions.

FYI this problem has not anything to do with async or JS.


Don't have separate get / set functions?

    async function changeBalance(account, amt) {
        /* 
         * BEGIN;
         *
         * UPDATE accounts
         * SET balance = balance - amt
         * WHERE id = ${id};
         *
         * COMMIT;
         */
    }
Of course, this is just a mutex in database transaction clothing :)


>> What would you suggest instead?

On the front-end, I'd do a call to the server. On the server, I'd use a database transaction for that.

But I get your point. The way I deal with race condition in redux is to have a mutex. I.e. While something is being fetched, any call to this command is either ignored or queued up.

I use redux-saga for the "race-condition" and general flow, and call the various async functions from within sagas.


How about an optimistic lock?

    async function deduct(amt) {
        var balance = await getBalance();
        if (balance >= amt)
           await setBalance(balance - amt);

        var newbalance = await getBalance();
        if (newbalance != (balance - amt)) {
            await setBalance(balance + amt);
            // tell the user the transaction failed...
        }
    }


This is getting off topic, but what would probably be best here is a test and set operation: setBalance(expectedBalance, newBalance)


And they call average C developer less experienced in async. The only correct ‘update balance’ operation is:

  INSERT INTO CashFlow
    (date, income, expense)
  VALUES
    (:date, :income, 0)
Get balance:

  SELECT sum(income)-sum(expense) AS balance
  FROM CashFlow
What modern js still has to reinvent is in-client synchronizing storage which naturally resolves who waits on what (if at all). This is partially simulated by reactjs now by trading in a developer’s comfort at near zero price.

All this “await/promise has clear semantics and we think async” is pointless hope because code should never race with itself. Data should, and the storage must be there to kill associated problems once and forever.


Operational transformations are not the "only correct" way to update an integer and it comes with a number of performance and memory/storage consumption implications.

>What modern js still has to reinvent is in-client synchronizing storage which naturally resolves who waits on what (if at all)

I don't understand what this means but it makes me curious. What would be an example of this in one of the languages that have already "reinvented" it?


SQL.

>it comes with a number of performance and memory/storage consumption implications.

Instead of selecting sum(), CREATE TRIGGER then and update a singleton on insert. You can even create a view with N recent transactions and catch inserts into it to maintain that N, if you don’t want a full history.


How about an `updateBalance(updateFn)` that is protected by a mutex:

    async function deduct(amt) {
      await updateBalance(balance => {
        if (balance >= amt) {
          return balance - amt;
        } else {
          throw new Error("not enough balance");
      })
    }


Mutex specifically refers to OS/processor level constructs that protect critical sections using algorithms such as the bakery algorithm or special CPU instructions. None of these are required to protect the critical section in the example above.

You can just use a busy flag. I am not familiar with JS, so this is approximate syntax.

while (busy) { await busyIsFalse } busy = true

Critical Section

busy = false

notify busyIsFalse

Simple boolean flags will work


Use the datastore... use atomic updates and transactions update balance, where balance >= amount, set balance = x


Can anyone explain this to me, how is it a race condition?


Two concurrent calls to that code. Both of them get the same balance (100), pass the test and deduct the amount (75), getting to a -50 balance.


I correct myself. It leads to a 25 balance but possibly a double spend of those 50 dollars in two transactions.

The standard pattern is to use database row locking or calling a stored procedure that performs locking inside. Backend developers typically don't like the second solution but it provides a kind of API. Not to be overlooked if there are multiple services accessing it, especially in a polyglot environment.


Just because code is written in an asynchronous style doesn't prevent correctness errors that would not exist with locking. For instance, using everyone's favorite example, a bank:

In no particular order:

    1. Pizza company debits by balance by $5
    2. I withdraw $5
Assuming both those operations can yield due to e.g. async requests and that they can be resumed at any point, I don't know which order they will yield or resume in.

Consider this interleaving:

    1.1. Get balance, I have $5
    2.1. Get balance, I have $5
    1.2. Set balance to $0
    2.2. Set balance to $0 (but there are total debits of $10!)
With locking:

    1.0. Acquire account lock
    1.1. Get balance, I have $5
    2.1. Get balance: wait on lock
    1.2. Set balance to $0
    1.3. Release account lock
    2.1. Get balance, I have $0
    2.2. Can't set balance, will overdraw!
With regards to resource contention, anything that causes a waiter graph cycle can cause deadlocking, regardless of single- or multi-threading. I can't think of a compelling example here, but nobody expects to have deadlocks yet they still happen :)


> With regards to resource contention, anything that causes a waiter graph cycle can cause deadlocking

Maybe I'll learn something here, but can you explain how an async runtime with no locking primitives like JS could cause a waiter graph cycle?


You don't need locking primitives for a deadlock. Await is sufficient.


An example of a deadlock that continues the bank balance analogy:

A transfer funds method that does: 1. Lock account A 2. Get balance and check 3. Lock account B 4. Debit account A 5. Credit account B

If the two accounts transfer to each other at the same time, you can get: 1. A->B: Lock account A 2. A->B: Get balance and check 3. B->A: Lock account B 4. B->A: Get balance and check 5. A->B: Lock account B -- deadlock

One solution is to always acquire locks in the same order and release all of them when you fail to obtain one. You can do this in the transfer example by sorting by account ID (so always lock A before B).


Sure enough, but in javascript if your balance operations can yield due to async requests than you already have a bigger problem than a shared in-memory variable. Which is really the only case a standard os-lock can solve.

Let's say that balance is in a database or behind a REST api. An os-lock won't solve the problems of other processes trying to update that same resource.


You can imagine a function being unintentionally promoted into async-world for some kind of cross-cutting concern even if the data is stored in memory, e.g. if logging requires an async call, or if reporting metrics requires an async call.

While I agree that in the general practice in JavaScript programming stops you from this kind of footgun, you can still be caught off-guard if you end up thinking that this programming style is immune to this kind of problem: locking is a big hammer with its own problems, but it will never cause data incorrectness.


> Just because code is written in an asynchronous style doesn't prevent correctness errors that would not exist with locking.

Yes, that's why I referenced passing around a reference to an immutable data structure, which is fairly common in JS.


How will you update the bank account with an immutable data structure? You need to mutate the balance somewhere.


Well, in practice you're presumably using a database, which has transactions in the API you're using.

But to answer your question, in functional programming (also frequently ReactJS, if you use libraries like Redux), you don't mutate the data structure. You create a new data structure based off the old one.

Am I misunderstanding your question? Just because you use an immutable data structure doesn't mean it can't be updated.


Database written in not(Javascript) of course, since it has to work and handle concurrency properly.


Yes, of course. Why would you build a database in JS?

And does gmail not work well for you?

Most of us are only using JavaScript because that's the only way to build browser applications (or compile-to-JS languages, which still require an understanding of JS or you'll run into problems).

In any case, I don't particularly like JS, so your dig fell short. But JS is necessary for many of us.


> "Most of us are only using JavaScript because that's the only way to build browser applications (or compile-to-JS languages, which still require an understanding of JS or you'll run into problems)."

The javascript apocalypse is inevitable, we're just yet to reach the tipping point where this hard-requirement to use JS for web/browser applications is no longer there.


> also ignorant, because 1990s C was decidedly synchronous

There was definitely threading in C in the 1990's, and threads are asynchronous by nature. There's a whole class of synchronization bugs that C and C++ developers had to learn to deal with, and while JavaScript developers get to avoid some by the nature of there being a single thread, that doesn't necessarily exempt them from all of them.


Threading implementation was async, not the programming model. Async programming style/model(as far as I'm aware) refers to the use of callbacks or coroutines, neither of which was common at all in C in the 1990s.

Indeed, nginx caused big waves due to its superior IO performance as the first popular async http server released in 2004.

But correct me if you have a different understanding of the term.


Your post manages to say nothing correct. Javascript did not invent callbacks, asynchronous programming, or event driven programming. Nodejs popularized it in the 2010s but you are claiming credit for something that has been in common use since the 70s. I have no clue how anyone can speak with such authority while clearly having not done a cursory glance at programming APIs available in the 80s and 90s.

Lighttpd was released before nginx and was wildly popular for a good while. Not to mention other servers like AOLserver in the 90s.

The use of non blocking sockets was nothing new, used heavily in C code throughout the 90s and is the basis of asynchronous processing (refer to Stevens Unix network programming). You should also read documents by John Ousterhout written in the 90s about this topic.

Now just about every major GUI library was targeted in C or Pascal and used an event driven callback model. You can refer to the Windows API, the modern Mac carbon API which was developed in the 80s at Next and the older Mac SDK. The Windows SDK allows event driven programming for UI and IO. tcl/tk. Just about anything on top of X (ie Motif, GTK). The list will go on and on.

Do some research before making bold claims about C programmers not understanding asynchronous programming and callbacks.

Also I think the other reply is correct in asserting that your definition of async is making distinctions without differences.

Eg from 1995: https://web.stanford.edu/~ouster/cgi-bin/papers/threads.pdf Note that event driven programming was nothing new in 1995.. it was the basis for the Windows api designed years prior after all, but this was around the time that multithreading was pushed for everything.

Heck fibers: https://docs.microsoft.com/en-us/windows/win32/procthread/fi... have existed in the Windows api since at least 95.


I'm referring to Os level threads, but functionally it's no different if we talk about forking with shared memory. Given that the vast majority of computing 15 years ago was single processor and multi-process or multi-thread for ansynchronous behavior (or use of select), it's all functionally equivalent, and many of the same problems discovered many decades ago for multi-process programs apply.

If I have a main program, and I fork a child to handle a task, and use shared memory to communicate, how is that any different than Javascript executing and and async call that sets or returns a value? The Javascript runtime has the same behavior as the OS scheduler in this, and if there's a single processor, it necessarily will only execute one instruction at a time. There's still plenty of pitfalls to worry about and that's why locks were (and are) useful, and why they are included with OS thread implementations (which are really just a nice API on forks and shared memory often dealt with by the OS for additional benefit).


The most common need I've found for some kind of locking mechanism in JavaScript is the very basic example of asynchronous functions triggered by UI. For example, button -> http call -> navigate. If the user presses the button twice, the call is made twice before navigating, which has unexpected results in some cases.

The lock in this case could be as simple as disabling the button as soon as the user clicks on it, but I've also find Promise-based locking functions to be very useful in solving these kind of problems.

I wouldn't call these "primitives" though - you don't need a language construct for it. It's easy enough to build the locks as functions working with Promises.


> (also ignorant, because 1990s C was decidedly synchronous)

unless you were programming in... any GUI toolkit ever except the most toy ones ? even win 3.1 GUI primitives were async


Yeah, fair point. Technically, windows was C++ but close enough, plus as you say other Gui toolkits also used async. And I should have remembered that since Gui libraries used async for the same reason as browser JS. It's not good to block the UI thread.

I was thinking about C network programming, which despite what people are saying on here, was not event driven in the 1990s (I was there).


> 1990s C was decidedly synchronous

You know that Javascript is a C(++) program? That when you use TCP, the protocol is in C? The ethernet driver is written in C? The OS scheduler is written in C? That you're programming in a little sandbox, and all the concurrency around you in managed in C?

There has never been anything synchronous about C.


> You know that Javascript is a C(++) program?

JavaScript is not implementation-defined. There's a spec, and there are interpreters in a number of different languages. Sure, most common JS interpreters/JIT and otherwise, are in C++, but that says nothing about whether or not they use an event-driven style underneath to program the interpreter.

And regarding C being synchronous, I'm talking about programming models, not underlying architecture of the computer or OS. If everything is programmed in async as you seem to suggest here, then we do we bother distinguishing the two? Why do most books on networking programming have a separate chapter devoted to async or event-driven programming? Why does the Unix socket API have `socket.setblocking(0)`?

But I suspect you know damn well what I'm talking about.

I've got to get off of social media.


No matter what language you are using, concurrency happens in instructions, interrupts, cores, caches, devices, virtual memory mechanisms, etc, not even getting into GPU architecture. In C you have direct control over these things, you can make the system as concurrent as you want.

In Javascript you have a little window into this through whatever the layer below provided you. So Javascript by definition has a (small) subset of the concurrency you can get in a systems language.

And no I'm not entirely sure what you're talking about. It sounds like you learned concurrency in Javascript, and define concurrency in terms of Javascript primitives. But that's merely a guess on my part.


> "Javascript devs, for all their flaws, understand async programming far, far better than the average C programmer."

Not the junior ones, they don't. I understand the need to want to talk about this advanced topic with only perfect and good developers, but this is almost never the case in concrete scenarios. Developers forget, developers share code with other developers and the consequent spaghetti mess is hard to reason about 100%, developers make mistakes, developers are sometimes yet to learn something, developers miss small bugs, new code deals with library code that might have an async bug, etc.

Sequential flow is much easier to reason about and control for, and if you ask for my opinion, we should only use async features if the benefits of their usage far outweigh the potential complexity and headache that they introduce if you don't "use them the right way".




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: