If your vector for leverage is that subtle it’s going to be completely useless as a negotiation tool. Unionizing is about outcomes for workers, not retribution.
Nice. I've got a whole lot of magical things that I need built for my day job. Want to connect so I can hand the work over to you? I'll still collect the paychecks, but you can have the joy. :)
I thought it was a bit odd that the author claims there’s no mutexes in sight, the TVar is effectively a mutex guard unless I’m misunderstanding this? (I’ve written exactly 0 lines of Haskel). Or is the claim that the lack of ceremony and accidental complexity around threading is the real win for concurrency here?
No, a TVar is not a mutex guard. A TVar is a software transactional memory (STM) variable. STM works just like a database: you batch together a sequence of operations into a transaction and then execute them. During execution of a transaction, all changes made to the contents of the TVar are stored in a transaction log. If some other transaction occurs during the execution then the whole thing is aborted and re-run.
This can take any ordinary Haskell data structure and give you a lock-free concurrent data structure with easy-to-use transactional semantics. How it performs is another matter! That depends on the amount of contention and the cost of re-playing transactions.
This library is full of STM-oriented data structures. They perform better than a simple `TVar (Map k v)`.
It's kind of a fun trick actually. The stock Map is just a tree. The STM Map is also a tree [1] but with TVars at each node. So this helps a lot with contention - you only contend along a "spine" instead of across the whole tree, which is O(log n).
[1] Technically a HAMT a la unordered-containers - trie, tree, you get the idea :)
I know you say it depends on how much contention one sees but I'm interested in the performance hit. Also, is STM the "standard" (or accepted) way to do async in Haskell?
You are correct, Haskell has quite a few mutex-like types. MVar is one of them.
However, if memory serves me right, TVar is a building block for the transactional memory subsystem. The guard on TVar with, say, modifyTVar is not really stopping execution at entrance but simply indicating that the block modifies the variable. In my mental model, some magic happens in an STM block that checks if two concurrent STM blocks acted upon the same data at the same time, and if so, it reverts the computations of one of the blocks and repeats them with new data.
To my knowledge, Haskell is the only programming language (+runtime) that has a working transactional memory subsystem. It has been in the language for about 20 years, and in that time many have tried (and failed) to also implement STM.
Clojure's STM never really took off because, for various reasons, it's not as easy to compose as Haskell's (where you can build up a big library of STM blocks and piece them together at the very edges of your program). As such Clojure's STM implementation doesn't actually have a great reputation within the Clojure ecosystem where it isn't usually used in most production codebases (whereas in Haskell STM is often one of the first tools used in any production codebase with concurrency).
Basically it's the difference between focusing only on transactional variables without having a good way of marking what is and isn't part of a larger transaction and having a higher-order abstraction of an `STM` action that clearly delineates what things are transactions and what aren't.
My impression at least watching chatter over the last several years isn’t that it has a bad reputation but rather that people haven’t found a need for it, atoms are good enough for vast bulk of shared mutable state. Heck even Datomic, an actual bona fide database, doesn’t need STM it’s apparently all just an atom.
But I’ve never heard someone say it messed up in any way, that it was buggy or hard to use or failed to deliver on its promises.
Clojure atoms use STM, though. I've been writing Clojure for almost a decade now, it's not that STM isn't great, it's just that immutable data will carry you a very long way - you just don't need coordinated mutation except in very narrow circumstances. In those circumstances STM is great! I have no complaints. But it just doesn't come up very often.
“ Taking on the design and implementation of an STM was a lot to add atop designing a programming language. In practice, the STM is rarely needed or used. It is quite
common for Clojure programs to use only atoms for state, and even then only one or a handful of atoms in an entire program. But when a program needs coordinated state it really needs it, and without the STM I did not think Clojure would be fully practical.”
Haha, I read The Joy of Clojure way back in 2013 and conflated the different reference types with STM. So thanks for mentioning that, I always thought it weird that you'd need STM for vars and atoms too.
That said, I have never used a ref, nor seen one in use outside of a demo blogpost.
I would say to the contrary it would come up all the time if the right idioms were in place.
For example, when it comes to concurrent access to a map the Clojure community generally forces a dichotomy, either stick a standard Clojure map in an atom and get fully atomic semantics at the expense of serial write performance or use a Java ConcurrentMap at the expense of inter-key atomicity (or do a more gnarly atom around a map itself containing atoms which gets quite messy quite fast).
Such a stark tradeoff doesn't need to exist! In theory STM gives you exactly the granularity you need where you can access the keys that you need atomicity for and only those keys together while allowing concurrent writes to anything else that doesn't touch those keys (this is exactly how e.g. the stm-containers library for Haskell works that's linked elsewhere).
You missed a very important detail, the language runtime.
While Haskell's runtime is designed for Haskell needs, Clojure has to be happy with whatever JVM designers considered relevant for Java the language, the same on the other platforms targeted by Clojure.
This is yet another example of a platform being designed for a language, and being a guest language on a platform.
I don't think this is a limitation of the JVM. When I've used Clojure's STM implementation it's been perfectly serviceable (barring the composability issues I mentioned). Likewise when I've used the various STM libraries in Scala. Eta (basically a Haskell implementation on the JVM that unfortunately stalled in development) also had a fine STM implementation.
It's more of a combination of API and language decisions rather than the underlying JVM.
> Implication of Using STM
Running I/O Inside STM— There is a strict boundary between the STM world and the ZIO world. This boundary propagates even deeper because we are not allowed to execute arbitrary effects in the STM universe. Performing side effects and I/O operations inside a transaction is problematic. In the STM the only effect that exists is the STM itself. We cannot print something or launch a missile inside a transaction as it will nondeterministically get printed on every reties that transaction does that.
Does Zio actually offer any protection here, or is it just telling the reader that they're on their own and should be wary of footguns?
STM happens inside the STM monad while regular effects happen in the ZIO monad. If you try to do ZIO effects inside an STM transaction you'll get a type error.
Scala doesn't enforce purity like Haskell though so it wont stop you if you call some normal Scala or Java code with side effects. In practice its not a problem because you're wrapping any effectful outside APIs before introducing them into your code.
If you lock a section of code (to protect data), there's no guarantee against mutations of that data from other sections of code.
If you lock the data itself, you can freely pass it around and anyone can operate on it concurrently (and reason about it as if it were single-threaded).
It's the same approach as a transactional database, where you share one gigantic bucket of mutable state with many callers, yet no-one has to put acquire/release/synchronise into their SQL statements.
No a TVar isn't a mutex guard. As a sibling comment points out it gives you transactional semantics similar to most relational databases.
Here's an example in perhaps more familiar pseudocode.
var x = "y is greater than 0"
var y = 1
forkAndRun {() =>
y = y - 1
if (y <= 0) {
x = "y is less than or equal to 0"
}
}
forkAndRun {() =>
y = y + 1
if (y > 0) {
x = "y is greater than 0"
}
}
In the above example, it's perfectly possible, depending on how the forked code blocks interact with each other, to end up with
x = "y is less than or equal to 0"
y = 1
because we have no guarantee of atomicity/transactionality in what runs within the `forkAndRun` blocks.
The equivalent of what that Haskell code is doing is replacing `var` with a new keyword `transactional_var` and introducing another keyword `atomically` such that we can do
transactional_var x = "y is greater than 0"
transactional_var y = 1
forkAndRun {
atomically {() =>
y = y - 1
if (y <= 0) {
x = "y is less than or equal to 0"
}
}
}
forkAndRun {
atomically {() =>
y = y + 1
if (y > 0) {
x = "y is greater than 0"
}
}
}
and never end up with a scenario where `x` and `y` disagree with each other, because all their actions are done atomically together and `x` and `y` are specifically marked so that in an atomic block all changes to the variables either happen together or are all rolled back together (and tried again), just like in a database.
`transactional_var` is the equivalent of a `TVar` and `atomically` is just `atommically`.
As siblings note, TVar is a transactional variable. However, it's not just protective against concurrent writes but also against concurrent reads of altered variables, so it offers true atomicity across any accessed state in a transaction.
So if you have a thread altering `foo` and checking that `foo+bar` isn't greater than 5 and a thread altering `bar` and checking the same, then it's guaranteed that `foo+bar` does not exceed 5. Whereas if only write conflicts were detected (as is default with most databases) then `foo+bar` could end up greater than 5 through parallel changes.
I have a windows for work, Mac as a laptop and Linux on my workstation desktop. Windows is by far the worst, I don’t think Linux is vastly superior to Mac, they both have some things they do better than the other. My main issue is arm vs x86.
A couple of my M1 machines were purchased on the day of launch at the Apple Store so they’re the 8GB RAM models. They’re still very good running Apple software (even Final Cut Pro) but I find myself running out of RAM with a few web browsers running at the same time. The base model of the M4 mini now has 16GB by default which I’m sure would be just fine for me.
Mine has 64GB. I usually max RAM, and get about double the SSD I think I need. Knowing that at a minimum, I will use my latest laptop, and 2-3 older models around the house and my quick-carry bag for convenience, at any given time.
Beyond convenience, the old laptops are continually synced, as multiple onsite backups. So I get great long term value from consistently choosing higher end RAM/SSD specs.
But in this case good specs for the M1 has saved me money via an unprecedentedly long upgrade schedule.
I am feeling more pressure to update two old x86 laptops to M1, than any pressure to upgrade my M1. Never had this upside-down problem before. Apple did just a great job with that M1.
Mostly because of computational efficiency irrc, the non linearity doesn’t seem to have much impact, so picking one that’s fast is a more efficient use of limited computational resources.
There used to be competing centers of power. But then they stacked the judiciary and used manipulative propaganda to turn the congress and senate into a rubber stamp. The only check on power was having the interest of those institutions not aligned with each other, for them to have power that they were able to exercise independently.
I think the breathless hype train of twitter is probably the worst place to get an actually grounded take on what the real world implications of the technology is.
Seeing the 100th example of an llm generating some toy code for which there are a vast number of examples of approximately similar things in the training corpus doesn’t give you a clearer view of what is or isn’t possible.
I think that most of the developers who advocate for AI coding have never worked all by themselves on projects with over 500/1000 files. Because if they had they would not advocate for AI coding.
I posted this earlier, but I wanted a java port of sed for ... Reasons, and despite the existence of man pages and source code it couldn't do anything but the most basic flags.
Imo this should be low hanging fruit. Porting non-trivial but yet 3-4 core code files that are already debugged and interface specified should be what an LLM excels at.
I tried this with Microsoft's Copilot + the Think Deeper button. That allegedly uses the new o1 model. It goes into a lot of fancy talk about...pretty much what you said older models did. Then it said "here's some other stuff you could extend this with!" and a list of all the important sed functionality.
It's possible it could do it if prompted to finish the code with those things, but I don't know the secret number of fancy o1 uses I get and I don't want to burn them on something that's not for me.
You should be able to access it here if you have a Microsoft account and want to try the button: https://copilot.microsoft.com/
reply