More

grantwu · 2025-03-11T15:01:43 1741705303

> By far the most important aspect is that we need to keep the new codebase as compatible as possible, both in terms of semantics and in terms of code structure. We expect to maintain both codebases for quite some time going forward. Languages that allow for a structurally similar codebase offer a significant boon for anyone making code changes because we can easily port changes between the two codebases. In contrast, languages that require fundamental rethinking of memory management, mutation, data structuring, polymorphism, laziness, etc., might be a better fit for a ground-up rewrite, but we're undertaking this more as a port that maintains the existing behavior and critical optimizations we've built into the language. Idiomatic Go strongly resembles the existing coding patterns of the TypeScript codebase, which makes this porting effort much more tractable.

--https://github.com/microsoft/typescript-go/discussions/411

I haven't looked at the tsc codebase. I do currently use Golang at my job and have used TypeScript at a previous job several years ago.

I'm surprised to hear that idiomatic Golang resembles the existing coding patterns of the tsc codebase. I've never felt that idiomatic code in Golang resembled idiomatic code in TypeScript. Notably, sum types are commonly called out as something especially useful in writing compilers, and when I've wanted them in Golang I've struggled to replace them.

Is there something special about the existing tsc codebase, or does the statement about idiomatic Golang resembling the existing codebase something you could say about most TypeScript codebases?

jchw · 2025-03-11T15:10:53 1741705853

> I'm surprised to hear that idiomatic Golang resembles the existing coding patterns of the tsc codebase. I've never felt that idiomatic code in Golang resembled idiomatic code in TypeScript.

To be fair, they didn't actually say that. What they said was that idiomatic Go resembles their existing patterns. I'd imagine what they mean by that is that a port from their existing patterns to Go is much closer to a mechanical 1:1 process than a port to Rust or C#. Rust is the obvious choice for a fully greenfield implementation, but reorganizing around idiomatic Rust patterns would be much harder for most programs that are not already written in a compatible style. e.g. For Rust programs, the precise ownership and transfer of memory needs to be modelled, whereas Go and JS are both GC'd and don't require this.

For a codebase that relies heavily on exception handling, I can imagine a 1:1 port would require more thought, but compilers generally need to have pretty good error recovery so I wouldn't be surprised if tsc has bespoke error handling patterns that defers error handling and passes around errors as values a lot; that would map pretty well to Go.

Most TypeScript projects are very far away from compiler code, so that this wouldn't resemble typical TypeScript isn't too surprising. Compilers written in Go also don't tend to resemble typical Go either, in fairness.

nathanrf · 2025-03-11T17:55:54 1741715754

I'm not involved in this rewrite, but I made some minor contributions a few years ago.

TSC doesn't use many union types, it's mostly OOP-ish down-casting or chains of if-statements.

One reason for this is I think performance; most objects are tagged by bitsets in order to pack more info about the object without needing additional allocations. But TypeScript can't really (ergonomically) represent this in the type system, so that means you don't get any real useful unions.

A lot of the objects are also secretly mutable (for caching/performance) which can make precise union types not very useful, since they can be easily invalidated by those mutations.

dcre · 2025-03-11T15:39:01 1741707541

In the embedded video they show some of the code side by side and it is just a ton of if statements.

https://youtu.be/pNlq-EVld70?si=UaFDVwhwyQZqkZrW&t=323

1oooqooq · 2025-03-11T21:39:50 1741729190

to be fair, there's not many ways to implement a token matcher.

though looking at that flood of loose ifs+returns, i kinda wish they used rust :)

dcre · 2025-03-11T23:35:07 1741736107

I’d guess Rust compile times weren’t worth it if they weren’t going to be taking advantage of the type system in interesting ways.

grantwu · 2025-03-05T06:50:53 1741157453

I haven't read "Changing Minds".

> As he points out: it would be terrible! We’d lose the intuitive understanding of how to use the gears to solve any situation we encounter. Which mode do you use for gravel + downhill?

I have no idea, actually, what gears I should use for gravel + downhill, which is surprising since about 6 years ago I biked from Pittsburgh to DC, which had a fairly long downhill section. I remember mostly coasting on that section. I might've stopped a few times to make sure my brakes could cool down. I don't think there is a wrong gear on a bicycle when coasting down on gravel?

Perhaps I am the user who does need a nightmare bicycle with a gravel + downhill button.

grantwu · 2024-11-05T16:28:29 1730824109

The timeline says that the initial report was 6/16 and the initial patches were 7/8 and 7/18.

It's not clear to me what was exploitable when.

grantwu · 2024-10-09T19:20:52 1728501652

> In exchange for such favorable terms (i.e., small carrying cost, matures on death), the bank will receive a share of the collateral’s appreciation (essentially amounting to “stock appreciation rights"), and this obligation will be settled upon the borrower’s death.

It's a loan in name only.

Regarding Bezos's selling of stocks - perhaps he has offsetting capital gains. See https://old.reddit.com/r/BuyBorrowDieExplained/comments/1f26...

grantwu · on Sept 9, 2021

> Blocking within one, with those locks acquired, and thunking back down to userspace, would mean that 1. a CPU core, and 2. the filesystem itself, would both be tied up indefinitely until that callback thunk returns. If it ever does!

In this theoretical design, you could just block all other administrative modifications. I don't think you need to tie up an entire CPU core, and I'm fairly sure that these zfs operations don't block regular reads and writes.

I think you had it right in your initial comment. There's no good way express branching with an implementation which incrementally submits operations to be committed as a batch. You'd have to take an admin lock on an entire zpool.

EDIT: talked to a zfs dev, said this would take the txg sync lock.

derefr · on Sept 9, 2021

While there are ways to deschedule both userspace and kernel threads, there is no mechanism to deschedule a userspace thread while it's executing in the middle of kernel mode because of a blocking syscall.

Think of it like trying to deschedule a userspace thread in the middle of it having jumped to kernelspace to handle an interrupt. It just wouldn't work; that's not a pre-emptible state, not one that can be cleanly represented during a context switch with a PUSHA, not one where pre-emption would leave the kernel in a known state, etc.

So the CPU core is tied up because the original thread can't be descheduled, and instead would still be "stuck" in the middle of the system call, doing a busy-wait on the result of the callback. To make the callback actually happen in this hypothetical design, the execution of the callback would need to be scheduled onto another CPU core, using some system-global callback-scheduler like Apple's libdispatch.

Note that this is also why, in Linux, processes stuck in the D state are unkillable. They're stuck "inside" a blocking system call, and so cannot be descheduled, even by the process manager trying to hard-kill them (which, in the end, requires the system call to at least return to the kernel so that the kernel resources involved can reach a known postcondition state.)

And this is why innovations like io_uring make so much sense in Linux — they allow a userspace process to 1. make a long-running blocking syscall, while also 2. spawning a worker subprocess to communicate asynchronously with the logic inside the running syscall, by queuing messages back and forth through the kernel rings. (Picture, say, sendfile(2) messaging your worker to let you observe the progress of the operation, and/or to signal it on a channel to gracefully cancel the operation-in-progress.)

grantwu · on Sept 9, 2021

I'm not following what you're saying. Why do we need a callback?

In this imaginary design, the syscalls you make would look something like:

- BeginChannelTx -> return ChannelTxID

- ReadZFSProperties(ChannelTxID, params) -> return data

- DestroySomeDatasets(ChannelTxID, params) -> ok

- CommitChannelTx(ChannelTxID)

Notably, DestroySomeDatasets doesn't actually do any work. It merely records which datasets you want to destroy. There are no callbacks as far as I can see: there's no kernel thread waiting on a user thread to do something. This way also lets you express branching.

The drawback of this approach is you need a lock on all mutating administrative commands when you call BeginChannelTX. I talked to a ZFS dev, and he said that with ZFS' design, that's actually the txg sync lock. This means that while reads will proceed, writes will only proceed for a short period of time, and nothing will make it to disk. The overhead of making all these syscalls was also judged to be problematic.

grantwu · on Sept 8, 2021

I was really really excited when I saw the title because I've been having a lot of difficulties with other Go SQL libraries, but the caveats section gives me pause.

Needing to use arrays for the IN use case (see https://github.com/kyleconroy/sqlc/issues/216) and the bulk insert case feel like large divergences from what "idiomatic SQL" looks like. It means that you have to adjust how you write your queries. And that can be intimidating for new developers.

The conditional insert case also just doesn't look particularly elegant and the SQL query is pretty large.

sqlc also just doesn't look like it could help with very dynamic queries I need to generate - I work on a team that owns a little domain-specific search engine. The conditional approach could in theory with here, but it's not good for the query planner: https://use-the-index-luke.com/sql/where-clause/obfuscation/...

joppy · on Sept 8, 2021

Arrays are nicer for the IN case because Postgres does not understand an empty list, i.e “WHERE foo IN ()” will error. Using the “WHERE foo = ANY(array)” works as expected with empty arrays.

grantwu · on Sept 8, 2021

Works as expected? Wouldn't that WHERE clause filter out all of the rows? Is that frequently desired behavior?

ethanpailes · on Sept 9, 2021

I could imagine that you're building up the array in go code and want the empty set to be handled as expected.

grantwu · on July 20, 2021

Can someone explain why Blastdoor has been unsuccessful? Is it too hard a problem to restrict what iMessage can do?

grantwu · on May 3, 2020

Can you point to a source that defines Levenstein distance as only referring to bitstreams?

A translation of the original article [1] that introduced the concept notes in a footnote that "the definitions given below are also meaningful if the code is taken to mean an arbitrary set of words (possibly of different lengths) in some alphabet containing r letters (r >= 2)".

And if you wish to strictly stick to how it was originally defined, you'd need to only use strings of the same length.

More recent sources [2] say instead "over some alphabet", and even in the first footnote, describe results for "arbitrarily large alphabets"!

[1] https://nymity.ch/sybilhunting/pdf/Levenshtein1966a.pdf

[2] https://arxiv.org/pdf/1005.4033.pdf

arcticbull · on May 3, 2020

And Unicode is the biggest alphabet haha.

grantwu · on April 19, 2020

Where? CTRL-F Randen doesn't show anything, and the Randen repo claims it's faster than ChaCha8.

espadrine · on April 19, 2020

It is not directly in the article, but in a link to a tweet by djb, the creator of ChaCha8. He believes that the cpb listed in the Randen comparison is off:

https://twitter.com/hashbreaker/status/1023965175219728386

He mentions that perhaps the implementation of ChaCha8 for the benchmark is done by hand and unoptimized. And it is true from what I saw that a lot of benchmarks with ChaCha8 are implemented with none of the tweaks that make it fast.

In this instance, it looks like the Randen author didn’t reimplement it from scratch, but they used an SSE implementation, not an AVX2 one, which would have been faster: https://github.com/google/randen/blob/1365a91bafc04ba491ce79...

grantwu · on Aug 2, 2019

DevOps engineer link appears to 404.