Hacker News new | past | comments | ask | show | jobs | submit login
Criticizing Hare language approach for generic data structures (ayende.com)
151 points by ayende on May 4, 2022 | hide | past | favorite | 265 comments



"Bring your own datastructures" is likely one of the C carry-overs for Hare. If you think about many "classic" C applications, a lot of them basically only exist around/for one or a few data structures (e.g. most servers, many utilities etc.). In that context BYODS is somewhat defensible: If you don't care to implement the data structure, why does your application exist in the first place?

But that's not "modern" applications, which often deal with a ton of data structures. C applications that do so usually have a few "template headers" lying around (indeed, certain operating systems ship with them), though one of the most frequently used data structures in C is of course the intrusive linked list. Which is not a good data structure to use in most cases. Why is it used so much in C? Because it is easiest to implement. Why does that matter? Because C makes implementing generic data structures tedious. Is it a good idea to replicate that model?

qed.


Rust and Go seemed to initially want to target C programmers [1], but ended up capturing users from higher level languages – even Python and such – who wanted better performance and more control, because they had so many affordances.

Hare clearly isn't like that: it's _actually_ aimed at (FOSS) C programmers, almost stubbornly so, and isn't going to appeal to many others. But for those people I can see C-with-tagged-unions being an improvement, and there's value in finding the minimal diff that makes a language better.

[1] Remember when Go called itself a "systems programming language"? Neither community seems to use this phrase any more, though.


Rust still considers itself a “systems programming language” . They are moving Rust into the Linux kernel and android drivers. It doesn’t get any more “systems programming” then that. But you are correct about go, it doesn’t give the programmer enough control to be used at that level.


> But you are correct about go, it doesn’t give the programmer enough control to be used at that level.

What I like about go is that I can go full unsafe.Pointer if i want to, and do whatever I want.

The other thing I really like, is that they did such a good job discouraging it that i frequently hear people complain about go having pointers but not even letting you have fun with them.

The problem isn't that go doesn't give you enough control, the problem is that it's garbage collected. And you'd need to work around the GC. It's doable and people have done it though. Good idea? Maybe not.

Still waiting for someone to make "go but with ownership and borrow checker"


I think that being garbage collected is exactly the control problem in question.

I'm no a systems programmer unless you count hobby microcontroller projects, but my understanding is that, when a systems programmer is talking about control, they tend to be talking first and foremost about keeping tight limits on memory usage. Pointers and suchlike just happen to be key tools that help you do that.



> Still waiting for someone to make "go but with ownership and borrow checker"

That would have interested me more a few years ago, but I'm deeply invested in Rust these days.

Is that all you'd change about Go? One of the reasons I moved was the error handling and lack of generics. Go has fixed the latter, but I can't stand not having Result<> and Option<> these days. The '?' is awesome too.


> Is that all you'd change about Go?

No, go has plenty of warts, like every language (ever formatted a date in go?). I just don't like kitchen sink languages.


You can't even call a non-Go library without switching to cgo, which contradictorily basically everyone says not to use.

Leaving aside the GC, until Go has a real FFI it's hard to to imagine it being a really great fit for systems programming when the systems are all not written in Go.


Go is not a traditional systems programming language and calling it one is misleading at best and maybe even borderline dishonest. Two big problems are the lack of control of memory layout and the lack of some form that compiles down to a computed goto. It’s just not possible to write performant systems code without those.

I’m not bashing Go though. I appreciate the focus on readability among other things. It’s a fine language for the use cases it was actually designed for, like pushing petabytes of ad data to serving clusters all around the world with acceptable latency and reliability.


https://www.withsecure.com/en/solutions/innovative-security-...

I guess the Genera, Xerox PARC, ETHZ, Microsoft folks are borderline dishonest.


I would say that they evidently felt the need to write a new compiler and runtime, which appears to be a species of agreement with my thesis.


Apparently writing compilers isn't systems programming....


In my prior comments above and in all the following I use "language" in the typical way, which is to say referring to not just the syntax but the semantics of the standard toolchain and runtime as well. I wanted to clarify that since perhaps there is some confusion there. So when I talk about Go I'm talking about what I get here[1] as is everyone else who isn't explicitly specifying some other implementation.

Writing compilers is not systems programming in the sense that it requires a systems programming language, no. One could easily write a C compiler in Ruby, but we don't consider Ruby to be a systems programming language.

Thus, obviously, Go, despite not being a systems programming language, could be used to write the compiler for a systems programming language. I guess that is what Tamago is? I'm not going to read through the source to find out and the web page you linked is boring marketing copy.

[1] https://go.dev/dl/


Go compiler toolchain is written in Go.

TamaGo is a bare metal runtime for unikernels, whose main use is a commercial product for secure USB keys, sold by F-Secure.

Back on my youth, writing compilers was considered systems programming, in parity with kernel drivers, how things change.

I guess writing userspace drivers in hybrid kernels is also not considered systems programming.


Yes I agree it's a nice language. But the FFI situation is mildly infuriating.


You can use the same FFI as K&R C, write a couple of Assembly helpers.


Any links to help me understand what you mean about the K&R C FFI? (Not easy to google for)

Possibly related, building against foreign objects and manually setting up the FFI call: https://words.filippo.io/rustgo/.


Easy, K&R C was rather limited, inline Assembly wasn't yet a common extension, rather you would use the external assembler, and link both object files together.

What was good for C while it was gaining adoption, surely is good enough for Go.


Gotcha, that is along the same lines as the link I shared.



> “go but with ownership and borrow checker”

Wouldn’t this be just a lesser version of rust then? If someone makes a new language and wants it to catch on, it needs new ideas. Not just nice simple syntax mixed with something someone else is already doing.


> Wouldn’t this be just a lesser version of rust then?

That depends if you consider C++ an upgrade over C :)

Many flamewars have been had on this topic and i think it's just a matter of personal taste.

I'm firmly in the less is more camp.


Hilarious that you think that you could just add a borrowing mechanism + ownership and still be in the so-called less is more camp. That language would be looked down upon by the so-called less is more folks because of the “straightjacket” it would force on the programmer.


[flagged]


I have no fists in this fight, but wanted some clarification.

Are you saying the reason Rust is now being integrated into the Kernel instead of Golang is because of the success of evangelists, not because of the merit of the language itself?

I'm not involved in Kernel development myself, but if I was, I'd see your statement as a big hit in the face that we don't know what we're doing, if they (Kernel developers) are being sold to use Rust because of marketing, not because of the value of the language.


I must admit, I am still surprised to see Rust gaining acceptance when the reasons Linus gave in 2010 for rejecting C++ in the kernel appear to apply to Rust equally well. I can't think of any of the pragmatic points which he raised which doesn't apply to Rust.

For example, the arguments against namespaces and function overloading, high context dependency, easy of understanding and in favour of C's simplicity in general.

My guess (it's just a guess) is that his position has shifted over time to consider a higher-abstraction, higher-complexity language, as development methods and the complexity and professionalising of the kernel have changed since then.


Its not existing kernel devs that suddenly switched to rust. It is the incoming cohort.


The main push for and patches to include rust in linux have come from an established and longtime kernel contributor. What you’re saying is categorically false.


For someone to start using a new language in the Kernel, some of the existing "guard" has to approve it, meaning they've investigated if it's fit or not. Or can anyone willy-nilly contribute to the Kernel without anyone approving changes going into the main branch?


Miguel Ojeda is doing the lion’s share of the work here, and he is a kernel maintainer: https://ojeda.dev/

The person you’re replying to is either deeply misinformed or trolling


The existing guard doesnt have to prove anything, theyre on the way out. C is a tough language and not a lot of people have the patience for it, and even less people are being forced to learn it like the rest of us in school. As this segment of the community ages out, we arent going to see more code written in C, so we have the new group with Rust, recreating all of the same issues in effigy of C, but with better string handling. It's not like it is terrible code, but as Nikola Tesla used to say "it is what it is, and it aint what it aint".

You see how parties completely unrelated to the fight got Linus ousted? That was not based on technical merit, it was due to the changing tide regarding outcomes when people are offended/mistreated. Rust has significant overlap with the community that gave way to this, don't think it is 100% open arms and welcomes, because that is not the case.

Every day we stray further from God.


Damn right I don't have the patience for C. Writing anything in C properly takes a long time, and even then bugs are usually found after.

Compare and contrast Rust, where the code I write will usually just work. I don't need to be extra careful with it because the compiler is strict. You can say that this is indicative of some flaw in my character or whatever, but, uh, I don't care.


Cool, so everyone can just ignore you then, since this is entirely incorrect.


My understanding is that Go has several features that make it unsuitable as a Linux kernel language, e.g. automatic garbage collection, its concurrency model, and possibly the nature of its runtime dependency (which is related to the previous two items.)


Garbage collection is not mandatory, and the concurrency model is often hailed as one of Gos best features.


A best feature it may well be, but it requires infrastructure and the OS kernel provides that infrastructure it doesn't depend on it.

Rust for Linux relies on the fact that Rust is deliberately structured in layers so that the bottom layer doesn't need that infrastructure, this was needed to make embedded Rust practical, but it's also important for Linux. Actually, Linux needs even more than Rust had initially, but that's driven further improvements to Rust itself.

You could add a layer to do all this lifting (that's what e.g. a Java OS does) but that's not going to fly in Linux, which is one reason why Go for Linux isn't a thing whereas Rust for Linux is.


>Garbage collection is not mandatory

What does that mean? Isn't GC a fundamental part of Go?


I'd never use Go to write a kernel in, but you can disable the GC using `GOGC=off`.


I like Go and I use it often. I also work on kernels. Go is not suitable for writing kernels.


Im choosing to imagine that you typed this with one hand behind your back with some fingers crossed Drew.


> there's value in finding the minimal diff that makes a language better.

OTOH, if the differences are minimal, then the benefits are likely to be correspondingly minimal. And either way, the compiler/interpreter for a relatively new language will by definition be much less battle-tested.

Practically speaking, I think if a new language is going to compete with C, it needs to offer more than a few incremental improvements.


Mostly because writing compilers, linkers, GC implementations, GPU debuggers, OS emulation layers, distributed orchestration systems, unikernels, isn't considered systems programming by the crowd that is allowed to judge it.


No. Rust always wanted to appeal to C++ programmers moreso than C programmers.


"Because C makes implementing generic data structures tedious. Is it a good idea to replicate that model?"

A lot of so called "modern" languages make a choice for you. Here is a hash map, over there is a class, have a tuple, set, etc. And they all have a specific way to be used.

In implementing all that, many languages lose on flexibility while gaining general utility. Hare seems to go in the opposite direction, one that makes programmers think hard about the solutions to their problems.

And that is, IMHO, a good thing.


I think the main thing is that e.g. python allows you to reimplement all of its most basic data structures in python if you so desire. There's no real way to reimplement e.g. an array in C. The result being that python could still be useful if it wouldn't come with a bunch of data structures built-in.


BYOD combined with "we don't need a package manager" sounds like a particularly bad combination. I understand Rust's "small stdlib but great package ecosystem", but whats the appeal of "anemic stdlib plus no package ecosystem"?

Maybe I would come up with something like this if I only ever worked on Linux software using C, but outside those niches these choices seem weird.


> but whats the appeal of "anemic stdlib plus no package ecosystem"?

You can declare your language feature complete sooner


Correct me if I'm wrong, but Hare doesn't say "we need no package manager" but instead "we don't need a new package manager, use the one your OS provides", at least from how I understand it.


That strikes me as another C-ism that didn't deserve to be carried forward into a 21st century language.

There's a reason we call it "dependency hell" and not "happy fun fussing with dependency version conflicts among projects that are completely unrelated except that you happen to be using the same computer to work on them time."


The state of the art in package management could be considered Nix and Guix and those are both operating system level package managers and thats why they are better than npm or pypi or cargo, as the aforementioned language specific package managers _dont_ handle dependency hell as native dependencies exist. they all have the same problem as 'they work on my computer' and also encourage this ecosystem of thousands of micropackages, which in a thread so focused on security seems a little ironic given there is no way to guarantee all of those dependencies are made by good actors without a lot of vetting that just is not happening.


I've never had a "works on my machine" bug when dependencies are managed per project. The only time I have ever had them is with global/os-level dependencies. And I've used cargo/npm a lot in the last few years.


I do agree on the micropackages.

What I keep daydreaming of is a package manager that resolves and download packages, but doesn't automatically grab transitive dependencies. I don't even want it saying, "Hey, we need to grab these 15 other ones, too, is that OK?" I don't want to hide the pain of huge dependency graphs like that; I want to feel it acutely. Give me an error message saying, "Oops I couldn't add FancyPackage because it depends on X, Y and Z transitive dependencies," and send me on a fetch quest. And I want the whole community around the language I'm working in to feel it acutely like that. That way we're all in the same boat, and collectively incentivized not to create the problem in the first place.


That is on the library developers to settle out. Keeping your library API stable requires the mind of a genious and discipline of a monk.

Now, keeping your language design stable and backward-compatible is a no brainer. Given that, it's possible to handle "dependency hells" simply by building from source.

Hare is designed to be ultra conservative, so you don't have to worry about dependency hells. Features are in code, not builtin into the language.

It is a win-win situation.


That's a great way to make a program work only on specific distros.


Seems it fits Hare perfectly then, as one of the explicit goals of Hare is to only support non-proprietary OSes.


I'd guess it does the opposite?

If I'm writing for a proprietary OS, that's probably Windows or OS X. Both of which give me a more stable target to aim at when it comes to what dependencies I can expect.

If I'm writing for an open source OS, that's probably some flavor of Linux. But which flavor? And which package manager does it use? And do they support all the dependencies I need, in the versions I need? Without pulling non-standard package repositories into the mix?

When open source software has difficulty working on proprietary OSes, I think it's usually not because of the package manager. More often it's because of something like a glibc dependency. At which point the domain of support isn't really "open source OSes", it's usually something more like "glibc-based linux distributions."


How is being free related to having a specific package in the package manager?


Why? Proprietary OS have package managers.


How so?


Because you can't guarantee that every distro's package manager will have the packages you need.


Build them from source, then.

Hare is a systems programming language. What kind of a systems programmer doesn't know how to build from source?


So the solution is just vendoring? Vendoring is very clunky—that's why we have package managers in the first place.


Language designers have every right to avoid complex features if such a feature impede the language in any way.

Hare does the smart thing - it decouples the library distribution model from the language.

After all, it's the programmers job to make sure that distribution and packaging of his/her library is as simple as possible.


Every programmer should not be re-solving the problem of library distribution just to distribute their library. This is a solved problem; a go-style git repository model is the bare minimum that a language should have.


Having to download several dependencies from random websiees and install them is not the most fun way to spend an afternoon. Especially if half of them have cryptic build errors.


So Github is a random website now?

I think that Hare's workflow and usecase encourages having very few or no dependencies.

That means, in most cases, you should just be able to download a project's source, build and use it right away.


Package dependency resolution is fundamentally impossible to do efficiently and often broken (for legacy reasons) in specific implementations. Programmers are terrible at semantic versioning. Community package repositories are security nightmares. Under those conditions not including a package manager seems like a reasonable decision.


I question the assertion that Hare's standard library is "anemic"; it includes a pretty decent set of batteries for many use-cases:

https://docs.harelang.org

Many Hare programs will not need to have dependencies at all. We encourage a more conservative approach to dependencies than is common in many modern languages such as Cargo, PyPI, npm, Go, etc. Even dependency-heavy Hare projects will not have hundreds of dependencies, but maybe dozens at most. Think more C, not Node.

https://harelang.org/distributions/


That does look like a decent collection of functionality, but a HashMap as mentioned in the submission seems like a pretty glaring omission. I can't remember the last time I wrote a program that didn't use a Map. Even dynamic languages that tend to be quite light on types almost universally include a map type.

I can definitely see the problems with the Node "explosion of modules" approach, but if we're looking at fixing issues with C then IMO "everyone implementing their own data structures and not always with the care and attention they deserve" is right up there with things like lack of proper arrays/strings and poor null handling.

It seems like it ought to be possible to find a middle ground where dependencies are easy to find, install, update and integrate with a standardised build system, but where there is a cultural norm of being conscious of dependency size and not using micro-dependencies


Take a look at this comment where I explained why a first-class hash map is not a good fit for Hare:

https://news.ycombinator.com/item?id=31258759


Most application's bottlenecks are not their linked lists, and optimizing for anything other than your bottleneck is not the wisest use of your time. Performance is a budget, and I feel comfortable spending some of that budget on simplicity.


Linked list has a O(N) search time, which is atrocious.

A really common optimization is to replace a linear scan on a list with a hash table. In most languages, that is a trivial step to do, which means that it gets done.

There isn't any overhead.

With hare, you just blew your complexity budget on this thing.


It's not atrocious unless you need the performance. A linked list with a hundred items which is scanned every 90 seconds or something does not really demand attention for optimization, but it will be simpler, easier to write, easier to understand, and easier to debug, and those are wins are not to be sniffed at.

And for the record, Hare does have built-in growable slices, so linked lists are pretty rare in Hare. They exist in a few niche situations -- I only ever wrote a Hare program with linked lists once.


For sure there's plenty of circumstances where you don't need more performance, but you can't seriously argue re-implementing another linked list is simpler, easier to write, easier to understand or easier to debug than `std::vector`/`Vec`. Multiple allocations and setting up pointers is certainly more complex than allocation/copy. It's most definitely easier to write because you're not writing it, someone else already has. It's easier to read because there's a whole lot less of it. And a single contiguous allocation is way easier to debug.


My impression, although it was difficult to be sure, is that Hare's "slices" (or maybe arrays?) are actually vectors, in at least some cases.

It seems you can grow them by appending, so they aren't like Rust's slice which is non-owning or a typical array which can't grow.

I suspect a Linked List is actually almost never the right shape for a data structure in Hare, because you should just use these vectors to do that.

The two actual rationales for Linked List in the 21st century (other than "I was writing C and I don't know what I'm doing") are: 1. Concurrency, Lock Free and sometimes even Wait Free algorithms are practical for Linked List and the other costs pale into comparison on highly contended data structures and 2. I'm doing some serious acrobatics with huge lists, I constantly perform split/ merge operations and so those being cheap is crucial. Hare doesn't have concurrency, and nobody should attempt such acrobatics in a language this dangerous.


Do you remember what the reason that you went with a linked list in that case? I’ve been learning hare and was curious about where to use them.

Also, did you use a nullable pointers to represent the next node element in the linked list? Or a tagged union of ( ptr | void)? Or something else?


Yep. I used it in a kernel to create a linked list of statically allocated data structures in a context where dynamic allocation was off the table. You can see the code here:

https://git.sr.ht/~yerinalexey/carrot/tree/master/item/conso...

Went with nullable pointer types, which is also a rarely used feature in Hare.


Cool, thanks for the link and info


but maps are simpler to use (if appropriately provided by the language).

maps/sets are also fundamentally building blocks for a lot of algorthms and structures, not having them makes it WAY harder to implement them. actually potential impossible as performance can easily degrade to a point where it's unusable when you use O(n) (or worse) lookups instead of O(1) ones.

this is not a case of premature optimizations but of missing fundamental building blocks

and if you don't want genetics, ok, do it like go did initially make the map special language thingy.


Linked lists have generally awful cache locality. If linked lists are not the bottleneck, it's because they already aren't being used for anything where performance matters (which is generally as it should be).


Why do people spend so much energy shooting down a language nobody uses? If you don’t like it you don’t have to use it. I would hate to be the creator of Hare and get so much crap for simply daring to exist.

He is making a language he likes and wants to use. There are probably some people who like it and find it useful. Good for them! A lot of us have dreams of building our own language. He is following his passions and people are just lining up to be Debbie Downers.

Now I almost feel like I need to learn more Hare to give the guy some encouragement and positive feedback on things done right. Be the glass half full guy.

Well I won’t. I think Zig will be the last C-like language I invest time in. It is so easy to get sucked into something new that will never go mainstream.


> Why do people spend so much energy shooting down a language nobody uses? If you don’t like it you don’t have to use it.

Because this language isn't made by a nobody. It's made by a popular person with a following, so the language _will_ get used by a non-insignificant number of people. And those people _will_ (re)implement all the things that you want a language to have (database connections, networking libraries, (de)serialization libraries, etc.). And these things, which can be performance/security-critical, will be impacted by the issues people are pointing out.

Once we reach this plausible scenario, it's likely that some software that you want to use on, say, your server, will depend on those libraries. And _then_, like it or not, _you_ will be impacted by the choices of the language and be subject to whichever bugs all those people made when reimplementing (in this case) hash tables.

We could assume that everyone writing libraries in this language will be an expert and not make mistakes, but that is just a non-starter. In a sufficiently-large ecosystem, the people writing code will come from all sorts of backgrounds and have all levels of expertise, so you _will_ get buggy code in important libraries, and those bugs are the stuff most CVEs are made of nowadays.

I get with what you are saying that people are free to develop a language as they choose, and I agree with you there---I'm building my own and it's _not_ a modern language---but there can be consequences in the larger industry, and that's the "scary" part.


> And those people _will_ (re)implement all the things that you want a language to have (database connections, networking libraries, (de)serialization libraries, etc.). And these things, which can be performance/security-critical, will be impacted by the issues people are pointing out.

This language is only available on platforms where you are free to choose whatever software you want. If you think a Hare program is insecure, don't use it.


What happens when your financial institution or your doctor, whose IT departments run Linux on their servers to handle your personal data, install such insecure software?


The same thing that happens when they use any other insecure software.


Usually we like for replies here to reply what the person above wrote.


I don't agree with this. 'Important' software, like something you would want/need for your server, is not 'important' enough if it has a considerable amount of bugs, and therefore _will not_ be used by significant amount of people. (and will not pose a real threat to the community at large). At least not significant enough to explain all the hate authors and contributors are getting.

Let's say it's 2030, and I want to host my own matrix server. I would first compile a list of all matrix server implementations, and filter out those with confirmed bugs, and only then choose one (at random or some other criterion). The ones with bugs introduced by some library written in Hare/C/Rust/C++/X language will never be popular enough to even end up on the starting list cause people don't want to use buggy software (at least if there are options and in sufficiently-large ecosystem, there are options). And you might say that those bugs will go unnoticed - yes, but no more unnoticed than any other bug introduced by faulty logic but written in, say, Rust.

Also:

> so the language _will_ get used by a non-insignificant number of people

Hare is fairly opinionated (e.g. it mainly targets FOSS operating systems) so saying that it _will_ be used _just_ as a result of being related to ddevault (a person with a following), is a _wild_ guess. It does not matter how big of a following the author has, if the language itself is bad, it will not get used, or at least, it will get forgotten soon enough.

>In a sufficiently-large ecosystem...

In a sufficiently-large ecosystem, performance/security-critical software will be created using a language that is used not because of the author's following… Again: if the language itself is bad, it will not get used, or at least, it will get forgotten soon enough.

You can write buggy code in _any_ language.


I call this 'bitcreep' (https://www.gwern.net/Holy-wars): the mere existence of a language or library will make coordination harder and gradually impose costs on the rest of the world and form a little bit of a gravity well sucking everyone towards it, for better or worse. If there are network effects and gains from networks, then there are likely anti-network effects and losses from splitting networks...


But why be the one to shoot it down?

If the industry makes mistakes, that's not news.


Imagine if we’d had the opportunity to complain about PHP or JS back in their early days, and had the opportunity to fix some of their awful decisions _before_ they became practically set in stone?


There is a distinct difference between PHP, JS and Hare. Hare is being developed and designed with a particular point of view by a developer and early adopters who are passionately supportive of that POV. PHP and JS were both thrown together with minimal direction and to achieve specific goals in as fast a time as possible (if I am remembering both languages begins correctly). It could be possible that another language in the position of JS and PHP at its inception might be amenable to deep and fundamental changes; however, I do not think Hare fits this scenario.

If Drew and Co want to make Hare the way it is, awesome! If you don’t want to use share and prefer something else, great!


Simply, it is a bad point of view, and deserves discouragement in exact proportion to the amount of promotion it gets. He wants it both ways, and does not deserve that.


What particularly are you objecting to about Devault’s POV? I mean most of this thread and the article are based on the lack of generic data structures or complaints that Hare is not Rust. The complaint about lack of built-in data structures seems to be focused mostly on developer comfort, with a sprinkling of issues concerning possibility of bugs in implementations of specific structures. With regard to the continual comparisons to Rust, ala ‘memory safety’, if any model of software development outside of Rust’s limited paradigm is ‘bad’ then nearly all code in use today is ‘bad’. That seems to be an unsupportable position in the extreme.

I understand some people will not want or value any given piece of software or infrastructure, but the blanket pessimism is not proportional to the alleged detriments proposed by Hare’s mere existence.

Maybe I am just being overly sensitive to what I see as over reaction to someone’s programming language preference, but this is the reason I have never, and probably will never, release the language I made and use day to day to the public.


You cite a list of trivial cavils, which I have no truck with.

Hare's problems are much deeper. They stem directly from Drew's fundamental misunderstanding of the nature of software development. Drew is correct that managing complexity is central to the problem of software development. He is 100% wrong on how a programming language can contribute to solving that central problem.

There has been a great deal of progress, in the past five decades since C burst on the scene, on managing complexity in programming tasks, with programming language design taking a big bite. Hare adopts exactly none of that success, setting users straight back to the 1970s again. That might have been fine if demands on programmers were no greater than in the 1970s (although the bugginess of software coded then argues otherwise), but anyway we do not live in that world anymore. The programs we want to run don't fit in 64K of RAM and run single-threaded with no network connections, anymore. Where a 1970s language is used for programs today, we all suffer from the failings the language almost unavoidably invites into them.

The fundamental insight that Drew has wholly missed is that essential complexity demands more attention to get right than is typically available for one use case. We have learned that pulling complexity into a library component, or (better) a Standard Library component, or failing that a programming language feature, enables attention to be paid to it amortized across all the programs and libraries that depend on it. All the meaningful progress in programming languages since been in finding ways to bring more of what must be expressed into one place so it may be more widely usable, and able to command correspondingly more attention to its performance and correctness. Standard library components in our modern languages successfully remove their complexity from what programmers must manage.

The ways we have discovered to help with this process include powerful type systems that have been put to work doing heavy lifting far beyond mere "type checking". We have generics to work with types the way types work with values. Certain languages provide extra tools such as destructors/Drop traits that enable automating resource management. Many languages assert direct responsibility for pieces of that task, imposing borrow checking or GC. There is much left to do.

Hare provides exactly none of the tools that have been discovered to encapsulate complexity into carefully correct and generally useful components, or to enable using such a component in all the different places and ways it would be useful. This failing is painfully evident in the library components it does attempt.

In the 1970s, Hare might have competed with C and Pascal to express the then newly valued "structured programming". Today it is a toy. Evaluated as a toy, there is nothing wrong with it. But it is not presented as a toy. As a tool for programming in the modern world, it is sorely lacking everywhere that C is sorely lacking, that make C directly responsible for the massive suffering documented in the litany of CVEs, botnets, database exposures, and ransom events.


"Hare provides exactly none of the tools that have been discovered to encapsulate complexity into carefully correct and generally useful components, or to enable using such a component in all the different places and ways it would be useful."

I would be very cautious when calling anything that encapsulates complexity "carefully correct" or "generally useful". Hiding such complexity behind language features is how we ended up with new generations of programmers who have no idea how to code anything themselves.

If generations and generations of stack overflow copy-pasters are the future of programming, then I don't want to be a part of it.

"In the 1970s, Hare might have competed with C and Pascal to express the then newly valued "structured programming". Today it is a toy."

Excuse me? The most useful, the most endured, the most sane and the most understood coding paradigm is a toy to you? You know, some of us really like to think of our programs as many procedures being called one after the other.

That's the most logical way to think about them, because it's the exact way the CPU sees them.


Yes, a toy: inadequate for professional work. You can cut grass with scissors, but nobody who understands the task will hire you to do it.

CPUs do not, in fact, see procedures. They execute machine instructions. Your "procedures", exactly, 'encapsulate complexity behind a language feature'. A rudimentary feature, true, but one identically the same as is found in modern languages, among their more powerful features.

I am, indeed, always cautious about calling things "carefully correct and generally useful". Still, I am aware of many language features and standard library components in modern languages that fully satisfy those criteria.

Who pastes code copied from stack overflow? I assume that was a ham-handed attempt at an insult. You are welcome to continue fooling with toy languages, old and new. Pretending they are adequate tools for serious work says more about you than about them.


"Yes, a toy: inadequate for professional work. You can cut grass with scissors, but nobody who understands the task will hire you to do it."

Can you explain what you mean here? In the context of structured programming.

"CPUs do not, in fact, see procedures. They execute machine instructions. Your "procedures", exactly, 'encapsulate complexity behind a language feature'."

That's right, CPUs execute instructions. Those instructions are executed one after another, with jumps when a certain operation should be repeated or skipped. This is the definition of structured programming. Structured programming is a simple abstraction over how CPUs execute machine instructions.

I don't see any objects with methods there. Nor do I understand how would a CPU understand a lambda function or care if something is immutable or not.

"I assume that was a ham-handed attempt at an insult."

No insult there. It is just a sad reality how many developers rely on Stack overflow instead of using their own head a little.


It is not, in fact, the definition of structured programming.

Structured programming was defined by Edsger Dijkstra, and consisted originally of a model of programming with restrictions on where branches may go. Later programming language constructs if/then/else and while implemented this model. It does not address subroutines.

I describe C and similarly primitive languages as toys because they do not provide the expressive power needed to enable the levels of productivity and reliability demanded of serious work.


I think the critique is more valuable for other readers like the non-Hare users in these comment threads and in particular language designers, than at Hare developers. If it can inform Hare development to make a better language that's great too. If they choose their path ignoring it, that's their prerogative, but at least it was expressed and given a chance for adapting.


> But why be the one to shoot it down?

It's a very young language. One might consider that early mistakes can be addressed, but only if somebody speaks out about them.

I question framing this question as "shooting it down." The project will clearly survive this criticism. The article is by somebody who took the time to learn the language, and will probably continue to invest their time into the language. Criticism of this kind is a gift.


If the author of Hare didn't want feedback on his language, or didn't want other people to use his language, he shouldn't have advertised it in blog posts and on sites like https://harelang.org/, which is clearly an advertisement for the language. As it happens, he has.

You might think "nobody will use the language", but if nobody pushes back and says why people shouldn't use the language, perhaps people will start to. It's really important that people do write feedback pieces like this, because it helps other people be informed about their choices.


Also early on in the language seems a great time to respond to feedback. 3 years in when it has traction is terrible, heh.


So what you're saying is basically: "he's asking for it", right?


That counter doesn’t have much force to it when we’re talking about mere argumentation and not violence.


That's the point I make in the first paragraph, essentially. But the second paragraph is another reason.


Do you think that this type of argument has more value when talking about Drew than about a victim of sexual abuse? I think the rhetoric of "asking for it" works equally poorly when applied to all aspects.

And I will assume that your next comment will be about how the two don't compare. And I am in full agreement with that, except for the fact that trivializing abuse is another element that people use in this type of debate, and I believe that abuse is abuse. It might not have the same emotional impact on the victim, but the abuser psychology is similar in all cases, and the slippery slope of losing sensitivity to it begins with accepting discussion board bullying as normal.

PS. However this discussion is at least an order of magnitude removed from TFA, which is not abusive, merely presumptuous and premature.


> There are probably some people who like it and find it useful.

I've really been enjoying Hare. As a primary Go/Rust developer its syntax and limitations feel like what I've been looking for and it's youth predicates plenty of opportunity to work on new project. When I was working on a ANSI Colors lib [0] last night I spent a lot of time in the community IRC. Your post is basically what everyone's feeling. This is a language designed for people who like it's style. It doesn't need to have this or that feature. And if the community changes their mind, what better candidate than an open source tight-knit young language?

[0] https://github.com/tristanisham/color


We don’t need more memory-unsafe languages with no generics and no functional programming. We just don’t. If it were a toy project for learning and pedagogical purposes that’s one thing but for code that’s supposed to be used in production it’s useless.


You sure speak surely about "us" like there is just one group of programmers out there. Some people love JavaScript, some people love C. Some people only use programming languages, some create new ones.

Live and let live, don't tell others what they need.


Let me quote the part of the comment that you didn't read:

> but for code that’s supposed to be used in production it’s useless

This isn't about doing what you love - it's about engineering a product, a product that needs to have certain safety, security, and stability guarantees. That's the context of this conversation, and for that purpose, Hare isn't suitable.


Honestly memory unsafe languages shouldn't be used with exception of very few corner cases. It's the same reason why we no longer allow cars without seatbelts. You don't go around saying that seatbelts shouldn't be mandatory because you don't need them.


What's with this obsession about what other people do? You don't have to use a memory unsafe language, but others might.


Because software is generally written so that people use it. Memory unsafe languages have proven it's impossible to write secure software in, even by the best of the best developers. That's why I made the comparison with seatbelts. If you want to drive your car on public roads(deliver production software to users) you are not allowed to use memory unsafe languages(you must use a seatbelt). If you just make software for yourself that you don't release to the public no one will care if you write it in C or whatever.

It has nothing to do with anything you do. It has everything to do with the impact it has on everyone but yourself.


It’s also massively unsafe to drive in general, so we should probably ban the usage of cars wherever possible.


Driving also brings great value. Using C over another language "Because you can't tell me what to do" does not. The more apt comparison would be if we had fully self driving cars that are proven to make almost no mistakes. And then still wanting to drive a car yourself on public roads, endangering yourself and others.


C and C++ still see overwhelming usage in many industries because of the business value they provide in terms of performance and how quickly you can produce software in it. That does not seem likely to change any time soon- game studios are not widely adopting rust, for example, and I don't expect that to change, even though they impact a MASSIVE amount of people.

Hare's value proposition is simplicity- which allows you to better reason about the software you write, for one thing. We are each entitled to our own opinion about how valuable that is.


C and C++ are at diametric opposite ends of the scale of value proposition. Listing them side by side eliminates any credibility your argument might have had.

Hare's "value proposition" of simplicity is, simply, a shuck. Any software effort has essential complexity that your language may help you manage, or chuck you in the deep water to sink or swim. Hare does the latter, and is even self-righteous about it.


I mentioned performance and game development specifically. I'm sure you're aware that it's common to write C++ that's very close to C while taking advantage of few features of C++. I don't think it's fair to call that the "diametric opposite of C".

Your opinion of Hare's view of simplicity is your opinion. Any software effort has some amount of essential complexity, in my experience many languages and tools quite a bit of incidental complexity that could be avoided.


You can write bad code in any language. (We used to say "you can write FORTRAN code in any language", back when.) C code compiled with a C++ compiler is C, and bad. With the C compiler, you had no choice. You do not get that excuse when you have a C++ compiler.

Avoiding unnecessary complexity is everybody's responsibility. Dumping every last bit of unavoidable complexity onto the programmer where it has been long demonstrated that tooling can take care of much of it is simply irresponsible, and inexcusable.

I do not excuse it.


We license people to drive and have a legal system for dealing with irresponsible drivers.


Companies have policies for software security. Industries have regulations.


The most common policy will be for SOC2, ISO27001. There's virtually nothing about software security in there, other than scanning for vulnerabilities. Everything else is about controls on infrastructure, threat modeling, that sort of thing.

It's not really relevant.

Besides, none of those have to do with software engineers. Compliance doesn't imply any individual responsibility.


The most common policy for what? I've worked in defense and there are stringent security requirements for applications and servers, which programs written in manual memory management languages would likely fail. "Some existing policies are bad" is not a convincing argument that hare shouldn't exist.


The most common for companies to have/ be required to have.

> I've worked in defense and there are stringent security requirements for applications and servers.

Like? FEDRAMP? The vast majority of compliance is about threat modeling and access controls.


> Like? FEDRAMP?

No, STIGs.

> The vast majority of compliance is about threat modeling and access controls.

If compliance doesn't address security, that sounds like a bigger issue than a new hobby language existing.


The issue is that no one is taking responsibility for writing safe software. Compliance / regulation isn't, developers aren't either.


I can assure you, the automotive industry uses more C/C++ than you've ever seen. C is used precisely because it's a mature and verifiable language.


I'm not sure what your point is. I didn't claim memory unsafe languages aren't used, they obviously are. C is in fact not an easy language to verify. There have gone trillions of dollars into that and it's still not solved. You can't verify C. You might be able to verify a subset of C. But if you are in any position to choose any other language you should run, run, run quickly and far away from C.


My point is that your metaphor about "not [being] allowed to use memory unsafe languages" to build a car is completely backwards -- safety-critical embedded systems use C to avoid the abstractions introduced by higher-level languages.


It's not backwards. C is used out of necessity not because it's such a great language. Safety-critical embedded systems use C because they have to. Not because it's nice. Claiming they do because C is such a great language to write safety-critical software in is a special kind of Stockholm syndrome.


There is no such language as C/C++. Your statement is meaningless, as stated.


I’m allowed to have opinions about what tools should be used for what. I think it’s fair to say you shouldn’t write a production HTTP service in assembly, for example.


The problem with this is that you're now the Ministry Of Truth of programming tools.


You're allowed to disagree with me. I don't claim to be the ultimate authority and I have no power to enforce my opinions. If I were your tech lead and you wanted to use Hare for something I would probably say no but we could have a discussion about it and you might convince me otherwise. But that's not the context here. This is a programming forum, so I will speak my mind.


Doesn't even make sense. The Ministry of Truth is tasked with rewriting the past to match the present.


We all have opinions. The 1984 reference is unhinged.


> You don't have to use a memory unsafe language, but others might.

And we're telling them they shouldn't.


A person's actions sometimes have effects on other people. Such as not wearing a mask, for example. Or writing some piece of obscure but critical infrastructure with a memory vulnerability that causes all someone's precious apes to be stolen.


> Such as not wearing a mask, for example.

That's pretty rich


How so? I'm sure you take my point: that by indulging selfish wants, we can harm others. Private situations, do what you like; shared or social contexts, not so much.


Masks don't even work for many of the things people want them to work for.


Oh, hey! That makes them an even more perfect analogy. They help, massively, protecting others by removing many forms of transmission; but they don't prevent all transmission. Likewise, memory safety helps, massively, protecting others by removing many forms of security vulnerability; but it doesn't prevent all security vulnerability. And, crucially, not to get too far distracted from the original point: both are things we try to make use of where we can, not for ourselves, but for others and for the health of our overall communities. To really spell it out in words of one syllable: that's why people care about whether other people use memory safe languages.


All the while I have to use broken, unsafe software on a day to day basis I reserve the right to complain about things that make programs broken and unsafe


The vast majority of software that I use on a daily basis is broken because of logic errors and/or performance problems, not crashes due to memory unsafety. The former has nothing to do with memory safety, and the latter isn't solvable at all by programming language design.

I'm not saying that memory unsafety isn't a problem, but if you really want to make software better, you should address the biggest problems first.


I personly don't, yet they land on hardware devices and servers I am not asked about, and have an impact on my digitial life.


> for code that’s supposed to be used in production it’s useless

Then ... don't use it. Others can use it and see what they learn from it, or even whether they find it productive to build stuff with. Even Brainfuck and Malbolge have devoted users who enjoy the challenges they pose, so I can promise you there's room enough in the world for this language. Let's just have some camomile tea and calm down a little.


Except maybe one day I'll be using some software written in Hare which will mess up and have negative implications for me.

Furthermore, it could be argued that our time could be spent more productively than on a language like Hare.


Then don't use that software. If you're right, software made with it will be worse - if even producible - and it won't take off. I'm sorry, but the ship has sailed: it is already possible to create bad software. The Hare language is not opening some hitherto-unopened Pandora's box. The same forces carrying good software to the top, and bad software to the bottom - and with them the tools used to make both the former and the latter - will continue to function.

As for whether our lives could be used better, well, let's leave that to each individual. I heartily encourage you to live your life as I do, worrying about what I myself am spending time on (currently: this inane conversation), rather than whether other people might be spending their own time in some supposedly-slightly-suboptimal manner.


For Christ's sake, you already use messed up software.

It is literally everywhere. Your kernel, your networking stack, your interpreters, your office suite.

Almost everything you use has some roots in C, a memory unsafe language that "messed up" and still became the most widely used language ever.

Software will always mess up, regardless of language it's written in.


What, in your opinion, does functional programming bring to the table for writing things Hare is designed for?


Clearly, it brings a warped perspective on how a CPU functions.

Do I really have to guess the order that my procedures are called? FP zealots say that I do.


Same things it does for Rust.


Why do you feel the need to mention functional programming in a thread about a clearly structured, systems programming language?

What makes generics and functional programming style mandatory for any given language?

These things just add complexity and philosophy to a field that values pragmatic thinking.

Functional programming is not pragmatic. And neither are thusand ways to handle generics.


Not sure what you mean. Rust has no functional programming either ...


Functional programming patterns are very common in Rust code. It doesn’t have automatic currying or higher-kinded types but it has generic associated types, anonymous functions, closures, and a rich set of collections with functional processing methods in the standard library.


A lot of the standard library and an extensive part of the community ecosystem lives and breathes mutation at both the edges and in the internals, something people normally would avoid when trying to stick with functional programming. Sure, some mutation is always needed, but Rust wouldn't be the first language I'd pick if I want to focus on functional programming for sure.


No, but it’s widely acknowledged that functional programming features are useful even in imperative languages. Almost every imperative language that isn’t C, Go, or Hare has them. Java, C#, JavaScript, Python, Ruby. You should take a look at all the functions available on iterators in Rust if you haven’t. The Rust community tends to prefer using iterators to express things like looping when possible, which is a more functional style.


Mutation within a single program flow is a well-known functional programming pattern, e.g. as supported by the ST feature in Haskell. Shared mutability is much harder to reason about, so is generally excluded and Rust implements it separately.


Rust is a functional programming language, that's exactly what's implemented in the basic borrow checking rules. Interior mutability is added to that featureset, so as to support procedural programming with shared mutable state.


I can not agree with Rust being a functional programming language. I don’t think it has NO functional style idioms, but the borrow checking does not make it functional. That would imply that the only characteristic of functional programming is some sort of ‘variable’ mutability constraint. I may not be up to defining functional programming by exclusion, I think any definition of functional programming includes things Rust does not have.


It borrowed some FP concepts, it's influenced by FP but it's definitely not an FP language. It's very much a language about imperative, explicit control. And that's a good thing for the use-cases that Rust targets.

It doesn't have in-built immutable data structures, most basic data structures and types are mutable and have mutate in place methods as their primary way of interaction. Struct fields being private by default is a dead giveaway (unnecessary in an FP language). Closures and functions are not the same thing and you cannot do much with them except calling them and passing them around.

I would rather describe it as an imperative language with lightweight OO (via traits) and some FP sprinkled on top. Note I'm not saying you cannot write FP code in it or that Rust doesn't have great unique benefits.


> It doesn't have in-built immutable data structures

Immutable data structures by default would involve avoidable overhead, so that's why they aren't a built in. There are well-known crates that do provide persistent data structures in Rust.

> Struct fields being private by default is a dead giveaway (unnecessary in an FP language).

Data hiding is just as useful in functional programming; in particular, it enables you to expose an interface to data that stays valid under any isomorphism, while preserving desired invariants. Structs with generally accessible and updatable fields are a special case.


You can't do proper functional programming without garbage collected closures. There is a reason why Haskell has a garbage collector ... and it's not "the borrow checker wasn't invented yet".


Haskell uses a garbage collector to enable closures to extend the lifetimes of values they capture within their context. You can do the exact same thing in Rust by using Rc or Arc, it just isn't required in most cases because the borrow checker can verify that the value is not outlived by the closure.


This point has been made so many times that it gets tiresome. Perhaps that's ok, but it doesn't really tell us anything new.


> I would hate to be the creator of Hare and get so much crap for simply daring to exist.

Certainly.

Since the announce of Hare I've seen __too__ much criticism IMHO.

I think the authors have stated clearly the intentions and the features of the language: They just want a slightly better C. Of course you may not like it, but no need to be that harsh...


A hash table that compares hashes instead of keys, doesn't support deletion, doesn't distribute keys evenly etc. is not even wrong; criticizing it is definitely not "crap for daring to exist", especially in the polite tone of the linked article.

If Hare is intended to be a deliberately Spartan C replacement, so be it; but pretending that writing a hash table is easy looks like the immature and arrogant opinion of someone completely unaware of his ignorance.


> A hash table that compares hashes instead of keys, doesn't support deletion, doesn't distribute keys evenly etc.

I agree, that implementation is definitely flawed. Probably the author do not consider that important a proper hashtable.

> looks like the immature and arrogant opinion of someone completely unaware of his ignorance.

This is what I'm referring to when I say "harsh" criticism. Maybe he is wrong, but I think there are better ways of saying it: "Hey, maybe a proper hashtable is not that easy.. have you consider [insert here improvement proposal]?"


Everybody can have a bad day and write terrible code, but blogging about it as a positive example and as evidence that a practically useful programming language doesn't need high quality generic data structures suggests a deeper problem at understanding the needs of programming language users.

Wrong questions (focusing on the narrow case of unique, overspecialized application-specific data structures), not only forgivable wrong answers (that hashtable from the Hare compiler, after all, could be close to good enough for its specific usage).

Personally, I think abstract data types based on some kind of macros or templates would be a valuable feature for Hare, even if their use in the standard library is limited, because someone will want to write reusable libraries.

Even within the confines of an application, there is a practical need to deal with different uses of a data structure in the same way and make changes in one place (e.g. "the generic hashtable in our ORM" ) rather than in possibly numerous quasi-duplicates (e.g. 250 different struct to struct hashtables representing database query results in a large business application).

C++ templates, a good example of what could be realistically offered by Hare, allow to upgrade a definition from "do X with one type" to "do X with any suitable type", which is a useful abstraction even without reuse (what's actually expected or not of the type parameters becomes explicit), a technique to write general libraries "for free" (e.g. you can put whatever you want in our hashtable), a technique to define general language foundations (e.g. you can have smart pointers to anything).


People will bash anything new that might spoil their favorite language's success.

So far, not many languages measured up to C. Hare just might, with it's right amount of simplicity and C's footguns gone.

So it's small surprise to see The Rust's Evangelist Strike Force in action on these threads about Hare.


Unfortunately, the only footguns I see talked about in C are memory safety. Well, tbh I do see then occasional discussion of undefined behavior also. I think these are the main areas people are getting bothered over with Hare too.

A huge potion of the criticism at this point towards Hare has been summed into three points: no ‘generics’, no memory safety, and small stdlib combined with no package management.

But, I can imagine that if Hare had generics using template metaprogramming then people would complain about how language design has shown that templates are not good enough, so it needs to be not just a way to have compile time type specialization of data structure, but enable ad hoc polymorphism as well and of course that needs traits or typeclasses.

Further, if Hare used testing and fuzzing and modeling, then claimed to be within the bounds of some error percentage ‘safe’ from leaks, use-after and double free, out of memory conditions; the counter claim would be that you can not be safe, for ANY usage of the word unless you have ownership and borrow checking.

And finally, without a centralized, language committee backed and certified web based way to store and access libraries Hare will never be able to make usable and fit for purpose software, so a package manager is a must have to be worth even considering.

All that said, I don’t think it is exclusively users/promoters of Rust that are complaining/disparaging Hare’s very existence. But it certainly seems like many, many people seem to be convinced that their use case/product goals/preferences are the only considerations for any individual of group making engineering decisions.


You summed it up so well.

And those are the very same reasons Hare should remain a simple language.

You can never please the crowd, but you can please yourself and a couple of your friends.


The author of the TFA isn't criticizing the language, he's criticizing a particular example, and he's claiming that hash tables are generally really useful.

In the example, maybe the static size of the table is fine, and maybe not being able to remove items from the table is fine, but unless maybe it's some sort of cache, it's almost never ok to assume that there will never be any hash collisions.

I mean, if you don't want to include hash tables in your language or library, fine, but using a bad example to attempt to justify it after the fact is.. not good.


The primary reason I prefer working in C++ over C is that, for all of C++'s faults, it ships with a solid collection of containers and algorithms in the standard library. unordered_map may suck compared to other implementations, but it's a vast improvement over linked lists or home-grown hashmaps.

Anyone can write a hash map, in the same way that anyone can write an HTTP server over a weekend. Doesn't mean it will be production quality.

Glibc's obstack doesn't count, it's non-standard and the only type it supports is (of course) void*, which is a bad interface.


> it's a vast improvement over linked lists or home-grown hashmaps.

One of the easiest things to do is write home-grown hashmaps that perform better than std::unordered_map. I suppose you have to worry more about technical debt / bugs, but yes home-grown hashmaps are feasible and I don't think std::unorderd_map is a "vast improvement".


Easy to write, sure. But every time a developer has to look at it for the first time, they'll need to learn this hashmap's interface.

Do I need to manage memory? Does it manage memory? How do I iterate? How do I add a new key space without thrashing the original implementation and making it twice as complicated?

That's the biggest thing a standard interface gives you. Even if it's less performant and not well-designed, I'll take it.

The C++ containers are actually well-designed and performant so it's a no-brainer if you get to choose.

Almost forgot the type checking you get with C++. That's also a major factor.


> The C++ containers are actually well-designed and performant so it's a no-brainer if you get to choose.

The STL doesn't allow the containers to be "well-designed and performant" other than std::vector which is trivial. People do what they can, but there just isn't much to work with because random coincidences about the data structures that were in mind at the time are enshrined in the C++ standard as API features.

std::unordered_map is obliged to be a bucketed map. This is a poor choice, but it's not optional because it's enshrined in the API even though you probably don't (and shouldn't) actually rely on this. Lots of the good options are drop-in replacements for std::unordered_map... unless you really depend on it being a bucketed map.

The other important thing to know if you've been relying on the standard library is that the powerful need for backward compatibility means it's not going to give you a proper hash, which is important for a proper hash map. std::unordered_map doesn't care that the standard "hash" provided isn't, but if you use a hash map written by people who explicitly told you to use an actual hash it's going to exhibit jaw-droppingly bad performance until you do so.

std::unordered_map isn't worse than a reasonably competent person's "my first hash table" but it's disappointing how little better it is than that after decades.

As with all performance, if you aren't measuring then changes are just wanking, but if you are measuring you will almost certainly find that swapping out std::unordered_map is worth doing.


To be fair, if you ever rely on the default hash it's already safe to assume you don't care about performance. There's no good 'default' for all input.


I agree with all your points except "are actually well-designed and performant". I mean they perform well enough but when I write simple hashmaps that perform better than libstdc++ (yes with the same hash) anecdotally I don't have a lot of confidence.


Keep in mind that

- those hash maps are optimized for large sets

- they work around several edge cases

- they implement way more features than an in-house implementation

- they need to support many types of hardware

A decent engineer can beat the std collections in a day on performance. But to do that in production over a decade with shifting requirements is a different ball game.


The one I cooked up was used to test some kind of game solver, so a 'large set' and it happily beat std::unordered_map, was a tiny bit slower than a Google hash map implementation.

But I do believe in edge cases, I'm sure there are some pathological edge cases (that don't usually matter, but deserve consideration).

I'm interested to know what 'way more features' are for hash maps/sets, because AFAIK C++ just supports the basic features you'd expect of the abstract data structure of a "map" with the algorithmic guarantees you'd expect from a hashmap.

And nothing I did was aimed at 'specific hardware' other than it vaguely benefitted from memory locality / avoiding thrashing the cache.


"Features" was probably the wrong word for me to use. What I was referring to is interface + customizability.

The STL map allows you to:

- specify the key/value types

- specify the equality predicate

- specify which hash function to use

- iterate over values

along with a host of other things that most users don't initially need but they might need in the future.

You could implement something that has feature parity in C++ or ideally find a battle-tested library that does.

Using C, however, you'd have to sacrifice type safety and possibly readability. Yes it can be done, and yes it can be done correctly. But it's tricky and difficult to ramp up on for the new developer. This is the one that you should be careful about adopting.

My philosophy is to use the STL containers because they're familiar and they're more than good enough for 99% of the use cases out there. Developer time has a premium, and this saves developer time.

If you need performance because you're constrained by CPU cycles or if you're running at scale (Google for example), by all means reach for something better.


It’s unfortunate that people are trying to revive a better version of the past, when the world has moved on and the past ultimately really wasn’t that great.

C is a terrible language for building things in the modern world. Not including the progress over the last 4 decades in your new language is a mistake.


Sorry, but I can't agree with the take that Hare doesn't include the progress over the last 4 decades. Sure, it doesn't include every new development, but that's impossible anyway unless you're fine with being a kitchen sink language.


It is great that Hare has tagged unions and non-nullable types, but polymorphism has been a huge part of language advancement.


Agreed, I don't know how writing anything in C today would be considered a productive use of time besides low level / embedded work.


Even in the embedded world, it’s becoming less justifiable to use C when your platform supports rust.


rust doesn't offer much value for embedded (mcu, bare metal, tiny stuff) imo.

No allocation = no leaks, no use after free, no dangling pointers.

No multi core = no atomics or concurrency to worry about.

Ring buffers solve 99% of my interrupt data sharing = no need to worry about synchronization.

Bounds/safety checks? Need to run and test release builds most of the time because of space/memory constraints.


> No allocation = no leaks, no use after free, no dangling pointers.

Often you don't allocate because it's not a necessity and a PITA due to C, but with rust I see no point in not using a tiny allocator if one has a few 100 KiB of RAM - totally depends on the project and target platform.

> No multi core = no atomics or concurrency to worry about.

There are interrupts though and those just need the same mechanisms than multicore and can be a PITA to manage in C. Besides that, there are quite a few cheap multicore embedded CPUs available now.

You can encode safety checks like pull-down/up clashes and the like nicely via the rust type system, that alone makes it a major benefit over C.

For example, from the rust embedded book:

    One can also statically check that operations, like setting a pin low,
    can only be performed on correctly configured peripherals.
    For example, trying to change the output state of a pin configured in
    floating input mode would raise a compile error.
-- https://docs.rust-embedded.org/book/static-guarantees/index....


It's a bit early, so there are currently lots of downsides, but some potential benefits:

- The package management is a nice. Install a generic ring buffer or a board support package for the hardware you are using.

- Async-await can be really nice for dealing with hardware peripherals.

- Don't interrupts constitute concurrency?

- Reduced UB and footguns compared to C


Rust absolutely benefits embedded software. A big reason is that the type system allows libraries to let you enforce invariants at compile time.

For example, your i2c controller only supports two specific pins and they have to be in a specific mode? That can be a compile error if you make a mistake.


Configuring pins and peripherals with code is a chore, I use graphical tools for that.

But C has no problems abstracting away peripherals, just hide the register bits in a private TU and expose i2c_init(), i2c_write(), etc.


> just hide the register bits in a private TU and expose i2c_init(), i2c_write(), etc.

That's not a zero cost abstraction, unless these "functions" are actually provided in an include file. Embedded programming is often performance sensitive, so this can matter.


They certainly are when LTO and PGO are part of the picture.


It’s a chore because the alternative in C is to do things perfectly without the machine doubling checking your work. That assumption does not hold with tools other than C.


Any platform that supports C supports C++. Rust might be a hard sell, but C++ offers all of its plausible benefits.


> besides low level / embedded work

And this is where Hare is meant to be used, is it not?


That reminds me that I had to reimplement basic hashmaps in C at least three times in my career. What a waste of time.


How much time did you waste?


I don't recall exactly, but at least a day each I would say? They were all performance critical, so had to put some effort. Two of them were for internal nginx patches, but were storing data a bit differently so code couldn't be quite reused. In languages with generics hashmaps are trivial to reuse with zero performance cost.


Is there one with enough parameterisation to choose between different hash collision / resizing / data layout choices? Or one where the various tree/trie/vector alternatives can be swapped in? Seems possible to write such, but std::unordered_map doesn't have it. Maybe rust?


Yes, Rust with const generics could probably do that, but I am not aware of an existing implementation.


One of the rare HN-appropriate occasions for me to say: user-name checks out.


Was there a good reason not to use Judy arrays?


Why would writing a Judy array three times have been better? Aren't they infamously difficult to implement?


There's a mature open source library for them packaged for Arch and other distros. Using that would be less work than trying to implement it from scratch.


Are you suggesting there aren't mature implementations of hash maps in C you can easily vendor?


What does “vendor” mean in this context? Is it just a synonym for “distribute”?


It means including a library in your codebase via manually copying the files in (but generally in their own subdirectory and without modifying them so that it is easy-ish to update to a new version of the library by simply copying in the files from the new version).

I believe the name comes from a tradition of putting such code in a directory named vendor/name-of-library or vendor/name-of-vendor/name-of-library which allowed for distinguishing between the src directory which was first-party code and the vendor directory which contained code form 3rd party vendors (often such code was proprietary and had to be bought/licensed).

Nowadays the term is used to differentiate between a manual approach and the use of a package manager.


It's basically the professionally sanctioned version of copy and paste. With all the good and bad that that entails.

Based on my long builds times and erratic network behavior in CI I'm beginning to think every system should just vendor.


I was only suggesting what I thought was the easiest option. If others are even easier, then I'm even more curious about what might rule them out. I hope to learn something.


This reads like a rant of a person caught too much in their own bubble, a bubble in which hash maps seem to be of extreme importance. I've coded a lot of C in the past few years, and I can count the number of times when I would have actually needed a proper high performance hash map with all bells and whistles on one hand. A piece of code where hash map lookups are so frequent that it becomes a performance bottleneck is either badly designed, or a special case which needs a very careful implementation.

Builtin hash maps are convenient for "dictionary languages" like Python or Javascript, but beyond that they are highly specialized and tailored to specific data sets and hardware properties.


I knew a guy who only used PHP (and a little JS) and he literally did not know the difference between hash maps, linked lists, and arrays. The SOB is a millionaire now.


The author writes database engines.


TreeMap as a fallback for bucket and/or some king of random salt for hashes had to be introduced due to security issues (DoS) - I doubt anybody would remember about such stuff when making own hashmap implementation.


Any links to learn more?


The average performance of hashtables generally depends on collisions being rare, but if the hashed index is trivially derivable from the key, then an attacker can supply data where everything hashes to the same index, regressing performance to the point that it's a DoS attack.

This is why cryptographically secure short-output hash functions like SipHash were developed.


https://en.wikipedia.org/wiki/Collision_attack#Hash_flooding

Note that this is only a problem when an attacker can influence your hash table usage, and can be mitigated with hashing functions like siphash. This is not necessary for the majority of hash tables in the wild.


I don't get this guy. He's just raging.

I agree a hashmap is not simple and Hare's post is a bit naive, but I don't get what's his problem with the language not providing a default implementation. It's part of the language's design. It's targeting people that most likely won't need a hash table. It's not aiming to be a high-level batteries-included language like Java.


The problem is less that it doesn't provide an implementation, but more that it doesn't allow anyone else to provide an implementation either (aside via code generation).


I've been doing some work in C for the first time in a long time.

I'm constantly going back to Java to try things out, just because of the absence of convenient data structures. It's a pain to keep recreating them unless it's necessary.


I think that even the mere fact that there are a billion implementations of a hashmap kind of implies that a hash table _is_ a simple data structure, that you may complicate in a bazillion ways in order to cater for some specific usecase / requeriment. But at it's core, a simple data structure.


Why assume that? The specific points in the post appear to be common to all implementations: implementing hashing for example.


Fair point. I agree with you on that.


Maybe it's just the problems I work on and the code I write, but I use hash maps all over the place. Having to write my own sounds like a total pain in the ass, it'll almost certainly not be as performant and/or have some subtle bugs to figure out.

I guess Im not the target audience for the language, but the lack of basic data structures like this adds a lot of friction for me.


I'm so bored with all this reactionary tech. I don't understand the urge to reinvent everything from first principles while ignoring the last 30 years of advances in computing. I was alive back then. Things weren't actually better.


I think it's simple: People don't want something like Python or Java, which improves on old stuff with a big runtime, and they don't want something like Rust, which improves on old stuff with lots of complicated static checking. You can't write an OS in Python, and Rust isn't simple, which these people really want. So throw away the runtime, throw away the fancy compiler checks, and you just have a slightly nicer C.


We seem to be very worked up over a virtually unknown new C-replacement language that's more likely than not to fizzle and become yet another 0.1% also-ran.


I feel like people are less getting worked up over Hare itself, and more what it represents. It is very much a reactionary language, defined less by any grand new ideas or visions and more by rejection of things that many people see as valuable progress over the last few decades. It doesn't help that it is being pushed by an already highly notable and controversial figure. The language just happens to be a convenient vehicle for arguing about these engineering (and personal) values.


Yesterday, while reading about Hare, I discovered the problem this blog post alludes to (Hare's own module code just assumes if the hash for two modules was identical, they're the same module) and I reported it to Hare's developer list.

https://lists.sr.ht/~sircmpwn/hare-dev/%3C20220503143516.19e...

I was sort of expecting Drew to... fix it? I mean, what you'd hope for is an updated blog post with a mea culpa in it, but I'm not setting my expectations so high. Fixing the bug seems like a pretty basic place to start though.


Thanks for the email - yours is one of 46 emails in my Hare queue, which is why it has not been addressed yet. I agree with your comments and have filed a ticket to address them, however, and I will answer your email in time. Thanks for your patience!


it can't be understated how insanely hard it is to get maps right and that not getting them right (e.g. not having them or problematic implementations) is prone to produce subtle security vulnerabilities in large projects. Most times DoS attack vector not cought by normal (D)DoS protection (especially with micro services they can be nasty).


What's the problem? People will just use libraries. Why does it need to be part of the base language? Seems like that made C++ quite a mess in fact, while NPM-like ecosystems have done well.


The problem is that you can't properly implement data structures for a statically typed language in libraries without some kind of generics. You can try with explicit casting to void* or whatever Hare uses, but that comes at the cost of type safety.


The entire post is a reply to [1], which mentions the exact code the OP criticizes. Drew said "Hare doesn’t provide us with a generic hash map, but we were able to build one ourselves in just a few lines of code", but the OP demonstrates that hash map is incorrect.

[1] https://harelang.org/blog/2021-03-26-high-level-data-structu...


So... people will just use libraries that have a correct implementation?

Lots of C++ compilers shipped STL implementations with atrocious performance (and outright bugs) for like a decade. It's not like putting something in the standard library guarantees correctness or performance.


> So... people will just use libraries that have a correct implementation?

The entire issue is that the language does not allow for that: like Go (pre 1.18) it does not have userland generics. And unlike Go it doesn't even have a builtin hashmap, only arrays and slices.


C doesn't have userland generics or built-in hashmaps, and there are plenty of hash map implementations in libraries out there (glib, apr to name just a few). Yes, they obviously don't have the type safety that generics enable. But the fact that an implementation of a hash map data structure was buggy doesn't mean that all other implementations will be. And, as those of us who used C++ back in the early '00s very painfully remember, generics support and an authoritative implementation don't guarantee you get good generic data structures, either.

Edit: plus, realistically now, the article's criticism revolves around shortcomings of a particular implementation of a hash table. If Hare had generics, and a hash map implementation in the standard library, and that implementation used the same hashing algorithm and made the same assumptions (e.g. no key collisions), all that criticism would still hold. You can find poor implementations of any data structure in any language.


> And the design of the Hare language doesn’t even allow me to provide that as a library. I have to fall down to code generation at best.


Here we go implementing Hare++!


How is it possible that the design of a language do not allow for external code/libraries?


The issue is _generic_ data structures, not external code. If I have to re-create a hash table for each scenario, I'll not invest the proper time to do so.

So if I need a hash table to map inodes to strings and another to map strings to IP addresses, I'll need to create that anew each time.

That means either writing a lot of code twice, or relying on manual code generation (which sucks for many reasons).


Right, I take my question back


No support for generics or custom allocators, combined with no desire to implement a package manager. The language author accepts that (1) Hare is just as limited as C and (2) C has no way of implementing a good hash map.

> Recall that Hare is designed to be similar to C in terms of scope and goals. C also provides no general-purpose hash map, and little by way of other data structures (though some attempts exist, none of them good).

So you're stuck using a crappy general-purpose hash map that you have to convince your linux distribution package managers to maintain.


Thanks for writing this up, though I feel that it may be a bit premature since at this point hardly anyone has any real experience writing Hare code. Regardless, I understand that this is a contentious design decision of Hare, so I would be happy to explain in it in more detail.

We have discussed adding first-class maps to the language many times. We recognize the value in this feature and have tried to come up with a good way of doing it that fits within the design constraints of the language, but it has several design issues. There are three key problems: finding a good way of hashing arbitrary data structures (or arbitrarily limiting the kinds of keys the map can store), finding a good way of determining equality of arbitrary data structures, and dealing with memory allocation semantics. Hare does not have generics, and the alternative is a hands-free approach to hashing and equality, which has some troubling limitations, some of which are very unintuitive and non-obvious (which conflicts with our values for explicitness). The allocation problem is also troublesome: Hare uses manual memory management, and any hash map solution which involves magic behind-the-scenes allocations has serious design conflicts with Hare's principles.

This article criticizes a sample of a hash map implemented for the build driver. This hash map is used to store a mapping of module details keyed on the module's namespace. It is true that it is a fixed size map, and that collisions can easily be found for the fnv32 hash. These constraints limit the ability for this hash map design to generalize. However, this is not supposed to generalize. It's designed to be a special-purpose hash map for this specific use-case, and takes the simplest approach which is sufficient to this specific problem. As the author notes, there are many approaches to hash maps and there is no one-size-fits-all solution. So far as hash collisions are concerned, these are very unlikely in this use-case. This is not a hash map where hash flooding is a concern, and accidental collisions are so unlikely as to be a negligible risk. If this were not the case, it is easily fixed by just comparing each bucket entry by the namespace rather than by the hash. For use-cases where these things do matter, I would be interested in seeing something like siphash end up in the Hare standard library.

Recall that Hare is designed to be similar to C in terms of scope and goals. C also provides no general-purpose hash map, and little by way of other data structures (though some attempts exist, none of them good). Each of these approaches and concerns comes with different needs and trade-offs, and Hare places responsibility for evaluating these needs and trade-offs into the capable hands of the programmer. This is a reflection of Hare's values, which are distinct from the values of some other languages mentioned in the OP - Rust, Go, Zig, C++, etc.

Thanks for providing me the opportunity to clarify this here, though perhaps this merited a blog post rather than an overlong HN comment. I understand that this is a design decision which appears especially confusing from the outside, but know that it was made through careful deliberation and discussion.

Oh-- and a map-like thing for net::uri would be nice to have as a convenience function in the future. We need to implement the basic approach using the most fundamental design and then build the convenience on top of it, in order to accommodate all use-cases and leave the trade-offs for the user to make.


> We recognize the value in this feature and have tried to come up with a good way of doing it that fits within the design constraints of the language, but it has several design issues. > Recall that Hare is designed to be similar to C in terms of scope and goals.

This sounds like the language inherits all the limitations of C and isn't able to provide sufficient expressivity to solve problems that you yourself consider valuable (which aren't that many considering the niche scope). People move away from C because of these limitations, how do you expect they would choose Hare as a replacement?


It does inherit some of the limitations of C, but in exchange, it also inherits much of the power. The trade-offs are similar to C, which is desirable in many cases. But, you may be working under different constraints, in which case you may be better off with another language - which is totally fine!


> So far as hash collisions are concerned, these are very unlikely in this use-case. This is not a hash map where hash flooding is a concern, and accidental collisions are so unlikely as to be a negligible risk.

The current compiler will silently ignore colliding hashes and (I believe) result in a very confusing error. The probability of this happening is not very large, but still not small enough that someone will hit this error probably this year. At the very least you need to report hash collisions so that you can somehow rename your modules and so on.


Checking the bucket entries based on their module identifier instead of (or in addition to) their key is a simple and obvious improvement to this code.

https://todo.sr.ht/~sircmpwn/hare/679


> There are three key problems: finding a good way of hashing arbitrary data structures (or arbitrarily limiting the kinds of keys the map can store), finding a good way of determining equality of arbitrary data structures, and dealing with memory allocation semantics.

It's not unheard of or unreasonable to have collections that have take an allocator as an argument. And a comparator and a hash function as arguments.

I think adding generic types and/or a notion of an interface might increase the complexity of the language, but it will also become far more expressive. And yes, these are tools that are sharp and require caution, especially when designing a standard library - you want to think hard about the constraints you'd impose on the consumers of your standard library when picking which interfaces require what methods etc, but I still think that there's not much to gain from a language without these features when compared to C.


Aye, we would have probably gone with a design similar to what you suggest... if we had generics, but we do not. Generics simply don't fit with the core values of the language. I still believe that there is room for languages without them.


Something like this:

    struct cmp
    {
       hash: *fn(_:*void) u64,
       eq  : *fn(_:*void, _:*void) bool,
       free: *fn(_:*void) void 
    }
With maybe some default options should be enough for at least the basics. Note that the indirection overhead would be significant, but without generics or something similar, you don't really have a choice.


The problem here is *void, which is not great. You lose type safety with this approach, and the allocation problems are still present. Rolling your own can be both more type safe and better suited to your problem area, and it's not particularly difficult.


What about a use case where you want to use a hashmap to store 3 different types of keys independently. Would you implement the hashmap once and just store a union of the 3 types and then assert the correct type has been stored at each call site? Would you implement the hash map 3 different times? What happens if later on, some years later, the code needs to change and another type needs to be stored in a map? Generalizing a map via void pointers is the most pragmatic solution.


It depends on your specific needs and is difficult to characterize in hypothetical. I imagine that it would probably involve Hare's native tagged unions, though.


> I still believe that there is room for languages without them.

In the same sense that there's room for languages without, say, for-loops.


Why not support limited parametric polymorphism for things like equality and hashing? These are well studied problems, seems a bit convoluted to reject existing solutions.


I think that there is a problem here with regards to what you are trying to do.

You are conflating the _implementation_ of a hash table with the _existence_ of a hash table.

You list a few problems:

Computing the hash for arbitrary structure & check equality of the hash is something that you can do by adding a func in the creation of the map.

That will likely add overhead (since you can't inline it, and most of them are trivial), but at least you'll have something.

I'm not sure how memory allocations for a hash table is any different than the usual. Memory ownership semantics are likely relevant, so you'll likely need to add a way provide a free() routine here, as well.

A common example may be a `map<string, file>` - where you need to deallocate the string and `close()` the file.

This gets interesting when you do a set over an already existing value, by the way. And certainly a challenge you'll need to deal with.

My issue isn't so much with the complexity of the solution. Design choices such as not having generics will have impact on performance. But having _a_ solution is still a baseline requirement in my eyes. The original post said that you can just roll your own, except that you really can't. Certainly not over & over again.

With regards to the hash table I criticized, that is exactly the point. You were able to "get away" with implementing a pretty bare bone thing, but you called it out as something that should be viable in general.

Hashtables are in _common_ use. They aren't something that you'll use once in a blue moon. Go's decision to provide dedicated syntax for map wasn't an accident.

And for certain scenarios, you may need to use a different implementation to get the best results. But for the vast majority of scenarios, you just want _something_. And whatever you have in the box should suffice.

That is because a generic data structure has a lot of work done on it to optimize it for a wide range of scenarios. Conversely, a one off implementation is likely to be far worse.

There are also security considerations to take into account. You are not likely to properly implement things like protections against has poisonings attacks, etc.

And "easy to fix" is sure, but you have to remember to _do_ that, each and every time you write this code. Because your approach is to give that ownership to the user.

And your language design means that you _cannot_ actually solve that properly. That is not a good place to be.

C was designed in the 60s, and it is very much a legacy of that timeframe. Today, there is literally no system beyond hello world you can build that won't require hash tables galore.

The web is full of those (response and request headers, query string parameters), data formats (JSON is basically a hashtable plus some fancy pieces) , caches (that are just hash tables), etc.


>You were able to "get away" with implementing a pretty bare bone thing, but you called it out as something that should be viable in general.

I can see how you had this impression from the language of the blog post, but what I meant to express is that the approach generalizes, even if this specific code does not.

>The web is full of those (response and request headers, query string parameters), data formats (JSON is basically a hashtable plus some fancy pieces) , caches (that are just hash tables), etc.

The web generally falls well outside of the target use-cases for Hare.


My future perfect "practical" programming language will have type inference, strong typing, and intrinsics for misc data structures. In other words, looks kinda like JavaScript, behaves kinda like Java.

This:

  foo = { "apples" : 100, "bananas" : 20 }
  bar = [ 100, 200, "oops" ] // error
Instead of this:

  foo = new hashmap<string,int>
  foo.put( "apples", 100 )
  foo.put( "bananas", 20 )
  bar = new array<int>
  bar.add( 100 )
  bar.add( 200 )
  bar.add( "oops" ) // error

I believe, but cannot yet prove, this would eliminate 98% of the (popular) pressure (desire for concision) for adding generics.


Is there any field or subject where people don't routinely take a dump on each other's work? Or am I just being overseinsitive, too easily vexxed?


It's 2022


Is it too late to rename the language to Ostrich?


Sounds a bit... hare-brained. :^)


It's easy to flame young language projects, I don't find that compelling. I don't find the original description compelling either but that's not to say there won't be some kind of support for generic programming or correct hash tables in their standard library (with all the bells and whistles that you need, like randomized insertion order, reasonably fast performance, perhaps optimization for ordinal keys like a btree).

What does throw me off about some language projects is the rejection of "complexity" either in a compiler implementation or language feature because it is "complex" - despite decades of research and experience in other languages that some of those features are actually super useful (and also ways to get them wrong or right!). I'm not sure if this is where Hare has landed on generic programming, but it's an ethos I see in a lot of "C but not C" languages that I don't think is a sound approach to language design.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: