Hacker News new | past | comments | ask | show | jobs | submit login
The Architecture of Open Source Applications (aosabook.org)
423 points by pcr910303 57 days ago | hide | past | web | favorite | 37 comments


- SQLAlchemy by Michael Bayer: https://www.aosabook.org/en/sqlalchemy.html (nice observation of core language on top of SQL, then the separate declarative/orm system)

- LLVM by Chris Lattner: https://www.aosabook.org/en/llvm.html

- Audacity by James Crook: https://www.aosabook.org/en/audacity.html (interesting mentionings of wxwigets)

Note that pretty much everything in volume 1 and 2 is interesting, though!

What would be nice to see, non-exhaustive:

- Blender: Especially 2.80, an analysis of its GUI/widgets, integrated python scripting, and everything else (wiring into graphics cards, rendering, shapes). There's a lot of stuff that could be covered

- Gimp / Inkscape / Krita

- Saltstack / Ansible / Fabric or PyInvoke : These have nice abstractions on top of system functionality / programs and delivery magic that'd be fun to go into


- Tiling window managers, e.g. i3, swaywm, awesome

- Wayland, Xorg

- Kdevelop, VSCode, Atom, Electron

- React, Vue, Angular (it'd be nce to see more js)

- Android SDK

- Unity3D SDK, Unreal SDK

- .NET Core

- black (python formatter), clang-format, prettier

- docutils, sphinx, pandoc

- systemd, mesa, cairo, pulseaudio

- implementations of python (cpython), typescript, lua, luajit, haskell ghc, swift

Loved the bash one. Loved the time aspect to all the stories - early decisions and then the way they played out over time. Hard to sim various architectural decisions so a super informative read for both good and bad outcomes.

Really curious why you'd want to study these unless you already had strong fundamentals in application architecture and are not afraid to expose yourself to antipatterns. GIMP/GTK in particular are a cornucopia of terrible ideas stemming from ignorance and prejudice.

This wasn't necessarily the most constructive thing to say, but it unintentionally invites a silver lining: studying failure is the most effective route to success. This is as true in software as it is in operations, economics, finance or pretty much any other discipline.

Studying failure is better than not studying, but it can’t hold a candle to to studying success.

As an extreme example, seeing ten spaghetti codebases does not teach you much more than the first. But expanding your mind by studying new patterns and paradigms, skillfully and elegantly woven, will build upon everything you’ve already mastered.

Absolutely not true. I've learnt more from my failures than my successes. I'm more interested in other people's failures than their successes. Understanding why a conceptual error was ever allowed to be realised is very instructive and a very cheap lesson (if it's something you weren't involved in!).

Seeing a clusterfuck of microservices blooming darkly in the midst of a project, watching time being wasted sorting it out every bloody time more code was checked in, sucking precious time from the rest of the projct - you can learn a lot immediately.

Lesson: don't use them without good reason.

Or seeing a function that takes a geographical area and returns a list of progressively larger areas that enclose it eg. give buenos aries, it returns argentina, south america, america, the world. But note that the function does not return its input (buenos aries) which caused every bloody call of that function to have extra code to add it back in because it was needed.

Lesson 1: the output of a function like this is always likely to need its input for our app, so provide the input with the output.

Lesson 2: If it's evident a function isn't quite adequate, provide a wrapper function that does what's needed so the users don't have to write workaround code every time.

> skillfully and elegantly woven

(giggle). Not on my planet!

Could you elaborate a bit? I don't know much of their code base (but I've used it as a user) and I'm curious as to what is done wrong.

GTK and GIMP both suffer from the original decision by people who don't understand C++ to reinvent it in bare C except very poorly. It's what you get if you heard a rumor that vtables are bad somehow but don't realize that an array of function pointers is worse in every way.

Having worked with GObject extensively, including through PyGObject, I can say the experience is infinitely better than trying to make C++ interoperate with e.g. Python.

(This coming from someone writing C++ since 2002)

That makes sense since the implementation of things in GObject is basically the same as the internal architecture of Python. Since it's all just named properties and function pointers there's no chance that the compiler will rearrange it and break introspection. On the other hand there's also no chance that the compiler can optimize a GObject program, so you've traded good performance for an easier-to-use FFI for Python which might not be the trade I would have made. They've also traded away core developer productivity in order to make foreign language developers more productive, again a tradeoff I might not have made.

So is it a cornucopia of terrible ideas, or just a platform that has different goals and therefore tradeoffs?

> Having worked with GObject extensively, including through PyGObject, I can say the experience is infinitely better than trying to make C++ interoperate with e.g. Python.

Have you tried SWIG?


My current favourite is pybind11. I wouldn't wish SWIG on my greatest enemy :)

> It's what you get if you heard a rumor that vtables are bad somehow but don't realize that an array of function pointers is worse in every way.

No, it's what you get when you need an object system that works across FFI boundaries.

GObject is a ref-counted object system that is designed to make interop with all sorts of languages possible across FFI boundaries, including those with runtimes that include tracing GCs; such a thing is not built into C++.

If the GIMP authors would have used C++ in 1995 they would be stuck with a legacy codebase now that would be ridiculed because it's not "modern C++" - a term that has designated about 3 different programming styles during that time.

Even the Qt developers found that 1995 C++ was insufficient for a UI toolkit and augmented it with a custom "moc" pre-compiler and build tools.

What's wrong with gimp? Been using it for a decade with no trouble.

Poor/questionable [initial] architectural decisions at a codebase level don't mean that the end product doesn't work (WordPress IMO being a very good example of this -- famously crufty codebase but the product definitely does the job). Just means it's not _necessarily_ a great subject for an architectural breakdown.

WordPress is a fork of b2.

There are many open source projects that are terrible codebases but have great support.

Example for me... lwIP. One of the worst APIs I’ve ever used, but it works damn near perfectly once you hack through it.

Maybe that says something. It certainly doesn’t make the code or behavior any easier to understand, though.

I guess that’s why many company’s are going with the “code is meant to be read, not written” thing

Until rather recently it took a PhD in computer science to draw a circle in gimp... I still love it though.

This book is such a gem. I really liked the one about async by Guido

- PostgreSQL

It's not the same format, but here's a few slide decks from Bruce Momjian's site, describing PostgreSQL's internals:


Redis would also be nice

missing SQLite

The SQLite guys have unsurprisingly already done a really great job on that topic.


This is interesting, however I'd say most of the software on this list not relevant for 99% of software architects / engineers ( ie all of us working on web / server side applications for BigCorp).

It's not the fault of the authors of course, in my experience there's not many organisations interested in explaining stuff like this out in the open.

It would be great to see the architecture of: - a large scale distributed system - a typical web app with UI, web tier, data tier maybe thrown in some eventing for good measure. - maybe anything else that needs to talk across a network all the time.

Nick Craver from Stackoverflow does a great job of explaining this for how SO do it, any other good examples would be great.

I have found StackShare to be excellent to get a birds' eye view of the stack of all the big companies. They also write on their blog in more details usually in coordination with the companies.





I would particularly like examples of Archs that are not 1000 reads per 1 write. A lot of business software is interaction heavy. So designing to support loads that can't be waved away with caching would be good to observe.

Implementing QuantLib is also an interesting read on how the open source C++ library was built.

Although is a library for a small niche is interesting how Luigi Ballabio explains the process with some comments of why some decisions have been taken and some corrections/modifications that happened in the library.


Great article on Continuous Integration: http://www.aosabook.org/en/integration.html

I’m confused. Is this an ad?

Imho a bit. Its a Cc by licensed book. But part of the revenue is for the Oss projects who participated. You can get a download without costs.

It's a book.

You can click on links and view the data. If u want pdf or book then buy it. This is really interesting find.

Even if it's an ad, what's the issue?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact