Hacker News new | past | comments | ask | show | jobs | submit login
The Architecture of Open Source Applications (aosabook.org)
395 points by letientai299 on Aug 31, 2020 | hide | past | favorite | 67 comments

I've had the idea for a while to start a club to read through open source projects, compile them, extend them, note architectures and design decisions, look at documentation and CI/testing organization, etc.

Do something like one project per month with one topic among the above per week. Haven't found the right group of folks to experiment on with this yet.

Edit: Send me an email if you'd like and I'll think about hosting this virtually.

Do it yourself on youtube. Have a patreon page. Go over the code yourself to get your impressions and then interview the main contributors about the project.

Done right I think that may go somewhere.

I would pay for this. Specially if it’s done for small to medium sized Go projects.

I would definitely subscribe




On a similar note, I've thought about putting together a presentation on debugging and understanding large codebases by fixing a bug on a large project whose source code I've literally never seen before. I figure that by picking a codebase I have 0 experience with, there's less risk of skipping over steps.

The biggest hurdle for me has been actually finding a project I'm interested in where I haven't yet gone poking around the source code for some reason or another.

Yeah, I've also been interested in writing more about orienting yourself in novel codebases. I don't think it's explored very much.

It's a pain to write about because you have to find some contrived error (real or not) in a contrived app. I've got a draft of a post but without the examples filled in.

Would like to read (and share) what you've got.

It's not explored very much because most new developers try to push for greenfield development and they're trying to split everything up even when it doesn't need it.

Now we're having multiples of codebases to comb through.

So true. Also with respect to documentation, which you have to put in a blog rather than contributing it to the project where it can be maintained. It's bizarre to be literally the last port of call for work you've actually done, when people would rather refer to obsolete blogs on things you addressed years ago.

At a previous job of mine modifying large and unfamiliar codebases was a major aspect of our team's role. I always thought it would be a good way to interview candidates, but never got to try it out.

Yes, teaching that sort of thing is sorely needed in my area of research computing support specifically, and something I've wanted to do. Unfortunately the collaboration I'm involved with only seems concerned with regurgitating Software Carpentry etc., so users don't get taught about programming and maintenance. There's a lot of talk about "software sustainability" that's often at odds with what I've learnt over decades practising it, and dogma which would have killed the most successful long-term collaborative project I've been involved with.

I'm not sure it needs de nuovo examples, just general advice and techniques from long experience.

I haven't found much about this topic and I agree there's a need for more info on that.

I remember finding this talk useful and general good advice: https://www.youtube.com/watch?v=wN4ZuGruiNw but would love to learn more about how others do it.

ardour.org via https://github.com/Ardour/ardour is here for you, day and night :))

Yes please

me too. I actually did this with the discourse.org codebase. I can make a youtube video about it if someone is interested. I spent over a year digging into it.

Ooh can you do ours? It’s a fair bit more complex mand general-purpose than Discourse, though.

Would be great to see your impressions as you dig into it and maybe record yourself on youtube. Any way I can be of assistance, let me know.


Qbix looks interesting though it’s the first time I see it mentioned. Is it getting any traction?

It looks really nice as a platform. Regarding traction it could maybe benefit from an integration with the Fediverse.

I'm definitely interested!


With a friend we started having those, and we've been thinking about streaming them, but they always end up being quite chaotic. It'd be very interesting to figure out how to make them fluid enough and yet easy to follow for more watchers.

We both do some pre reading of the topic, and "pair program" through it, changing some bits and running the code to see if the current mental model still holds. The person who drives has to be very verbose explaining what he's doing ("Let's go to this function definition, ok, we see this struct here, let's leave a mark for later and visit that struct"), because otherwise, the one watching is like "wtf, are we looking at now".

We're currently going over emacs' native compiler branch and it's been lots of fun so far. The conversational interaction of Lisps makes those explorations very lively.

I'd be interested in joining, of course :)

To understand large codebases is my biggest weakness. I would be definitively interested.

I don't find them interesting just because they're large. I've worked on some large projects and sometimes, it was just a lot of code because there were loads of business exceptions to this and that.

Are you looking for a book club like group to hop around with or some projects eager to have you?

That would be really cool. Maybe live streaming the content?

At this point, the field of programming is broad enough that we really should have a moral equivalent of Comparative Literature and we don't.

Me too.. I've not found the trigger to keep doing it on my own.. So a group might work..

Very much interested. Email sent.

Sounds like fun! If you end up putting this together, I'd definitely be interested

Very interested, would love to be a part of this.

Email sent. Thanks! Sounds interesting

This sounds really interesting !

I’m interested; email sent!

Great idea. Sent email

I’m interested in this

Sure i'm also interested

Please submit a "Show HN" once this launches. I'd joyfully watch any recordings.

FWIW: I started a study group in the late 90s, which continues today.

Our original topic (theme) was Design Patterns. Took three attempts to become self-sustaining. Mostly for an in-group to congeal. We had to experiment with the format, work out the details. Stuff like meeting frequency, best way to do the food, delegating tasks.

After each book, the group gets together to choose the next book. Over the last 20+ years, we tackled most vogue programming languages, Prolog, SICP, concurrency, ML & big data, k8s, methodologies, algos. (One notable omission are the consensus algos.)

So here's my unsolicited advice:

#1 As soon as you can, delegate the discussion leader task, then step back and focus on coordination (hosting). Once you figure out a mostly working format. After showing everyone how it's done, hand it off. Ideally, do round robin with the participants. After a while it becomes clear who's good at discussions.

First reason is coordinator is a full time job itself. Preparing to lead a discussion is a lot of work and a different headspace. Reading the source material, doing background research (reading other reviews of same), distilling 3-5 essential open-ended questions about the chapter, etc.

Second reason is emotional engagement, investment. Everyone has to pull their weight. Just like with a church, there's no commitment without sacrifice. You're hosting a discussion group, not a seminar.

#2 Experiment, have a plan, make decisions democratically, do post mortems after each series. Ask each participant to write what worked, what didn't, what they'd change (aka The Good, The Bad, and The Ugly). Send all write-ups to group pre meeting. Then discuss.

As coordinator, my job was to facilitate, carry out the group's will, keep everything on track. (Well, mostly the job was to show up.) There have now been a handful of coordinators; so apparently our culture and format were successful.

So if I was going to tackle AOSP today with a new crew, I'd draft a plan, invite everyone to also propose plans, present plans to the group, have a list of open questions to be decided, keep a list of new questions raised, keep minutes, send out a recap. I'd probably host every thing on git, to ease collaboration.

My opening position would be to suggest a first "season" 4-6 months long, meet alternate weeks, use a survey to divine which chapters to cover (based on group interest), another survey to divine meeting schedule.

Ok. Enough of my brain dump.

Again, whatever you end up doing, please do a Show HN.

What you're doing is very exciting. Happy hunting.

I'd like to ask you some more questions about your experience if you wouldn't mind. Email is in my profile (yours isn't listed).

I’d love to join as well. Email sent.

Email sent as well

email sent!

FWIW the Oil project was partly (negatively) inspired by the very helpful description of bash's architecture:


The maintainer expresses some regret about the parser architecture. I referenced this in several 2016-2017 blog posts:

OSH Parses Shell Scripts Up Front in a Single Pass. Other Shells Don't - http://www.oilshell.org/blog/2016/10/13.html

OSH Can Be Parsed With Two Tokens of Lookahead - http://www.oilshell.org/blog/2016/11/17.html

The Thinner Waist of the Interpreter - http://www.oilshell.org/blog/2017/01/26.html

Word Evaluation - http://www.oilshell.org/blog/2017/03/09.html


This principled architecture ended up paying off in a couple unexpected ways, which I wrote about earlier this year:

Oil Uses Its Parser For History And Completion - http://www.oilshell.org/blog/2020/01/history-and-completion....

Oil's Parser Doesn't Worry About Aliases and Prompts - http://www.oilshell.org/blog/2020/01/alias-and-prompt.html

Summary: in addition to the regrets expressed by the maintainer in the article, bash actually has multiple ad hoc, incorrect parsers for its own language! They are incorrect to the point that a separate project (bash-completion) has code to paper over the errors.

Oil uses the same parser for all use cases. Moreover, its parser isn't littered with orthogonal concerns like the interactive prompt and expanding aliases.


However I still need help finishing the project -- see what I've cut here: http://www.oilshell.org/blog/2020/08/risks.html

And feel free to contact me (address on home page and in profile)

I was inspired by the AOSA many years ago and I had studied (and held out) as an example the Apache architecture (about 18-20 years ago). It is indeed a pity that so many younger software engineers really have no idea of how heavily used real-world systems get built. Nor do they understand all the various tradeoffs and corner cases that need to get handled to make products work well in real life.

Building Apache from source (including a few key addon modules) and configuring it to work for a small set of tests was step 1 at my organization then. Those successful at this would then get into the source itself. Fun times!

> It is indeed a pity that so many younger software engineers really have no idea of how heavily used real-world systems get built. Nor do they understand all the various tradeoffs and corner cases that need to get handled to make products work well in real life.

Isn't that a hallmark of a junior engineer :) ? If you're trying to achieve deeper technical depth (which takes time), it helps to be a part of an organization that rewards said technical depth.

> it helps to be a part of an organization that rewards said technical depth.

I agree. My previous job wanted to employees to finish work ASAP. There were no incentives for quality. Performance was judged on the basis of how fast one can push things out irrespective of quality. As a result there was lack of engineering.

Not only that, but those types of organizations are also the ones that are likely to tell experienced engineers that they are "overqualified" for the job because they want more juniors that won't push back against harmful practices.

Only for a while. After a number of failed ‘initiatives’ someone at the top is bound to start asking questions.

Unfortunately deep technical depth is unwarranted and unwelcomed in many organizations. In fact you repeatedly get told that "the customer doesn't care about your code". At the end of the day adding yet another 'if' condition to patch up things is what is usually appreciated.

> I once worked at a place where we had an endpoint on an application taking a long time because of a weird DB query, like 12+ seconds to query a few hundred rows. Was pretty much told if we didn't figure it out today it doesn't matter, we'll ship it like that.

Even worse, writing code fast instead of good usually leads to catastrophic events. The solving of those problems (usually late in the evening or the weekend) will be seen as heroic and rewarded. But it’s just stupidity.

I've been working through the first volume in a reading group with a few friends. Some chapters are definitely better than others, and it often requires awareness that it captures a snapshot of thinking from ~10 years ago, but it's been a good experience.

Could you please elaborate on the chapters that you feel are the best to study?


So far we've read the chapters on Asterisk, Audacity, Bash, BerkeleyDB, CMake, Continuous Integration, Eclipse, and Graphite.

Of these, I liked the chapters on Bash, BerkeleyDB, and Graphite better than those on Asterisk, Audacity, CMake, and CI. I had trouble getting through the chapter on Eclipse, although there were some interesting bits.

The chapter on Bash suffered a little, in my personal experience, because about 70% of it was a rehash (with a slightly, but not sufficiently, different perspective) of information present in the Bash man page, which I had previously read in a reading group with some of these same people.

But really, the chapter that's best to study is probably the one you're currently most interested in, even if it's "not as good" as another considered in the abstract. Just remember not to assume that the entire book is of the same quality, whatever your experience.

Telepathy is one of the coolest most useful most modular pieces of software i've ever used, a modular pluggable framework for chat, & I am always so glad to have a chapter of this book talking about that all.

It's a pity it's "open source" and not free software including Emacs, in particular, which has been rather successful itself and as a model long-term.

I wonder how some of these architectures have really helped with maintenance. For instance, I'm familiar with Open MPI and its rather sorry maintenance history with continual breakage and at least partially unnecessary lack of compatibility. Then it still seems the best bet for general purpose MPI...

Very good book, but I would love a version working with modern tooling and software products. I remember going through these maybe 5-7 years ago and thinking at the time that they were using some really old paradigms and software. Would love to see some covering software with modern languages as well, like Rust.

I think the most important you can take away from it, is that a good architecture takes the problem into account that the project tries to solve. Too often I hear calls for some kind of architectural holy grail, that solves all problems.

Want to double star this resource. It covers many interesting (and advanced?) topics like distributed system, building your own database, meta programming, and template engine.

This is a brilliant idea! It would be great if they start with something everyone uses such as editors - example: Vi, Emacs, Code, browsers such as Mozilla etc.

Is all the content on that website representative of what is in the book when you buy it?

This should be part of any serious computer science program subject.

I'd like to see Kicad, Freecad and Inkscape covered.

Can't speak about the other projects listed but FreeCAD has a great landing page on their wiki.

https://wiki.freecadweb.org/Developer_hub (I especially liked the forum post from the contributor who detailed the geometric constraint solver - very neat)

Great book! I think I will read the whole thing.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact