
Ask HN: Examples of bad open-source code to learn what to avoid? - theSage
What are some of the bad examples of code that you have seen? Something you would want to avoid?<p>I&#x27;m looking for examples which fall along the lines of &quot;fail to see the forest for the trees&quot;.
======
neya
I feel like this could lead to very opinionated, non-constructive
comments/flame, but if were to give you an example, I'd suggest taking a look
at Wordpress eco-system. While Wordpress core's codebase has improved
significantly over the years, some of the plugins haven't.

The top pick goes to WooCommerce, although an open source E-commerce solution
on top of Wordpress, it has some terrible decisions under the hood.

The top pick would go to mixing presentational logic with business logic. For
example, to render a table, instead of exposing an array of objects to allow
the developer to loop through it as he/she sees fit, WooCommerce will force
you to use a PHP function that renders a table for you and there's actually no
way to modify the presentation logic if you wanted to.

It's a really fundamental programming paradigm that even top open source
companies fail to adhere to.

Again, I'm not saying this to attack them or the maintainers behind the code,
just my opinion of why I think it's bad quality code while respecting the fact
that developers still do take time and effort for us to enjoy something with
freedom and zero cost.

~~~
superasn
An indirect thing you can learn from this example is that how little code
quality matters when it comes to product popularity or revenue.

WordPress and its plugins are most often cited as examples of bad code and to
top it off it is written in PHP - a programming language hated by a lot of
programmers.

Yet when it comes down to it, WP powers 33.6% of all websites on the internet.
Just think for a second how big that number is!

So if the software gets the job done and the end-user can easily understand
it, it really doesn't matter if you write it in what language, using what code
patterns.

~~~
regularjack
> to top it off it is written in PHP - a programming language hated by a lot
> of programmers.

You could say that about any language. PHP is loved by many programmers too.

~~~
BoorishBears
Not to the degree that PHP is hated.

~~~
ezekg
I’d argue PHP is so hated _because of_ WordPress and the terrible
code/practices associated with its ecosystem.

~~~
gremlinsinc
Amen to that, I live/breathe for laravel projects, I contemplate slitting my
wrists when working on wordpress.

~~~
rocketpastsix
Laravel projects can be just as bad of a codebase as WordPress.

~~~
gremlinsinc
yeah, I'm working on one now (picking up where Indian devs left off) ... Code
is laravel 5.2, every controller injects session, requests, etc into the
constructor, then injects a 'manager' which is basically a loose repository
structure, laravel 5.3 > somewhere makes it so using requests/sessions in
constructor you generally want to wrap that in middlware...

So upgrading is a huge nightmare. This is one reason I'm not a fan of
repository pattern, more classes to inject everywhere even when they're not
fully needed. When just making a fatter model would suffice.

Also lots of bad php practices...everywhere else. Bad devs can work on any
code base, but laravel core code is pretty beautiful, and laravel's community
encourages better code, if some bad actors write shitty code the rest of us
have to clean up that's on them, but woocommerce is owned by Automattic and
has bad code, you'd think they'd fix it or something being a large company.

There's also tons of great packages out there that are written way more OO and
with testing and best practices than there are for wordpress. Laravel is also
easier to optimize, and the data structure for wordpress can get out of hand
as well.

ltdr; Yes, lots of bad laravel code -- but that's on the individual dev, core
laravel and lots of laravel packages use php best practices and encourage good
coding. Easier to write better code in laravel than it is a wordpress plugin.

------
ben_bai
Always a crowd-pleaser: OpenBSD true.c vs. GNU's true.c

[http://cvsweb.openbsd.org/cgi-
bin/cvsweb/~checkout~/src/usr....](http://cvsweb.openbsd.org/cgi-
bin/cvsweb/~checkout~/src/usr.bin/true/true.c?rev=1.1&content-type=text/plain)

[https://github.com/coreutils/coreutils/blob/master/src/true....](https://github.com/coreutils/coreutils/blob/master/src/true.c)

~~~
k4ch0w
Simple is better imo, openbsd wins.

------
codr7
It doesn't work, energy follows thought and it makes no sense to focus it on
what you want to avoid. Take or leave.

That being said, the worst I ever saw was the in-house business nonsense I was
paid to deal with as a Java consultant. The worst code isn't open source from
my experience, subjecting it to public scrutiny would mean suicide for the
companies involved.

~~~
SmellyGeekBoy
This matches my experience. Open source projects tend to get the worst parts
fixed. It's in-house applications, usually written in VB / Delphi / Java, that
have been supported and added to over the past 20 years where the true horrors
lie.

As you say, I doubt many businesses would be willing to put this kind of code
out there.

------
jasode
An example of bad code that always stuck with me was the flawed CDDB disc id
hash algorithm.[0]

I was reminded of that short-sighted decision every time I ripped a bunch of
CDs and saw how importing song titles was _not automatic_ because a dozen
different discs had the same hash ids which resulted in collisions[1]. It
ended up creating needless friction for millions that depended on that discid.

What's sad is I'm not even sure if one can extract any useful "lessons
learned" from it! The programmer that wrote it was not an amateur script
kiddie; he had a computer science degree from Uni California. Apparently, he
didn't realize he was writing a flawed hash algorithm as he wrote it.

One could say that hash algorithms should be "peer reviewed". Well, he got
_unsolicited_ peer review that pointed how his homegrown hashing computation
was flawed but he _ignored_ the suggestion to improve it.

[0] _> Ti Kan wanted to use a hash. He could have chosen something like CRC32,
which would have given him a 32 bit number, yielding 4 billion unique IDs.
Instead he wrote his own hash. [...] Ti Kan was made aware (not by me) of this
problem back in 1994, and given a script to convert this format into a
CRC32-based format, but he rejected it because the deployed base was too big.
At that point it was probably in the high dozens._ \-- excerpt from
[http://quimby.gnus.org/circus/notes/cddb.html](http://quimby.gnus.org/circus/notes/cddb.html)

[1] [https://forums.macrumors.com/attachments/multiple-matches-
jp...](https://forums.macrumors.com/attachments/multiple-matches-jpg.17857/)

[2] wiki:
[https://en.wikipedia.org/wiki/CDDB#How_CDDB_works](https://en.wikipedia.org/wiki/CDDB#How_CDDB_works)

~~~
the8472
That advice is wrong too, a 32bit number would have been insufficient due to
the birthday problem.

~~~
planteen
Exactly. Assuming 10 songs per CD, you should see your first collision after
around 6500 CDs. If he did CRC64, it would be after 400 million CDs.

------
AnaniasAnanas
Here you go
[https://github.com/progwml6/Natura/blob/1.7.10/src/main/java...](https://github.com/progwml6/Natura/blob/1.7.10/src/main/java/mods/natura/common/NContent.java)

Tip: If you ever end up in a situation where you have to copy-paste code with
minor changes then there is something that you are doing wrong. In this case
using arrays and loops would be a much better solution.

~~~
pjc50
That looks nasty, but surprisingly hard to fix with loops because everything
is of different types. If I were fixing it I'd look to some kind of code
generation solution, even if just a hacky python script parsing a CSV.

(addShapedRecipe is just _begging_ to have ASCII art as its canonical form)

~~~
AnaniasAnanas
Most of them are of the types button and item though.

------
spion
How about examples of open source code to learn whats really, really good,
together with why it was designed that way? Seems like that would be way more
useful.

Or examples of projects that did things one way, but later refactored, and why
they refactored.

~~~
convolvatron
i dont think you really get very deep into it by reading code without working
with it. sure, there are surface syntactic niceties one can bikeshed.

the real meat of the matter comes when you are trying to make a change. is the
structure robust? is there convenient tooling that helps you do what you need
to do? does the system require extensive boilerplate to do simple things? does
the system come crashing down in some unrelated area when you are trying to
make simple changes?

it may be surprising, but large old codebases usually have huge hunks that
serve no real purpose whatsoever except to glue together two pieces that would
be much happier talking directly to one another.

I really wish as a community we could abandon the 70s business notion that
software is a concrete artifact that one invests in and sells. its a really
poor model. software is a process. code that is not being maintained is
largely just dead. as developers we should be evaluating software as a living
thing that responds to its environment...not as a shrink wrapped item we unbox
and review on youtube.

------
kissgyorgy
That's a really bad idea. You need to have a good counterexample, otherwise
it's just wasting time at the best case. A lot of people learn from really bad
codebases and picking up the same style which is terrible. You should look at
GOOD codebases instead!

~~~
theSage
I did some work on html2text when I started off and while the little parts did
make sense, the whole library was confusing for me. I couldn't change anything
without breaking tests.

On the other hand I've been tinkering with curio for a while now and it's a
fresh breath compared to that.

My trouble is that I still don't understand what makes the html2text thing
"bad". What particular thing there caused me to not like working with it? I'm
trying to understand that.

I've been book hunting + figuring out if it's something that I did not know
which would have made the code a lot easier to work with (stuff got a lot
easier after I cleaned up my set theory understanding)

\- [https://github.com/dabeaz/curio](https://github.com/dabeaz/curio) \-
[https://github.com/Alir3z4/html2text](https://github.com/Alir3z4/html2text)

------
otras
If you’re interested in a case of unnecessary optimization and effort, the
infamous left-pad npm library has been refactored to only add to the string
O(log(n)) times. It is short but not sweet.

[https://github.com/left-pad/left-pad#readme](https://github.com/left-
pad/left-pad#readme)

~~~
ksaj
I think it needs more // comments. Hilarious.

------
Sir_Cmpwn
Here's some old code of mine:

[https://github.com/vatt849/LibMinecraft/blob/master/LibMinec...](https://github.com/vatt849/LibMinecraft/blob/master/LibMinecraft/Client/MultiplayerClient.cs)

The whole library is a trip if you want to read a bunch of bad C#. Highlights:

\- Generated documentation

\- Giant switch/case instead of a more organized dispatch map

\- Large swaths of commented code instead of using version control

\- try...catch statements that just eat the errors

\- Inconsistent code style

\- This thing:

[https://github.com/vatt849/LibMinecraft/blob/master/LibMinec...](https://github.com/vatt849/LibMinecraft/blob/master/LibMinecraft/Client/MultiplayerClient.cs#L603-L621)

I've written something similar from scratch since, which I'm still not
entirely satisfied with, but is much better for reference:

[https://github.com/ddevault/TrueCraft](https://github.com/ddevault/TrueCraft)

The client-side networking code lives here:

[https://github.com/ddevault/TrueCraft/blob/master/TrueCraft....](https://github.com/ddevault/TrueCraft/blob/master/TrueCraft.Core/Networking/PacketReader.cs)

[https://github.com/ddevault/TrueCraft/blob/master/TrueCraft....](https://github.com/ddevault/TrueCraft/blob/master/TrueCraft.Client/MultiplayerClient.cs)

[https://github.com/ddevault/TrueCraft/tree/master/TrueCraft....](https://github.com/ddevault/TrueCraft/tree/master/TrueCraft.Client/Handlers)

Notable improvements:

\- Handwritten docs only where necessary

\- Uses a stream implementation for decoding this particular wire format

\- Has a different and better abstraction for reading packets out

Still has bad error handling though.

------
slezyr
See Syobon Action it's source code as bad as the game itself.

[https://github.com/angelXwind/OpenSyobonAction/blob/master/m...](https://github.com/angelXwind/OpenSyobonAction/blob/master/main.cpp)

------
jstarfish
The source for Terraria is notoriously terrible.

[https://github.com/TheVamp/Terraria-Source-
Code](https://github.com/TheVamp/Terraria-Source-Code)

It is another example of how even inelegant code full of hardcoded values can
be successful.

~~~
czr
(Note that this is _decompiled_ source – the original probably at least has
comments here and there)

------
twhitmore
Libraries can be useful despite imperfections, and poor design decisions can
occur in overall good libraries. So we can't judge too harshly.

Having said that:

1) iText PDF library used to have some fairly poor & duplicated code. Column
layout was a highlight. Also strange ideas overemphasizing subclasses, eg. for
paragraph styles. (Correct approach: use values rather than types.)

2) Tomcat webserver back around 2007 used to have some amazing 'clustering'
code to deploy your webapp across multiple servers. But it lacked proper
knowledge & hence control of what it was doing. IIRC there was no clear
master, and a server couldn't tell what had been started on it versus what had
been replicated since a peer was seen to be running it. Effect: replication
would be additive only, contexts would just replicate everywhere uncontrolled,
and there was no good way to stop/ undeploy an app across the cluster.

------
ddebernardy
OpenSSL (before HeartBleed) springs to mind.

[https://news.ycombinator.com/item?id=7640378](https://news.ycombinator.com/item?id=7640378)

~~~
blattimwind
OpenSSL code and docs are still a huge mess.

------
type0
"How to Make Mistakes in Python" \- [https://www.oreilly.com/library/view/how-
to-make/97814920482...](https://www.oreilly.com/library/view/how-to-
make/9781492048275/)

------
blattimwind
Drupal 6/7 would be an example for "widely used [at the time] but pretty bad".
I don't know how many of the issues across all layers were addressed in later
versions.

OpenSSL is still an excellent example for very messy code where even
maintainers / frequent contributors regularly get lost. Also a good example
for designing many bad APIs and poor docs. libsodium is a good counterexample,
although the internal structuring of the code base is a bit atypical, it is
logical and consistent. (It does have some API idiosyncrasies which cater
specifically to dynamic bindings, like providing a constant always as a
#define/macro but also as an exported function; and it has a bit of an issue
where you have both legacy APIs and newer APIs, but the docs are pretty clear
on which is which).

BorgBackup is an example of how you don't want to mix C and Python code, and
also contains various bits that only 1-2 people on the planet really bothered
to understand, besides demonstrating other issues of organically grown code
bases.

------
bradknowles
There are an infinite variety of things that you don’t want. If you focus all
your energy on those things, then you won’t have anything left to do the
positive things you do want.

You have to turn that equation around. Even if all you know right now is the
negative thing you don’t want, you have to figure out how to reframe that into
the positive thing you do want. On,y then can you make positive progress
towards that thing you want — and by the way, you will naturally avoid the
things you don’t want by focusing on the things you do want.

Sure, examples of bad stuff can be instructive, but only so far as it helps
you further clarify the good stuff you’re actually trying to achieve.

------
superpermutat0r
I've always found Calibre to be a huge mess.

------
arthev
While usually focusing on smaller snippets, thedailywtf.com is a site about
(mostly) code wtfs. A fair number of management wtfs too, though.

------
pjc50
> fall along the lines of "fail to see the forest for the trees"

I think I'm going to need an example to understand this?

Having said that, one of the most informative programming books I've ever read
was _C Traps And Pitfalls_. Flags common easily-made errors and explains them,
which in turn fixes misconceptions about the language. I feel most languages
could do with one.

------
ksaj
Here is something atrocious I wrote in Lisp. You actually _can_ make Lisp
ugly! Who knew?

[https://github.com/ksaj/Capitalize.Lisp](https://github.com/ksaj/Capitalize.Lisp)

------
RickJWagner
Just a side note-- this is the beauty of Open Source. If there is some bad
code (we all write it), it can be improved with a little help from Open Source
"friends".

All of us are stronger than any one of us. Long live Open Source!

------
craftoman
Many JavaScript libraries like Fastify (Node.js) for example. You always get a
nice & clean API but if you look under the hood you would be amazed at how
much spaghetti code can be written in a project.

------
frostburg
Praat: [https://github.com/praat/praat](https://github.com/praat/praat) I'm
not sure that this is a good way to learn anything, however.

------
sam_lowry_
Jgit is amazingly bad for a piece of software built on top of well thought out
data structures if git. Some if its flaws could be attributed to Java IO
design, though.

------
peterwwillis
Anything related to OpenStack, but particularly jenkins-job-builder is rather
horrible.

------
jxub
Many, if not most OSS packages which are released by academics or
universities.

------
yamann
[https://github.com/mholt/caddy](https://github.com/mholt/caddy) not only a
mediocre code, but the guy behind it received lots of money from Mozilla as an
innocent promising open source project author, then he made it as a paid
product.

