Hacker News new | past | comments | ask | show | jobs | submit login

Last three times I've joined a new team (2004, 2015, 2017) I spent my first week trying to fix tiny bugs and generally just reading my way through the codebase till I understood what was going on. I highly recommend it.

I learned to do this when I was working on my master's, and my advisor suggested a project to me, and when I asked him for help figuring out where to start, and he told me to read through the code until I understood it, and come to him if I spent more than an hour or two stuck on things. Thanks, Mikhail - your advice was fantastic.




That only works if the code was written in a semi-digestible format; I've met my fair share of monstrosities that were propped up by race conditions -- you're not going to briskly digest them in a week (especially if you're in the unlucky circumstance of only having a single breakpoint per run and the engineers who had any understanding of it haven't worked there in 20 years).


Test cases can be a good place to start because they are usually clustered around business critical functionality. Or problematic legacy code that needs to work :)


I wish that was an option for at least half the codebases I've worked on. Tangent: you'd be surprised how many life threatening codebases have little to no tests at all.


Test cases? I still feel lucky to discover that a new (to me) project is even using source code control, let alone has some kind of test suite, let alone one that it currently passes.


I've twice had to introduce test infrastructure so that test cases can even be written in the first place.


Reading through a codebase without a specific purpose is a very slow way to get up to speed.

The best advice I've heard is "start from the data structures, they are at the heart of everything".

I like inventing purposes while reading through code if I don't have a real one. I call it "scavenger hunting" and often use it as a programming exercise when teaching. For example here is a scavenger hunt for CPython and Postgre - which lines of code would you need to touch to achieve these purposes?

https://github.com/python/cpython In json module: add parameter to load() and loads() that when True only allows spaces as whitespace characters in the json (and not tabs or newlines)

In re module, python regex has syntax for a digit (\d) and for an alphanumeric or underscore character (\w). Add \l that means "any letter".

In re module, python regex has syntax for hex escapes (e.g. \x0a), octal escapes (e.g. 0o012), unicode escapes (e.g. \u000a). Add binary escapes (e.g. \q00001010)

In python C implementation: Replace the hash function used when inserting a string key to a dictionary

In python C implementation: Add an average() builtin function (like e.g. max(), sum())

https://github.com/postgres/postgres Better string hashing algorithm for query plans that involve hashing a column

"Better column combination hashing algorithm for query plans that involve hashing a column

Some queries will want a hash on (a, b) which involves calculating hash(a), hash(b), and then combining them into combination_hash(hash(a), hash(b))"

"Add a new kind of step to execution plans

e.g. a step that logs to a cloud logging service. No need to compile it into queries, just to add the data structure for it and code that runs it when relevant so it could be compiled"

"Add gzip compression as a way to avoid page splits

Today when a page is about to be split, bottom up deletion and deduplication are attempted to keep the page from splitting. We want to add GZIP compression as a third way to try and avoid expensive page splits."

"Add a complex number column type

The column should hold a float for the real part and a float for the imaginary part"

Add Configuration Flag to Never Use Covered Index Optimization

"Add new type of index

e.g. index that runs Principal Component Analysis to index high dimensional data, but don't care about the details, just the integration points"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: