My question is pretty straightforward: how do you, hacker news enthusiast, familiarize yourself with a new codebase? Obviously your answer is going to be contingent on the kind of work that you do.
Some background: What's motivating me to ask is that I am flirting with the idea of trying to add a couple of features to SlickGrid (https://github.com/mleibman/SlickGrid), Michael Leibman's phenomenal javascript grid widget. Unfortunately Leibman got busy and isn't actively supporting it anymore.
The codebase is something like 8k lines of javascript, so it's not ludicrously big, but I'm kind of intimidated thinking about trying to make sense of it. My first strategy is just to open up important-looking javascript files (slick.core.js, slick.grid.js) and read through for comprehension. This seems like a pretty slow way to build a mental model of the code, though. Some features I want to implement are 1. an ajax data source that doesn't require paging, and 2. frozen columns. Someone else has implemented a buggy version of frozen columns (and since abandoned the project), and I might like to use it, but I can't tell if it's buggy because it's a hard problem, or because their implementation strategy was poor (or both!). So at the moment I can't evaluate if I should implement my own, or try to fix the issues with theirs.
Picking up other people's code seems to be one of the harder tasks developers face, as evidenced by how much code gets abandoned, so I wondered if the voices of experience on here could point me in the right direction, either by talking about this problem in particular, or more generally, how you build knowledge about a new codebase.
Thanks!
https://github.com/gilesbowkett/rewind
it's for assessing a project on day one, when you join, especially for "rescue mission" consulting. it's most useful for large projects.
the idea is, you need to know as much as possible right away. so you run these scripts and you get a map which immediately identifies which files are most significant. if it's edited frequently, it was edited yesterday, it was edited on the day the project began, and it's a much bigger file than any other, that's obviously the file to look at first.
we tend to view files in a list, but in reality, some files are very central, some files are out on the periphery and only interact with a few other files. you could actually draw that map, by analyzing "require" and "import" statements, but I didn't go that far with this. those vary tremendously on a language-by-language basis and would require much cleverer code. this is just a good way to hit the ground running with a basic understanding which you will very probably revise, re-evaluate, or throw away completely once you have more context.
but to answer your actual question, you do some analysis like this every time you go into an unfamiliar code base. you also need to get an idea of the basic paradigms involved, the coding style, etc. -- stuff which would be much harder to capture in a format as simple as bash scripts.
one of the best places to start is of course writing tests. Michael Feather wrote a great book about this called "Working Effectively with Legacy Code." brudgers's comment on this is good too but I have some small disagreements with it.