Hacker News new | past | comments | ask | show | jobs | submit login

I read to like the first line under the first bold heading and immediately this person seemed like an alien. I'll go back and read the rest because it's silly to be put off a whole article by this kind of thing, but what in the actual fuck?

I was probably not alive the last time anyone would have learned that you should read existing code in some kind of linear order, let alone programming. Is that seriously what the author did as a junior, or is it a weirdly stilted way to make an analogy to sequential information being passed into an LLM... which also seems to misunderstand the mechanism of attention if I'm honest

I swear like 90% of people who write about "junior developers" have a mental model of them that just makes zero sense that they've constructed out of a need to dunk on a made up guy to make their point




To anyone who gets confused by the parent comment, note that the line they're referring to has been updated. It used to read:

> Remember your first day reading production code? You probably did what I did - start at line 1, read every file top to bottom, get lost in the details.

Now it reads:

> Remember your first day reading production code? Without any experience with handling mature codebases, you probably quickly get lost in the details.


The change makes me question the authenticity of the text. I mean, did the author actually read files from top to bottom, or did he just write that because it suited his narrative?

That’s a trivial change to make for a line that did not receive the feedback that the author wanted. If that’s the case, maybe the text was more about saying what people wanted to hear than honestly portraying how to make AI read code better.


I forced an analogy and took the metaphor too far. I promise you'll see better from me in the future!


> Remember your first day reading production code? You probably did what I did - start at line 1, read every file top to bottom, get lost in the details.

Top to bottom left to right is how we read text (unless you are using Arabic or Hebrew!), the analogy was fine IMO. Don’t let one HN comment shake your confidence, while people here may be well intentioned they are not always right.


Haha thank you for the kind words!

I've been a lurker on HN ever since I was a kid. I've seen over and over how HN is the most brusque & brutal online community.

But that's also why I love it. Taking every piece of feedback here to learn and improve in the future, and feeling grateful for the thousands of views my article is receiving!


Hebrew speakers also read top to bottom and left to right, when they're reading code, because coding is (almost always) in English languages. :)


Don't take this feedback too personally—remember that most HN users read and don't vote or comment, a subset of them read and vote, and only a tiny loud fraction of us actually comment.

Your article has been very well received, and it wasn't because that one line deceived people into paying attention, it's because the content is good.


When I started out, I did read code top-to-bottom. I was mostly self-taught and didn't have a mental model yet of how code was structured, so I relied on this "brute force" method to familiarize myself.

I suppose it's not safe to assume that everyone started out like this. But advael is guilty of assuming that nobody started out like this. And on top of that, conveying it in a very negative and critical way. Don't get discouraged.


This discussion is about junior professionals, not zero experience programmers. If a junior professional programmer is still starting at the top of files instead of at the entry points to the program or the points of interest, then they had a very poor education.


Metaphor? What metaphor? What analogy?


Wow, people are being very uncharitable in this comment section


Welcome to LLM-related threads on HN.


Oh, I was confused, thanks a lot.

And, indeed, reading every file from top to bottom is very alien to me as a junior.

I would just try to get to the file I thought the change I needed was made and start trying and error. Definitely not checking the core files, much less creating a mental model of the architecture (the very concept of architecture would be alien to me then).

I would do get lost in irrelevant details (because I thought they were relevant), while completely missing the details that did matter.


Oops, I should have marked my edit clearly. Added a footnote now.


Thanks! No worries, we all live and learn. :)


While that wasn’t my experience as a junior developer, this is something that I used to do with academic papers.

I would read it start to finish. Later on, I learned to read the abstract, then jump to either the conclusion or some specific part of the motivation or results that was interesting. To be fair, I’m still not great at reading these kinds of things, but from what I understand, reading it start to finish is usually not the best approach.

So, I think I agree that this is not really common with code, but maybe this can be generalized a bit.


> reading it start to finish is usually not the best approach.

It really, really depends on who you are and what your goal is. If it's your area, then you can probably skim the introduction and then forensically study methods and results, mostly ignore conclusion.

However, if you're just starting in an area, the opposite parts are often more helpful, as they'll provide useful context about related work.


> this is something that I used to do with academic papers

Academic papers are designed to be read from start to finish. They have an abstract to set the stage, an introduction, a more detailed setup of the problem, some results, and a conclusion in order.

A structured, single-document academic paper is not analogous to a multi-file codebase.


No, they are designed to elucidate the author's thought process - not the reader's learning process. There's a subtle, but important difference.

Also: https://web.stanford.edu/class/ee384m/Handouts/HowtoReadPape...


they are designed to elucidate the author's thought process - not the reader's learning process

No, it’s exactly the opposite: when I write papers I follow a rigid template of what a reader (reviewer) expects to see. Abstract, intro, prior/related work, main claim or result, experiments supporting the claim, conclusion, citations. There’s no room or expectation to explain any of the thought process that led to the claim or discovery.

Vast majority of papers follow this template.


The academic paper analogy is interesting, because code and papers are meant to do the exact same thing: communicate ideas to colleagues. Code written by a small group of competent programmers with a clear, shared vision is therefore a lot easier to read than code written by a large group of programmers who are just desperately trying to crush enough jira story points that they don't get noticed at the next performance review.

The difference is usually papers written that badly don't go into "production"--they don't pass review.

I usually read code top-to-bottom (at least on a first pass) in two ways--both root-to-leaf in the directory/package structure and top-to-bottom in each source file. Only then when I've developed some theory of what it's about do I "jump around" and follow e.g. xref-find-references. This is exactly analogous to how I approach academic papers.

I think the idea that you can't (or shouldn't?) approach code this way is a psychological adaptation to working on extremely badly wrought codebases day in and day out. Because the more you truly understand about them the more depressing it gets. Better just to crush those jira points and not think too much.


You're supposed to read academic papers from start to finish.


You're supposed to read the abstract, preferably the bottom half first to see if there are conclusions there, then proceed to the conclusions if the abstract is insufficient. Once you're through with that, you can skim the introduction and decide if the paper is worth your attention.

Reading start to finish is only worth it if you're interested in the gory details, I'm usually not.


I was taught to read the abstract, then the conclusion, then look at the figures, and maybe dig into other sections if there's something that drew my interest.

Given the variety of responses here, I wonder if some of this is domain specific.


It depends also on what you want to get from the article. Usually I focus on the methods section to really understand what the paper did (usually I read experimental papers in cognitive science/neuroscience). I may read parts of the results, but hopefully they have figures that summarize them so I do not have to read much. I rarely read the conclusion section and in general I do not care much about how authors interpret their results, because people can make up anything and if one does not read the methods can get really mislead by the authors' biases.


It’s interesting how many different opinions there are in this thread! Perhaps it really varies by field.

I was reading mostly neuroscience papers when I was taught this method as an undergrad (though the details are a bit fuzzy these days).

I’d bet it also varies quite a bit with expertise/familiarity with the material. A newcomer will have a hard time understanding the methodology of a niche paper in neuroscience, for example, but the concepts communicated in the abstract and other summary sections are quite valuable.


I learned very quickly reading math papers that you should not get stuck staring at the formulas, read the rest first and let them explain the formulas.

I would not say it should be read start to finish, I often had to read over parts multiple times to understand it.


I don't know. Your comment feels like alien. The first line under the first bold heading is:

"Remember your first day reading production code? Without any experience with handling mature codebases, you probably quickly get lost in the details".

Which looks pretty much accurate. And yes, this includes the (later) implied idea that many juniors would read a PR in some kind of linear order, or at least, not read it in order of importance, or don't know how to properly order their PR code reading. And yes, some just click in the order Github shows the changed files.

Not that for 99% of the industry, "junior dev" is not the same as something like:

"just out of uni person with 12+ years of experience programming since age 10, who built a couple of toy compilers before they were 16, graduated Stanford, and was recently hired at my FAANG team"

It's usually something bewteen that and the DailyWTF fare, often closer to the latter.


The article was updated, probably in response to the parent comment. It used to read this:

> Remember your first day reading production code? You probably did what I did - start at line 1, read every file top to bottom, get lost in the details.

I copied before refreshing, and sure enough that line was modified.


I have actually just printed out codebases and read them cover to cover before (sometimes referencing ahead for context), as a senior engineer. If you need to quickly understand what every line is doing on a small to medium sized body of code, it's a pretty good way to avoid distraction and ramp up quickly. I find that just reading every line goes pretty quickly and gives me a relatively good memory of what's going on.


Doing this requires higher IQ. Believe it or not a ton of people literally don’t do this because they can’t. This ability doesn’t exist for them. Thousands of pages of code is impossible to understand line by line for them. This separation of ability is very very real.


I don't read all the lines of code but I open and scan a ton of files from the code base to get a feel of which concepts abstractions and tricks are used.


> I was probably not alive the last time anyone would have learned that you should read existing code in some kind of linear order, let alone programming.

If you want to dive all the way down that rabbit hole, can I recommend you check out the wikipedia article for the book Literate Programming [1] by Donald Kunth [2].

[1]: https://en.wikipedia.org/wiki/Literate_programming [2]: https://en.wikipedia.org/wiki/Donald_Knuth


I was a junior so long ago that I've forgotten how I first read code, but I do remember I was very confused.

Edited the post to improve clarity. Thanks for the writing tip!


Yea sorry if I came off caustic there, dealing with really dismissive attitudes toward juniors I'm actively trying to foster has perhaps left a bad taste in my mouth


No worries. I took the metaphor too far and you rightfully called me out. I'm still learning how to write well, I promise you'll see better from me in the future.


Love to see someone genuinely trying to improve at something and I'm glad to have played a tiny part in it


> I was probably not alive the last time anyone would have learned that you should read existing code in some kind of linear order, let alone programming.

Some of us have been around since before the concept of a “Pull Request” even existed.

Early in my career we used to print out code (on paper, not diffs) and read / have round table reviews in person! This was only like 2 decades ago, too!


I think this article is indicative of the "vibe" I've been getting when reading any discussion around genAI programming.

The range of (areas of) competence is just so damn vast in our industry that any discussion about the quality of generated code (or code reviews in this case) is doomed. There just isn't a stable, shared baseline for what quality looks like.

I mean really - how on earth can Jonny Startup, who spends his days slinging JS/TS to get his business launched in < a month[1], and Terrence PhD the database engineer, who writes simulation tested C++ for FoundationDB, possibly have a grounded discussion about code quality? Rarely do I see people declaring their priors.

Furthermore, the article is so bereft of detail and gushes so profusely about the success and virtues of their newly minted "senior level" AI that I can't help but wonder if they're selling something...

/rant

[1] Please don't read this as slight against Jonny Startup, his priorities are different


> Furthermore, the article is so bereft of detail and gushes so profusely about the success and virtues of their newly minted "senior level" AI that I can't help but wonder if they're selling something...

With all the money in the AI space these days, my prior probability for an article extolling the virtues of AI actually trying to sell something is rather high.

I just want a few good unbiased academic studies on the effects of various AI systems on things like delivery time (like are AI systems preventing IT projects from going overtime on a fat-tailed distribution? is it possible with AI to put end to the chapter of software engineering projects going disastrously overtime/overbudget?)


Is there a difference in quality? Johnny Startup is presumably trading quality in order to release sooner, but the lower quality accepted in that trade is recognizable.


If Jonny startup has been building release prioritised systems all his life/career, there's a decent chance he doesn't even know what more goes into systems with higher release & maintenance standards.

Conversely, if Terrence has only ever worked in high rigour environments, he's unlikely to understand Jonny's perspective when Jonny says that code generation tools are doing amazing "reliable" things.

Again, this isn't meant to be a value judgement against either Jonny or Terrence, more that they don't have shared context & understanding on what and how the other is building, and therefore are going to struggle to have a productive conversation about a magic blackbox that one thinks will take their job in 6 months.


Your leap that lack of exposure, maybe even lack of capability, to writing high quality software precludes being aware of varying degrees of software quality is curious, and frankly unrealistic. In reality, Johnny Startup knows full well what tradeoffs he is making. Top quality software is not a priority concern of his, but he understands that it can be for other people. He is under no illusions that safety-critical software, for example, is written like a line-of-business MVP. And vice verse. Especially given the context of people participating in communities like HN where software quality is a regular topic. We are not talking about people living under rocks, as they say. It is effectively impossible for one to not have that awareness.


Didn’t seem like dunking on juniors to me


Yeah, to me his description of how programmers think didn't really jive with either senior or junior. I think with senior developers when they look at a code review, they're busy, so they're looking for really obvious smells. If there's no obvious smells and it's easy to understand what the code is intending to do, they usually let it pass. Most of the time if one of my PR's get's rejected it's something along the line of "I don't know why, but doing X seems sketch" or "I need more comments to understand the intended flow" or "The variable/function names aren't great"


Erm. I've been a developer for... well, certainly longer than most people on HN, I've reviewed code for most of that time, and for most PRs/MRs, I read the code almost linearly. I take a few notes here and there, and sometimes return to amend my notes, but that's often it.

It's only when a PR reaches a fairly high complexity (typically a refactoring, rather than a new feature) that I take the effort to sort it any further.

So, yeah, I guess I'm pleading guilty of doing that? But also, in my decades of experience, it works for me. I'm sure that there are other manners of reviewing, of course.


I think that you missed the point and should have read until "That’s exactly how we feed codebases to AI"... ;-)

Actually, the article shows that feed an AI with "structured" source code files instead of just "flat full set" files allow the LLM to give better insights


> I was probably not alive the last time anyone would have learned that you should read existing code in some kind of linear order

I think you're jumping ahead and missing a point that the article itself made: there are indeed bootcamp developers who were taught this way. I have spent quite a number of hours of my life trying to walk some prospective developers back from this mindset.

That said I think that you could write this entire article without dunking on junior developers and I don't consider it particularly well written, but that's a separate issue I guess.


I suppose such a bootcamp may exist but wow, that's crazy to me

But yea, having now read the whole thing I'm mostly taking issue with the writing style I guess. I find the method they tried interesting but it's worth noting that it's ultimately just another datapoint for the value of multi-scale analytic techniques when processing most complex data (Which is a great thing to have applied here, don't get me wrong)




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: