That's interesting. This is kind of what I do to understand and learn new codebases.
I have mostly been working in maintenance and rewrites from scratch, an in most places there's zero documentation, and the business rules are the law. So I often gotta document and copy. Sometimes it's stuff that has been abandoned for years, often in languages/frameworks I never saw before.
This "decompression" step is exactly what I do when I'm lost or in a hurry. I just inline inline inline everything so I can really grasp what's going on.
I have a terrible time understanding deep Java-OOP style abstractions, or Ruby-style meta-programming, even though I've been oscillating between those two worlds for 20 years.
But somehow the terrible abstraction-less inlined code is way easier for me to understand as a newcomer. I don't force it upon others though, and it's the first time I see someone else talking about it.
Actually I'm not being truthful, this John Carmack one crazy idea also follows a similar principle: http://number-none.com/blow/john_carmack_on_inlined_code.htm... . But I still feel like I'm alone with those people that do games like John Carmack, Casey Muratori or Jon Blow.
I think finally we will be able to demonstrate that the deep Java-OOP style abstractions were bullshit at most software shops. Unless you are leveraging that stuff to make your code more testable, maintainable, and performant then just leave it alone.
I think the issue is that middle-aged architects get that "flow" feeling while doing their design work. They should keep that shit in their home woodworking shop and leave it out of the code base!
In many cases, I believe distaste to “abstraction-less” code is an avoidance of discomfort. “I have never programmed like this, this feels uncomfortable, therefore I label it as unclean or bad quality code”.
But then the appropriate time will not be spent designing a clean interface. Instead the abstraction is built upfront and fits the original discovered problem space but not the actual problem space.
Then the abstraction may be coupled tightly, even though it is a class or some form of layer, with the implementation. Then all bets are off.
But of course the programmer who made it goes “ah feels like home”.
After watching an inspiring video on terminal optimization [1], I started doing some napkin math on how fast my ex-employers' programs "should" run and how many resources they "should" take up.
With my very small sample size (N=5), those numbers definitely hold. My big untested assumption is that most startups have even worse codebases than what I've experienced in my short tenure haha. I'd be curious what an "average" startup could achieve in 6 weeks.
"Refactoring" is the phenomenon. Speaking in terms of compression and decompression is not quite exactly the thing, although it can be perfectly cromulent to compress the text or even the AST of a given chunk of code.
Factoring is math.
Consider:
45 * 7 + 60 * 8
We can factor out 15 (which in the case of multiplication is actually called a "factor"):
15 * (3 * 7 + 4 * 8)
And hopefully this expression has better properties (it's either more efficient or shorter or both). There are more involved examples in the WP entry:
https://en.wikipedia.org/wiki/Factorization
Anyway, the point is that when you consider programs as expressions in a symbolic language you can factor out operations (and still know mathematically that your expressions retain their meaning) to look for more mechanically desirable forms.
Typically this looks like "Don't Repeat Yourself" or "Semantic Compression" but it's important to understand that that's the outer form of the technique, its meat and bones are factoring in the mathematical sense.
(I think this point bugs me because DRY is itself a repetition of the concept of refactoring.)
It be well to adopt languages that make it easier to do this sort of thing (besides factoring the other thing you'd want to make easy is applying and unapplying the Futumura Projections.) Forth, Factor, Joy, and the other "concatinative" languages are particularly good for this, although the Lisp family and ML family are good too.
Check out "Compiling to Categories" http://conal.net/papers/compiling-to-categories/ wherein Haskell is transformed to a Point-Free style and interesting things are done with and to the compiler.
- - - -
I couldn't help myself, I golfed this expression a little:
Any compiler that is beyond toy quality reduces that to 795.
So should the programmer; the only reason to leave it in a * b + c * d form would be if it actually looks that way due to symbolic constants being used. The compiler still gets us the 795 so there is no run-time calculation.
Then in that form, there is no common factor to remove: the expressions a and c don't have a common factor.
Of course. I'm just using the simple high school algebra as a kind of metaphor or analogy to the similar concepts and operations in e.g. Lambda Calculus-based PLs. They're formalized in Category Theory but I didn't want to get into all that.
I agree with this so much. At my last job, the way I tried to explain it to people was that we normalize our db to prevent data duplication why not our code? Each team had its own StringUtils, DateUtils, etc. etc. But what was worse was that each team would manage state differently, and use different libs for scheduling background jobs.
I think GPT and friends are going to be busy refactoring legacy code bases for a long time.
The ability to point GPT to a new code base and say "Explain this to me", "Are there any security vulnerabilities?", "What are the best refactorings in terms of cost/benefit?" is available now.
I have mostly been working in maintenance and rewrites from scratch, an in most places there's zero documentation, and the business rules are the law. So I often gotta document and copy. Sometimes it's stuff that has been abandoned for years, often in languages/frameworks I never saw before.
This "decompression" step is exactly what I do when I'm lost or in a hurry. I just inline inline inline everything so I can really grasp what's going on.
I have a terrible time understanding deep Java-OOP style abstractions, or Ruby-style meta-programming, even though I've been oscillating between those two worlds for 20 years.
But somehow the terrible abstraction-less inlined code is way easier for me to understand as a newcomer. I don't force it upon others though, and it's the first time I see someone else talking about it.
Actually I'm not being truthful, this John Carmack one crazy idea also follows a similar principle: http://number-none.com/blow/john_carmack_on_inlined_code.htm... . But I still feel like I'm alone with those people that do games like John Carmack, Casey Muratori or Jon Blow.
Anyone else is like this?