
Low Hanging Fruit of Programming Language Design - sidcool
http://250bpm.com/blog:112
======
titzer
You know, sometimes duplicating code is just better. I've been around long
enough to see many attempts at refactoring to _share code!_ crash and burn.
The hoops and complexity that some people go through to DRY (don't repeat
yourself) are worse sometimes than the cost of maintaining two copies.

~~~
naasking
> The hoops and complexity that some people go through to DRY (don't repeat
> yourself) are worse sometimes than the cost of maintaining two copies.

The number of bugs are proportional to the lines of code, this is undeniable
from empirical data. Ergo, fewer lines of code will tend to yield fewer bugs.
So if your code is literally the same, there's no reason not to extract it
into a function.

That said, the moment you have to start adding parameters in order to
successfully factor out common code, ie. to select the correct code path taken
depending on caller context, that's when you should seriously question whether
the code should actually be shared between these two callers. More than likely
in this case, only the common code path between two callers should be shared.

~~~
dllthomas
> So if your code is literally the same, there's no reason not to extract it
> into a function.

If the two pieces of code are likely to change in different ways, for
different reasons, that is a strong reason not to extract it into a function
even if they happen to be character-for-character identical for the moment.

Less code means fewer bugs, but that doesn't mean I should be working on the
gzipped representation.

~~~
naasking
> If the two pieces of code are likely to change in different ways, for
> different reasons, that is a strong reason not to extract it into a function
> even if they happen to be character-for-character identical for the moment.

The future is much more malleable than your immediate needs.

But even if your future turns out to be true, the code you need to refactor is
then already extracted to a function, so you can easily duplicate that
function, make your localized changes, and change the callers to the new
function. So this is still the best route.

> Less code means fewer bugs, but that doesn't mean I should be working on the
> gzipped representation.

Don't be absurd. gzipping doesn't preserve your program in human-readable
form. Extracting code into a reusable function makes your program more human
readable, not less.

~~~
falcolas
> Extracting code into a reusable function makes your program more human
> readable

When done properly, yes. When done to the point where a five line function is
created with ten inputs (yes, this is real), no. But DRY tells us that the
five lines of duplication is unconditionally _worse_.

Hell, I've even seen things like logging/write tuples (i.e. log the error,
write the to a socket) encapsulated, even though the only non-parameter code
ends up being the two function calls.

Anything, taken to extremes is bad. The problem with DRY is it encourages that
extremism.

~~~
naasking
> When done properly, yes. When done to the point where a five line function
> is created with ten inputs (yes, this is real), no. But DRY tells us that
> the five lines of duplication is unconditionally worse.

I'm not convinced by your example. There are plenty of mathematical
calculations taking numerous parameters that I think should be in a distinct
function.

Even for non-mathematical calculations, 5 lines of code that are used
repeatedly as some sort of standard pattern in your program should also get
factored out. Like your logging example, ie. you consistently log everything
in your program the same way, then sure, refactor that into a shared function.
Then if you suddenly find you need to log more or less information, you can
update it in one place.

Of course, I understand your meaning that sometimes factoring out doesn't make
sense, but if you find repetition more than twice as per DRY, refactoring
seems appropriate.

~~~
always_good
You DRY up code by writing indirection. That's the expense of all
abstractions. You can't believe that all indirection is worth it at all costs,
so I'm not sure what point you're belaboring.

~~~
naasking
> You can't believe that all indirection is worth it at all costs

I think I've been pretty clear about the costs and when this is worth it,
particularly in my first post in this subthread, which I'll quote here:

> That said, the moment you have to start adding parameters in order to
> successfully factor out common code, ie. to select the correct code path
> taken depending on caller context, that's when you should seriously question
> whether the code should actually be shared between these two callers. More
> than likely in this case, only the common code path between two callers
> should be shared.

Or if you want a more concise soundbite: refactor if your indirection is
actually a clear and coherent abstraction.

------
julian37
As "michael" points out in the comments on the article, this has already been
invented, it's called a macro and has been available for decades in various
LISPs. Wikipedia lists a number of other languages [1] that support macros.

[1]
[https://en.wikipedia.org/wiki/Macro_(computer_science)#Synta...](https://en.wikipedia.org/wiki/Macro_\(computer_science\)#Syntactic_macros)

~~~
simias
The problem with having powerful macros is always the same: as a project grows
and grows you end up inventing your own little dialect of the language which
is opaque to any 3rd party reading your code unless they take the time to
unravel your macros.

I think Rust did two things right in that regard: firstly the macro
invocations are always suffixed with '!' so you know it's not a regular
function call right away. Secondly Rust macros are so quirky, ugly and painful
to implement that you only ever use them as a last resort, so people tend not
to abuse them too much.

~~~
flavio81
> _as a project grows and grows you end up inventing your own little dialect
> of the language which is opaque to any 3rd party reading your code unless
> they take the time to unravel your macros._

This is bad use of macros, or an ugly macro system.

Macros, at least in Lisp, made code even _clearer_ to understand; because they
let you create the constructs that make the problem domain map more directly,
straightforwardly, easily, to the programming language.

So they reduce line count, they reduce need for kludges or workarounds. They
allow more straight code.

But this is within the Land of Lisp, where writing a macro isn't something
"advanced" nor "complex" nor "esoteric". In the Lisp world, writing a macro is
95% similar to writing a run-of-the-mill function.

~~~
jdmichal
No true Scotsman would ever write macros in such a confusing manner!

But really, this is a recognized problem of Lisp, and has been called the Lisp
Curse. [0] One is _never_ programming in "just Lisp", but rather in Lisp plus
some half-baked DSL haphazardly created by whoever wrote the program in the
first place.

Also, don't confuse readability with understanding. Yes, DSLs are typically
easier to read, but only _after_ you come to understand the primitives of the
language. When every program has its own DSL with its own primitives, even
programs that do similar things... That becomes quite a burden.

[0]
[http://www.winestockwebdesign.com/Essays/Lisp_Curse.html](http://www.winestockwebdesign.com/Essays/Lisp_Curse.html)

~~~
klibertp
You're replying to an, arguably, "no true Scotsman" argument with a straw man
of your own:

> but rather in Lisp plus some half-baked DSL

Why does it need to be "half-baked"? Why do you assume that writing a good DSL
is impossible for most Lisp users? Are you sure it's actually the case?

~~~
AnimalMuppet
If I understand "the Lisp curse" correctly, the claim is that Lisp often winds
up with a "half-baked DSL" because making DSLs in Lisp is so _easy_. You can
do it without putting very much thought into it, so it's easy for the original
author to just slap something together.

Note well: This is my understanding of the claim. I take no position on
whether it is true.

------
aaron-lebo
The author is absolutely right that theres's a ton of data from languages/open
source projects that if analyzed could prove massively useful.

But I'm not sure that the metrics in the referenced article have much to do
with language design. Most of what they are measuring (their conclusion says
as much) are people copy pasting files and entire projects. JS by their stats
is the worst about this but JS is a language that reduces duplication more
than Java (also measured with less copying).

Seems to be a social or dev skill issue rather than language design. Honestly
most languages have an excellent tool for reuse - the function/method, which
isn't used enough to remove all duplication as is.

 _We presented an exhaustive investigation of code cloning in GitHub for four
of the most popular object-oriented languages: Java, C++, Python and
JavaScript. The amount of file-level duplication is staggering in the four
language ecosystems, with the extreme case of JavaScript, where only 6% of the
files are original, and the rest are copies of those. The Java ecosystem has
the least amount of duplication. These results stand even when ignoring very
small files. When delving deeper into the data we observed the presence of
files from popular libraries that were copy-included in a large number
projects. We also detected cases of reappropriation of entire projects, where
developers take over a project without changes. There seemed to be several
reasons for this, from abandoned projects, to slightly abusive uses of GitHub
in educational contexts. Finally, we studied the JavaScript ecosystem, which
turns out to be dominated by Node libraries that are committed to the
applications’ repositories._

Reading the conclusion again, they're capturing the existence of
node_modules/, which has nothing to do with language design.

~~~
ulucs
Further info about node_modules from the paper

 _Because the JavaScript sample was so heavily (78 out of 80) dominated by
Node packages, wehave performed the same analysis again, this time excluding
the Node files. This uncovered jQueryin its various versions and parts
accounting for more than half of the sample (43), followed froma distance by
other popular frameworks such as Twitter Bootstrap (12), Angular (7), reveal
(4).Language tools such as modernizr, prettify, HTML5Shiv and others were
present. We attribute thisgreater diversity to the fact that to keep
connections small, many libraries are distributed as asingle file. It is also
a testament to the popularity of jQuery which still managed to occupy half
ofthe list._

------
mattmanser
In visual studio in the late 2000s, code snippets were all the rage, a new
feature to allow you to write boiler plate quickly.

Since .Net 3.5 I barely use them, I really only use the built in ones for
foreach and property accessors. The addition of lambdas and short-hand
property accessors vastly reduced the boiler plate code in C#. I imagine the
addition of async/await has also helped a lot with other certain types of
code.

My round-about point is that existing languages can improve to reduce code
duplication.

------
tluyben2
I find (given enough time for it that is), more and more, that if code is
duplicated or bad or both, that I can rewrite it into some (unexisting) DSL
that makes it elegant. Then when trying to rewrite it to the language at hand,
it becomes (a lot more) inelegant again. So is low hanging fruit there an Alan
Kay strategy of writing DSLs for specific domains (which would put languages
like Lisp, Ocaml or Haskell in a good place) or should language designers find
ways to make the language itself better for these kind of problems (and is
that possible, for instance; advanced types OR elegant macros?).

The other comment about Django (which would also work for Rails); a lot of
times, big enough libraries/frameworks written in a language, are actually
DSLs which would suggest we lean toward the 'Alan Kay' 'make everything a DSL'
solution. The mathematician/CS person in me wants the 'improve the language'
solution. But low hanging fruit for language design I don't see here.

~~~
Joeri
The thing about DSL’s (or really any programming language abstraction) is that
they make it easier to understand the intent of the code at the expense of
making it harder to understand the effect of the code. I find that the
cognitive load of maintaining a project with lots of DSL’s is high. The
advantage of e.g. java over scala is that while java code is often more
verbose and ugly, it holds fewer surprises. Like anything there is a trade-off
here, adding more DSL’s can be better or worse.

~~~
tluyben2
> find that the cognitive load of maintaining a project with lots of DSL’s is
> high

Yes, that is currently why I design that kind of scenario in a (possibly blue
sky) DSL 'on paper' (if there isn't one handy) and then translate that back to
libraries and structures in the language i'm working in. I just notice that
this becomes annoying sometimes if the existing solutions really do not match
the DSL I envisioned.

------
JoelSanchez
"In Java, 70% of duplicates were due to code generation. For C++ is was 18%,
for Python 85%, for JavaScript 70%."

In the case of Java and JavaScript, there's already a language that is built
on top of them and has macros. It's Clojure/ClojureScript.

~~~
virtualwhys
Two actually, you forgot Scala/Scala.js ;-)

disclaimer: I won't argue about the superiority of one macro system over
another, the point is that macros exist in both languages.

------
indescions_2017
A nice complement to this is Andrej Karpathy's Software 2.0 essay. Namely that
it is getting much easier to collect the data. Than it ever will be to write
the complex programs which can anticipate every case.

Software 2.0

[https://medium.com/@karpathy/software-2-0-a64152b37c35](https://medium.com/@karpathy/software-2-0-a64152b37c35)

One Model To Learn Them All

[https://arxiv.org/abs/1706.05137](https://arxiv.org/abs/1706.05137)

------
bunderbunder
> In Java, 70% of duplicates were due to code generation

I don't think, in Java's case, that fixing this is low hanging fruit. Java's
more reliant on code generation than comparable languages for a reason. Most
the codegen I've encountered is stuff that would be hard to implement in a
more elegant way due to limitations of the platform's runtime environment
(such as type erasure) that make it hard to handle a lot of the dumb verbose
mapping work and stuff like that at run time.

Someone else mentioned Clojure. I haven't tried Clojure yet, but I'm willing
to believe it - dynamic typing seems like it could easily be a secret weapon
on the JVM, since the platform is so halfhearted at static typing. But I think
that teams that are committed to using a static, infix, non-S-expression
language may be painted into a bit of a corner here.

~~~
maaaats
A lot of our duplication in our Java-projects is due to Java classes generated
from wsdls or xsds or similar. I quite like it, actually. Can generate it from
a known xsd and version it as it's own module.

How could a language solve this? I know F# can compile some stuff using things
online, but this means the build is suddenly not reproducible. And we could
always only generate that code on the fly when compiling, but that always
wreak havoc on some tooling or IDEs when the code you're referencing isn't
there until compile-time.

~~~
bunderbunder
Something like F#'s type compilers would work on the JVM. You can make the
build reproducible by just making sure that the inputs the type compiler uses
are consistent. Practically speaking, the biggest difference between that
approach and more traditional code gen is that you don't have to add a whole
bunch of build steps and you don't have to have team arguments over whether
generated code gets checked into source control or not. It's actually great
for the IDE, which does know all that type information at development time
because it can invoke the type compiler as a service. F# is the only language
I've seen where you can write inline SQL queries and get a red squiggly if you
mistype a field's name. (IntelliJ comes close, but, as far as I've been able
to find, it only works if you stick all your SQL queries in resource files.)

The other approach is to just handle these sorts of things at run time, using
the richer run-time information that your code has access to. So, e.g., if
you're binding an XML file to a List<Record<Employee>>, in C# it's possible to
actually express something like List<Record<Employee>>.class. The mapping
function can just look at that and figure out how to map data dynamically.
You, the developer, then get to own your own domain objects instead of having
to rely on ones that are auto-generated by some code generation process. You
don't have to accumulate a bunch of XML files and XML schema files and
associated obnoxious-to-maintain clutter. And you can do it without resorting
to a bunch of custom weirdness like Jackson's TypeReference class.

------
nerpderp83
Joe Armstrong talked about immutable globally available code. Only then would
we not need duplication. #lazyweb please find a link to it.

Why do we need modules at all? [http://lambda-the-
ultimate.org/node/5079](http://lambda-the-ultimate.org/node/5079)

[http://erlang.org/pipermail/erlang-
questions/2011-May/058768...](http://erlang.org/pipermail/erlang-
questions/2011-May/058768.html)

------
AnimalMuppet
Duplicating my comment from the other discussion of this article:

> In Java, 70% of duplicates were due to code generation. For C++ is was 18%,
> for Python 85%, for JavaScript 70%.

C++ is the real outlier there. That could be because C++ code is much harder
to generate, but I don't buy that it's that much harder than Java. Or it could
be because C++ templates aren't considered "code generation". Or it could be
because C++ doesn't get used for projects with that much boilerplate code.
Or...

~~~
to3m
A C++ project that features generated code will probably run the generator
during the build process - and possibly build it too, if it's written in C++
itself - and so the generated code might well not end up in the repo.

------
krylon
Duplicating code is acceptable, IMHO, if it is either clear the two files are
going to diverge quickly, or if it looks like they will not be changed at all
in the foreseeable future.

Trouble starts if there are two source files that do almost the same thing but
slightly different, and then you need to change them; now you probably need to
change these duplicate files in lock step, and that is a rather error-prone
process. IIRC, this kind of problem was the reason for adding templates to
C++. And say what you will about C++, but templates are a very powerful tool.

I fail to see the connection to language design, though. Is the author saying
that one should add some feature to make duplication as unnecessary as
possible (like templates in C++)? Or that the tooling around the language
should be better suited to automatic code generation?

FWIW, I think code generation is a very powerful tool; in a way, code
generation is meta-programming. (Is there a distinction at all?) And I think
that there is a lot of potential in this area. Go has supported the "go
generate" command for a while now, and I have seen a few very interesting use
cases (e.g. ffjson, which generates code to serialize/parse Go data types to
and from JSON more efficiently than the builtin reflection-bases mechanism).

Someone once said (I forgot who), "I'd rather write programs that write
programs than write programs." That sums it up pretty well, I think. ;-) Okay,
okay, so now I _do_ see the connection to language design. Sorry, my fingers
were faster than my mind this time.

------
fallingfrog
C# is really bad about this. They seem to be heavy into the Magical
Boilerplate coding style in which your "empty" project consists of thousands
of lines of duplicated/generated code which you then make slight edits to. My
opinion is that if some piece of code is so common that it needs to be
inserted into every project - then it should be a library call. Apparently
that's a minority opinion though..

~~~
victorNicollet
This isn't my experience with C# (an empty project starts with a `Class1.cs`
file if it's a library or `Program.cs` if it's an application), so you might
be using something on top of C# that needs all of that boilerplate.

~~~
arethuza
Sounds like project templates in Visual Studio - which can populate your
project with vast amounts of stuff. This can good or bad depending on the
context.

~~~
fallingfrog
An empty ASP.NET core application contains ~35,000 lines of code.

~~~
dahauns
No. Just looked, and the only large file generated is the dependency cache
(~7000 lines of json) which you never touch and can always be regenerated with
dotnet restore.

The rest is ~100-150 lines code+configuration and a readme file (~180 lines).

~~~
fallingfrog
Look- I get that a lot of this stuff (jquery..) is libraries, and some of it
is the visual studio solutions file, but if it's libraries, then _why was it
copied into my project directory?_

    
    
       1031 ./.vs/config/applicationhost.config
         19 ./.vs/WebApplication1/v15/.suo
          3 ./WebApplication1/.bowerrc
         10 ./WebApplication1/appsettings.Development.json
          8 ./WebApplication1/appsettings.json
         10 ./WebApplication1/bower.json
         24 ./WebApplication1/bundleconfig.json
         35 ./WebApplication1/Controllers/HomeController.cs
          1 ./WebApplication1/obj/Debug/netcoreapp1.1/CoreCompileInputs.cache
          0 ./WebApplication1/obj/Debug/netcoreapp1.1/TemporaryGeneratedFile_036C0B5B-1481-4323-8D20-8F5ADCB23D92.cs
          0 ./WebApplication1/obj/Debug/netcoreapp1.1/TemporaryGeneratedFile_5937a670-0e60-4077-877b-f7221da3dda1.cs
          0 ./WebApplication1/obj/Debug/netcoreapp1.1/TemporaryGeneratedFile_E7A71F73-0F8D-4B9B-B56E-8E70B10BC5D3.cs
         24 ./WebApplication1/obj/Debug/netcoreapp1.1/WebApplication1.AssemblyInfo.cs
       9830 ./WebApplication1/obj/project.assets.json
         17 ./WebApplication1/obj/WebApplication1.csproj.nuget.g.props
          5 ./WebApplication1/obj/WebApplication1.csproj.nuget.g.targets
         25 ./WebApplication1/Program.cs
         27 ./WebApplication1/Properties/launchSettings.json
         60 ./WebApplication1/Startup.cs
          7 ./WebApplication1/Views/Home/About.cshtml
         17 ./WebApplication1/Views/Home/Contact.cshtml
        108 ./WebApplication1/Views/Home/Index.cshtml
         14 ./WebApplication1/Views/Shared/Error.cshtml
         73 ./WebApplication1/Views/Shared/_Layout.cshtml
         18 ./WebApplication1/Views/Shared/_ValidationScriptsPartial.cshtml
          2 ./WebApplication1/Views/_ViewImports.cshtml
          3 ./WebApplication1/Views/_ViewStart.cshtml
         22 ./WebApplication1/WebApplication1.csproj
         37 ./WebApplication1/wwwroot/css/site.css
          0 ./WebApplication1/wwwroot/css/site.min.css
          0 ./WebApplication1/wwwroot/favicon.ico
          1 ./WebApplication1/wwwroot/js/site.js
          0 ./WebApplication1/wwwroot/js/site.min.js
         44 ./WebApplication1/wwwroot/lib/bootstrap/.bower.json
        587 ./WebApplication1/wwwroot/lib/bootstrap/dist/css/bootstrap-theme.css
          0 ./WebApplication1/wwwroot/lib/bootstrap/dist/css/bootstrap-theme.css.map
          5 ./WebApplication1/wwwroot/lib/bootstrap/dist/css/bootstrap-theme.min.css
          0 ./WebApplication1/wwwroot/lib/bootstrap/dist/css/bootstrap-theme.min.css.map
       6757 ./WebApplication1/wwwroot/lib/bootstrap/dist/css/bootstrap.css
          0 ./WebApplication1/wwwroot/lib/bootstrap/dist/css/bootstrap.css.map
          5 ./WebApplication1/wwwroot/lib/bootstrap/dist/css/bootstrap.min.css
          0 ./WebApplication1/wwwroot/lib/bootstrap/dist/css/bootstrap.min.css.map
        105 ./WebApplication1/wwwroot/lib/bootstrap/dist/fonts/glyphicons-halflings-regular.eot
        287 ./WebApplication1/wwwroot/lib/bootstrap/dist/fonts/glyphicons-halflings-regular.svg
        771 ./WebApplication1/wwwroot/lib/bootstrap/dist/fonts/glyphicons-halflings-regular.ttf
         93 ./WebApplication1/wwwroot/lib/bootstrap/dist/fonts/glyphicons-halflings-regular.woff
         72 ./WebApplication1/wwwroot/lib/bootstrap/dist/fonts/glyphicons-halflings-regular.woff2
       2377 ./WebApplication1/wwwroot/lib/bootstrap/dist/js/bootstrap.js
          6 ./WebApplication1/wwwroot/lib/bootstrap/dist/js/bootstrap.min.js
         12 ./WebApplication1/wwwroot/lib/bootstrap/dist/js/npm.js
         21 ./WebApplication1/wwwroot/lib/bootstrap/LICENSE
         24 ./WebApplication1/wwwroot/lib/jquery/.bower.json
       9831 ./WebApplication1/wwwroot/lib/jquery/dist/jquery.js
          4 ./WebApplication1/wwwroot/lib/jquery/dist/jquery.min.js
          0 ./WebApplication1/wwwroot/lib/jquery/dist/jquery.min.map
         36 ./WebApplication1/wwwroot/lib/jquery/LICENSE.txt
         39 ./WebApplication1/wwwroot/lib/jquery-validation/.bower.json
        997 ./WebApplication1/wwwroot/lib/jquery-validation/dist/additional-methods.js
          3 ./WebApplication1/wwwroot/lib/jquery-validation/dist/additional-methods.min.js
       1397 ./WebApplication1/wwwroot/lib/jquery-validation/dist/jquery.validate.js
          3 ./WebApplication1/wwwroot/lib/jquery-validation/dist/jquery.validate.min.js
         22 ./WebApplication1/wwwroot/lib/jquery-validation/LICENSE.md
         43 ./WebApplication1/wwwroot/lib/jquery-validation-unobtrusive/.bower.json
        415 ./WebApplication1/wwwroot/lib/jquery-validation-unobtrusive/jquery.validate.unobtrusive.js
          4 ./WebApplication1/wwwroot/lib/jquery-validation-unobtrusive/jquery.validate.unobtrusive.min.js
         22 ./WebApplication1.sln
      35413 total

~~~
wvenable
That's not an empty .net core project; that's a sample project that you can
run and see a whole website which itself is documentation on creating .net
core site.

However, I'm not sure Visual Studio gives you any way to avoid this when you
use the GUI to start a new core project. It is however possible to create a
non-Core ASP.NET project with literally nothing in it.

~~~
dahauns
VS sure does offer you a way to avoid this. Simply choose the "Empty" Template
when creating a new ASP.NET Core project. (Duh. :) )

------
alangpierce
There's a lot more to a programming language than the ability to avoid
duplicated code, and making the language more expressive is not always better.
The flexibility of the C++ preprocessor makes tooling (editor support, static
analysis, incremental builds, etc) much more difficult to write, and as a
result, state-of-the-art C++ tooling has a lot of disadvantages compared with
(say) Java tooling.

------
nerdponx
I would argue that many of the large extensions, plug-ins, and packages are
already their own programming languages. For example you don't just "learn
Python", you can "learn Django".

------
NotableAlamode
There is an interesting related paper " _Copy and Paste Redeemed_ " [1] that
is based on the assumption that a lot of copy/paste happens (and discusses why
this is not necessarily a bad thing). [1] investigates how copy/paste can be
auto-detected, how similar pieces of code can automatically be merged
together, and how abstractions can be automatically created from copy/paste
code.

I have not used the tool myself, so I cannot comment on how well it works in
practise. But I found the idea intriguing.

[1] K. Narasimhan, C. Reichenbach, Copy and Paste Redeemed
[http://creichen.net/papers/cpr.pdf](http://creichen.net/papers/cpr.pdf)

~~~
agumonkey
good approach IMO

------
TorKlingberg
Normally you shouldn't be checking generated code into version control. This
only measures GitHub users who did that anyway.

~~~
AnimalMuppet
Normally, yes, but...

If you don't check in the generated code, that means that you have to run the
generator every time you build (or at least every time you get a clean copy of
the tree and build). For code that changes very rarely, that may not be a net
win.

Also, if you don't check in the generated code, and then you upgrade to the
latest version of the generator, surprising things can happen (or even if
different people have different versions of the generator installed).

So there can be cases where checking in generated code can be at least a
reasonable thing to do.

------
typetehcodez
The pain of implementing an ORM and the generated code between OO and
Relational Databases might be low hanging in it's explanation, but writing a
language that does this is probably a Hard Problem. I wouldn't call that low
hanging, but I might be really happy to use such a language.
([https://blog.codinghorror.com/object-relational-mapping-
is-t...](https://blog.codinghorror.com/object-relational-mapping-is-the-
vietnam-of-computer-science/))

------
ndh2
The fact that this is labeled as "low hanging fruit" tells us that the author
has never designed or implemented a language.

------
carapace
I was trying to express an inchoate thought the other day, something to the
effect that, "total volume of code in the world should be shrinking about
now". I think this article could be interpreted as pointing to that.

I imagine a Grand Refactoring... The "great compression" of '23, or
something...

------
z3t4
> In other words, 70% of the code on GitHub consists of clones of previously
> created files

I interpret this as code _reuse_ which is almost the opposite of duplication.

~~~
dllthomas
Duplication will always be reuse - why are you duplicating if not to use?

Reuse is not always duplication, however, and reuse without duplication is
quite often better.

~~~
z3t4
Repetition (like in DRY) is duplication without reuse. When I think of
duplication in the negative sense, I think of repetition, not reusing a module
or library. Another negative thing with duplication is taking up more disk
space, but a good file system will take care of that! Then there is "DLL
hell", or "library hell" where two programs use the same library, but
different versions ... How do you reuse without duplication ?

------
GorgeRonde
I'm a lisper at night so I tend to have a heavy use of macros as well as
rather abstract patterns you can use in any language. This works for me
because

a) I'm the only maintainer of my own side projects (most of the time I never
finish nor publish them ... including a cool wysiwyg mouse-driven Clojure POC
debugger)

b) I tend to perform multiple full rewrites of my projects (up to 4 times).

c) My code tends to get obscure very quickly since I enjoy giving it an
"ontological" twist, and eventually this leads to less code but the code gets
less understandable from the static perspective of the source file it sits in,
so to get a good grasp of what it does, one needs to run the code and observe
what it does when it is evaluated (at macro-expansion time or run-time): this
is why I semi-successfully attempted to write the debugger mentioned above.

Meanwhile at work, I have not performed a full rewrite yet (understandably so)
and the goal is to keep the code as flat and linear as possible so that anyone
in the team can grasp what it does in one glance. Obviously this leads to a
lot of repetition, but this is for the greater good.

Currently I'm working on improving my dev experience with Clojure from two
angles:

1) Saner macros : I've been tweaking Clojure's reader so that code generated
with the backquote reader macro gets printed in its original form when using
pprint. For instance `(a b c) expands to (my-ns/a my-ns/b my-ns/c) and becomes
`(a b c) again when printed with pprint. I'm also thinking about expanding
macros in temporary files in order to get sane stack traces for code generated
by macros, but this is a surgically more complex thing to do.

2) "Macros" that expand and persist in the very file they are written in. This
allows for in-file debugging and should address point c) from above. Example:
At first the content of your file looks like:

    
    
      (debug
        (+ 1 2))
    

When you evaluate the file, it turns into (i.e. the file gets rewritten as):

    
    
      (debug
        (+ 1 2)
        ;;= 3
        )
    

Since the debugging/inspecting of how the code behaves at runtime gets
persisted in the file along with the code itself, it should allow for a
broader and more direct understanding of what the code does. Since this can
also be used as a language/library level snippet system, I've also been
considering using these in-file persisting macros as a templating engine for
code. In particular, if it is augmented with a conflict management system à la
git, one should be able to have flat/linear yet automatically generated code
and still be able to overwrite what's in a code template expansion without
losing the benefits of automatic code generation.

------
amirathi
I highly recommend you read about composability, various levels of abstraction
and how it all plays together. A good programming language should strive for
clean interfaces and expose relatively lower level methods that can be
composed in different ways to build more complex logic. E.g. Ruby/Python
exposing the methods they are exposing now, gems/packages use it to provide a
higher level construct, django/rails combines these gems/packages to expose
even higher level constructs.

So there is no point in including the most common auto-generated code in the
programming language itself. Because the tool that generated it can keep
evolving independent of the language and actually uses the lower level
constructs exposed by language. By including the logic within programming
language we'll just bloat the language and loose out on the beauty of
composability.

