
Sharing Code Between Projects: Lessons Learned in the Trenches - peter_d_sherman
https://www.smashingmagazine.com/2018/04/sharing-code-between-projects/
======
peter_d_sherman
Excerpt:

"Finally, we decided to look deep into the open-source projects on GitHub,
checking for both duplications and re-implementations of a simple isString
function in the 10,000 most popular JavaScript GitHub projects.

Amazingly, we found this function was implemented in more than 100 different
ways and duplicated over 1,000 times in only 10,000 repositories. Later
studies claim that over 50% of the code on GitHub is actually duplicated. We
realized we were not the only ones facing this issue."

Discussion:

Intuitively, there's something really big here... What if we thought about
functions as patterns with specific intents, and thought about them as
existing in scope as all of the way from simple logical and bitwise functions
all of the way up to whole programs...

Statistically then, some will repeat more than others, which should be more-
or-less inversely proportional to size/complexity. Well, if we could identify
all of those, automatically, or at least those that repeat the most, with
machine learning, in theory, we could refactor a whole bunch of code bases to
call the most common extracted functions in a single code repository...

Of course there would be many problems and challenges to this, that is, for
say string functions, there might be many implementations of a given string
function, and those specific implementations might not be 100% compatible with
one another... But all of those challenges and problems aside, _there 's
something there_... future Software Engineers and would-be Software Engineers
should explore this... I claim it could be a very rich and rewarding (well,
intellectually! <g>) area of investigation...

