Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Making invisible glue code first-class (metaobject.com)
91 points by goranmoomin on June 12, 2021 | hide | past | favorite | 29 comments


I recently watched a video [0] that contrasted all of the OO Design Patterns with Functional Programming equivalents. After reading about how this author wants to make "glue" code "first class" and how the Unix pipe operator is an example of single-character-glue-code, my thoughts immediately went to the aforementioned video presentation. Perhaps my "glue code" is simply gold-old "function composition"?

[0] Scott Wlaschin "Functional programming design patterns" https://vimeo.com/113588389

[Edit] Spoiler alert: each of the design patterns reviewed and contrasted in the video I mentioned seemed to "disappear" in FP since the equivalent is simply to use functions and/or function composition. Recently transitioning from C# to F# for my .NET Core projects, I felt the "glue code" the author mentions disappear in a similar manner with greatly-reduced code bloat.


Yes, function composition is one kind of glue, and it's good that we have it. But it's only one kind of glue, and often not particularly useful (see the part of the post about algorithm DSLs).

As John Hughes put it in Why Functional Programming Matters, we need more kinds of glue.

See https://blog.metaobject.com/2019/02/why-architecture-oriente...

As to Design Patterns: the Smalltalk version of the Design Patterns book is a small booklet, most of the patterns just go away. As to the video: tried to watch, got to the "it's functions, functions functions and again functions" slide. For OO, it would be "it's objects, objects, objects and again, objects". Sadly, most FP-centered critiques of OO are at this level, or even worse. :-/

Overall, you're going to have more patterns in FP because its model of computation is less powerful. (See Concepts, Techniques, and Models of Computer Programming https://www.info.ucl.ac.be/~pvr/book.html for a definition of relative power ).


I think function composition is powerful glue, indeed. Attach some side effect to the arrow, and all of a sudden you can implement a state machine, a reader environment, promises, or even continuations.


Scott is a great speaker and does an incredible job at talking about and explaining a ton of cool ideas in functional programming and makes his slides in F# easy to understand and apply those ideas.

Absolutely a great video and he has many others in addition to his site fsharpforfunandprofit.com


He's also got some other great videos on software design.


This is a really excellent topic to bring up -- the fact that glue code grows essentially quadratically.

Has it been a topic of exploration much in computer science? Any literature?

After all, sometimes it feels like most of the programming you wind up doing is glue code.

On the one hand, it feels like the kind of thing that there should be better tools for at least writing it declaratively, and let the computer turn it into code.

But on the other hand, it feels like a lot of that already is taken care of. E.g. CSS is essentially declarative glue code for what would otherwise be a huge number of painting and layout lines of code.

And most of the glue code I myself wind up writing winds up being written specifically to handle various constraints around memory, disk space, etc -- e.g. how to import a protobuf that changes every 30 seconds and convert it to 10,000 rows in a live database in a way that's performant -- and so isn't as amenable to something simple.

But I do still wonder if there's more opportunity here for more declarative coding when it comes to "glue".


> This is a really excellent topic to bring up -- the fact that glue code grows essentially quadratically.

I think this was the motivation driving Microsoft's Language Server Protocol.

https://microsoft.github.io/language-server-protocol/


LLVM (or, really, any IR/bytecode format) solves a similar problem for programming languages and compilers.


> After all, sometimes it feels like most of the programming you wind up doing is glue code.

This, except in spades.

> But on the other hand, it feels like a lot of that already is taken care of. E.g. CSS is essentially declarative glue code for what would otherwise be a huge number of painting and layout lines of code.

Every gui environment seems to have something like that. Whether it's as primitive as Win32 or as advanced as Android, you still program by putting widgets with properties in places. None of these really allow you to program the edges.

> And most of the glue code I myself wind up writing winds up being written specifically to handle various constraints around memory, disk space, etc -- e.g. how to import a protobuf that changes every 30 seconds and convert it to 10,000 rows in a live database in a way that's performant -- and so isn't as amenable to something simple.

If the glue that I wrote served that purpose, I'd be happy. If you're programming to meet a business requirement, it's not really glue: and performing fast enough is a business requirement.

I can handle the Win32 glue, and work around it, because I know it's there to make it easy to adapt code that was designed to work on computers with single digit megabytes of RAM.

If I write an Android program, the whole thing feels like writing glue, and the result is as chunky as an Android program always is. There's like one line of code to implement a requirement surrounded by dozens of lines of code to fulfil the ever changing requirements of the API. The best way of actually focusing on the requirements is by writing even more lines of Rx glue to turn the whole thing inside out, but the Rx glue isn't particularly concise either or well documented. The whole thing feels like I'm fighting against the people at Google, who give me an API that is neither efficient nor effective, nor some compromise of the two.

I do think these reactive style libraries and functional abstractions help to factor out some of the glue, if you don't have to write it all yourself. And when I can write a page in an FRP style, I can squint and imagine, with a bit more work, the business rules could be expressed in a declarative/markup language and even automatically drawn as a (state machine) diagram. In principle, with such a markup, you could write the glue for each platform and compile it to a native Swift iOS app or a Kotlin jvm Android app or JS app (using some library, perhaps Rx, as part of the target). (Naturally, you wouldn't have something that was truly write once. Platform specific logic and processes should still be captured in the business logic, when they aren't mere formalities. But it still gets the complexity way down.)

I think it's in that direction we should go. Just like we can declaratively specify a REST API in Swagger and compile it to a target language and library, we should declarative specify the business logic of our program and compile it to a target language and library, instead of trying to specify our logic in terms of the APIs provided by some GUI service provider. The declarative language should be third party, so that it is genuinely adaptable to multiple platforms.


Treating shell scripts as first class has taken me on a bit of a journey over the last decade:

- Learned from shunit2 that it's possible to write solid shell scripts, and that writing test code in shell scripts is hard to learn but doable. Bonus: I'd rather write the test script in a more easily debuggable language, such as Python.

- Bash has many more surprises in store for you than any other mainstream language. How $IFS affects things, how looping over files is either completely trivial (`for file in *`) or damn near impossible (`while IFS= read -d '' -r -u 3; do …; done 3< <(find … -exec printf '%s\0')`) based on the pattern you want to match, exit codes of arithmetic assignments, all the tricks to deal with not having exceptions, how `ls` is unsuitable for scripting despite being ubiquitous, locales, variable flags, and so on.

- `shellcheck` is fantastic, especially for including URLs to more information about each warning.

- Greg's Wiki[0], despite being hard to find in search engines, is the best place by far for learning about Bash.

- Writing a shell script to do some fast, asynchronous processing is maybe 30-100 times faster than writing the same in Python, but getting the nitty-gritty details right is about 30-100 times harder as well.

[0] https://mywiki.wooledge.org/


One patten I've seen to address this is to define a "plugin architecture" where disparate components are integrated into the main system via fixed "sockets". The sockets themselves are generic as is the glue code that attaches plugin and socket.

This does create a lot of boiler plate, but that boiler plate is predictable and uninteresting, and a good candidate for code generation.


So it's kind of like oop code with traits and interfaces?


Isomorphisms.

They let us think about things very easily, and express ideas very cleanly. When I Think about a system, I treat isomorphic things as almost exactly the same.

But when actually writing code, I need to fully spell out the isomorphism. That is glue code.So the code becomes my clear ideas, with awkward glue in between.

My dream is a compiler that knows these isomorphisms, and automatically implements them with decent (perhaps guided by me) optimization.

This way, the idea in my head and the code in my editor start looking like eachother a whole lot more.


I believe the activefacts project is within a year or two of having this ability (the hard research is done, but not the legwork to connect it to everything else).

Specifically, it should be able to automatically convert between any representation of any subset of the facts in a model.


I'm not understanding the importance of distinguishing glue code from non glue code. It's still work that needs to be done whether it's "first class" or not. That difference boils down to who writes it, not the fact that it is present in the code base (as measured by LOC, which the author measures and attributes the cause of high LOC to glue code)


I read this article but am having a hard time understanding the point in concrete terms. What does it mean to make invisible glue code first class? How would reduce the #LOC, which is something the author mentioned many times?


Glue code is almost always uninteresting, incidental, boilerplate. But as the author points out, it makes up for the large majority of a code base because it grows quadratically with the number of components that need to be glued together.

A powerful language to describe what needs to connect to what (similar to the Unix pipe) could perhaps make the glue code much more obvious and concise. Whether that's achievable is the question the author is asking, and apparently trying to answer.


It's worth researching the demoscene[0] and how demos that looked very complex were made with as few lines of code as possible. Every byte was accounted for and nothing went to waste.

This is a common thing in games and Super Mario even re-used the cloud sprite for the bushes[1]

Now we have to deal with gargantuan Electron apps that hog your PC's resources for housing what essentially is a lightweight webapp.

[0] https://en.wikipedia.org/wiki/Demoscene

[1] https://www.todayifoundout.com/index.php/2010/01/the-clouds-...


A lot of time and complexity went into inventing those simple things. If you ever made something 'large' in a very constraint environment (like a mcu with 24kb for all including the drivers and the OS) which is what old home computers and consoles had in common with that and demo programming in another but similar way (getting every ounce from the system which has (artificial or real) limitations, you will have noticed that this is hardly comparable with what the author is saying. You are mixing glue and other code and try to sometimes mix them into one thing (stuff a hardware driver inside the business logic) etc. In the modern demoscene potentially more (depending on the allowed size of the end result), but in the other cases the end result will have far less resemblance of something that called be considered and architecture. Or something you want to work with afterwards. The code fits in your head and you understand every byte; that does not scale to much more than indeed these examples. Not something the author is targeting or referencing besides line count. And we are indeed talking line count then k/j or code golfing would fit with different degrees of readability of the end result, just like your examples and most embedded code (not talking rasberry pis; talking what runs in your dishwasher).


I posted this to Lobste.rs when this story hit there and I'll post it again here:

> The pipe is one-character glue: “|”.

I think that this is mistaken. Pipe makes it easy to connect the output of one program to the input of another, but that is not the same as “gluing” them - you have to do text processing to actually extract the fields of data out of the output from one command and convert it into the right format for input to another.

This is why you see tr, cut, sed, awk, perl, grep, and more throughout any non-trivial shell script (and even in many trivial ones) - because very, very few programs actually speak “plain text” (that is, fully unstructured text), and instead speak an ad-hoc, poorly-documented, implementation-specific, brittle semi-structured-text language which is usually different than the other programs you want them to talk to. Those text-processing programs are the Unix glue code.

The explicit naming of “glue code” is brilliant and important - but the Unix model, if anything, increases the amount of glue code, not deceases it. (and not only because of the foolish design decision to make everything “plain” text - the “do one thing” philosophy means that you have to use more commands/programs (which are equivalent to functions in a program) to implement a given set of features, which increases the amount of glue code you have to use - gives you a larger n, which is really undesirable if your glue code scales as n^2)


I don't think that's right. If you compare bash:

    ls | sort
with some pseudocode:

    stdin_ls <- empty_read_stream()
    stdout_ls <- init_write_stream()
    stdin_sort <- init_read_stream()
    stdout_sort <- init_write_stream()
 
    plug_stream(stdout_ls, stdin_sort)
    stream_sink(stdout_sort, function_that_processes_lines)

    handle_ls <- start_proc("ls", stdin_ls, stdout_ls)
    handle_sort <- start_proc("sort", stdin_sort, stdout_sort)

    wait_for_both(handle_ls, handle_sort)
it seems that "|" is definitely gluing things together.

It's true that it doesn't do everything you want. But it's truly glue. It's crazy to imply that there's only one level of glue.

You identify other glue: if you want to sort by something that isn't the whole line, you need to write a bunch of glue. But that doesn't mean this isn't glue.

> the foolish design decision to make everything “plain” text

I don't think you can call it foolish and be taken seriously. You can disagree about whether it was the right design decision, but it was neither the first nor the last system to transfer unstructured data and rely on external programs to interpret them. It seems to have paid off too - systems descended from this design are widespread. You might not like paying the tradeoff, but a lot of people find it productive.

Powershell has some improvements on top of the Unix model because it saw the benefits of the Unix model and the costs of it. In either case, the gui world has nothing like the productivity of either system.


> Powershell has some improvements on top of the Unix model because it saw the benefits of the Unix model and the costs of it.

yes, the pattern is to have some internal normalized generic-enough data structure on which pipes will operate. Then all you need is the ability to convert non-normalized input sources into it, and the same with the output formats.

BTW: the UNIX pipes forks the processes, but Windows' CMD.com was executing the stages in the pipeline sequentially, not sure about PowerShell.

> the gui world has nothing like the productivity of either system.

I think Tcl/Tk is kind of GUI glue. It's very easy to build a GUI front-end for an existing CLI commands / scripts.

EDIT: also COM/OLE, ActiveX, JavaBeans were promising code re-use for GUIs


cmd.exe most certainly does execute the pipeline in parallel, much as unixes do. The old MS-DOS command.com would not (since only single process at a time execution was even supported). I have no idea if the Windows 9X/ME version of command.com worked the same way as dos or not, and I don't feel like firing up a VM to find out.

Now cmd.exe has maintained a lot of stupidity, and even many obvious bugs from the old command.com executable, in order to ensure that positively ancient script still run as expected. (This includes horribleness like the cmd.exe code not being able to read ahead in the .bat file, in order to support old self-modifying .bat file tricks!).

But sequential-only pipelines are not part of the stupidity that was kept in place.


Your example is intentionally and extremely contrived in such a way so as to support your point, and not representative of the real world at all. For instance, if you suddenly ls -l and then want to sort by size instead (yes, I'm aware of -S, and it's irrelevant), then suddenly you have to use far more than just a pipe character. Moreover, the psuedocode is absolutely ridiculous - a more accurate example would be a single function call to list items in a directory, and then another call to sort() with a lambda to pull out the correct field to sort on. In Common Lisp:

(sort (ls) :key (lambda (entry) (size entry)))

Compare that to your psuedocode snippet.

> It's true that it doesn't do everything you want.

My point was that it doesn't do anything. It's like function composition in any modern programming language - it's so incredibly trivial that nobody mentions it, because it's both necessary for most operations but also completely insufficient for doing those operations on its own. It's not "glue" in the same way that function composition is not "glue" - it doesn't do any more connection than just enabling things to theoretically exchange data in some way.

> But it's truly glue.

Maybe according to your own definition, but certainly not in the way that the article (and I) use it, which is "the unglamorous, invisible code that connects two pieces of software, makes sure that data that's in location A reaches location B unscathed (from the datbase to the UI, from the UI to the model, from the model to the backend and so on...)" ...which clearly means that the data is actually transformed into a format that the receiving code can read - which pipe does not do.

> I don't think you can call it foolish and be taken seriously.

If someone doesn't take me seriously (after I make my point, of course), then that's an indication that their own mind has been blinded by a zealotry for the Unix style, not that my point is invalid.

> it was neither the first nor the last system to transfer unstructured data and rely on external programs to interpret them

...which doesn't matter, because it was still a design decision that the Unix architects made, and one of the ones for which Unix is best-known for.

> It seems to have paid off too - systems descended from this design are widespread.

The idea that popularity implies a good system design is a pretty well-known fallacy. Look at PHP - nearly universally acknowledged to be awfully-designed, and yet extremely popular. Windows, JavaScript, Java, and more - all popular, all poorly-designed.

> You might not like paying the tradeoff, but a lot of people find it productive.

You can say the same about dynamically-typed programming languages, and weakly-typed programming languages, and even untyped programming languages (Tcl, where everything is a string) - an illusion of productiveness, where you spend much more time down the line debugging mistakes caused by the flaws in your methodology.


You are mistaken.

Connecting them is gluing them.

What you are describing is massaging the data so that the computation yields the result that you want.

I am not saying that Unix pipe got everything right, it did not. And the "everything is text" idea is a compromise that is both brilliant and awkward. It is brilliant because it means that the end-user packaging and compositional packaging of functionality is actually the same, which is generally hard to impossible to achieve, and something people generally get wrong. That is, they create end-user packaging for their components and then wonder why it doesn't compose. Or create compositional packaging and wonder why it's unusable.

But of course you are stuck with the awkwardness that often things don't express so well as text.

Another thing is that filter composition is only applicable to small subset of problems (though a larger one than is often thought), going outside that range quickly leads to "unnatural programs" (see Concepts, Techniques, and Models of Computer Programming, https://www.info.ucl.ac.be/~pvr/book.html).


However, to the post’s point, the glue (the arguments to tr, sed, etc) becomes explicit and easy to identify.


I don’t understand why “master” is problematic but “first class” is okay.


Perhaps because the passengers in steerage had some choice in the matter, and didn't generally get sold, separated from their family, and tortured on arrival.


"Best of breed" always makes me think of eugenics gone awry.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: