Hacker News new | past | comments | ask | show | jobs | submit login
Dr. Software: An unfinished journey starting from dirty code [pdf] (bitgloss.ro)
153 points by dragosslujeru 33 days ago | hide | past | favorite | 46 comments

It's a nice primer on writing functional Java. However, due to some past experiences I feel a bit on-the-fence about it when it comes to actual real world advice:

1. I once happened to have a look at a Python codebase where some poor soul had tried to write Java in Python. Quite literally. It was not a good experience.

2. Many moons ago I had a task of implementing a repetitive something for many somethings in C++. Being the newbie I was I just set out typing out all the methods. Then a more senior colleague of mine, after having watched the ordeal for a few minutes, asked for my keyboard to type out a signature of a templated function to suggest that I could try 'something like that'. Noticing my blank stare he proceeded to delete the scribble with the words 'It's fine, you were well on your way to finish it!'. I stopped him, and saved the snippet in a file titled 'black_magic' to later grok it. Now I know. But I didn't then. I think he was wise in his approach to reality.

3. At some point I got accidentally involved in a C# project. To my eyes it was full of accidental complexity. But I was not expected to be a core contributing member later on. So I'd do my own thing in parts of the code base allocated to me, changed all the structs to classes as required in code review, and made sure the whole thing fits together well with all the layers upon layers of abstraction. Because it wouldn't be mine to maintain, so it better be made the way people who will maintain it like it.

So, the general advice 'When in Rome...' may apply here. I mean, what's the chance that after an 'enlightened' contractor changes a Java code base to a functional paradigm comes someone 'unenlightened' just to rewrite it back? Considerable, I think.

My take in the specific case of functional programming in Java is that the language will fight you every step of the way.

Checked exceptions are decent example here. The set of checked exceptions a function might throw is part of its signature, and the built-in function types don't declare any checked exceptions.

But I've run into others. I find that Optional<T> and Java Streams have some design quirks that are mostly no big deal if you're using them in predominantly object-oriented or procedural code, but quickly become irritating if you're trying to follow a scrupulously functional idiom. And some of the design implications of how Java did generics at a language level have a way of sneak attacking you when you're trying to scrupulously follow a functional idiom. Not because of type erasure (really, Haskell is the poster child of type erasure; this definitely isn't about type erasure), but because some limitations in what kinds of generic types the language will actually allow you to express can quickly back you into mind-bending workarounds like the the curiously recurring template pattern.


I'm really thinking here of functional programming in more of the ML/Haskell tradition, which is a peculiar way of doing things that does tend to lean pretty hard on the type system. If we're just talking about using higher-order functions, that's great. Higher order functions absolutely have a place in object-oriented programming. Java should have had them decades earlier than it did. The virtual machine that the JVM was largely based on was for a language that used them heavily, Strongtalk, and I'm not sure I understand why the Java team took them out. I'm sure there was some technical challenge, but I can't help but wonder if it's really because Java was born during the great "Lisp vs Everyone Else" war of the 80s and 90s, and the feature was purged for political reasons.

With HM type systems, the type system becomes an asset, rather than something you have to fight against or satisfy in peculiar ways. It's a breath of fresh air doing FP in a statically typed language that's not based on type erasure.

I think you might be misunderstanding what type erasure does. It's really just about whether type information is retained at run time, in order to support run time type checks. Hindley-Milner type checking, on the other hand, happens at compile time.

In a rigorously statically typed language like Haskell that does all the type checking at compile time, there is no need to retain types at run time, so they are erased. Java is a less statically typed language. Variables have types, but you're allowed to upcast all the way up to Object, and then do a dynamic type check and downcast at a later time. It's just that the language won't do this transparently for you the way it will in a more fully dynamically typed language. In order to support this, though, Java cannot erase types.

The controversy with Java is that, when generics were introduced, they decided that generic type parameters would be erased. This greatly undercuts the soundness of the type system. If you're not careful (or if you're trying to be clever), you can insert a String into an object that had been declared to be a List<Integer>, and you won't get any errors until you try to iterate over the list and a run-time type check suddenly blows up your program because some code finds itself interacting with a String even though the compiler's static type check had confirmed that the code should have only ever been able to operate on Integers.

The thing to call attention to here, though, is that this is at least as much about Java's relatively weak static type system as it is about erasing type parameters. If Java (the language) had something closer to a Hindley-Milner type system with stronger static type checks, then it might have been able to catch shenanigans like that at compile time. It's a question of framing: the problem could have been handled by using type reification to enable stronger run-time type checks, but it might also have been solved by using stronger compile-time type checks to eliminate the need for those run-time checks in the first place.

That said, this is pretty deep out in speculative territory, since it's doubtful that either option could have been executed without major breaking changes, which Java definitely wasn't going to do.

> If you're not careful (or if you're trying to be clever), you can insert a String into an object that had been declared to be a List<Integer>

It's less about type erasure and more about missing covariance/contravariance in Java generics. Scala also has type erasure in generics (same JVM), but won't allow that (unless you explicitly upcast). HM doesn't recognize inheritance, so it's in a completely different category.

> The controversy with Java is that, when generics were introduced, they decided that generic type parameters would be erased. This greatly undercuts the soundness of the type system.

Not at all.

Like you said yourself, type erasure has consequences at runtime but zero impact on the type system.

What you probably mean is that the type system can be circumvented at runtime, but you don't need erasure for this, standard operations such as casting already allow that.

It has zero impact on a static type system, but it has huge impact on any run-time type checking.

So, for example, you can't really do dynamic typing and type erasure. More generally, you can't do any dynamic type checking, including casts when types have been erased. OCaml does a more complete version of type erasure, and also lacks a casting operation. Because you can't safely cast if the run-time can't dynamically verify that the cast is valid.

(Tangentially, and Java is a great example of this - static and dynamic typing are not a binary. There are actually quite a few different kinds of type disciplines, and, in practice, it's common for languages to do some things statically and some dynamically.)

The soundness problem for Java is that, with the way it did generics, they don't really fit cleanly into Java's original type discipline. There are some things that tradition would have dictated be handled dynamically, but can't be in the case of generics because generic types only have partial run-time type information. At the same time, they weren't handed by the compiler, either, presumably because, historically, they fell outside the type checker's bailiwick. So they ended up being nobody's responsibility, and just not checked.

Nowadays we do generally get warnings for the most common situations where this happens, but there's no real way to fix them. You just silence the warning with an annotation to tell the compiler, "Trust me, I know what I'm doing." Meaning that, yeah, there's even a little bit of weak typing in Java. At least as of Java 5.

> doing FP in a statically typed language that's not based on type erasure

What language are you talking about here?

Haskell, and OCaml/Reason

But those languages are type-erased.

OCaml is slightly lovely for being an object-oriented language that doesn't let you downcast.

Yep, following the idioms of a language is almost always better than trying to make it look like some other language, even if the emulated language is more fashionable.

> It's fine, you were well on your way to finish it!

Wholesome mentor is wholesome.

This isn't about refactoring OOP to FP, the author invents an example (strawman?) to refactor, and the first thing he does is complain about how it is a badly written CLI application. I stopped there.

> All projects that I got into had nasty, unnecessary issues. Almost all of them were started in a rush, to get the business off the ground and had created a parallel business of supporting the customers through all the defects they had. It’s almost like the businesses were caught in a startup limbo, for years. Sometimes decades! Working in such a company can take its toll on someone.

So what is more important to have: (1) a product with bugs which can be sold and raises money, (2) a piece of software that has no bugs but also shipped x-times as late thus only learns you your true requirements x-times as late.

For someone bashing on ivory tower architects in the subsequent lines this sounds like pure irony. The mythical software engineer that doesn't produce bugs.

> Most defects in software come from poor engineering.

In my experience most defects come from not knowing the exact way your software will be used by the customer. Sometimes because of not asking thoroughly but more often because the customer doesn't know yet exactly. Also in my experience most bugs are actually change requests in disguise. Actions that you didn't imagine and never designed your software for but are executed by the customer and now cause trouble to your software.

> Oh and yes, design, architecture and other fancy words, are just engineering in the software industry, because it is not mature enough to have that level of abstraction, in which the architect draws the picture and the engineer implements it in such a way that it almost 100% of the time is right on the money. No. Far from it. What happens in reality is that the engineers try desperately to make the system work somehow, while maintaining the illusion of the story told by the architect.

I feel sorry for your loss. You must have worked in awful environments to think about it this way.

The next few sections I skimmed through read like the author is not proficient in software engineering at all which make me doubt if this is experience from some coding (scripting) side-projects rather than a full-time professional software engineer.

The trick between imperative/functional programming is that there is a balance involved. Some things make a lot of sense to put in a for loop and wrap with a try/catch. Interfacing with the real world is where I write the most of my dirty code. In software I am responsible for, the closer you get to the domain model, the more functional it becomes.

In my mind, the entire point of imperative programming is to construct a sort of "matrix" in which the functional programming can exist without having to be aware of the uncertain nature of the outside world.

If you want to build something complex really quickly, start with the domain model. And start with excel, not visual studio. Sit down with a spreadsheet in the conference room with all your business stakeholders. Iterate that document until every business person AND developer agree that it covers the needs on both sides (facts + normalization). Once you have that document completed, write the quickest & dirtiest MVP you can tolerate against it. Convene another meeting with the stakeholders. Check your gaps. Make another spreadsheet. Do this about 4-5 times, and then you can write "the one".

I skimmed the book a bit and caught a Java construct I never heard of:


I use the language very rarely. Is this a feature people like, use, know about?

The version of java that features Records has only been released since March 2020, so I would suspect no.


Something like 2/3 or 3/4 of Java users are still on Java 8. So, yeah, no.

Sorry, where did you get this estimation from? (Genuine question).


Not OP, probably different numbers, but Snyk has recently published a JVM ecosystem report containing a similar question: https://snyk.io/jvm-ecosystem-report-2021/ (60% of people use JVM 8 in production, but 40% use multiple versions, so e.g. 62% use JVM 11)

It's just sort of in the air. There are several organizations that conduct and publish annual surveys asking Java people what version they're on. Jetbrains and Perforce do, and I think IBM has one, too. The numbers get tossed around in Java tech talks so often that I've long since stopped keeping track of which specific source everyone's quoting, but the figures generally fall on that range.

They may not be relevant based on how large the Java community is, but I think part of it is that the Java-targeting languages like Kotlin, Clojure, and Scala are still targeting Java 8.

Though, I would argue that many Kotlin and Scala users have a stronger incentive to migrate to Java 11 or later than Java users do. Java 8's JIT compiler and default garbage collector tend to punish you (performance-wise) for using Kotlin or Scala; Java 11 and later have made great strides to improve that.

Not sure about the other languages, but I've been using Clojure with Java 14 and GraalVM 11 for quite some time.

I don't know about Clojure, for example, but Kotlin targeted Java 7 for a long time. By which I mean that the standard library didn't use anything from Java 8 or later, and the compiler would emit Java 7 compatible code by default. You can still compile it to target later versions, though.

It's only fairly recently bumped up to Java 8 being its minimum (and default) Java version.

from a general survey point of view, there are confounding aspects to a single number like that. Since Java is an enterprise language by design, the difference between 'moving versions due to management declaration' and 'our major ecosystem component supports THIS' also factor in to the overall picture.

As an outsider who does follow this for editorial reasons from time to time, most recent info (last year) was yes, projects surveyed are staying on (open) version 8.

Recently code reviewed some new Java code.

Author, who is brilliant in a number of ways had written the code using new functional style.

This meant a bunch of anonymous function stringed together in maps.

No comments of course since comments are generally bad these days.

If one does functional programming in Java, don't do that.

Unfortunately there's this idea that functional programming is just about using lambdas for everything. For me, proper functional style is just about decoupling business logic from state and side effects, and this can still be done with idiomatic OOP code.

What a strange title.

Shouldn't it be "Refactoring from imperative to FP"?

You can still keep the OOP attributes of your code while embracing FP, there is nothing mutually exclusive about these two paradigms.

Submitted title was "Dr. Software – Refactoring from OOP to FP [pdf]". We've changed that now, in keeping with the site guidelines: "Please use the original title, unless it is misleading or linkbait; don't editorialize."



So junior level programmers are not allowed to upvote stuff on the front page, or?

No it just means that promoting garbage will result in people treating that garbage as good practice particularly when coming from HN.

Careful. Functional is often harder to debug. The very "state" functional likes to avoid is often an excellent x-ray point(s) for debugging. It's one of the reasons functional fails to take off in the mainstream despite being around 60 odd years.

Service-ability often trumps parsimony in team code. I'm just the observant messenger. There are exceptions, but they tend to be transient.

> Careful. Functional is often harder to debug. The very "state" functional likes to avoid is often an excellent x-ray point(s) for debugging.

Wat. Yes, if your bug relates to a certain state, then you'll have to debug the state, but (a function or a program) being stateless means automagically avoiding all bugs relating to state; thus you don't need those "x-ray points". Also, you don't need to know whether a given function mutates it's argument or returns the result (or both), which is a pain in JavaScript, and even with Common Lisp's sort "function". If your language is immutable by default you just never have to watch out for accidental mutations. No "x-ray" needed, because functional programming (done properly) shouldn't be a leaky abstraction.

> but (a function or a program) being stateless means automagically avoiding all bugs relating to state; thus you don't need those "x-ray points".

That's not entirely true, at least the way I understand "state". You still can, and do, model state in stateless systems; the programming environment just doesn't give you mutable, named cells for free. State is modeled by, for instance, observing how the arguments change in each self-call of a recursive function, or by the value observed between stages of a functional pipeline. These are what I would consider the analogous "x-ray points".

I don't disagree with you, but I find the benefits are more about lifting state out of the steps and making it visible, rather than eliminating the concept of state.

Relatedly, "Immutability is not enough", from a couple days ago: https://news.ycombinator.com/item?id=27642263

> You still can, and do, model state in stateless systems;

Yea, in functional programming it's easier to focus on the essential state inherent to the problem, and avoid introducing accidental state.

> Relatedly, "Immutability is not enough", from a couple days ago: https://news.ycombinator.com/item?id=27642263

Okay, interesting article, but I don't agree with this bit:

> It turns out that our purely functional rendering code is sensitive to ordering in non-obvious ways. The first time I encountered this kind of bug, it felt strangely familiar – it’s something that often occurs in imperative programs. This is exactly the kind of problem that functional programming was supposed to help us avoid!

I don't think functional programming is supposed to help you with if you misunderstand the essential logic of the problem. I'd argue that declarative languages (like Prolog and SQL) can help avoid this type of ordering bug though by making variable assignment explicit.

I agree with you on the linked article ^_^ but it does at least do a good job of showing how the concerns of state still exist in the absence of mutable data.

You can still step through functional code an examine local variables, you just have the assurance that the values of any local state are purely dependent on the inputs to the function and not any other external state. So it's ultimately easier to debug - any values that affect the operation of a function are explicitly set when calling the function, and you don't have to worry about figuring out ways to ensure that runtime-generate state matches whatever it was when the error originally occurred.

In theory you wouldn't have many "local variables" if you followed functional practices.

Re: "you just have the assurance that the values of any local state are purely dependent on the inputs to the function and not any other external state."

I don't see that principle helping much in practice.

Let me show you the interactive debugger for Clojure in Emacs:


     // procedural style
     func foo01(a) {
       b = x(a);
       c = y(b);
       r = z(b,c);
       return r;
     // functional style
     func foo02(a) {
       return z(x(a), y(x(a)));
With foo01, one can readily study the value of "b", "c", and "r" in a debugger and/or a Write() statement. With foo02 it's harder to do the equivalent. A good debugger will let you explore the output of a given function call in foo02, but you still have to rework the code to put those into a Write() statement. A direct debugger is sometimes not sufficient, especially in big loops.

It's also easier to read the first style. It's more like a bulleted list:

    - Step 1
    - Step 2
    - Step 3
The functional style is often an awkward horizontal string of chained steps. Sure, the formatting can be adjusted, but it takes more effort, training, and/or discipline to format functional code to make it legible to ordinary code maintainers. Procedural just handles riff-raff better; functional requires ideal conditions/staff.

PS, do I really deserve a -4 score above?

Both are functional styles.

    (defn foo01 [a]
      (let [b (x a)
            c (y b)
            r (z b c)]

    (defn foo02 [a]
      (z (x a)
         (y (x a))))
are both valid.

I prefer the first one where I can see the intermediate steps clearly.

this is not what procedural vs. functional is. Both funcs are equivalent.

What definition of "functional" are you using?

It sounds to me as if you don’t have much real world experience developing software in a functional language. Am I right?

Applications are open for YC Winter 2022

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact