
Lessons learned porting 50k loc from Java to Go - hu3
https://blog.kowalczyk.info/article/19f2fe97f06a47c3b1f118fd06851fad/lessons-learned-porting-50k-loc-from-java-to-go.html
======
mykowebhn
"Both Java and Go have interfaces but they are different things, like apples
and salami."

------
sathishvj
One of the most interesting parts for me was the time taken - 601 hours for
about 50k loc. That could be a starting point estimate for other projects.

~~~
archagon
I’d say that’s not nearly enough time to write 50k loc from scratch! (Unless
I’m horribly slow and no one is telling me.) Porting is a very different kind
of work from feature development, since you already know the inputs and
outputs of your system.

------
hamdouni
Thanks for sharing. I understood the contract was about replicating as much as
possible the same structure or workflow as the original Java code but if you
had to develop the same client in Go from scratch, where would be the main
differences in your approach and how the Go code would benefit ?

~~~
kjksf
I would split it into multiple packages.

I would survey existing libraries for other JSON document databases (MongoDB,
Google's Firestore etc.) and steal all their best ideas.

But honestly, good API design is something that needs a lot of time to reach
the polished state.

Whatever I can come with today, I'm sure a year from now I would find ways to
improve that.

That has been my experience with much smaller libraries I wrote.

The more time you get to spend thinking about the problem, the better solution
you can come up with.

------
rurban
> Today if I was settings this up from scratch, I would stick with just
> AppVeyor, as it can now do both Linux and Windows testing and the future of
> TravisCI is murky, after it was acquired by private equity firm and
> reportedly fired the original dev team.

I would rather go with CirrusCI which has windows, macos, linux and freebsd
support, is much faster and is easier to work with. appveyor being the
slowest, and travis having the worst features.

~~~
kjksf
Thanks for the tip, CirrusCI looks interesting.

------
michaelcampbell
Some of these examples were obviously cherry picked for one side of the other.
Comparing against getters/setters, which are written once, generally
automatically by the IDE, and never looked at again and omitting the
ubiquitious "if err, _ = f()" rote seems a bit disingenuous.

~~~
farmdawgnation
They also neglect to mention that there are active conversations in the Java
community around how to effectively implement Data Classes. For example:
[https://cr.openjdk.java.net/~briangoetz/amber/datum.html](https://cr.openjdk.java.net/~briangoetz/amber/datum.html)

------
andonisus
I personally have ported about 20K LoC from a java application to go; the
majority of the code was from an underlying library used to model the data
structures being used. It was mostly a class-by-class, function-by-function
process of porting the base data structure and then all of its
implementations, and culminated with the port of the services which uses all
of it.

Regarding the author's frustration of moving POJOs over (and their variable
declarations), I used sublime text to select all variables based on a shard
token, then cut them and moved them over by word. You can then lower case the
first letters of every word, and then find-and-replace by type using a shared
token. I found this method very quick and effective.

~~~
grogenaut
shard token top of 2nd paragraph threw me for a loop, is this some sublime
feature I don't know about? Means Shared token.

~~~
andonisus
Yeah, that was a typo: I meant shared. Sorry!

------
eclipseo76
>Java allows circular imports between packages.

>

>Go does not.

Huh? As a package maintainer for a lot of Go stuff, I had to deal with tricky
cyclic dependencies several times, especially in Google own Go packages like
golang.org/x/build and google cloud.

~~~
rat9988
What do you mean by "deal with"?

~~~
thanatos_dem
They had to take it into an alley and kill it, mafia style.

More seriously, they probably means find ways to restructure packages and
extract shared dependencies into subpackages to avoid cyclical dependencies.

~~~
rat9988
Which in turn means that go doesn't allow it, hence the original point.

------
dxxvi
If I'm paid to port it from Java to Scala, I'll write a blog like this after
I'm done (maybe I'll never be done :-) ).

~~~
masklinn
Why would you be paid to port a java library to scala? Scala developers can
use the Java client library directly.

~~~
rsynnott
There can be benefits to having a ‘native’ Scala library, which uses scala
idioms, but realistically it would probably just wrap the Java one.

------
networkimprov
Go has code coverage analysis in its testing facilities... why not use those?

~~~
kjksf
I do use builtin coverage.

It's just it's nice to be able to track code coverage over time, which is what
codecov.io provides.

Also, running all tests (to get full code coverage) takes 20 minutes and makes
fan on my laptop unhappy.

The way it works is that on every checkin the CI job runs all the tests with
code coverage enabled and uploads the results to codecov.

Codecov can then plot coverage over time.

My gripe is with inaccurate accounting of empty lines (like comments or struct
definitions) by codecov. Go's tool to visualize this count them properly. I
don't know if it's codecov or maybe I'm not sending the data properly.

~~~
jimsmart
In the above linked article you say:

> Codecov is barely adequate. For Go, they count non-code lines (comments
> etc.) as not executed. It's impossible to get 100% code coverage as reported
> by the tool.

I have an open source Go project that has some comments in methods, but still
achieves 100% coverage using Codecov.io — I'm not sure what I do differently
to yourself? (Perhaps I'm not using any inline struct definitions?)

Here's a link, in case there's anything useful to you in my .travis.yml ?
[https://github.com/jimsmart/store4](https://github.com/jimsmart/store4)

HTH

~~~
kjksf
It looks like we're both doing the same thing.

I don't know, maybe I'm not counting things right but for example
[https://codecov.io/gh/ravendb/ravendb-go-
client/src/master/d...](https://codecov.io/gh/ravendb/ravendb-go-
client/src/master/delete_attachment_command_data.go) shows less then 80%
coverage and there are only 2 lines not exected out of at least 18, which
should be at most 10% counted as not covered.

~~~
jimsmart
In the example you link, there are 18 coloured lines, 4 of which are not
green: 4/18 = ~0.22 = ~22%. This tallies-up as expected with the 77.78%
coverage shown at the top of the file.

Codecov* doesn't count an 'if' statement as having full coverage unless one
tests both outcomes: so the yellow lines here have been executed, but do not
count towards your coverage score.

Granted, one could argue that that's not very generous! But on the other hand:
those yellow lines have not been fully tested, despite being executed, so I
can understand their decision.

In the linked code, just implement a couple of simple tests to test for the
expected error conditions: it's easy (here at least) and ensures the code
behaves as expected. (Obviously not all partial/no coverage lines will be so
easy to hit with tests, it might not always be possible to easily get 100%
coverage, but hey: start with the low hanging fruit!)

* I say Codecov here, but I highly suspect that they may simply be using Go's coverage reports under the hood?

------
MrBuddyCasino
Can anyone explain what’s happening here? I can’t make sense of the syntax:

func (q _Query) GroupBy(field string)_ Query {...}

~~~
notafrog
I like the Go language for all of its efficiency and easy concurrency, but
part of me can't help but think that they were dying to make some new syntax
just for the sake of it.

~~~
pjmlp
That syntax was based on Oberon-2 (1991), given Robert Griesemer's presence on
the team.

    
    
        PROCEDURE (q : Query) GroupBy*(field:  String): Query;
        BEGIN
         (* .... *)
        END GroupBy;

------
lukeh
Also, null is not the empty string...

~~~
coldtea
He knows it, that's why he mentions lack of null (nil) in Strings in Go as an
issue when porting the code.

But depending on how the Java client uses null, "" can do just as well. It's
not like you have many other options (except to add your own composite struct
on top of String, or to use a guard value that's still a string).

~~~
fileeditview
In most cases this would probably be bad but he could also use a pointer to a
string.

------
arendtio
I wonder if it would have been easier (in terms of complexity, not in hours
spend coding) to rebuild the project in Go (in 100% idiomatic Go) and port it
to Java afterward. Sounds weird but since Go has such a strong focus on
simplicity and Java probably implements most things Go uses, a port from Go to
Java is probably a lot easier.

~~~
kjksf
It's true that if those code bases were evolving together, it would be
possible to make them more in sync by sticking closer to what Go provides.

But RavenDB is 10 years old. Java client already exists and has 50 thousands
lines of code.

If I was trying to rewrite that from scratch, I'm afraid Joel Spolsky would
find me and spank me.

~~~
grey-area
Did you at any point dream of writing a translation tool which would port java
to go automatically? (crazy idea I know)

~~~
kjksf
Yes I did and I don't think it's crazy.

I'm not sure a tool could do 100% translation but I'm sure it could do a lot.

A surprisingly large amount of time was just moving the order of variable
declaration from Java's "type name" to Go's "name type" and renaming, say,
"String" to "string".

If a tool did that for me, it would save a ton of time.

Unfortunately, the upfront time investment to learn enough to write even the
simplest translator would probably be greater than time saved on one project.

------
tannhaeuser
I'm wondering if the goal of this porting project would've been achieved by
Java's newly gained native/AOT compilation feature. Probably not, as a huge
Java .so brings the JVM's infrastructure for gc etc., and too much overhead.
Still, I hope we can get rid of the language-centric ecosystems we have in
favour of established OS/POSIX ways for shared libs, or an evolution thereof.

~~~
kjksf
There are an estimated 1 million of Go programmers.

The goal of the project was to enable those Go programmers to be able to use
RavenDB database in projects written in Go.

The company also maintains Python client library and Node.js client library.

It's not about Go vs. Java as a technology but enabling as many programmers as
possible to use RavenDB.

~~~
tannhaeuser
I understand the goal, and my comment wasn't so much arguing in favour of one
language vs another, but rather that developing the same thing for multiple
languages over and over seems wasteful.

~~~
Matthias247
In those client things the amount of the code that can be moved into a cross-
language shared library might be lower than expected. E.g. the wire protocol
and socket handling might be an option - but it's often not that big.

The bigger part might be transformation of the programming languages types
into something useful for the client (e.g. through serialization), which has
be be redone anyway for each language. And after that the question comes up
whether sharing the remaining things yields enough benefits to justify the
hassle of having a dependency which is less portable, requires another build
system, etc.

That's my general experience with those kinds of projects - I don't know
enough about RavenDB in particular to tell if it's the same here.

------
raphael_kimmig
It's interesting that the resulting go code has 43k lines of code, while the
python client for raven only has 6k lines. I don't know whether they have
equivalent feature sets - but I kind of wonder how it would have turned out if
the go port would have been based on the python version.

~~~
sheeshkebab
go codebases are not shorter (by much) than java's.

Performance is also usually on par - unless java's equivalent is built around
a lot of reflection (orm and what not).

However, go compiled binaries are usually smaller, and code IMO is much more
readable.

~~~
dallbee
Like with all languages the implementer matters. There's a remarkable amount
of variance that I've encountered in the tersiness of both Go and Java code.

For example, Go's err != nil pattern is often cited as being ugly, but good go
code will often remove errors by design.

There's a good post by Dave Cheney about this;
[https://dave.cheney.net/2019/01/27/eliminate-error-
handling-...](https://dave.cheney.net/2019/01/27/eliminate-error-handling-by-
eliminating-errors).

I think this is equally true of Java. Most Java code I've seen disgusts me,
but I've also seen some beautifully written pieces.

~~~
fpoling
This pattern is known in numerical computing as NaN. It’s drawback is that
when the computation produces NaN, one has no idea what triggered it. But that
can be mitigated if the program prints stack trace on the first NaN. In case
of Go that corresponds to logging error when it happens.

Another thing is that in many cases there is no good sentinel value to return
that naturally leads to exit from loops or complex logic to check for error at
the end of a function.

------
indogooner
The article is more about the process and not the motivation for this port.
What problem with Java/JVM caused the organization to commission this
expensive porting exercise? And what are the benefits they have achieved after
the port.

~~~
kjksf
RavenDB is a database server, like PostgreSQL or MongoDB.

It requires database drivers (client libraries) for as many languages as
possible.

More client libraries, more programmers can use RavenDB, more licenses for
RavenDB sold.

They already have C#/Java/Python/Node.js libraries.

I ported Java client library to Go, so that people who program in Go can
access RavenDB database.

~~~
aidos
Bonuses questions: do they other libraries stem from the java root client too
or did they evolve separately (even out of house)? Why did you choose Java to
port from? Are there interesting lessons in the commonality / differences
between the language implementations?

~~~
kjksf
RavenDB is written in C#. At first, it was Windows only but now is cross-
platform database engine (Windows / Linux / Mac OS).

As a result, the original and most featureful client is for C# / .NET.

Java client is a port of C# client, done in house by Hibernating Rhinos (the
creators of RavenDB).

As far as I can tell, other clients (Python / Node.js and the Go client that I
wrote) were contracted to outside people.

The company suggested starting from Java code base. It makes sense because C#
client heavily uses LINQ, which is unique to C# (neither Java nor Go has LINQ-
like capabilities).

I didn't dig much into non-Java clients so can't speak much to that.

Overall, I was surprised how similar I was able to make Go code to Java code.

Changing from exceptions to errors was pervasive but a simple, mechanical
transformation.

Porting functions using generics was the biggest hurdle.

Porting functions that use overloading was easy but annoying.

That being said, a Java code base that heavily used virtual functions and deep
inheritance hierarchy would be more challenging to port to Go. Lucky for me,
this code wasn't.

~~~
pjmlp
> neither Java nor Go has LINQ-like capabilities).

Not true since Java 8, with the introduction of streams and functional
interfaces, which keep being improved with each release.

What Java doesn't have are expression trees, which are convenient but not a
requirement for LINQ like features.

~~~
svick
Expression trees are a requirement _if you 're writing a database client
library_.

------
edem
Why didn't you port it to Kotlin instead? Much better language than Go in
every aspect and also comes with an auto-conversion tool.

~~~
hactually
Single executables yet without JVM?

Static typing?

Simplified toolchain without make/maven files?

My big one after swearing off of Java was never needing an IDE to do even the
simplest things - is Kotlin useable without an IDE?

~~~
eikenberry
> Single executables yet without JVM?

Yes. The JVM is Java's biggest downside, so why would you want to just move to
a different language on the same overly complex (to put it mildly) runtime.

~~~
mustardo
I'd argue it's one of javas strengths, its fast and other than increasing max
heap size tuning is only necessary for the hardcore FAANG etc What's complex
about installing openjdk-<version>jdk from your package manager? Your app
normally bundles library dependencies so no virtual environment fuckery like
python Ruby et.al

~~~
rsynnott
I can assure you that you don’t need to be a FAANG for tuning to be
necessary...

------
ivan_gammel
It is very interesting to see in such projects how they expose the weakness of
certain Java idioms. The mentioned JavaBeans getters and setters are obsolete
pattern, for which in most cases there’s no good reason to keep it in the
code. Java ecosystem is probably the richest one on the design patterns, some
of which receive „anti“ prefix over the years (e.g. self-contained singleton).
At the same time new languages explore and verify the ideas and coding styles
that might be worth to borrow and make mainstream. Porting exercise can be
such source of new best practices pushing us to rethink what is worth keeping
and what is pointless overengineering.

~~~
jaabe
I was struck by the lack of inheritance in go, that to me is brilliant.

I’ve worked with C# for a decade, and I’ve yet to see a use of inheritance
that wouldn’t have been easier to maintain in the long run without using
inheritance. We’ve limited our own useage to override methods in the standard
library, but even then it’s often used to implement things that are really
just terrible practices. Like adding search functions for AD extension fields
or increasing the timeout in one of the older web clients.

~~~
tybit
Embedding does seem like one of the decisions Go got very right.

Many languages (thinking c#, Java, c++) say they promote composition over
inheritance but yet make inheritance so mmuch easier to use.

~~~
nathanaldensr
Indeed! This has _always_ mystified me. Why is it so easy to implement what is
now considered to be an anti-pattern (inheritance) when it's so boilerplate
and annoying to implement composition? Why does C# not have language support
for delegating members? Why do I have to buy and use ReSharper just to
generate all that boilerplate? It's a constant battle talking to "just get it
done" developers about why inheritance is bad.

~~~
pjmlp
Inheritance is only considered an anti-pattern by some.

Since the 90's, any good CS book about OOP paradigms had discussions about
is-a and has-a and how to make the best use of each, depending on the desired
application architecture.

The thing is, such books usually aren't taught on bootcamps.

~~~
jaabe
Inheritance isn’t an anti-pattern in academia. I do part time work as an
examiners for CS students, and I see their car/animal examples everywhere in
introductionary courses.

When we have interns, they’ll sometimes build things with inheritance. So it’s
certainly still a thing.

I’ve yet to see a real world use of it, where you wouldn’t have been better of
not using it though. My real world is the relatively boring world of
enterprise public sector software, however, and maybe I’m simply oblivious to
where inheritance might be worthwhile.

~~~
pjmlp
That is the thing, the architect should have a proper understanding of is-a
and has-a relations and apply them appropriately.

Initially, VBX only allowed for composition as well, COM introduced interface
inheritance with delegation, when one wanted to override a couple of methods,
but not the remaining several ones.

And now UWP offers mechanisms to do implementation inheritance in COM, because
everyone got tired to write delegating code for is-a relations.

Inheritance and composition are both tools, it is up to each one to learn how
to use them appropriately.

