

Experience porting 4k lines of C code to go - ukdm
http://blog.kowalczyk.info/article/af1h/Experience-porting-4k-lines-of-C-code-to-go.html

======
zokier
On a semi-related note: I find it bit strange that C, language that is widely
used for over 30 years, has no defacto standard container library. I mean,
it's bit ridiculous that a 4kloc project uses 1kloc to create something as
basic as a dynamic array.

------
Groxx
> _The C code implements parsing by partying on char_ pointers.*

That's an excellent quote. And it describes a _lot_ of C code I've seen (good
and bad).

~~~
lylejohnson
So what word do we think he meant to go there? Partitioning?

~~~
billforsternz
He meant partying. It's called using the English language in a lively and
interesting way.

------
Legion
The name "Go" is easy enough to skim past even when authors _do_ properly
capitalize it in their post titles. Not capitalizing it is just being mean.

~~~
jemeshsu
Agree "Go" is not particularly search friendly. I tend to use Golang instead
when post or blog. #golang is the Twitter tag used by Rob Pike FYI.

------
markokocic
It's nice that Go offers some improvements over C, but is it enough?

Porting a program written in one of the "lowest level" languages still in use
today to one of the newest "modern" languages which claims high level features
and having only 20% code size reduction doesn't seem like a big win to me,
especially if all the saving attribute to only one feature (better arrays).

Maybe it is just because it was port, and not a reimplementation in idiomatic
Go style.

I would certainly like to see other solutions to the same problem written in
idiomatic Go, Clojure, Haskell, C++ and Scala, just for the comparison.

~~~
jlouis
Enough? It is rather the observation that if you use the same parsing
approach, more or less, in a language which looks like C, then the possible
improvement is rather small.

There are two things at work here: how small is the parser? And second, how
fast is the parser? I know you can write some extremely fast highly idiomatic
parsers in Haskell though. But I doubt they will be faster than hand coded C.

------
veyron
How many lines would the equivalent python program take?

~~~
kkowalczyk
(I'm the author of the article and also a heavy Python programmer).

That's not an easy question to answer.

If you were to do a faithful port i.e. using using the same techniques as C/Go
code, it would be very close. upskirt uses a traditional lexing/top-down-
parsing approach. There's nothing in Python syntax that makes writing such
code more compact than in Go. It's a bit tedious to write but the benefit is
that (in C/Go) it gives the best speed because it minimizes the number of
times each source character is looked at.

If you were to use a different approach e.g. brute-forcing your way through
the text several times with regexpes, which is the most popular way of doing
that in dynamic languages, the code clocks at ~2 thousand lines (like the
implementation I use for my home-grown blog system
<https://github.com/kjk/web-blog/blob/master/markdown2.py>)

If you were to use this technique in Go, the code would probably end up
smaller than the manual lexing/parsing approach used in upskirt, but it would
be significantly slower.

Interestingly, upskirt approach would probably be slower in Python than the
regexp approach, because regexpes are heavily optimized C code and looking at
individual characters of the string in Python code isn't particularly fast due
to python interpretation overhead.

~~~
knome
> because regexpes are heavily optimized C code

AFAIK, the python re module does not have a C language component. It is a pure
python regular expression implementation.

~~~
enneff
Most dynamic languages use a C Regular Expression library. That's why they can
do so well on Regular Expression benchmarks.

V8 is a notable exception. They use their code generation pipeline to JIT
compile regexps to machine code, making it the fastest regexp engine around
bar none (I think).

~~~
azakai
AFAIK all modern JS engines JIT regexps: V8, SpiderMonkey and Nitro.

------
doosra
I was hoping to see a speed comparison of the two implementations.

------
homebru
So, why didn't you write a parser to convert your C code to Go?

~~~
kkowalczyk
(the author of the article here).

One, this was an opportunity to learn Go.

Two, writing a converter is way above my head and quite likely theoretically
(and practically) impossible. I've definitely had to make some decisions that
I don't think even the cleverest compiler could (like noticing that
array.[c|h] and buffer.[c|h] could be replaced with native Go arrays/slices
easily so I didn't have to port that at all, just change the callers to Go
equivalents).

~~~
nvictor
I am impressed with the AppEngine version of our beloved (now dead) JoS forum.
Great job.

~~~
kristianp
I was curious, so I looked it up:
<http://blog.kowalczyk.info/software/fofou/manifesto.html>

------
dm_mongodb
Seems to me comparison to C++ might be more interesting. Wonder how that would
be given there are 'growable arrays' and such. There's better (not perfect)
safety given scoped_ptr, string classes, etc.

