
Hobbes – A language and an embedded JIT compiler - ah-
http://lambda-the-ultimate.org/node/5452
======
zoom6628
This should be of great interest to anybody working with high volume sensor
data in IOT field. I will be checking it out very soon (on a few deadlines now
so cant afford the diversion). Anything that is fast and simple for massive
volumes of small record types should always be of great interest to IOT/WOT
folks.

I agree with the comments about KDB/Q - tried to look at it but could never
understand it. Maybe cos im not fulltime dev, nor in that field, but KDB just
always becomes too hard. AT least hobbes looks like i can work it into C++.
Looks at first 'parsing' to be a case of right tool for the job of working on
high volume discrete data which one would expect from an investment bank.

Kudos to the bank for releasing this part of their secret-sauce.

~~~
kthielen
Thanks, I hope to hear about your experience if/when you look into this (and
I'm happy to help in any way I can). :)

------
kthielen
I started the hobbes project at MS and am happy to answer any questions that
folks have about it!

~~~
ah-
What motivated you to start building your own language? Did you use much kdb/q
before?

How did you manage to get this open sourced?

~~~
throwaway7645
I'd like to know this as well. I know most array languages coupled with a DB
aren't as fast as a language with a JIT, but the code is quite expressive. If
Kdb+ can ever easily be run on an FPGA or GPU cluster...that would be
something.

~~~
stuntprogrammer
Not kdb+, but a proprietary (internal-only) language that it heavily
influenced was designed around execution on GPU clusters.

The FPGAs were used mainly for feedhandlers and there was a different DSL for
that (compiling to verilog).

It was indeed rather something to see :)

~~~
throwaway7645
Was it received favorably? A secret sauce, or a failed project.

~~~
stuntprogrammer
Secret sauce, at least for the team involved, and well-received. Nice
combination of extreme perf and decent productivity for them.

I've moved into non-finance stuff for quite a while now though, so not sure
what's become of it. Given business challenges in that particular sub-field,
who knows..

------
nurettin
All embedable languages should come with their own header parser/code
generator to save us from the hassle of generating a bunch of class wrappers
and boilerplate registration code. I think this high up in the list of
concerns in a professional setting.

------
willtim
This looks like a typed take on KDB/Q, something that is long overdue! The key
to making this work is structural typing of rows (row polymorphism /
extensible records). I'd be interested to see more details on how they tackled
this.

~~~
amiramir
Morgan Stanley has a history of languages in the vein. MS was a big APL shop
in the 80s. Arthur Whitney[1] worked at MS and developed the languages A, and
A+. He later wrote J, and K.

[1]
[https://en.wikipedia.org/wiki/Arthur_Whitney_(computer_scien...](https://en.wikipedia.org/wiki/Arthur_Whitney_\(computer_scientist\))

~~~
throwaway7645
I think J was mainly written by Roger Hui (now @ Dyalog APL) after Ken Iverson
designed it himself. I think Roger was inspired by Whitney though...or was
Whitney actually involved?

------
zokier
I'd love to see more about this, seems very interesting language. Something to
explain how/where this is/could be used, maybe few more complete examples to
show the language etc. Also some notes about performance would be great, I
guess it is reasonably fast by the looks of it, but that is pretty vague.

~~~
kthielen
Yes the whole project started because we needed dynamically-determined logic
(in various places) that could fit in very tight latency budgets. And the
output of this compiler is put in a critical trading path that sees a very
large portion of all daily US equity trades.

So there is a lot to say about performance.

------
jitl
I wish there was a more clear language tutorial, instead of mixing embedding
tutorial with the language itself.

~~~
kthielen
Yes that's a good point, I do hope to build out more targeted documentation
and example programs over the coming weeks.

------
nnq
Is this a _" pure"_ language like Haskell? Or it's more like OCaml?

And, even if it's not pure, is there any syntactic sugar for monads?

Anyway, this looks _awesome,_ especially with easy C++ interfacing that seems
to be there: a system where you can have machine learning (anything from
bayesian to deep nns) code in C++ and business logic in something Haskell-like
is the stuff of wet dreams...

~~~
kthielen
It's not pure currently, so definitely more like OCaml.

We make frequent use of overloaded array comprehensions, which is one form
that monad syntax can take. The "do" notation introduced by Haskell isn't
supported right now, but maybe soon. I'd like to refactor the parser code a
bit anyway.

~~~
nnq
Imho non-pure is probably ok. Haskell's purity is probably what destroyed any
chance of it growing popular enough to have a healthy ecosystem of libraries
and community.

Anyway, since this has no memory management, neither rust-like nor gc, this is
_not in the niche I care about now,_ so moving on...

But again, great work. "Strict Haskell" with good C++ interop and decent
records _is_ awesome.

------
gjem97
Can someone explain variants vs sums vs tuples vs records? This sentence from
the README confuses me: "We can combine types with variants or sums (the
"nameless" form of variants, as tuples are to records)."

~~~
kthielen
Sure, I guess "record type" makes sense? Basically a record type is a "struct"
or a series of values with particular names (the "field names"). A tuple is
the same thing, but by convention the "names" are by position (so you have the
"0th field", the "1th field", etc).

A variant can be _one of_ a series of values (where a record is _all of_ a
series of values), and each of the things can have a particular name (the
"constructor names"). As with tuples to records, a sum type is just a variant
where you don't care to name the constructors (so you have the "0th
constructor", the "1th constructor", etc).

A common example of a variant is an "enum" type, where here the payload types
are trivial (just the unit type) and only the constructor names matter. In
hobbes we might write the type "|red:(),green:(),blue:()|" or the shorter
syntax "|red,green,blue|" for an enum that we might write in C as "enum { red,
green, blue }". You could also represent this same type as the sum "()+()+()"
but that looks a little weird/profane and maybe not clear to other
programmers.

Tuples are sometimes called "product types" and that connects well with sum
types and the general logical/combinatorial interpretation of types. If you
see the type "bool * bool" (or perhaps "2 * 2"?) then it means "a bool AND a
bool" and there are four such values. Similarly if you see the type "bool +
bool" then it means "a bool OR a bool" and it's either the first or the second
one.

If you're interested in the general area of type theory, you might like the
book "Types and Programming Languages" by Ben Pierce.

------
keenerd
> _you will need LLVM 3.3 or later_

Though it doesn't build with 4.0. (And is now in the AUR.)

Nice to see first-class parsing in a language.

~~~
kthielen
OK, there's now a PR in the queue to allow hobbes to build with LLVM 4.0+.
Pending review, it will likely be merged to master soon.

------
lostmsu
Examples do not look very readable.

~~~
willtim
They look readable to me, certainly much more so than K or Q.

~~~
zokier
> much more so than K or Q.

For most people that is not very high bar.

~~~
kthielen
> > much more so than K or Q.

> For most people that is not very high bar.

hehehe

------
ehudla
A useful summary: [http://lambda-the-ultimate.org/node/5452](http://lambda-
the-ultimate.org/node/5452)

