
Efficient Aggregates in Julia - luu
http://julialang.org/blog/2013/03/efficient-aggregates/
======
TuringTest
Julia is looking more and more like a language that could replace C++ either
for systems programming, scientific libraries or plain old business
applications requiring high performance.

~~~
masklinn
> could replace C++ either for systems programming, scientific libraries

AFAIK Julia is GC'd and is only embeddable, so meh.

~~~
pjmlp
System programming languages with GC were already used at various Xerox PARC
systems.

UNIX brought us into the wrong path.

~~~
sixbrx
I think Unix did right. Good GC's are enormously hard to implement if the
language is complex, they touch nearly every aspect of implementation.

That's why a lot of interesting languages don't have GC's comparable to those
of the JVM or CLR, which are the work of large dedicated teams on decade-scale
projects. And which are still barely to not-at-all usable themselves in some
domains, such as numeric programming (as with Julia) or OS development, where
even teams within MS found CLR wanting.

~~~
pjmlp
That is why UNIX took us into the wrong path. If those languages were already
widespread back then, then we would enjoy the ca 30 years C has enjoyed from
compiler development support from the industry.

~~~
sixbrx
I don't know whether it would have been a good thing (really don't know).

It seems that GC's developed so far are basically inseparable from individual
language implementations, or rather closely related families of languages,
which have similar approaches to programming (e.g. rooted in single virtual
dispatch on heap referenced objects). These systems require complete buy in to
a particular approach to computing in order to be usable from new languages,
forcing them to use the same object layouts, calling convention, stack usage
conventions, unwinding support, view of concurrency, etc, which can all be
very limiting. [Edit for clarity: the high performance GC and access to
existing libraries is the honey, the forced view of computing is the resulting
bee sting.]

I think if such a siloed GC VM system would have become dominant, we would not
have such a huge set of reusable routines to call from really innovative new
languages. That's something for which we can be grateful to C I think, despite
all of its flaws.

So I think even the high performance VM's delivered by the JVM and CLR are too
limiting, really. I'm a lot more excited by the prospect of LLVM making
precise GC easier for language implementers. From what I understand, the API
is being standardized now, so maybe in a few years we can start enjoying some
common work on high performance GC's usable by a really _wide_ variety of
languages which don't all have the same narrow approach to computing.

~~~
pjmlp
> I think if such a siloed GC VM system would have become dominant, we would
> not have such a huge set of reusable routines to call from really innovative
> new languages.

No, it would just mean you would have GC as an another OS service, and use
whatever OS ABI was in place.

> So I think even the high performance VM's delivered by the JVM and CLR are
> too limiting, really

They already good enough to replace C++ systems in high performance trading
systems.

I only use C++ in hobby projects nowadays. At work, when I get to see C++ code
in enterprise systems, it is usually as part of migration projects to replace
them by JVM/.NET based ones.

~~~
joe_the_user
Cheers to the discussion between pjmlp and sixbrx - classic hn. I feel like
computing is such a large beast that one can find two very informed people
disagreeing while each being right in wide area.

I don't think you can even call memory allocation "just another OS service",
(much less garbage collection). Memory allocation is going to be ubiquitous in
any application and if it's performance degrades, so will the application's
performance. Further, one knows nothing about "what's behind the curtain",
performance will degrade for any _medium performance or higher_ application.

Java is perfect for enterprise code because most enterprise software is more
forcused on data-safety and generally playing well with the rest of the data
zoo. Enterprise software isn't high performance and thus the costs of c++
outweigh the benefits. Or even more

C++ is needed in system programming and shrink-wrap software where performance
is crucial.

On the other, other hand, if a dominant OS forty years ago had imposed a GC
model that became ubiquitous, perhaps the sweet-zone where the programmer
doesn't have to worry about the allocation process would have been larger and
the area where you did have to worry about the allocation process might not
have been harder to deal with. Especially, a single ubiquitous program layout
model would force hardware makers to adapt and what I understand is that a lot
of the need for multiple memory/program-layout models comes from innovative
hardware getting the last once of performance through constant novelty in low-
level memory access.

~~~
pjmlp
+1 for the comment.

Just some info on my side.

Like many developers I do have my issues with C++, but I do like it, since my
Turbo C++ days.

However, my experience with GC enabled system programming languages dates back
to being an Oberon user in the mid-90's.

So I experienced first hand that such systems are possible and have been ever
since dismayed by the lack of attention from mainstream OS vendors.

------
collyw
Does anyone know how aggregates in Julia compare to aggregates in SQL in terms
of speed? I like the power of SQL, and know it well. I wonder if Julia is
worth investigating.

~~~
Bootvis
This question isn't that bad, so why the downvotes? I'd like to see a
comparison between Julia's performance on an in memory dataframe against the
same data in an SQL table. Julia should have the advantage but how fast is it
really?

~~~
tfigment
I wasn't voting but to me they are orthogonal questions unless you frame it in
specific way. Also to me that is not what the article was about. It was about
defining immutable structures.

Julia is a client language and data must be local/in-memory to manipulate
which SQL is generally has more going on and is usually being interacted with
remotely. Julia might be faster than Sqlite3 but I sort of doubt it for an in-
process manipulation where sql can do the work. Julia has an advantage that
you can probably do more operations than what sql typically allows for.

I frequently hand code SQL to do complex aggregates in it so that the server
can do the calculation fast and only send the result and I don't have to pull
the whole data set to work on it. When Julia is client/server and I can write
remote functions to aggregate data on the server and then return the result
then this question can be asked again I think.

------
KenoFischer
Note that this is a year old. Not that that makes it any less accurate.

------
RivieraKid
Does Julia have an equivalent to interfaces?

~~~
3JPLW
Kinda sorta not really. Abstract supertypes currently behave somewhat like an
interface. But because multiple inheritance is still an open issue [1], it
doesn't really fulfill the requirements of an interface.

But it looks like it's definitely on the roadmap [2].

[1]
[https://github.com/JuliaLang/julia/issues/5#issuecomment-330...](https://github.com/JuliaLang/julia/issues/5#issuecomment-33031468)

[2]
[https://github.com/JuliaLang/julia/issues/4935#issuecomment-...](https://github.com/JuliaLang/julia/issues/4935#issuecomment-31874539)

[See also:]
[https://github.com/JuliaLang/julia/issues/2248](https://github.com/JuliaLang/julia/issues/2248)

~~~
RivieraKid
Great, thanks for the links.

------
oakaz
The website is down.

