
Show HN: Vector - A High-Level Programming Language for GPU Computing - zhemao
http://zhehaomao.com/project/2013/12/20/vector.html
======
sitkack
Please think about renaming it. This is the most generic, hard to find thing
possible. Consider `vecmao` has only 13.5k hits on google.

~~~
theophrastus
hear-hear! some of my least favorite things to search and research about are
matters related to "R", "C", "dock", "boost", "Go"... i never had any such
problems with "erlang", "numpy", or "gromacs".

Please do not underestimate the critical nature and endless annoyance
associated with naming things after common concepts (or worse yet, single
characters)

~~~
yeukhon
There seems to be trend of namning things with language as suffix. Like xx.js
and xx.py or xx.go

~~~
theophrastus
there's a language named "js" or "py"? i'm only aware of languages named
"javascript" or "python" ;)

~~~
gcr
You knew what OP meant. ;)

The suffix is just meant to be unambiguous. Those familiar with the target
language will understand what language the library is written for.

------
lelf
High-level???

 _This_ is high level —
[http://hackage.haskell.org/package/accelerate](http://hackage.haskell.org/package/accelerate)

~~~
asmman1
I don't get you. Is not C high-level?

~~~
stass
No, C is as low level as you can get. It directly maps into the instructions
of a register machine in runs on.

~~~
sillysaurus2
I invite anyone who thinks C is low level to try their hand at CUDA or OpenCL.

You think C is bad? Those are far worse.

This is at least a great step in the right direction. It's not "low level" as
you describe just because it's C.

~~~
rbonvall
You know what's worse than CUDA or OpenCL? General purpose algorithms
implemented in a graphics API.

I read a lot of GPGPU papers at the university, and I could never understand
the older ones, that described algorithms by mapping everything to graphics
elements, and computed the solutions as a side-effect of rendering something.

Next to that, undestanding an algorithm implemented in CUDA is a breeze.

------
pavanky
While it makes CUDA more readable, I feel like the amount of time taken to
write the code in this language will be very close to writing actual CUDA code
for someone who is experienced with it.

~~~
zhemao
Maybe not faster to write, but it'd be less repetitive. I've written a bit of
CUDA code, and having to put in a bunch of cudaMemcpy calls everywhere got
pretty old. Also, reduce is pretty annoying to implement properly, and I'd
rather not have to do it again for every possible reducing function.

~~~
pavanky
That is what libraries are for.

------
14113
Very interesting! I'm also implementing a programming language for my
undergrad dissertation (but specifically for agent based simulations).

The thing that struck me most about vector was the radically different for
loops (compared to C). I'm assuming you're purposefully crippling them to make
parallelisation easier? Or is there another reason?

EDIT: One other thing - the website fails to scroll nicely on a mac (in
chrome). I had to manually use the scroll bars instead of being able to 2
finger swipe...

~~~
zhemao
Yes, the special for loop syntax is to make it consistent with the "pfor"
syntax. The "pfor" syntax is that way so that it can be parallelized.

Also, I can't believe I forgot to mention this in the post, but both for and
pfor can sweep multiple iterators, so

    
    
        for (i in 0:10, j in 0:5) {
       }
    

Is equivalent to

    
    
        for (i = 0; i < 10; i++) {
            for (j = 0; j < 5; j++) {
           }
        }

~~~
melonakos
Hey, @zhemao, wasn't kidding about wanting to talk about bringing you on board
here. Seriously takes a lot of talent to do what you've done :)

~~~
melonakos
Ah, OK. Well, it's never too late to say no to the BigCo and join a startup :)

* [http://notonlyluck.com/2013/07/23/reasons-to-join-a-startup-...](http://notonlyluck.com/2013/07/23/reasons-to-join-a-startup-over-a-bigco/)

* [http://notonlyluck.com/2013/07/24/more-thoughts-on-reasons-t...](http://notonlyluck.com/2013/07/24/more-thoughts-on-reasons-to-join-a-startup-over-a-bigco/)

~~~
goldenkey
Startup culture is a cancer. Quite trying to sway him from true greatness. All
hail Emperor Bozos.

------
pflanze
I'm wondering about the timings on page 36 in the vector.pdf; those can't be
seconds or it would be way too slow. (I've written a program[1] to calculate
the mandelbrot set on the CPU with SIMD optimizations, and SMT support, on my
ageing laptop with a Core 2 duo it calculates the start set in about 0.07
seconds.) It would be interesting if you provided the pure C program that was
used for the timings as then I could get a real grasp of the performance of
the GPU variant.

[1]
[https://github.com/pflanze/mandelbrot.git](https://github.com/pflanze/mandelbrot.git)

(BTW, also in the PDF, page 35, you write "computes the number of iterations
til convergence for that point", that should be "divergence", right?)

PS. I'm quite impressed by what you achieved in the given time frame.

~~~
zhemao
You can find the benchmarks in the "bench" directory of the git repo. The CPU
code we generate for the benchmark is not particularly optimized and is
completely single-threaded (so not really a fair comparison).

~~~
pflanze
I'm getting the following when running "vagrant up"; this is on Debian.

    
    
      $ vagrant up
      /home/chrishaskell/src/vector/Vagrantfile:7:in `<top (required)>': undefined method `configure' for Vagrant:Module (NoMethodError)
              from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:115:in `load'
              from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:115:in `block in procs_for_source'
              from /usr/lib/ruby/vendor_ruby/vagrant/config.rb:41:in `block in capture_configures'
              from <internal:prelude>:10:in `synchronize'
              from /usr/lib/ruby/vendor_ruby/vagrant/config.rb:36:in `capture_configures'
              from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:114:in `procs_for_source'
              from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:51:in `block in set'
              from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:45:in `each'
              from /usr/lib/ruby/vendor_ruby/vagrant/config/loader.rb:45:in `set'
              from /usr/lib/ruby/vendor_ruby/vagrant/environment.rb:377:in `block in load_config!'
              from /usr/lib/ruby/vendor_ruby/vagrant/environment.rb:392:in `call'
              from /usr/lib/ruby/vendor_ruby/vagrant/environment.rb:392:in `load_config!'
              from /usr/lib/ruby/vendor_ruby/vagrant/environment.rb:327:in `load!'
              from /usr/bin/vagrant:40:in `<main>'
    

If you post the generated C code then I'll give the timings and try to compare
what it's doing differently.

The CPU I'm using (Intel(R) Core(TM)2 Duo CPU T9300 @ 2.50GHz) was released in
July 2006 [1]. The GPU you're using was released on 15 June 2007 [2]. My CPU
code calculates the 1246x998 pixel image of the zoomed out view (real=-2..2,
imag=-1.6..1.6, maxdepth=200) in 0.07 seconds, if your GPU code does about the
same in 0.61 sec, then that's about 8 times slower than the slightly older CPU
can do with hand optimized C code. That wouldn't be such a pretty result yet
:)

[1]
[http://en.wikipedia.org/wiki/Intel_Core_2](http://en.wikipedia.org/wiki/Intel_Core_2)
[2]
[http://en.wikipedia.org/wiki/GeForce_8_Series](http://en.wikipedia.org/wiki/GeForce_8_Series)

------
elwell
Nothing against this particular language, but... I feel like there is a new
language at least every day. It would seem that this does more harm than good
to the developer community's progress. Of course, languages need to be
iterated on in addition to the programs they compose. But, there is now such a
large spread of similar languages that it necessarily slows the development of
the most productive ones by blurring/resetting the focus constantly. Many
technical problems can be solved with existing languages, rather than
eliciting the distraction of a brand new language. Though, in this case, there
is perhaps a clear purpose for the specialization of the language. There is
certainly a benefit to new languages that offer _truly_ new concepts or
optimizations.

~~~
zhemao
I agree. This was just a class project, and I don't plan on continuing
development. These features would be a lot more useful rolled into existing
programming languages.

~~~
kylelutz
Would you be interested in trying to adapt some of your approaches into a C++
GPGPU library
([https://github.com/kylelutz/compute](https://github.com/kylelutz/compute))?

~~~
zhemao
Hey that's pretty cool, and would probably make OpenCL usable by mere mortals.
One improvement that I see you could borrow from vector is getting rid of this
explicit copying business. Take a look at the array implementation in our
runtime library.

[https://github.com/vectorlang/vector/blob/master/rtlib/vecto...](https://github.com/vectorlang/vector/blob/master/rtlib/vector_array.hpp)

Basically, the VectorArray class contains both the host array pointer and the
device array pointer. There are also two boolean flags, h_dirty and d_dirty.
When you modify array elements on the host, h_dirty is set to one. Then, when
you run a kernel, the data is copied to the device if h_dirty is set, h_dirty
is cleared, and d_dirty is set. When you try to read an array element again on
the CPU, the data is copied from device to host if d_dirty is set, and d_dirty
is then cleared.

------
melonakos
Copycatting is the sincerest form of flattery :p

[http://arrayfire.com](http://arrayfire.com)

~~~
muyuu
How similar is it?

~~~
melonakos
Reminds me of an early version of ArrayFire from 2009 or so. The project
highlights 3 aspects:

* Automatic memory management - Been in ArrayFire since 2008

* Their pfor statement - See ArrayFire's GFOR, [http://www.accelereyes.com/arrayfire/c/page_gfor.htm](http://www.accelereyes.com/arrayfire/c/page_gfor.htm)

* High-order functions - Been in ArrayFire since 2009

It's always interesting to watch other people reinvent the wheel. It takes a
lot of talent though. If the people behind this want an awesome opportunity to
join with our team (where we live this stuff every day and have developed a
great culture and customer focus), give me a holler. Find me at
[http://notonlyluck.com](http://notonlyluck.com)

~~~
goldenkey
It's interesting how much startups tend to talk about how great the culture
is. Can you elaborate on this 'developed culture.' I am really curious and
hoping for a real response, not fluff.

~~~
melonakos
I've written dozens of posts about it. Maybe peruse some of the posts here:
[http://notonlyluck.com/category/culture/](http://notonlyluck.com/category/culture/)

~~~
goldenkey
I see, thanks.

------
Chromozon
This was an undergraduate project? Props.

