
Show HN: AWS Infrastructure for Stanford Course in Parallel Programming - eslaught
https://elliottslaughter.com/2014/07/cs149-setup-tool/
======
yeukhon
Great work.

When I was an undergrad (not too long ago) there are two things I always want
to teach people: server provisioning and security. By provisioning I want to
show people how to run an openstack environment, use vagrant, use docker and
deploy using ansible/chef/puppet/salt/fabric/shell script. Show them the
complexity of deploying production and development environment.

Many schools today use very traditional VMs, mostly powered by VMWare /
Virtualbox. But most of the system admins are too caught up helping other
faulty and departments. Most of the time the IT is underfunded. Everyone wants
new computer and a new lab; admins don't have time to sit in their office to
learn new technology unless they are told to do so.

I encourage students to volunteer to help their system admins, encourage them
to look at the possibility of running an environment similar to openstack. I
am sure by this time the system admin's lab is full of old computers ready to
be recycle or to be destroyed. Those machines can be used to bootup
lightweight VMs (think dockers).

------
ilaksh
Couldn't you do all of that, except for CUDA, using Docker on a single
machine/VM?

Also, why not use something based on OpenCL, which would be compatible with
many student's own computers rather than requiring nvidia, so they could do
that part of the exercise in PyOpenCL or maybe even node-webcl or something.

------
jmta
Why don't use a local virtualized solution like
[http://informatica.uv.es/~carlos/docencia/netinvm/](http://informatica.uv.es/~carlos/docencia/netinvm/)
? You can create a model of a network with preins & preconf packages.

~~~
eslaught
Parallelism, as opposed to concurrency, is purely a performance consideration.
With concurrency, you can actually write programs you couldn't write before.
That is never the case with parallelism. The only reason to write parallel
code is to make your code run faster. Given how much of a burden parallel
programming is, if you're not seeing measurable performance benefit, you
should just drop it and go back to your original serial code.

As a result, CS149 is a class in performance programming. Sure, if you only
wanted to teach how to program in each programming model, then you could
virtualize everything and run locally. But if you want meaningful, predictable
performance, you had better provide hardware where it is possible to achieve
the performance you're asking for.

Even the larger AWS node sizes can be suboptimal for our purposes. The
networks on AWS, for example, would be too high latency, low bandwidth, and
unpredictable for use with typical MPI applications. Even their dedicated
cluster instances still go through 10 Gbps Ethernet, with performance that
continually lags behind the specialized networks used in modern
supercomputers. So if we gave students an MPI assignment, we wouldn't be
teaching them what the actual state of the art in the field is today.

------
nomnombunty
Hi Elliott, I remember you TAing the class when I took it. I really enjoyed
working on the assignments, thanks for the great work!

~~~
eslaught
Thanks, glad you enjoyed it!

