Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Virtual Machines in the Browser (grunseid.com)
281 points by grun on Aug 15, 2013 | hide | past | web | favorite | 148 comments

I was hoping it was a finished version of http://bellard.org/jslinux/

I was hoping it used http://www.paulgraham.com/arc.html

Same expectation here. Deploying a customized VM to the browser (with networking included) will be very interesting.

I am not sure about the real use cases. For learning development will be useful since you don't need to do a more complex interaction between the browser and your server.

Unfortunately, full networking capability as it exists in a typical virtual machine (like Virtualbox), will not be possible in pure javascript. Security features of the browser like the same origin policy restrict this from being possible. Even if you were to use some of the exceptions to the same origin policy, you will be limited to sending HTTP requests. There is no way to send a UDP packet from javascript for example. Of course you can create browser extensions which expose these utilities, but then you're taking the easy way out :)

Would be neat for a project like this to go open source and accept community submissions. Not sure why the author of jsLinux didn't go that route.

Fabrice Bellard is the author of qemu and ffmpeg (among other things), so it is likely he is aware of the benefits of open source and it is a conscious decision (maybe there is interest from someone with relatively deep pockets, or maybe he considers it too hacky?) Don't want to speak for him here though.

I do wonder how qemu compiled via emscripten would compare though...

Fabrice Bellard is the Chuck Norris of all hackers..

its a waste of time trying to figure out what his superior mind could be thinking :)

Unbelievable. I tried, echo, ls, find, grep, top, ping, whoami, ps, wget, vi, emacs. Everything is there. All in my browser in Javascript? This is fucking cool. I would love a complete vim in javascript that I can use to replace textareas. This would make it possible to simply use the vim.

The real wtf is installing tcc and running hello-world with a shebang...


I am not a vim user, but I guess this gets close:


It's the best attempt I've seen so far, but still it's very close from _the real Vim_.

See this: http://haldean.org/docstore/?vim-problems .

Interesting! Thanks, will look into it!

Just give me one more month, I'm almost there ...

anybody just typed:

$ ls

$ gcc hello.c -o hello

Reading the comments here... wow, HN must be a depressing place for people who create things. I feel sorry for the author.

As someone who gave depressing feedback, I'd like to defend my actions:

I've learned that attention is more important than appreciation. When you introduce a project, it's better to get people thinking about it, nitpicking and even questioning the point of your project, than to get a few replies of "Nice!" and be ignored in the long term.

I've noticed another thing: most things of which I'm initially slightly dismissive fail to reach any success, but the things I'm very dismissive succeed. I've seen products get very harsh feedback and, a few months later, the products were successful, had better user experiences, and their users loved them.

But the visceral reaction to his project may have to do with the kind of expectations we have from web use vs. installed software.

I know some people install anything they happen by, and will try all one-line command line installs and random .pkgs they find, but most people who care about security are wary of installing closed source software from a not-very-known third party, or run arbitrary code in a virtual machine.

When you browse the web, there's an expectation of not worrying about security and malicious code execution. Javascript blurs that line, but it's mostly kept away from the innards of the host. Installed software is more intimate; in some ways, it's like letting a stranger into your kitchen, but not into into your bedroom. Of course, they can do harm in the kitchen too, but the bedroom has implications that go beyond physical access.

HN can be very negative sometimes, and seeing angry or snide comments makes me sad too. You can give concrete, actionable advice and feedback without being an asshole, and no, that's not pretending to be nice, it's simply not going out of your way to be rude. But if you take everything into account, it's better to get discouraging comments from HN and filter them for real information than to be ignored.

I think you may be right with the idea that lot's of negative feedback may be good in a way, interesting observation.

I think one of the main reasons why projects like Arc or NaCl started is that the web is that the web is not a very good platform. It was created as a platform for documents, not for apps. I wish there was a platform having advantages of both web and native platforms (Windows, Linux, Android, etc.).

I currently live in Thailand. This country may be the most polite in the world. In some situations Thais will never give you honest feedback [1]. For instance, they may give you compliments on how well you speak their language - regardless how well you actually speak it. Just to make you feel good. And (to some degree) they expect you to do the same.

In this culture you never quite know what people really think. Hacker News is the opposite. You will get (mostly) constructive feedback from (mostly) smart people for free, that you could not get elsewhere if you were to pay money for it. Don't come here to stroke you ego. That said, if you get positive feedback you know it is for real and not just out of politeness.

Sometimes however, people could package their criticism a little bit better or self-censor if it's not constructive.

[1] I don't claim to understand Thai culture. In some cases I may be the extreme opposite. Where sometimes people say things to each other that would be considered outright insulting in the West to strangers.

That's a very good point. I recently realized that this bothers me too, especially in business - for example the appointment can be very polite and positive and they then just send a brief message saying that they've chosen someone else for the project without giving any reasons. And I don't even live in an especially polite country (Czech Republic) :)


* I don't think total honesty is always the best amount of honesty, in some situations it's better to twist the truth slightly. I see it as a tradoff between honesty and politeness.

* You can be honest in many different ways. I agree that some of the comments here could be rewritten to be less depressing for the author while staying honest.

I've seen him demo the product, and text just doesn't do it justice. It's awe-inspiring and I'm glad he finally decided it was ready for its admiring public (got worried there). I'd pay no mind to the naysayers; that internet thing was pretty dangerously insecure at first too.

A 3-minute video demo (with no editing) is all Arthur needs to get the point across that this is the next big thing. Show the public what I've seen.

I noticed, some two years ago, the tide turning. "Show HN" posts became painful to read; instead of mostly constructive criticism, most or all comments were angry and hypercritical. It's been sad to watch.

Unfortunately, HN is a monopoly in its niche, I don't see a good alternative.

The only one I know of is lobste.rs, which is all fine and good if you know someone and can get an invite — without one, you're stuck in read-only mode. No idea why they'd make a decision like that.

I like HN for its news, but am not so sure about its moderation policies, muchless the sometimes-hypercritical attitudes.

If we have learned one thing from people like Elon Musk, it is to continue to do what you believe in despite all the naysayers.

I hope the author continues working on the project.

(P)NaCl already solves this problem, without such hacks as transparently spinning up a virtual machine.

I don't trust Virtualbox to be especially resilient to attacks from malicious VMs. Chrome's sandbox is well-audited and (overall) is sound. A virtual machine host has a much larger attack surface, and generally doesn't assume malicious guests.

Native Client

  1) Is Chrome only.
  2) Can't spawn processes or subprocesses.
  3) Can't open raw UDP or TCP sockets.
  4) Requires apps be ported.

While I like the idea of running a VM in a browser (not sure if I'm convinced it makes sense, I still like the idea):

Argument #1 only makes sense if you support a lot of platforms. Right now you only support Mac OS X (according to another comment).

The number of people who use Chrome globally is larger than the number of people who use Mac OS X. Ergo, if you used Native Client and chrome, you'd be more ubiquitous.

Regarding sockets: chrome supports UDP and TCP: http://developer.chrome.com/apps/socket.html

If there's one platform to build on that is going to cover a large number of people and give close to native performance, it's Chrome+PNaCl. I wouldn't tell everybody to drop what they're doing and adopt that target (it's not ready yet). VMs in the browser are pretty nascent, too.

Why do you think that is? Is it because Google have a large team of engineers who have developed a proper security programme for the project and realised that doing 2 and 3 are bad and that as a result of that and other problems with diong this that 4 is necessary? There are reasons for these limitations.

I trust virtualization as a security boundary, particularly on modern CPUs with stuff like VT-d/VT-x, more than process separation on Unix/Windows. I still trust hardware separation a lot more, particularly because it is so much easier to audit and put multilayer controls on, particularly vs. administrative users.

Virtualbox in specific might be a vulnerability compared to other virtualization systems, though.

Um.. transparently spinning up a VM is not a hack. Its the actual design !

>Chrome's sandbox is well-audited and (overall) is sound.

So why does it fail during the annual Pwn2Own competition? You and I have different definitions of 'sound'.

Also coming back to NaCL.. NaCl's code verifier has never undergone large scale deployment/testing/real world use from millions of users/apps.

VM hosts have assumed malicious guests ever since people started renting out VMs. It's a VPS host's nightmare to have a VM root exploit be escalatable to expose all the other VMs on that VM's host.

This is really cool.

It's completely backwards from my personal usage of the Internet; I get away with browsing online using a Penryn-era Pentium, a ARM-based Chromebook, and/or my Galaxy Nexus precisely because the majority of processing costs are offloaded on "traditional" websites rather than on the browser doing the rendering.

But if this means that application developers won't have to recompile every application under the sun (like, say, ffmpeg or Audacity) to run under asm.js or PNaCl, then I think it could mean that we could skip a decade or two of having to reinvent the wheel.

On the other hand, this feels suspiciously reminiscent of ActiveX, so I suspect you're going to have a hard time convincing people to adopt it if the security diehards warn you of running arbitrary code on your machine (even if it is in a sandbox).

"the majority of processing costs are offloaded on "traditional" websites rather than on the browser doing the rendering"

I have not found this to be the case. I find that most websites take a LOT of processing power to display - loaded with flash, scripts, video, etc.

A lot of sites are not really guilty as it is the ad network content inline with the site that pulls all of that computing power, but other sites (boingboing, for instance) generate a lot of CPU use just on their own.

And it gets worse all the time. I suspect that whatever gains we make with efficiency of HTML5, etc., will be immediately consumed by things like the OP is building.

I have a 5 year old macbook air that absolutely does not need to be replaced. Except that I can't have more than 10-12 browser windows open before it's pinwheel city...

With the devices I've used, I find that flash content loads and runs just fine provided that there's at most one running on each page and it doesn't crash.

Granted, I find myself enabling Adblock by default on most sites because most ads nowadays are annoying Flash pop-overs. Back in the day, the Linux implementation of Flash didn't support making the Flash embed transparent, so I had to go into Firebug/Web Inspector and delete the embed/object tags entirely just to read the page. While I think that particular bug has been fixed, even today, most of the Chrome tab crashes I run into are still caused by Flash crashing.

It really bothers me how much stuff depends on Flash still, whether it's putting something in the clipboard from the browser, or just Google Hangouts or Facebook deciding to play a "ping!" when a notification goes off.

The only thing "HTML5" means to me is that my ARM devices can offload H.264 video decoding to the GPU, rather than trying to run a cross-compiled Sorenson decoder on the (relatively underpowered) CPU. Everything else under the "HTML5" banner seems to be just increasingly complicated browser-specific extensions to JavaScript and/or CSS.

I've also noticed that the Chromebook and Chrome for Android will deallocate tabs that I haven't used recently and reload them when I switch to them, which lets me have dozens of tabs "open" on devices which are otherwise only capable of handling three or four tabs at once. Of course, this is only reasonable because of "high speed internet"; I remember being on dial-up in the late 90s running Internet Explorer 4 and opening ten windows in the background so that the pages would pre-load in the background while I read the current page.

Title is wrong. Virtual Machines are not in the browser, they are in your computer bridged to the browser.

Security wise, this seems like an awful idea: Unless the host is firewalled from the guest, if Arc-style VMs become popular, than you'll have malicious websites starting VMs to scan your host network for unpatched vulnerabilities, and abuse them.

I try to keep my network secure, but e.g. Cisco/Linksys E3000 hasn't received a firmware update in a long time, and it has known exploitable bugs - right now, the fact that it is only accessible from inside the NAT, and that webpages can't do arbitrary accesses is what keeps all those E3000s from being exploited.

(My E3000 has been running dd-wrt, so it's not vulnerable to those problems; but I had to manually upgrade the dropbear ssh because of vulnerabilities - latest official dd-wrt for it is still vulnerable)

Security-wise it is no different than installing a multitude of software packages onto your computer, which many developers are already comfortable with (whether they should be or not).

Perhaps - But Dripy and Peggo are not targeted at developers.

The people who use Dripy & Peggo are also the people download spyware infected "Youtube Downloader" apps already.

I don't see a lot of difference - for good or for bad.

Arc apps will require explicit permission to communicate with a local network. This can be enforced at the hardware layer by the virtual NIC.

How do you define local?

Is it, and I guess that would cover 99% of local networks.

I wasn't aware virtualbox has firewalling at the virtual NIC level - my 4.1 doesn't; It's either host-only, bridged, or nat - of which bridged is unlimited, host-only is useless, and nat cannot (as far as I can tell) be firewalled at the virtual NIC level. So how do you do it?

Even better would be to require explicit permission to communicate with any site that isn't the app origin.

A VM can be put on its own VLAN with a traffic routed through a secure firewall. I don't think Arc is as doomed to be insecure as many are claiming.

VirtualBox, at least the version I run, cannot do that on its own. You would need to set up the firewall rules on the host. Which is, of course, possible - but not in a cross platform way (linux uses netfilter/iptables, bsd uses pf, windows uses ... I'm not sure what these days, but many users have a 3rd party firewall as well)

It isn't doomed to be insecure, but its security, portability and convenience/usability have a nontrivial tradeoff which is ignored by the original description. If it's portable and convenient, it is likely going to be lacking on the security front.

Which Dropbear vulnerability? The last one I can see is from 2012.

That's the one. The latest stable dd-wrt is from 2009 - if you're running dd-wrt and haven't updated dropbear, you're vulnerable.

The original Cisco firmware for the E3000 also has vulnerabilities, which were never fixed.

From an end-user POV, what will using an Arc app entail? Will it be like Flash Player and Java; ie, you download and install Arc once, and then all Arc apps will just work and be super awesome?

I had an idea like this, but instead of a VM, using a client-hosted server. The big concern I couldn't solve was security. If you have, say, Peggo.. What is to prevent other websites from being malicious and connecting to your locally-installed Peggo VM and trashing it or otherwise exploiting it?

> From an end-user POV, what will using an Arc app entail?

If Arc is installed, you're good to go. Everything just works. If Arc isn't installed

  1) arc.js transparently falls back to the cloud and runs the Arc app on a
     server. The user doesn't know the difference.

  2) Upsell the user to install Arc.
I haven't built the transparent cloud fallback yet.

> What is to prevent other websites from being malicious and connecting to your > locally-installed Peggo VM and trashing it or otherwise exploiting it?

The web server running in the Arc app can check the Referer header to verify the request came from a permissible domain.

Can the Referrer header not be spoofed?

Spoofing it in a client's browser is not possible but it's trivial to spoof referrer headers (or anything else) from a stand alone program. Beyond checking for referrer headers the server should give the client a signed token (returned back by the client to the server) to verify the request is valid. Otherwise if the client is arbitrarily sending requests to the server to "install X, run Y, ..." it'd be very easy to hijack the server for other processing.

As usual this goes back to one of the standard rules of server security: Don't trust anything that comes from the client.

Spoofing with a client is easily done. In Firefox you can use TamperData.

You might want to change the name from Arc to something else. Arc is the name of a lisp dialect written by Paul Graham, and is the language Hacker News is written in. http://en.wikipedia.org/wiki/Arc_%28programming_language%29

Arc the language hasn't been officially updated in 4 years. All names have been used at some point - it's not realistic to expect them to be truly unique.

But "Arc" as a name is widely known and used, at least among the HN sort of community, and although it's a niche language it's actively used within that niche. I think you're overstating things; the name "Arc" was not ready for garbage collection.

Isn't this a problem that the market will correct? If Arc the language has a significant enough following, then "Arc" in common use will refer to the language. I'm not saying people should go around picking names others have used, but it seems a little heavy-handed to assume that because a few thousand people have an attachment to a name it is now off limits for everyone else. Moreover, at what point does it become ok? There's also Ark the YC company: http://ark.com/ is that fine because it uses a k?

Taking a look at it from a different angle, even in the case of the legal framework for names - trademark - having Arc the language and Arc the VM thing would be fine.

The truth is, if you don't want people to collide with your name, picking a 3 letter common word probably isn't the way to do it.

It would be lucky for me if you were right, because there's a name I've fallen in love with and want to use for an open-source project that is roughly as semantically distant from an existing project as "Arc the language" is from "Arc the VM thing". But the market isn't the only factor here. There are also cultural norms, which deserve respect because they encode how people ought to treat each other (or in this case, each other's work). So I worry about that.

Yeah, gauging your audience is a different aspect of this problem entirely. If Arc's ultimate audience is intended to be HN that's probably not a very good strategy, but if it isn't, then it doesn't really matter. You're always going to piss someone off if you pick something that isn't very obscure (or made up).

Names are hard. :/

Plus Arc is just one syllable and sounds good. That makes it really tempting.

'"Arc" as a name is widely known and used'

I just found out about this language thanks to the link.

More importantly, according to the TIOBE index, Arc programming language apparently isn't one of the 50 most popular languages.


> All names have been used at some point

I'm pretty sure the heat death of the universe will occur before all names are used.

Perhaps I should phrase it as "all good names"...

I believe this is technically accurate.

Arc is an indomitable name. It's short, memorable, recognizable, and pronounceable. It's also a strong prefix: Arc app, Arc box, Arc sync, Arc OS, etc.

Arc passes the highway test with flying colors. Imagine driving down the highway when an 18-wheeler thunders past. If you remember the name stamped on the side of the corrugated shipping crate, it's a good name. 'Arc' in a Neo-grotesque typeface next to an iconic logo on such a shipping crate is as good as burning Arc into your retina with a megawatt laser.

You've evaded the GP's comment, which is about the name "Arc"'s rather obvious existing binding. Why? I'm curious to hear by what process you've decided to ignore that.

At this point the name is so overloaded, I suppose we could criticize most post-'80s things named "Arc" on that grounds. I haven't heard pg/rtm defend why the Arc programming language chose to clobber the existing extension .arc, which is widely used by the ARC archive format. Or the binary 'arc', which has been used for ~30 years as the name for the command-line interface to the industry-standard ArcGIS. There's also a programming language in ArcGIS named the Arc Macro Language.

Those are good points. I suppose the counterargument is that it depends on how much the contexts intersect. It seems to me that in this case the contexts intersect rather a lot. But if I'm wrong, and overloaded project names don't matter, so much the better.

Not to forget Noah's floating palace!

Edit: just realized that was Ark with a K. My bad.

let the living eat the dead

Desktop client disk space is large enough to have a VM for many sites. In the future, if it increases enough to have a VM for every web application _and_ work offline you have the selling point that the app can work during infrastructure failures. People will buy that. Sandbox the networking so it only can call your site and you have solved some of the security problems mentioned. Trouble is, large amount $$$ is behind tablets and their ilk. So you have to work on this for another 10 years when desktops come back. That will happen when Moore's law returns into existence. Light-based computing or analog-fusion-digital computing comes on the scene. What you have is a homomorphism to just a desktop application. The browser has grown to be a glorified dumb terminal that can display rich interfaces, and so the lines are blurred enough to confuse anyone trying to handle the client/server distinction. Best bet is to turn it into a platform that enterprises can use to solve their existing problems. Niche play. Great work, BTW.

Neat concept, but I don't see a VirtualBox-based implementation ever becoming mainstream... not that it needs to be mainstream to be useful.

It seems like full blown x86 PC hardware emulation is overkill for what you're doing. As others mentioned, NaCL isn't really the right abstraction layer either.

Perhaps a stripped-down version of VirtualBox could be turned into a "standard" browser plug-in and paired with something like Docker, so you're just running a minimal container image.

I like the Docker idea, especially if Arc were providing the base linux VM via read-only mount and the website's Arc App was only a difference image on top of that ... nice and small.

On Chrome platforms, this could make use of Native Client to run the VM directly in the browser, rather than requiring the installation of an un-sandboxed browser plugin.

It's certainly a great project, but i can't help thinking : so now we're not satisfied anymore with running virtual machines to execute some code, we need the whole OS on top, along with the shell, and the preinstalled programs...

That story reminds me of the last time I tried to compile a blackberry app on my mac a few years ago : Java VM running my code within a blackberry simulator running inside of a windows XP VM on top on my mac OS.

Where will it end ?

People prefer this incremental series of hacks to the existential despair of being faced with a cohesive system, say a lisp machine. :-)

And what happens to the VM when the user closes their arc app? It seems like you could build really powerful tools that leverage saving the state of the VM between sessions. Suppose I build an arc text editor/image manipulator, something where my workflow is boot program --> load file X --> edit file X --> save file X. In theory I could save states of the entire app between sessions, so the whole workflow becomes boot arc app --> edit file... And I could do this for any app I build in arc... Am I getting this right? Because that would be absolutely incredible for building session persistence without a home-brew backend.

To me, this is the most exciting part. I'm thinking Dropbox for application state.

This would be very powerful.

Should have named it red pill. :)

Seriously though, I was more excited when I thought it was about running Linux inside the browser. Personally, I have no interest in installing an extra VM on my system just to make your development easier.

Hasn't this already been done with Java applets? Would the difference here be that instead of running Java, you could run almost any language?

> Would the difference here be that instead of running Java, you could run > almost any language?

More than just any language - any Linux software.

And what are the dependencies? What is the lifetime of the VM? It says that Arc uses VirtualBox, does that mean I would need a full install of VirtualBox to use this?

Yes, I just tried it. The first screen on the downloaded installer says it will download & install Virtualbox, and run Virtualbox + Arc upon system startup.

VirtualBox is included in the Arc installer. It doesn't have to be downloaded or installed separately.

Congratulations! This is a great idea. If I needed port a network and/or performance critical Linux app to the web, Arc offers some unique advantages. However there is some serious competition from Java apps, jslinux, Emscripten w/ asm.js, and (P)NaCl.

I don't have a Mac, so I haven't tried out your demo yet, but ideally you should make this work like genymotion, using the existing VirualBox install in headless mode. This would also make a much quicker install option for existing vbox users. Hopefully you can avoid the mistake YouWave made of interfering with an existing vbox installs.

Where did you find the file to try it? I only saw the blogpost.

There was an example site that was linked in the blog post: http://peggo.co/

It will only work with OS X 10.7+ though.

Oh ok, that's why it worked for me then. I thought he was making the beta of Arc available already. It looks like it might solve a problem with an in-house project I've been working on at work recently.

A new, better, richer, open-souce Flash or Silverlight? Well, it could have some uses.

But in my opinion the browser will eventually simply become the OS: you will compile everything to Javascript (or at least asm.js) and use all the interesting APIs from there. Sad but very probable.

cool, looks like the natural extension of client-side programming.  why should we be limited to running things in the browser while our local machines remain underutilized? just spin up a sandboxed VM via Arc - brilliant! and - like so often - a seemingly obvious idea in hindsight... 

should also be great for downloading torrents in a sandboxed environment.

Interesting, but dependencies are a long-term killer. You should check this project:

It creates a VM in javascript and can boot Linux.

How does one try out this beta? Is this going to be open source?

> How does one try out this beta?

Documentation for Arc and arc.js will be available shortly.

> Is this going to be open source?

Arc will not be.

Perhaps Peggo once Arc has cooled.

>> Is this going to be open source?

>Arc will not be

The project is certainly interesting and I respect your right to license it as you will, maybe for commercial reasons, but a closed source black box with relatively low-level access (like Java) makes me uncomfortable.

I may be alone in my aversion to browser plugins, but most of what Arc could be good for would be better solved with actual native software. See Peggo: A GUI app for OSX with just an input box for the YouTube URL and a choice of where to download the MP3 file would be a better solution for that problem. On the devices where Peggo would be useful (mobile phones, for instance), Arc can't be used.

Some things just shouldn't be web apps—I say this as someone who's never made a native app.

>> Is this going to be open source?

> Arc will not be.

Are you using a commercial license for VirtualBox then (and not the default GPL)?

This needs to be answered.

I am surprised by this. Will you expand on why you're not making it open source?

Author probably wants to make money out of it, why else do people lock stuff down? Funny how virtualbox licensing issue got overlooked, because we are all used to download stuff from github with an assumption that any public repository is public domain.

I might be off base here but because the processing of data happens on client side is it possible that this could be used to increase privacy through zero knowledge apps?

If I have a VM in the browser can I run.. vim in the browser? My customized instance and all that? Or maybe this wouldn't be too feasible for io-heavy uses?

> If I have a VM in the browser can I run.. vim in the browser? My customized instance and all that? Or maybe this wouldn't be too feasible for io-heavy uses?

vim, emacs, anything that runs on Linux.

Try koding.com which already did it. http://d.pr/i/8Y1u

This is pretty cool. It has always seemed sort of silly to me that I have powerful so many powerful computing devices in my life and yet most of the important applications in my life are inherently centralized and running on someone else's servers in a far away location. Obviously the power of a remote datacenter is necessary for many applications, but for others it seems unnecessary or even like a hindrance.

browsers are essentially virtual machines themselves, so to be highly redundant (i.e. java already does this), why have VM's in the browser, which is already interpreting code on its own. The next guy is going to come along and make a VM inside of a VM inside of a VM and lets see how slow we can make the browser when its 5 levels deep in abstraction.

This is an actual linux virtual machine... Not a language runtime.

It's too bad that "Virtual Machine" means so many things. Maybe someone has more info on this, but I understand that it was Java who first called their language runtime a "Virtual Machine". Partly motivated because they wanted to create an actual physical machine that ran Java (kind of like Lisp machines). Nowadays with type 1, type 2 hypervisors, and jails/containers/zones, and every programming language on the planet calling their runtime a VM, I'm not surprised that people are getting confused.

This sounds a lot like reinventing the wheel (JVM). JVMs lack the power of a full linux core, but do you really need it? you just wanted hardware access + native threds (JVM has these and more)

Also this sounds good for just 1 App, but what happens when you try to run more VMs than you have physical cores? And/Or memory, this would directly affect the host.

> what happens when you try to run more VMs than you have physical cores?

The VMs are just processes of the host OS. They're multiplexed over available cores, same as ordinary processes.

I like the idea here, but really wish you'd rename it to something which didn't collide with pg's Arc (which runs Hacker News).

There is a lot of awesome stuff you can do with "local, short-lived VMs". I've thought about how to do this securely (using hardware) -- an awesome end state would be letting a data owner with local data, and a code author in the cloud, both mutually distrustful, allow data owner's data to be processed by code owner in a safe way mutually agreed by each. You could do this with trustworthy computing, or a dedicated trusted third party environment on the net, or maybe with MPC someday.

Not sure if Linux or x86 vm is the right level of abstraction; maybe the "peggo" level where you have a higher level might be better for users. Maybe even something like Docker (but on the client)?

This project is going to die unsuccessful. The sooner you stop working on it, the less you will lose unless you really enjoy building this for no other reason than doing it and having something you can put on your resume.

People who need solutions like peggo needed can use the Java applet a portion of their users already have, or they could use asm.js and convert their binaries into JS that runs in the browser. Months ago, I saw a post here where I ran a Qt desktop GUI in my browser that was nothing but JS.

I don't see any real projects / sites that would use Arc, because it would be a lot of friction / poor UX for their users, and it is simply unnecessary because there are already usable solutions to client-side computing.

This is a terrible idea. The author needs to stop what they're doing right now, from a security context this is really quite dangerous.

Arc uses a desktop virtualisation tool to run arbitrary code on your system. The manifest provides a set of packages to download and install and a series of commands to execute inside the downloaded Arc VM image. A malicious server could use this to run an app that attacks your network, the host, acts as a bot, anything.

I'm assuming that in order to run native code like this, there's no sandboxing. I've seen no mention of it. There's a reason you can't run native code in the browser without restrictions, and this bypasses all of that.

An attacker would have three primary avenues of attack:

  1) Escape the virtual machine sandbox.
  2) Denial of service of host resources.
  3) Attack network services.
For 1), virtual machines provide strong, hardware supported isolation of the host. They can't access the filesystem, resources, or hardware of the host, only their own.

For 2), virtual machines also provide strong protection from the denial of service of host resources. Virtual machines can be restricted in memory, CPU, and device use while running. This occurs routinely with virtual machines provisioned on servers.

Network services, 3), are the largest attack vector not protected by encapsulation within a virtual machine. One weapon virtual machines do posses, however, is complete control over the virtual NIC. Every packet sent or received can be inspected, modified, or discarded.

Arc's network policy is not carved in stone. There is nothing preventing Arc from adopting a same-origin policy, like browsers. Perhaps it will.

Hey, I'm the guy being negative in the other comment. I'm sorry about that. Honest question: Do you really think this could be deployed on a large scale? Without reenacting the security nightmare that is Java Applets? You better be very sure about that...

Java applets aren't isolated from the host OS by the hardware. This is.

I don't see the fundamental difference: both run on a virtual machine, and in both cases it is likely that the isolation will be broken by exploits. And just because it's hardware doesn't mean it's less likely to contain bugs...

Another attack vector is to attack the Arc application itself, perhaps with a malformed manifest or something similar. The entire Arc application presents a huge surface area.

> An attacker would have three primary avenues of attack:

That's not true. You've identified 3 avenues of attack, there are far more. That's what threat modelling is for. It does not appear that this project has properly considered the threats associated with the attach surface area, but lets look at these three.

> For 1), virtual machines provide strong, hardware supported isolation of the host. They can't access the filesystem, resources, or hardware of the host, only their own.

Not true. The isolation isn't as strong as you think. Most virtualization software provides avenues to share information between guest and host. Code is still executed on the same hardware. Virtualization isolation is extremely hard to do correctly, and even the pros[1][2], have got it wrong in the past. Even Java's had sandbox issues, as has Google Chrome most recently. It's believed that VirtualBox may have been affected by this[3] bug, which is in Intel's 64-bit CPU architecture, not the product. I say may because it's really hard to confirm without sitting down and writing an actual exploit for the bug.

> For 2), virtual machines also provide strong protection from the denial of service of host resources.

That's not true. Yesterday I was testing a simple windows fork-bomb type attack (putting %0|%0 in a bat file to be precise) and the VMs were all suffering, as was the host. You can limit available resources in a VM, but that's neither here nor there - what happens if you open multiple VMs? How will the virtualisation react?

> Network services, 3), are the largest attack vector not protected by encapsulation within a virtual machine.

They're not the largest attack vector by a long way (arbitrary code execution by virtualisation escape is the most significant attack vector), but lets talk about netowrk services. Just because packets could be inspected, modified or discarded doesn't mean they are. If I deploy a malicious VM to your computer and you start it, that means I'm executing the code of my choice in a virtualised environment on your computer on your network. Nothing will change that. You can sandbox parts of it to reduce what can be done to other components and it'll be down to the integrity of the isolation and sandboxing, but unless you start removing instructions, arbitrary code is exactly what my VM executes on your system.

Architecturally, it's a bad idea. If you were meant to execute native code on a system from a browser, it'd be supported across multiple browsers and OSes as part of a spec. Even Native Client (NaCl)[4] has it's detractors and has made mistakes (possibly addressed in Pepper), yet seems to most likely do enough to support what this project proposes to do.

[1] - http://www.cupfighter.net/index.php/2009/07/blackhat-cloudbu...

[2] - http://seclists.org/fulldisclosure/2010/Jan/341

[3] - http://www.kb.cert.org/vuls/id/MAPG-8TVPQL

[4] - https://en.wikipedia.org/wiki/Google_Native_Client

I can understand the dream and the goal but it doesn't stack up. It's dangerous stuff. Stop it.

A lot of comments below recognise this. Most people here probably know... It's doomed to go wrong. This isn't supposed to be anything more than a neat tool. But still quite an interesting experiment, no?

As experiments go, it's as interesting as changing genes in foetuses to produce blonde kids. It's stuff that might seem intellectually interesting but falls into the category of things people shouldn't do because of the consequences.

There is sandboxing. The native code is not run in the browser or by the host OS. The native code is executed inside a linux guest running on virtual box. The browser itself does not run any native code. Others have pointed out that there are still security concerns having an arbitrary code execution device suddenly appear inside your network.

> There is sandboxing.

I think I missed that. Can you point me to the bit in the source code where the sandboxing is so I can have a look and assuming it's good retract any claims about a lack of sandboxing I may have made?

If this truly has the malicious potential you claim, what would stop someone with bad intents just making it themselves?

Users have to download this software in order for an attack to be made. Ideally it would only gain acceptance if it was very secure but Java applets are a great example of that not being the case.

I think this is awesome! I had the thought the other day that perhaps using a VM or the lightweight container technologies (like Jails or Docker) that we should be able to do this. Glad to see someone is already working on it!

> I want to build apps in Python and C and ship them in the browser. I can't.

Did you try the existing solutions for running those languages in the browser? (pyjamas, emscripten, etc.)

Were there specific limitations that prevented you from using them?

Yes, I really wonder if emscripten was explored here... I assume he was doing this all on the server-side (since there's a mention of $25k monthly server bills).

From the link:

> This was painful, expensive, and inefficient. Clients are more than capable of transcoding video; the problem is browsers aren't.

> Browsers can't run Linux software like ffmpeg, can't run Python, can't reach native performance, and can't make cross-domain requests.

Depending on exactly what ffmpeg is being used for, the legality of distributing codecs is going to be an issue. Virtualbox VM overhead is non-zero compared to "native performance" anyway so I'd like to see numbers on this versus running in a browser. Cross-domain requests can be an issue, but CORS headers can work (unless he does not control the site in question, or they don't want people making these sorts of requests).

From the post it sounds like this was a YouTube scraper, which basically means that there's no way he could legally be distributing ffmpeg + codecs needed in any case, and also this is against YouTube's ToS of course.

> Yes, I really wonder if emscripten was explored here... I assume he was doing this all on the server-side (since there's a mention of $25k monthly server bills).

It is on the client, I believe. Literally runs VirtualBox on the client side.

> Browsers can't run Linux software like ffmpeg, can't run Python, can't reach native performance, and can't make cross-domain requests.

ffmpeg definitely can be run in browsers, as can python. How close to native performance needs to be measured in each case, of course. I'm curious if they compared the performance and found it lacking, or just didn't try the browser options at all.

> It is on the client, I believe. Literally runs VirtualBox on the client side.

Right, sorry I meant the situation that Arc was intended to remedy (from the site):

> I want to build apps in Python and C and ship them in the browser. I can't.

> I ran full steam into this problem when I built Dirpy, a web app that records MP3s from YouTube. I could have built a native app in Python and C, but to test, distribute, and maintain builds for every OS is a total nightmare. So I built Dirpy as a web app, did all the work on servers, and paid $25,000 every month in hosting costs.

>> Browsers can't run Linux software like ffmpeg, can't run Python, can't reach native performance, and can't make cross-domain requests.

>> ffmpeg definitely can be run in browsers, as can python. How close to native performance needs to be measured in each case, of course. I'm curious if they compared the performance and found it lacking, or just didn't try the browser options at all.

Totally agreed (I was quoting the site there, to be clear) - I am a huge fan of emscripten's approach, and I think it would work really well for this use case. VirtualBox is overkill here, and getting people to install something like this is a huge barrier.

However I still think that it would not be legal to distribute the codecs one would need to do this, and it'd also be against YouTube's terms of service.

This is really cool. Thanks for releasing it to the world!

Of course it is as secure as the virtualbox Sanbox is, but some variation of this statement would also be true for Flash, for Java and a host of other technologies that despite whatever flaws they had did allow a lot of interesting things to be built.

I am here musing, trying to come up with something nice I could build once the Linux-packaged version is made available...

Very nice! Games such as Runescape were built into the browser stored data on users computers to track botting. So many of the first gold farming companies developed technologies like this to implement into Botting clients (which were very sophisticated web browsers).

There was a very interesting tech scene that many don't know about around MMO cheating, especially Runescape.

"$25,000 every month in hosting costs". Wow, is that right?

As I recall, Dirpy had hundreds of thousands of users. That's a lot of video transcoding.

Dirpy had three million monthly users.

Ah I see, my apologies. I thought it was a beta site, that was't clear to me when I read it.

This seems to me as a very good chance to work together with the Vagrant team and win. http://www.vagrantup.com/

$25,000 a month in hosting costs...on a youtube ripper? :0

Dirpy had three million monthly users.

Aren't browser plugins like Java and Flash bad?

The installation failed.

The Installer encountered an error that caused the installation to fail. Contact the software manufacturer for assistance.


Please shoot me an email so I can destroy the bug.

So we use ArcVM (aka VBox--controlled by Oracle) instead of JavaVM(also controlled by Oracle)?

Qemu would seem to be the better choice.

Virtualbox is open source - GPLv2


So is the entire JDK.

I think it's a nice idea, but you should consider adding support for other backend providers such as Docker.

Sort of maybe at an intersection of what the NaCL hopes to achieve? Mini sort-of-virtualization of x86?

it's a 171.5MB app to convert youtube videos to mp3s

I didn't click ok

Maybe 40MBs is ok… :D

Awesome. Glad to see that you got it out the door, Arthur!

I'd like to see this come to fruition.


From the post:

"Arc is only packaged for OS X right now. It will be packaged and available for Ubuntu and Windows soon."

Something has to go first.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact