Hacker News new | past | comments | ask | show | jobs | submit login

Mitchell: First allow me to apologize for the tone of my comment. Re-reading it now, I realize that I glossed over a key point: I used Vagrant every day for a team of 5 people. In general, it worked very well, rarely broke in any way, saved us a whole bunch of time getting up and running. So thank you for your small, but real, contribution to my startup.

> middleware was so _painfully_ the RIGHT way to do things

I did see how you were composing the operations and I'm familiar with the middleware pattern; thanks for sharing your talk. However, I think this is one of those cases where OOP has a name for something that FP just takes for granted: function composition. Regardless, Clojure's Rack equivilant does provide middleware as functions of request -> response maps. It's much nicer to work with, but that might be a result of Clojure's design vs Ruby's and unrelated to Vagrant: https://github.com/ring-clojure/ring/wiki/Concepts

> Write a shell script that does the following

I don't have any of those requirements. I only need to support OSX as the host OS. The team's MacBook Airs don't have enough resources to run more than one VM, so I don't ever worry about port collisions. I need to configure my network adapters anyway for my production deployment, so I just need an if statement or two in that config file template. Hell, I don't even need the ability to turn off a VM gracefully, since kill -9 seemed like a decent way to test crash recovery.

Even thought I don't have those requirements: I used Vagrant. It saved me some time up front.

But that complexity has a cost. Running `vagrant ssh`, for example, took 3ish seconds on my machine. Again, this is probably due to RubyGems, which is painfully slow. I had to explain to the team to run `vagrant ssh-config >> ~/.ssh/config`, then to go tweak that a bit, so that they could write `ssh dev` and have it be instantaneous.

Worth noting however, is that I already needed a way to generate local ssh config for each production box to be able to talk to specific other production boxes, and for client machines to talk to our cluster. Vagrant's ssh-config command was just one more extra moving part to concern myself with, so I manually put the correct config into a cat script along with the production needs, and added it to the build process.

There was one or two more of these little things that escape my memory now.

> make all of that REUSABLE and EXTENSIBLE

I think my main complaint about Vagrant is one which you seem to be fixing, starting with the very commit that this HN thread is attached to: You're pulling out the REUSABLE bit of abstracting away various virtualization platforms across a variety of host and guest operating systems.

However, I'm a big believer that EXTENSIBLE does not belong in library code. Extensibility belongs at user layers. Furthermore, extensibility is really fucking hard. 99% of the time, I'd rather the core be more reusable and allow me to bypass extensibility.

I think that Git is a perfect example of this design. They call it plumbing and porcelain. The plumbing commands offer zero extensibility. They don't have any creature comforts, as their output is tuned for parsing. The goal of these commands is to manipulate the abstractions of the underlying system. The porcelain, by contrast, does offer extensibility via git-foo.sh scripts on your path, aliases and configuration, commit hooks, etc. I'd see a lot of value in something like this for virtual machine management.

Another example would be Chef & Puppet's facts systems. I forgot what they are called. Ohai? Something like that. They provide OS-independent interfaces for gathering facts about machines. Those components should be hoisted to first class project status!

Unfortunately, Vagrant does not, yet, provide a plumbing vs porcelain split. In many ways, I view Vagrant like SVN: It's a damn solid piece of software. Lot's of people get a ton of milage out of it. However, it's underlying concepts are slightly muddled together and there are huge gains to be reaped from deep thought and careful modularization. When those big gains do come, they will only work on a single platform, maybe one or two guest operating systems, people will bitch about it's low ease of use, despite it's inherent simplicity, but the people who deeply understand machine provisioning and management will adopt it in droves, until eventually Windows support doesn't suck too bad, all the various use cases are more or less covered, and people are wondering "How can you possibly still use that old thing?"

I think you've got a good start towards that goal in Vagrant, but my complexity sense is tingling.

> I'd say Vagrant's 3000 line of code count is pretty spot on

I just ran cloc on the master branch locally; I see 12355 lines of Ruby ;-)

Thanks for your response. Lots of points in here so I'll try my best to respond to each in turn. I apologize if I miss anything (just point it out, I'm not trying to avoid anything).

1. Function composition middleware

Actually, I simply use classes as a way to do function composition. The reason I chose classes over functions I think doesn't really matter in the grand scheme of things, but in my talk I mention that what I am doing is function composition, and I spent a solid 5 to 10 minutes explaining how function composition solves the problem. :) We're on the same page here.

2. The fact you don't have the requirements for the situation I posted.

I understand this, and that is why Vagrant works for you. But it also works for people who do have requirements other than yours. There is a _huge_ amount of Windows users out there that love Vagrant because it now makes Ruby on Rails and so on development work well! If I catered Vagrant to only _my_ requirements (Mac OS X host, Ubuntu guest), then Vagrant would indeed be a lot simpler, but Vagrant as a tool gets a lot of its power and usefulness out of its ability to work in heterogenous environments.

3. `vagrant ssh` is slow. This must be caused by complexity.

This has nothing to do with complexity. This is caused by poorly optimized code paths and how things work up to that point. I just want to note that nowadays `vagrant` commands across the board are a LOT faster. But there is still a lot of work to be done. Specifically the `vagrant ssh` case will get a massive speedup due to cached SSH info.

I specifically have issue with "complexity causes slowness." Sure, it happens, but it doesn't HAVE to, and I'll show that to be true.

4. Plumping and porcelain Git vs Vagrant

Perhaps there is an option here to use unix-type piping to provide an API via plumbing commands. I've decided to take a different approach. Will this work well? Time will tell, but I don't think its fair to criticize Vagrant at this point before this is proven/disproven yet.

5. 12000 lines of Ruby.

~3500 is test code.

~5500 is "plugins" but there is a lot of boilerplate here. The boilerplate is on purpose, I didn't want any "magic," hence it being somewhat Java-like, I suppose, but this is a lot more friendly to non-Ruby developers, and a vast majority of Vagrant users are not Ruby developers. Note that ~3000 of this is ONLY the VirtualBox plugin, which is a pretty complex piece, so that makes sense. The other ~2500 is _every other_ feature of Vagrant.

~3300 is core code. This is mostly a glorified plugin runner that handles plumbing for you, such as hooking up the right host/guest combination, choosing the right communication (SSH, WinRM, etc.) for the machine, error handling, etc.

Hi Mitchell, it's really great to have you explain all of these things here. Thank you both for this and for writing Vagrant, there is no doubt that it is a valuable piece of software. I have a few more questions if you don't mind:

1. When you first started with Vagrant, did you consider using Python at all? Did you consider using libvirt? Because VirtualBox offers a Python API out of the box (or it does now) and libvirt comes with direct support for Python (http://libvirt.org/bindings.html). I like Ruby and generally I prefer it to Python but I am interested in your view of whether it was the right tool for the job in this case or whether the language choice didn't matter much.

2. In your view what is the overlap in functionality between libvirt and Vagrant and what are the differences? As far as I can see libvirt does let you spin up boxes, talk to them and tear them down. The notable difference is that there is no Puppet of Chef provisioning, is there anything else beyond that?

3. veewee in version 0.3.0 (currently in beta) seems to offer the same functionality as Vagrant plus the ability to create boxes using templates, minus the ability to run provisioners. If they add that say in 0.4.0 then it will be 1:1 in terms of features with Vagrant. Any thoughts on this?

Thank you again for your time and for answering in such detail :-)

No problem, I'm glad you're getting value out of the responses. In response to your questions:

1. When I first started Vagrant, I was a full time Ruby developer, so Ruby was really the obvious choice for me. I don't really see language as a barrier for using a tool, my opinion is generally use the best tool for the job. And as a Ruby developer that at the time had 4 years of Ruby experience, that was the best tool for the job since I'd get the most productivity out of it. To date, I don't regret this decision. There are some things Ruby is bad at, some things it is good at, but I think I can overcome the bad with time.

2. I think the difference is that Vagrant is a tool focused on user experience and workflow. i.e. it'd be possibly to build Vagrant on top of libvirt. I think libvirt is technically stronger than Vagrant, but Vagrant provides a better overall experience. The overlap is small, we both have a lot to gain from working with each other.

3. A handful of people keep saying that Vagrant is just a "VM setup + provisioning" tool. Vagrant does quite a lot more. The main example I always use is networking because it is usually the most complex. Vagrant makes networks work across Linux, Mac OS X, and Windows, and sets up the hosts and guests properly. VeeWee doesn't do this and won't do this. VeeWee at its core was built to be a VM image creation tool. VM lifecycle control was bolted on later. VeeWee filled a major gap in the Vagrant ecosystem for a long time.

I'm good friends with the creator of VeeWee and we talked all the way back in Oct, 2011 of merging the functionality. He was all for it. Its almost been a year but this work is finally going to get started.

I don't want to talk too much about it until I really start coding, but I can say that no one in his thread has really seen the true scope of what I'm trying to build here. They will. :)

Forgive me for derailing the conversation, but I've been following this thread about the Cathedral & The Bazzar: http://news.ycombinator.com/item?id=4407188

In particular, I've found the discussion of autoconf to be fascinating. I think what fundamentally bothers me about Vagrant is that it feels like autoconf. By that, I mean it's a system which 1) hides a ton of hard work that users don't need to do to target multiple backends 2) accomplishes that hiding through complexity 3) further hides that complexity through an "easy" interface 4) generally just fucking works most of the time 5) is completely maddening to deal with when it does break 6) ideally shouldn't need to exist.

Anyway, I realize that ideals != reality. I also realize that I've got a learned behavior of reinventing the minimal viable tool from fundamentals to avoid dealing with the complexity of problems that I don't have. And furthermore, I realize that my learned behavior is of limited applicability outside of narrow environments, such as my own startup where I own the full stack and make all the rules.

"However, I think this is one of those cases where OOP has a name for something that FP just takes for granted: function composition. Regardless, Clojure's Rack equivilant does provide middleware as functions of request -> response maps."

I'd like to address some of the finer points in the comparison of middleware vs function composition that have been overlooked.

First class functions - While Ruby does indeed have lambda's and blocks it does not have first class functions, or partial application. This makes state and argument management much more difficult. That is, Vagrant would have to cram _everything_ into `env`...

State - The middleware(s) operate on a common state container `env` and this is where the initial composition comparison arises. The common argument is consistently the `env`. The problem with cramming everything into `env` is that you discard data encapsulation. All the things are known to all the people, which is unnecessary and can cause implicit dependencies between middleware(s). By instantiating a class with values specific to its operation we avoid this issue all together. To achieve a similar result with pure functions you would need something like partial application which Ruby simply doesn't support. It's also worth noting that with pure functions you could adopt a monadic pattern for composing and managing the state which is, IMO, a more accurate analog [1].

Method dependencies - With regular functions you have two choices, you can simply use a global reference to a helper function to maintain discrete testable operations or inline all the different bits of functionality into the function itself thereby making it difficult to test. Generally speaking I'm a fan of keeping those helpers namespaced and easily sharing the small bits a state specific to the instance's operation between them via context. Keep in mind that small classes with sparse state access through the implicit self can actually be more readable, and the ability of the reader to reason about execution isn't hindered too horribly by the implicit state.

Warden - Implementing a recovery pattern is hard using just pure functions. In Vagrant a watcher middleware (Warden) is inserted between each middleware to catch exceptions. When an exception is thrown, because the parent is the Warden it can then recognize its place in the stack and work backwards calling a `recover` method on all the preceding middleware in the stack. This means that each middleware is effectively two methods. One for the initial operation and a second for recovering in the case of failure. You can implement something _similar_ using Either, but in reality you need a custom record data type that carries both functions in which case you're not composing a simple function any more and the difference is really just that Vagrant is written in Ruby. I've written a blog post on the inspiration for Vagrant's Warden implementation if you're interested [2].

Vagrant's Middleware != Rack - PEP333 makes no assertions about state cleanup because in an http request there really isn't much state (the prescription is to return a helpful error message[3]). Vagrant on the other hand is writing hundreds of MB to the disk and needs to clean up when something goes wrong, so I'm not sure that drawing a comparison with a Rack implementation makes sense.

1. http://johnbender.us/2010/07/22/middleware-composition-and-m...

2. http://johnbender.us/2010/10/18/haskell-and-vagrants-middlew...

3. http://www.python.org/dev/peps/pep-0333/#error-handling

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact