Hacker News new | comments | show | ask | jobs | submit login
The same app 4 times: PHP vs Python vs Ruby vs Clojure (adambard.com)
74 points by adambard on Mar 25, 2013 | hide | past | web | favorite | 92 comments

Is it just me or is PHP easily the best choice for quickly implementing something like this? The Ruby example required the framework Sinatra, Python isn't a language most servers support out-of-the-box (it is beautiful though) and Clojure is the ugliest language I've ever seen (at least in this example). I look forward to the day PHP is standardised (it will happen), because it cops a lot of flak when it gets the job done without needing usually anything installed on the server to run it. Upload your scripts and run, BAM!

PHP just has a different deployment model in most cases using just Apache + mod_php out of the box and PHP is embedded inside the Apache process.

This model is memory intensive and does not scale well.

A more accurate comparison would be something like Nginx + PHP+FPM or even HipHop VM for PHP.

If you're only measuring stick is "Upload your scripts and run, BAM!" then it's no real argument. But if you're at the scale of having professional systems engineers and scaling to millions of users then deployment isn't an issue as you have your processes running smoothly.

If you can deploy the same set up 10,000 times easily, it doesn't matter that the initial set up took you 2 hours vs. 1 hour.

Software is infinitely reproducible. The effective deployment cost then becomes 0.

Essentially, at scale, you are no longer running Apache + mod_php.

> If you can deploy the same set up 10,000 times easily

If you had 10,000 web servers, are you telling me its difficult to create a script that will automatically pull files from a centralised network store?

In the above case, that's how I would set up and its trivial.

Yes, I am saying it is trivial.

To be fair, the beast we call the PHP standard library is gargantuan and tailored to web development. Much larger than Ruby + Sinatra or Python + Flask, and arguably worse at general purpose programming. So, different strokes for different folks.

> I look forward to the day PHP is standardised (it will happen), because it cops a lot of flak when it gets the job done without needing usually anything installed on the server to run it. Upload your scripts and run, BAM!

What do you mean by standardizedPHP was quite popular in the past, and serves about 35% of the web traffic(wikipedia and facebook alone would constitute a significant amount).

I don't know about you, but my requirements are to get the job done reasonably, not "get the job done by copying a script on a cheap-ass shared hosting and you can't install any extra library. lol".

Naming conventions as pointed out are a big issue. "strtolower", "str_replace" both string manipulating functions that have different naming conventions, don't get me started on arguments of the functions either. Sometimes the string comes first, sometimes the needle comes first, etc. Once PHP has its naming conventions down it'll be much better.

You can do the Ruby bit without Sinatra - just rack - without too much hassle, really. https://gist.github.com/cheald/5242227

That's because PHP was made for this.

Part of the reason the Clojure code is ugly is because he's basically doing imperative code in a functional language. Here's my (highly annotated) take:

    (ns nurblizer.core
      (:use compojure.core)
        [clojure.string :as str]
        [clostache.render :as clostache]
        [ring.adapter.jetty :only run-jetty]
        [compojure.handler :only site]))

    ;; main nurble stuff
    (def nouns
      (->> (-> (slurp (clojure.java.io/resource "nouns.txt")) ; read in nouns.txt
               (str/split #"\n"))                             ; split by line
           (map (comp str/trim str/upper-case))               ; feed the lines through upper-case and trim
           set))                                              ; transform into a set

    (def nurble-replacement-text "<span class=\"nurble\">nurble</span>")

    (defn nurble-word [word]
      (get nouns (str/upper-case word) nurble-replacement-text)) ; return word if word in set else nurble

    (defn nurble [text]
      (str/replace text #"\n|\w+" #(case %               ; using anon func literal, switch on argument
                                      "\n" "<br>"        ; when arg is newline replace with br
                                      (nurble-word %)))) ; otherwise nurble the argument (a word)

    ;; webserver stuff
    (defn read-template [template-file]
      (slurp (clojure.java.io/resource (str "templates/" template-file ".mustache"))))

    (defn render
      ([template-file params]
       (clostache/render (read-template template-file) params
                         {:_header (read-template "_header")
                          :_footer (read-template "_footer") })))

    (defroutes main-routes
      (GET "/"        []     (render "index" {}))
      (POST "/nurble" [text] (render "nurble" {:text (nurble text)})))

    (defn -main []
      (run-jetty (site main-routes) {:port 9000}))
The result is the same but the approach is different: build a set of uppercase nouns, define how each word in the text is handled, and do a single string replacement pass over the text to produce the output. Doing things like tokenizing the text and processing it as a seq or keeping the nouns lower case and producing upper case output are easy but seem like unnecessary complications so I left them out.

Note: In the grand tradition of the exercise, the webserver stuff is untested. I did repl-check the nurble part but was too lazy to set up a project.clj and have leiningen pull in the webserver deps.

Yeah - just you. (jk :-)

Problem with PHP (for me) is that you gotta install Apache to make it work.

Actually as of 5.4 it has a built in web server you can use at least for development http://php.net/manual/en/features.commandline.webserver.php

Problem with PHP is he loaded the nouns list outside of the function. Not sure whether he tested the code at all ;-)

Not the PHP code, no. I can only guarantee that it almost works.

And I can guarantee that it's a lot of these

   PHP Notice:  Undefined variable: nouns in nurble.php on line 12
   PHP Notice:  Undefined variable: nouns in nurble.php on line 12
   PHP Notice:  Undefined variable: nouns in nurble.php on line 12
   PHP Notice:  Undefined variable: nouns in nurble.php on line 12
   PHP Notice:  Undefined variable: nouns in nurble.php on line 12

(Also: check out explode() instead of preg_split() and with a simple pattern like this, str_ireplace might be faster, as might be searching for the word in the whole text file instead of converting it into an array.)

nginx is pretty great w/ PHP. both as a general server and a reverse proxy.

is apache not cool anymore? i've done some pretty beast installations and apache handled them just fine.

Interesting article, except that the examples don't line up exactly. For example, in PHP it's trivial to store the file contents in a var and refer to that over and over (as OP does in Ruby), but instead he chooses to read from the file over and over again and then calls that a shortcoming of PHP vs Ruby when that's purely an implementation decision.

Ruby is clean (but I dislike the stack), and Python looks good, but ye gads is Closure ugly... Uglier than PHP, even.

My favorite part about PHP is how easy it is to get a simple Nurble-like app up and running... but that's just me.

OK, I'll bite: how would you store the array of words in memory across requests in PHP? As far as I'm aware, the runtime deliberately doesn't provide a way to do that.

The closest you could get is to store it in $_SESSION, but that's per-user rather than global, and it still has to read the data from your session handler. Which by default just uses flat files, so we're no better off than we started.

The APC opcode cache extension also has a user cache which has an API similar to memcache but a bit faster since it's using shared memory and not going over the network. It's shared between all visitors across all requests.


APC. It comes as part of the official package as of 5.2.


... And that is something exclusive to PHP?

Please do, I'll update and edit as appropriate. I wasn't aware a trivial way existed of storing a persistent variable in PHP between existed. I bet I'll feel dumb when I find out what it is, though.

You can do it with APC, but beyond that, no, your solution is the PHP way to do it.

I think his point was just that you could hard-code your list of words into the code. It would be ugly, but it would work.

Isn't that the same as loading them from file on each request?

You would also want to install an opcode cache like APC, XCache, etc. That way the code is only compiled on the first page load.


No, because you don't have additional IO overhead.

I think that overhead would be negligible, especially if the other examples are all based on micro-frameworks (i.e. loads of files).

Premature optimisation is the root of all evil, right?

Disk IO is a massive problem at scale. This obviously isn't "at scale", so it works fine here, but avoiding hitting the disk (edit: any slow IO, really) is Performance 101.

"full disclosure: I didn’t want to install Apache+PHP on my dev machine, so I haven’t tested this one"

Something rubbed me wrong with this statement. If you are going to make a remark like '[Clojure's] performance should be the highest of the above,' shouldn't you at least try them first?

Yeah, that was kind of odd. Apache and PHP are probably the easiest stack to install out of these four, so why bother including it in your article?

Full quote: "Ok, that seems to work (full disclosure: I didn’t want to install Apache+PHP on my dev machine, so I haven’t tested this one). Deployment in PHP works like this: Make sure your server is running apache and mod_php Put the files where the server expects"

So which one is it, did you not install Apache+PHP or do you just check out and put files where the server expects?

More proofreading would be nice.

I laughed at that too. Let's take the hard way out and test these admittedly-ridiculous-to-deploy web stacks instead of spending a few minutes installing a LAMP stack.

PHP can scale too: look at Facebook as an example or any hugely popular website running Wordpress/Drupal. Where do you want so spend your time: scaling your PHP production environment or configuring your JVM/rvm/Python/Passenger?

Not to mention NGINX is what people should opt for, not Apache when it comes to performance.

I've never used PHP with NGINX. Can you recommend a good tutorial for configuring the pair??

To be honest, I never used one. It came installed on my stack and all I did was disable apache, enabled NGINX and PHP-FPM (cgi compiled php, quite a bit faster).

But if there would be a tutorial, perhaps this? http://www.howtoforge.com/installing-php-5.3-nginx-and-php-f...

When using PHP and Nginx, you're going to become familiar with PHP-FPM (FastCGI Process Manager). The briefest tutorial to get to working is at http://www.howtoforge.com/installing-nginx-with-php5-and-php....

Look at php-fpm (FastCGI process manager) which has been bundled with PHP 5.3+. On the nginx side, you would configure it to use fastcgi_pass. An example configuration is available in the Silex documentation:


The original SMBC specifies that everything except the nouns should be nurbled; your implementations replace the nouns themselves.

I'd like to refactor the Ruby example, but I'd like to know what the proper implementation should be!

You are correct. I've updated the gists and text accordingly.

The Ruby example was already correct.

The inevitable ruby style niggling, feel free to ignore: personally I like something a little more concise and inlined, some parts seem to have used more verbose idioms, (like Regexp.new instead of a literal), and also I think you need to escape the backreferences or use single quotes, something like:

  (words - @@nouns).each do |w|
      text.gsub!(/(\b)#{w}(\b)/i, '\1<span class="nurble">nurble</span>\2')
EDIT: oops, backslash. Also, in the example, line 19, where you have 'sub', don't you mean 'pattern'?

I submitted this article expecting to learn a few things about my style. In this case, I didn't actually know you could interpolate into regexes like that, so thanks!

And thanks for the typo catch, too.

No problem, in ruby style is always pretty subjective since there's a thousand ways to do just about everything.

And, just FYI you do need to escape the backrefs or use single quotes, like:

  "\\1<span class=\"nurble\">nurble</span>\\2"

I take it back - I had issues with escaping on irb but not in the app itself. Something new to learn...

A trivial speed optimisation for the python example would be to use sets instead of lists for e.g. change line 8 to:

    NOUNS = {l.strip().lower() for l in f}

I think it actually might be a little easier to read in Node.js/CoffeeScript.

    fs = require 'fs'
    express = require 'express'
    app = express()

    app.use express.bodyParser()
    app.use app.router
    app.use express.static(__dirname + '/public')
    app.set 'views', __dirname + '/views'
    app.set 'view engine', 'jade'

    nouns = fs.readFileSync('nouns.txt', 'utf8').split('\n')

    nurble = (text) ->
      text = text.toUpperCase()
      words = text.toLowerCase().replace(/[^a-z ]/g, '').split(' ')

      for word in words
        if word not in nouns
          re = new RegExp "(\\b)#{word}(\\b)","i"
          replacement = '$1<span class="nurble">nurble</span>$2'
          text = text.replace re, replacement

      text.replace  /\n/g, '<br>'

    app.get '/', (req, res) ->
      res.render 'index'

    app.post '/nurble', (req, res) ->
      res.render 'nurble', {nurble: nurble req.body.text}

    app.listen 3005

Cool stuff. The danger of this, though, is that you risk doing unidiomatic examples a disservice.

I thought you were just going for line-for-line parity until I came across the Clojure example. I think you could've written that same `for` loop in Clojure and either yield the original word or yield "nurble" if it intersected a noun. But this is put up or shut up territory and I'll take another look.

Also, could you clarify why you make multiple assertions that Clojure is "too much" for this task?

I'm a fulltime Ruby developer learning Clojure on the side and Clojure's Ring+Compojure feels like Rack+Sinatra to me. In fact, the choice between Ruby and Clojure to me is as trivial as using Python or PHP. Here's ClojureDoc's tutorial on making a DB-backed webapp in Clojure: http://clojure-doc.org/articles/tutorials/basic_web_developm.... It's just Sinatra/Flask in another flavor all over again.

The only downside of Clojure so far for me is that I might already be drunk with that special Lisp koolaid already.

It's hard to argue that the Clojure code is as easy to read as the Ruby or Python, and if you're working with a team that matters. Not to mention the relative dearth of Clojure-fluent developers. Neither of those hurdles is impossible to overcome, but if you're choosing between Ruby and Clojure, you have to justify that decision somehow.

Of course, if I'm the only developer I choose Clojure every time. That's probably the Lisp koolaid doing its thing.

Here's what I meant:

    (def nouns #{"apple" "book" "house"})
    (def text "My apple is red.")

    (defn noun?
      "intersect word with nouns and see if we've got anything."
      (not-empty (intersection (set [word]) nouns)))

    (defn nurblize
      (let [words (split text #" ")]
        (map #(if (noun? %) "nurble" %) words)))

    (nurblize text) ;=> ("My" "nurble" "is" "red")

Yeah, that's much better than mine.

I think people like to over-exaggerate the difficulty of learning Clojure. If you have one person on the team who is fluent in it, the rest of the team can pick up enough of it to be productive in days.

The place I work at used to be exclusively a java shop, we introduced Clojure for a project a year ago and everybody loved it. So, far we had co-ops learn it, contractors, and the whole team. I'm just not sure who these people who can't learn Clojure are exactly.

It might be more difficult to read initially if you've only worked with one family of languages, but that doesn't make it more difficult to read inherently. In fact, there are many benefits to readability once you're familiar with the syntax. It's a lot more regular, there's less special cases than in most languages, and you can see the relations in your code visually, since you're seeing the same AST that the compiler sees.

Rich Hickey sums it up best: "I can't read German therefore German is unreadable."

Weird that performance was the first reason mentioned when saying that is why he picked clojure. Ease of development is usually the top of my list. Then ease of hiring developers to maintain the system. Clojure is hard to dive into and hard to hire for right now. Two big minuses in my book.

I sent you a pull request. I think the Clojure version becomes more performant still if you slurp the nouns into a set so you don't have to do a linear traversal.

Also, I think the regex usage makes the code a little harder to digest and comes at the cost of further unnecessary cycles.

I think this code gets even easier if you don't have to preserve the whitespace and filter non-alpha characters. (see danneu's suggestions)

Finally, what are your thoughts on passing along the wordlist (nouns) as an arg to nurble? This would offer referential transparency and also make the test easier.

Btw, I think your update broke your gist (where you changed (some (partial = w) nouns) to (not (some (partial = w) nouns)). Check that out again.

Cool post, thanks for sharing.

And for anyone who's interested - here it is in old-skool Java JSP style

  <%@ page import="java.nio.charset.Charset" %>
  <%@ page import="java.nio.file.*" %>
  <%@ page import="java.util.List" %>
  <%@ page contentType="text/html;charset=UTF-8" language="java" %>
    public String nurble(String text) throws Exception {
      text = text.toUpperCase();
      String[] words = text.toLowerCase().replaceAll("/[^a-z ]/", "").split(" ");
      List<String> nouns = Files.readAllLines(Paths.get("nouns.txt"), Charset.defaultCharset());

      for (String word : words) {
        if (!nouns.contains(word)) {
          String pattern = "(?i)(\\b)" + word + "(\\b)";
          String replacement = "$1<span class='nurble'>nurble</span>$2";
          text = text.replaceAll(pattern, replacement);

      return text.replace("\n", "<br>");

  <h1>Your Nurbled Text</h1>
    <a href="/">&lt;&lt; Back</a>

To my mind, besides all the import guff, the core of that code is more readable than many of the other language implementations and just as concise.

What are you considering the core? The nurble method?


  configure do
    @@nouns = Set.new File.open('nouns.txt').map {|noun| noun.strip.downcase }

  def nurble(text)
    words = Set.new text.downcase.split
    # Replace words which are not nouns with nurble.
    (words - @@nouns).each do |word|
      text.gsub! /(\b)#{word}(\b)/i, '\1<span class="nurble">nurble</span>\2' 
    text.gsub(/\n/, '<br />')
Python is similar.

Personally, I find it far more readable that the JSP sample.

Yeah - the nurble method. It's really the Clojure that looks the worst.

Taking just the boolean query to figure out whether the word is in the noun list, I'd place them in this order from simplest to understand through to most obfuscated

  Python: if word not in NOUNS
  Java: if (!nouns.contains(word))
  PHP: if(!in_array($word, $nouns))
  Ruby: if not @@nouns.include? word
  Clojure: if (not (some (partial = word) nouns))
The Python is pretty much just english

The Java is almost english except for the !

PHP is starting to get more obtuse - ! and $ and you have to know the order of the parameters

Ruby (from the article) has @@ and ? but isn't too bad - might go above the php

Clojure is just weird - looking at the others, I reckon a person could easily modify and write a similar program without knowing much of the language. With Clojure I think it requires deeper understanding.

I haven't included your Ruby in that list because it's doing a different (much neater :) ) thing.

Perhaps I am missing something, but at least the PHP example is not the most efficient. Seems like you could easily build a nice regex with all your nouns and replacing them in the string, rather than splitting up the text and looping through each word. The code would be considerably shorter and likely quite a bit more efficient.

As for your comment on you have to load up the nouns file each file, assuming you have PHP 5.2 or later, the file will stay in memory when using APC and is reused. You can opt to extract the content and store that in memory also.

Lastly, for performance, the code could be improved as I mentioned above, but Apache is definitely not the way to go for performance. NGINX is the way to go, both are very similar, but when it comes to performance NGINX is super lightweight, fast and handles a heck of a lot more connections.

weird. he uses a mocro framework for the others but doesn't do so for php.

overall I think I prefer the python example.

PHP is the only one of them that doesn't need it.

Because PHP is a framework disguised as a language.

heh. It does if he wants to later complain about html and php code all mixed up.

The only reason that is true is that he chose not to use a micro framework for php.

That's fair, I took that particular whinge out.

You could have investigated micro php framework like silex tbh (silex.sensiolabs.org), with which you could have used Twig template (twig.sensiolabs.org). A bit an overkill for such a trivial task, though.

Will Passenger just serve up ERB files just like PHP? Now I'm curious.

I don't think it can(I did a quick search to confirm). But with apache, you can install a ruby script as cgi which renders the erbs and returns them.

You don't need Passenger. Just write the sinatra app, then 'ruby <yourapp>.rb'.

He isn't talking about the sinatra example. He is talking about apache/passenger serving .erb files directly as apache/mod_php does for .php files.

If anyone's interested here's my go approximation https://github.com/minikomi/go-nurble running at http://poyo.co:9000 for now :)

Heh. Our solutions came out very similar. I clearly did something wrong, as I thought backreferences didn't work in Go's Regexp package. Oh well. I'll remember for next time.


Just like others say, PHP is probably the winner here:

- You're comparing raw PHP with frameworks from other languages. Try not using frameworks in other languages.

- You're saying that PHP has no dependencies manager - http://www.getcomposer.org

- Using anything more than PHP for such task is an overkill.

To be clear, I'm not saying that PHP is the best language of them all. Like all languages it has its ups and downs and there's a lot of subjectivity on what you like and dislike. I'm just saying that such comparison isn't really fair.

> - You're comparing raw PHP with frameworks from other languages. Try not using frameworks in other languages.

Then try using Perl instead of PHP, maybe, so you can have fun importing CGI libs? Because PHP is a web dev framework. Don't argue with me, argue with PHP's own history page:

"... the very first incarnation of PHP was a simple set of Common Gateway Interface (CGI) binaries written in the C programming language... Rasmus rewrote PHP Tools [so the] new model was capable of database interaction and more, providing a framework upon which users could develop simple dynamic web applications..." -- http://php.net/manual/en/history.php.php

> I'm just saying that such comparison isn't really fair.

As presented, it's absolutely fair, otherwise you're comparing a web app framework (PHP) to languages that don't have, for example, HTTP POST handling unless you import those libs.

Here is my submission (written in standard Go): http://nurble.nickpresta.ca/

Source is available here: https://github.com/NickPresta/nurblizer

There are some minor differences (due to Go lacking backreferences with it's Regexp package, as far as I can tell) that I don't care to fix. Still, in ~74 lines of code, you get something pretty decent (and no external web server required, either).

Just a quick note that since php didnt use a microframework (because it didnt need to) you could've used many of the ones that already exist, however, someone wrote a clone of Sinatra for PHP called Frank[1], although the code is old now and there are many newer, maintained microframeworks to chose from.

[1] https://github.com/brucespang/Frank.php

What would be interesting is if the project would have randomly picked one of the codebases to load so you could end up taking a spin on each.

Nice article. I discovered a lot thanks to you.

Note you are very nice with the PHP implementation by using a function instead doing the foreach inside the HTML ;)

May I suggest improve the article by including: 1- an HTML template of each framework (since each seems to have a their own syntax) 2- the JS and GO example (since it's the trend on YC) 3- Use the Framework name instead of the language name

"The bit about dependencies is most interesting to me. In PHP, the problem of dependencies is offloaded to Apache, and by extension the server host. PEAR helps a lot, but even that’s not guaranteed to be available everywhere."

Most modern PHP projects (PHP 5.3+) are using namespaces, autoloaders, and Composer to handle dependencies.


Am not sure if you tested your `PHP` code but your regex is wrong (It returns empty values)


You can also improve the php function by using nl2br and isset instead of in_array

Here is a revised code : http://codepad.viper-7.com/4OtoTD

Tiny error: in Ruby global variables start with $, class variables start with @@.

Here's a Mojolicious (Perl) one I did up since I was bored: https://gist.github.com/RogerDodger/5242390

It's only about work with strings. Not very interesting - far from real web-applications.

i'm just running the php nurble() function from command line, but it seems you should reference global $nouns within the function and use either "/\W/" (upper case "W") or "/\s/" in the preg_split. am i missing something?

It surprised me to see Clojure had the longest example and PHP had the shortest.

I wasn't surprised to see PHP do well as having the shortest path to a responsive site (in the most minimal sense) is PHP's greatest feature.

I was working forwards from the Ruby example for all of these, so I may have implemented the Clojure one in a weird way.

PHP cheats a bit to be the shortest, since it just looks up files to execute and renders stdout as the result, removing the need for routing.

Well, that's one of the few "advantages" of PHP.

Though PHP allows this to happen easily, it's a fairly terrible idea for production code - the shortest isn't always the best.

In reality you'd use a macroframework like Symfony or a micro like Ham https://github.com/radiosilence/Ham

There is a microwork made from Symfony components called Silex: http://silex.sensiolabs.org.

Edit: I mis-read; reworded for clarity.

i'd be interested in seeing a go version as well just for comparison.

Any "language/framework comparison app" that does not include user authentication and permissions, sessions, user specific persistent data, https communication, queueing of long lived operations, unit tests, and scaling across multiple instances is worse than useless.

I'd write it if you would buy the book.

Build three versions of an app with these characteristics using Ruby on Rails, Python, and Node and I pledge I would buy the resulting book for thirty five dollars.

Try KickStarter. I funded this book http://s831.us/105pQB8 and I am glad I did. I would pay at least five dollars up front to fund the writing of such a book.

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact