Hacker News new | past | comments | ask | show | jobs | submit login
Suppose I wanted to write a C++ based web application...
33 points by jgalvez on Oct 5, 2008 | hide | past | favorite | 86 comments
What should I use? Ideally I would either have a Python or Ruby application with all its bottlenecks coded as external libraries in C. But what if I just wanted to write plain C++? How can I run it without having to resort to CGI or running it on top of a webserver library (eg. the app is the server)? Are there any event-driven servers that allow for easy plugging in of C++ applications?



It's not totally crazy. I think Meebo's mostly written in C++, for instance, since they're always looking for C++ and JavaScript people but they never mention scripting in their job ads: http://www.meebo.com/jobs/openings/server/ I think they mention in one of these blog posts that they wrote a lighttpd module.

I've written a C++ Apache module for a simple web app before. It's definitely harder than using a scripting framework though, not so much because of the language, but more that frameworks have functions that help you do just about anything (e.g. sanitize input, construct safe SQL). In C++ land you're more on your own since most people aren't crazy enough to build a web app in C++.

Also, if you go down this route and you're particular about C vs. C++, check out mod_cpp for building Apache modules: http://sourceforge.net/projects/modcpp/

I echo the advice that if you have something performance-sensitive, build it in C/C++, but proxy it through something built with a better-known framework. Thrift is great for this: http://incubator.apache.org/thrift/


Thanks a lot, Thrift looks very cool. That's exactly the kind of thing I would see myself doing a lot. Offload stuff out of Python altogether and communicate with a C++ daemon.

I heard about Thrift when Protocol Buffers came along but didn't bother to look into it. Is it any better than PB? Does anyone have any experience on this to share?


Here's the big Thrift/PB comparison table: http://stuartsierra.com/2008/07/10/thrift-vs-protocol-buffer...

Personally, I like Thrift better. More languages and greater community participation.


Meebo's in Java. Or at least so said the founder when on a recent TechCrunch meetup panel. Specifically, he was making cracks at Ian (from Songkick) for Songkick being in Ruby. ;-)

http://uk.techcrunch.com/2008/09/21/techcrunchtalk-video-sta...


I think you're confusing Bebo and Meebo. The co-founder of Bebo was on that panel and Bebo is written in Java.


Dammit. I swear I'm too old for Web 2.0. ;-)


It's ok, you just had a "senior" moment. Just relax, and don't forget the blood pressure pills ;)


I doubt the front-end stuff, e.g., templating, is written in C++.

Rather, they use C++ because the library that powers their IM client, libpurple, is written in C. Writing a C/C++ bridge to whatever (Java, Ruby, Python, PHP) isn't hard and there are lots of generic solutions out there already, e.g., Thrift.


bloglines is entirely C++. I remember seeing some interview where the guy said it was critical to beating their competitors. They were all busy trying to scale while bloglines already worked.


Interesting. That seems to be the exact opposite of pg's view that Lisp was one of Viaweb's main competitive advantages, because it allowed them to rapidly add new features.

http://www.paulgraham.com/avg.html


I think Lisp is reasonably fast, possibly as fast as C++.


and where are they now


Well, I think the only major competitor to Bloglines these days is Google Reader which is in... C++ as well. Well, I would guess also probably a few bits of Python in between. Any googlers here? :)


How do you know it's C++ and not Java?


Please lay down for a while and it will pass.


I haven't used this personally, but I remember reading about the web server behind okcupid.com: http://www.okws.org/

It uses c++: http://www.okws.org/doku.php?id=okws:tutorial


Disclosure: I'm one of KLone developers.

Use KLone http://koanlogic.com/klone/ .

It lets you write web pages with embedded C/C++ code (php-style escaping).

It's VERY fast and provides all common functions/features you need to parse incoming HTTP variables, sessions, file uploads, etc.

If you need more features you can link it against any C/C++ library and call library functions from within web pages.


That's exactly what I was going to suggest.

If you're doing something computationally intensive (like, a dating or quiz site, for example, with the impressive statistical analysis that OkCupid and HelloQuizzy deliver up very quickly), then it would pay to look at an approach that's working well right now.


Thanks for linking it - that was the first thing I was thinking of aswell when I read the guys post.


You could write an Apache module. Or use FastCGI. Or libasync: http://pdos.csail.mit.edu/6.824-2004/async/6.html

Let me know what you end up doing.


+1 for an Apache module.

Build on a mature best-of-breed product that provides a quality programming model and all required HTTP and logging services. Also easily take advantage of other apache mods such as memcached or wackamole.

Good luck.


I have taken extensive PHP applications and re-coded them as a C/C++ module to the PHP language. It worked out successfully. A simple loop in PHP that just does some math and prints our a value, can be 1,000 times faster in C; if a PHP object is created and deleted in the loop, and you do the same in C++, you can get 100,000 times faster. Your speedup will vary according to what it does, of course -- if it is mostly waiting for a database, you won't see much speedup at all.

Contact me if you are looking for someone to do that kind of work.

If I were starting a new web app from scratch today, I would code it in C using Thomas Boutelle's cgic library. Of course, it seems one rarely starts anything from scratch these days.


Well, we're constantly looking for talented folks at Côdeazur Brasil (http://codeazur.com.br), even for remote work. We actually have an open position for a Rails senior developer at the time but we haven't managed to find anyone yet, or haven't looked hard enough.

Please contact me at jonas@codeazur.com.br with IM info, I'd love to introduce you to some of the stuff we want to do (so maybe you can help).


http://www.webtoolkit.eu/wt

That said, even as a hard-core C++ guy, I can't see why you'd really want to write a web application in C++. If you want C++-like syntax, you'd be better off with Java for web stuff.


(Can't help it)

And start saving money for memory and large amount of servers...


the time you save is definitely worth $3k


Hmm, I'm not sure if $3k is all you'll ever need for Java hardware. If the application is big enough you're likely to have keep on spending on vertical scaling. But then again, that's what big enterprises like to do: spend money on big boxes.


Which leads me to point that out that Java is far from suitable for a start-up. I'm beginning to think that Python plus C/C++ might be the golden bullet for startups. Ruby's C integration is easier than Python, but there's something to be said about the overall language model and performance. I tend to have a main RoR app for the website and a few Python daemons in the background for the really heavy stuff, and now it's finally the time to think about offloading some of that load even further closer to the machine, in C/C++.

I'd really love to come up with a standardized way of deploying small, context-specific RESTful API servers in either C or C++. I'm starting to think that a possible C++/Python bridge via Thrift is likely the best option. But then again, if I just managed to keep it all in a C++ application without needing to run the Python interpreter, it could totally rock (for an application that is supposed to serve 100k+ users on cheap hardware).

It's cool to see cloud computing going mainstream and now applications written in any language can be reasonably fast with the help of EC2, a load balancer and memcached. But I still think there's great value to coming up with truly lightweight solutions that really use nearly all computing power a single node can possibly offer.


"Which leads me to point that out that Java is far from suitable for a start-up"

You can't generalize like that. It's a language. It works. It scales, and is solid.

Sure, if you're just putting up something that needs quick prototyping then java is probably overkill, but if you're writing something quite complex, java can do a great job.

I think the "golden bullet" for startups is having a good programmer with a passion for something. Language isn't so relevant IMHO.

(Mibbit currently handles around 3.5million visits a month on a single VPS in java which I'm pretty happy with).


What do you use for your templating engine? We are use struts2 as our framework and found that OGNL and struts2 tags are two orders of magnitude slower than straight JSP.

I'm thinking of ditching JSP and have PHP talking to our Java services via Thrift.


3.5 million visits is small in current internet terms. And now having a look, it is a small text message system. No media, no video, not many images. Sorry, but it's not a good example at all.


At peak it handles 500+ lines of chat a second. That's quite a bit more than twitter. It also has around twice the number of peak users than Meebo rooms. So although it's certainly not a massive site, it has taken some thought to scale.

Having big images/videos wouldn't really make any difference to which language you choose. That's just static media files which are irrelevant.

I really don't see how having more images on the page would make it harder to scale :/ if only the site were as simple as serving up some images...


>I really don't see how having more images on the page would make it harder to scale :/ if only the site were as simple as serving up some images...

Apparently that is all too hard for some web 2.0 startups.


Where is your site that receives 3.5M visits per month? It doesn't matter what language or platform you use. Handling 3.5M visits takes some thought. In fact, where can we find anything to support your language-bashing in your last few posts? I see you knocking down other people's suggestions without putting forth anything of substance yourself.


First, please take a deep breathe. This is a technical discussion and it doesn't necessarily need to have "a winner."

Second, please look my other posts 4 and 6 minutes before this one you just did.

On a C++ thread some people above say the thread creator should use Java. I refuted that argument with my opinion and reasoning. It is not bashing:

bash: - To engage in harsh, accusatory, threatening criticism. -ing When you deliberately attack a person using offensive and/or inappropriate language.

See on top of this page: "Suppose I wanted to write a C++ based web application... [...] But what if I just wanted to write plain C++?"

Can't Java people play along? Can't you guys even play pretend? There are very good comments in this thread (both pro and con.) And if you think C++ Web Applications are a bad idea, fine, say so and state your case. There was no need to bring up Java. Of course you are free to do so. But that allows others (moi) to counter-argue with you.

Thank you.


For the record, I am a C++ person, not a Java person. C++ is far and away my strongest language. Nevertheless, I believe that for general web app development, if you want something in the C++ family of languages, Java is more appropriate. At my startup we use a handful of languages for different tasks, C++ and Java among them.

I don't think anyone would have really jumped on the memory comment originally, but what made the thread asbestos-o-riffic however was the assertion that Java is inappropriate for a startup, which is, well, just plain silly.


But I never said that. I don't even think that. It depends completely on the startup's goal.

In a comment 15 hours ago I stated it's probably not worth it due to current cheap hosting prices.


You didn't, jgalvez did, and that's where most of this discussion forked off from.


I'm certainly not a "java" person, but I take objection to bashing a language like java with no reasons to back it up.

Web apps can work in any language (I wrote a couple in assembly once)...

Choosing the language is not really that important.


This really doesn't make a lot sense. Java is also easy to bind to C / C++ and its performance is better than Ruby or Python. The logic behind that seems to be ... that scaling in memory usage is important but not CPU usage? I'm not really a Java apologist, but it just seems another one of those languages that people bash because it's cool to bash.


In sensible hands, you can exchange Java for C++ and get very similar performance for a lot less hassle, and (increasingly) lean on scripting the same way you would with python/ruby. Of course that means ignoring the way "enterprises" mostly use java (which is just silly), and using it more the way google does.


Please note I mentioned memory. Java is OK on bare language algorithm implementation speed, but if you have a look at all the benchmarks in the language shootout its memory needs are up to 108 times better. A conservative consideration would be 10 times better in the average algorithm.

That is compounded with the typical architecture of Java frameworks. Also finding issues sometimes feels like occult magic on modern JVMs. And then, running with compatibility issues on JVM versions and all that.

Java failed in the browser and now it will probably have a crash at the server. There is no more budget for multimillion dollar datacentres everywhere. But thanks for all the jobs! :)


Why do you seem to think that java requires more servers? It's a language. Maybe specific crappy jsp frameworks require more servers, but that's not java.


Forgot the link of the Language Shootout:

http://shootout.alioth.debian.org/u64/benchmark.php?test=all...

It's no proof, but real life shows something even worse than that against Java. For example the one thread per connection model everywhere or the XML web services throwing away objects faster than the GC can handle.


>For example the one thread per connection model everywhere or the XML web services throwing away objects

Hence my statement: > ignoring the way "enterprises" mostly use java (which is just silly)

Of course the thread per connection model is long gone (at least what I have heard, but software tends to not die) and the XML beast is dying... but that it more how people abuse it.

but if its memory you want, you can't do better then using the OS to its full power via C/C++, which is what you are doing when you are using C/C++. Using anything that is a full VM (like jvm, or if you think that is opaque - give erlang a try !) then you kind of tell the OS to "go away" and the VM becomes an OS etc...


Try to rewrite something simple written in Python in C++ first and measure the differences in speed/dev time first.

Chances are you may end up with a story like this: http://groups.google.com/group/comp.lang.python/browse_threa...

Albeit micro-benchmarks may indicate that Python is 100-200 times slower than C++, they may not be applicable to the real world. Python can be very efficient. And when combined with libraries like NumPy, beating it's performance with hand-crafted C++ is difficult.

Emperor has no clothes.


Interestingly, I tried rewriting one of my C++ codes in Python one day, using numpy. It was mostly manipulation of largish (~1000 x 1000) matrices.

On the upside, it was 100 lines instead of several thousand.

On the downside, it was several hundred times slower.

I didn't bother trying to write serious simulation codes in Python after that.


Of course C++ will beat Python in speed if you invest enough time in it. And assembly will beat C/C++ if you invest even more time.

My point is that some people assume that if they would just sit and write something in C++ without putting enough time and effort and skill in it, it would be somehow magically faster and better than high-level language, just because it's C++. They are mistaken.

Python standard library is extremely fast. For example 'list.sort()' in Pytohn could run 3 times faster than C++ 'sort(vector<string>)'.

Here are few examples when simple Python code runs faster than the same code in C++: http://groups.google.com/group/comp.lang.python/browse_threa...


"For example 'list.sort()' in Pytohn could run 3 times faster than C++ 'sort(vector<string>)'"

Again comparing apples to oranges.


every tried NumPy? guess that's what you need for the matrices stuff ...


indeed. and lets remember that all of the built-ins in perl, php, python etc are written in c and compiled and optimized and debugged.


Python's pypy is written in RPython (it is python but with less features.) And it is already getting close to the main interpreter in C.

I can't wait for them to finally get the on the fly compiler working. Good times coming in the language competition.


C++ will drown you in complexity.

There are many ways to look at complexity as an abstract concept. You could look at code length. Or you can count how many 'degrees of freedom' your code allows you to have, and be sure that you need to control each of those. C++ affords too much freedom for developer.

The trend meanwhile is towards reducing complexity at system/core level and focusing on application unique added value. Today, in large web applications, complex behavior is build using simple components. Those components are normally done in C, and then wrapped by some scripting language. C is not too flexible. The components are rock-solid as a result, they lack unexpected behaviour. Their code's 'degrees of freedom' are easier to control.

On top of that, there is trend towards frameworks in scripting languages which reduces 'degrees of freedom' at core/foundation level even further. Finally, there is this REST thing that wants to standardize API of a typical web app, offering us even larger blocks to code to work with. Again, all these trends have one objective - allowing to focus on unique application functionality, on interaction with other systems etc and hiding underlying complexity from the developer.

By using C++ you are likely to get complex behaviour in your code very early, at a very fundamental level. As you add functionality, the complexity/behaviour quirks will grow exponentially and you will fail.

I am convinced that Microsoft failure with Vista is mostly due to them using C++. They were simply defeated by mounting complexity, not by lack of engineering talent. Meanwhile, Unix success (Apple OS X included) is determined by relative 'dumbness' of underlying C language, Unix tradition of small single-purpose tools etc.


Interesting, thanks for sharing. So do you think I should avoid C++ altogether and stick to C? What about Google? Google uses C++ a lot and they seem pretty fine/stable to me? Maybe C++ can be good web architectural stuff (not as complex as an operating system)?


treat the c++ standard as an assembly language of sort, from which higher level langs are built, like boost. if you look at it this way, you can carve out a sane section and deal with it. if you try to hold the entirety of it in your head and code, its hell


Can anyone recommend a book that deals with the web/server-side aspect of C++ coding? I'm nearly half way through Thinking in C++ by Bruce Eckel (wonderfully well written!) but it doesn't look like it's getting near there.


Isn't (and I haven't read it, I stopped at C and never made it to the ++) "Effective C++" an important book?


Yes, there is exactly what you are looking for: Yield.

http://yield.sf.net/

It is an event-driven, staged (threads communicating via message passing) application server, written in C++. It has stages for HTTP server and client, disk IO, gzip, logging (basically all CPU-intensive or IO tasks). Messages between stages are defined in IDL. You can send those over the network via Sun-RPC or JSON-RPC.

Not directly related to your request, but still handy: it also has a stage that wraps the Python VM, so you can write parts of your app in Python.


What is wrong with "the app being the server?"

There are plenty of C/C++ HTTP libraries and frameworks out there. Just proxy your homebrew app server behind squid, lighttpd, apache, nginx, or $YOUR_FAVORITE_PROXY and you are set.

In fact, it makes it really easy to develop and test your app that way. There is no web server to configure and you can easily stick in printfs all you want. I use CherryPy (standalone HTTP framework for Python) that way exactly for these reasons. I'd do the same with C++.


I've been thinking about this too. The main problem would be making it easy to extend.

The traditional HTTP side of it is trivial (just use nginx or something like that.) For template engine you have already Clearsilver and it's the fastest that I know of by an order of magnitude.

The key is solving how the application's developers could extend the web application. You can't expect them to code in C++ and compile every little change to their site. That's the beauty of CMF systems like Zope, you can put plain Python code in there with pages and components.

Doing a CMF in any lower level language and having to switch to a higher level language VM for every snippet would kill your original performance savings.

Find an element to fix that and you'll have a strong new contender in the CMF arena. Stuff like OpenMP and all that comes bundled in the new C++ libraries, transparently.

(BTW IMHO don't even think of starting without memorizing the Essential C++ books and all it's need-to-know patterns.)


Oh, and with decent VPS hosting at 30 bucks a month the need to have a high performance CMF is diminished.


You may want to check http://www.kegel.com/c10k.html


There are two approaches:

1 - build your app as an extension of an existing http server. Start with lighttpd or nginx source and build out from there. nginx has a nice module structure. Perhaps your app can be just a new module. Both of these code bases are C and use event driven sockets. You might be able write your extensions in C++.

2 - Use a known solid http C++ library to roll your own server. Depending on the complexity of your app, you may end up spending a lot more energy going this route.

My relay server for http://shellshadow.com is C++. I had my C++ programmer look at the structure and methods used in one of the two http servers named above. We ended up not re-using their source (our relay isn't http protocol) but did get some useful insight for developing our server daemon.


I have done something like this. I interfaced with Apache however rather than also writing the web server by hand. I also leveraged libraries when I could.

You may want to investigate the following libraries:

fcgi++: Interface with the web server using fast cgi; keep the program resident in memory between requests.

botan: For generating cryptographic hashes.

cgicc: For handling input from the web browser.

xerces-c: For implementing a templating system and generating the output which gets sent to the web browser.

libpqxx: Interfacing with Postgres.

cppunit: Unit-testing critical or difficult-to-test bits.

It's not as hard as everyone likes to make it sound (assuming you are on the right side of the C++ learning curve). The real question is whether you want to spend your time writing a framework rather than actual applications.


I don't mean to go about writing my web applications in C++. I do plan to write at least a couple for my job that are really performance sensitive. I'd like to use this opportunity to learn and educate myself as I believe a good web software engineer needs to have this kind of knowledge. What if I ever need to hack some code in or out of my web server (nginx, for example)? What if I need to rewrite some really slow code in a C++-based background daemon that communicates with my main application through protocol buffers or something? That's kind of thing I'm thinking about.


Premature Optimization FTL.


Tell that to nginx. "10-fold increase in the last twelve months."

http://survey.netcraft.com/Reports/200806/


C makes perfect sense for nginx, as it's a web server, not a web app.


Wt (http://www.webtoolkit.eu/) makes developing C++ web applications dead easy. It is similar to Qt: you work with object oriented widgets and events are delivered through a signal/slot mechanism. You get Ajax, javascript, cross-browser portability and all the nifty stuff for free.


Back in the late 90s, before .NET was released, I used to know a number of groups that would use C++ to create "ASP Generators", since ASP was such a beast. A OO code generator is much better than scripted spaghetti code.

Now, the concept is moot...a good OO web platform is so much more robust.


http://google-styleguide.googlecode.com/svn/trunk/cppguide.x...

This seems like a good guide on how not to fuck things up with C++.


If you're running windows have a look at Delphi / C++ builder. There are a number of ways to do this, intraweb components are one, or you can roll your own.

You can write your events in C++ or Delphi or both. (flame on)


You say you're from Brazil? How bout giving Lua a chance. If speed is your concern Lua might be your ticket. You can also check out the Kepler Project as well.


Again I see nobody is mentioning http://www.tntnet.org/


Talk to the boys at 37Signals ;) Quite a bit of Basecamp is C++, not Rails.


I can't seem to find any mention of it in their blog. Any links?


I've always wondered what the speed of Rails would be if ActiveRecord and friends was in C (or ASM for the insane).


Wonder no more ;).

If you want to see how much better an ORM in ruby/C can be, look at datamapper http://datamapper.org. This is mainly used with merb. The new datamapper uses data_objects which has been completely rewritten with C underpinnings to be much more efficient than previous datamapper. I haven't seen any recent benchmarks against ActiveRecord. But I do believe the datamapper team met their goals of being faster.


DataMapper is about 60% faster overall than ActiveRecord, and as much as 20x faster in a few key benchmarks at the moment: http://is.gd/3xwR

In fact we, the core DM contributors, have spent very little time optimizing the library. Up until now we've been mostly focused on creating a clean API, and keeping the internals simplified. I would expect the gap to widen much further once the API stabilizes, and the spec coverage gets nearer 100% (it was in the low 90's last I checked).


Do you attribute the performance to the implementation language or the different design pattern / approach to ORM?


All I have to say is: make sure you know what a buffer overflow is.


You may want to Google ISAPI, NSAPI


libmicrohttpd


I do everything in C. I have 3 e-commerce customers and their sites are written completely in C. I have no issues whatsoever. One reason I started doing it this way is because I was concerned with speed. Having done so in C, they are blazingly fast.


Id have to tip my hat to this...

Ive written a lot of C++, it was 'the one true way'.. but then I did see alot of people using python, ruby.. I learned a bit of lisp, and kind of de-conditioned myself.

If your not gonna use Lisp/arc/scheme/python/ruby... then, actually, I would go with C and not C++. You are just above the metal and you see exactly whats going on, and you dont spend time making cleverly convoluted constructs of infinite generality...

In C, you just get shit done. With a bit of discipline you can get a lot of the benefits of OO and even some of the benefits of FP in your C code. Heresy, I know.

I do love C++ still, its nice to see good RNG and regex APIs going into the language.. but its a bit verbose on balance.

gord.


why do you want to do this??? there are plenty of c/c++ web apps written out there, i have written some myself. its pain central. you do it only when you cannot spare cycles or memory per session and cut the safety nets. just buy another server. in man-hours you will be out ahead.




Consider applying for YC's Fall 2025 batch! Applications are open till Aug 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: