I'm responsible for the Tengine project. We're very excited to see the news of Tengine appears on ycombinator.
Just a few clarifications:
Q: Why Tengine is 'forking' Nginx instead of committing the patches to the official Nginx?
A: First, we are developing our own Nginx version because we have strong requirements to enhance it. Our website is very busy (ranked #14 on Alexa's top sites list) and many features we need can't be done by writing modules.
We would love to contribute our work to the official Nginx. We consider it as a great honor to share our achievements with the community. That's why we have open sourced it. We are also trying our best to contribute back to Nginx. Actually we have contacted the core members of the Nginx team last December, including Andrew Alexeev, the people in charge of their Business Development and their COO/GM, Maxim Konovalov. We asked how to collaborate with them. Their replied as following:
"It's interesting what you guys do and let's keep in touch. I'm not really quite sure right away in regards to what can be imported to the main branch, but hopefully we'll find things to collaborate on. We're a bit busy towards the end of the year, so probably a good idea to catch up in January."
More than two months have passed. We are still waiting for their requests. We are very confused because we don't know which features and bug fixes they think should be merged into Nginx. Some feature such as the syslog and pipe support are explicitly refused to implement; A 'bug fix' of the error_page directive I wanted to send to the Nginx developer but they thought that behavior was OK though many users think it's a bug... Frankly, we are a little bit frustrated. It's very sad that we haven't done too much things together yet. But the Tengine team are open to hear the ideas from the Nginx guys. And we're going to knock their door again.
Q: "input filters" in Tengine have anything to do with Chinese requirements for censoring?
A: No. It's just a mechanism to help implement something similar to Apache's mod_security. E.g. I have written a module to demonstrate how to fight the hash collision DoS attack:
BTW, please don't connect everything to censoring so rashly. The idea of 'people in China are willing to do censoring' is also stupid.
Never heard of this until now, so was very interested to see the list of enhancements that they have made to nginx. Some of them look quite useful e.g. the logging enhancements and Input body filter support.
This feature caught my eye:
- Combines multiple CSS or JavasScript requests into one request to reduce the downloading time
I wonder how they achieve this at the webserver level? Normally something like this is done as part of the deployment/compile process. Anybody familiar with Tengine care to comment?
It's a commercial project for www.taobao.com, which ranks 13th global according to Alexa (higher than ebay, lower than amazon)
So when the company has some custom feature, it can't wait for the original project to accept and update.
Tenginx going open source because we got tons of help from the community, not only Nginx project. And it's an honor to contribute.
Other open sourced projects include TFS(file system)/Tair(distrbuted cluster)/Webx(web framwork), all currently running on taobao.com.
You can check them out at http://code.taobao.org/ . Most docs were in Chinese tho...
Currently the feature list on Tenginx might be limited, but who knows what would happen...Google forked WebKit into Chrome, that's nothing to do with feature list in the beginning.
There's often a serious communication barrier between Asian open source developers and Western communities; partially a language barrier, but also other issues. The end result is that a lot of projects end up being de-facto forked by Asian open source developers -- this is where outreach can be very important, to help integrate the changes back in as soon as possible and start that necessary contact.
If anyone cares I can talk a bit more about this (as this was an issue with x264). There was also a session (with notes online) on the topic at last year's GSOC Mentor Summit.
Japan has long had a community of x264 developers and users, but for a long time they largely remained insular and didn't make much active effort to push their patches upstream. This wasn't out of maliciousness; there were many reasons.
The language barrier was a big problem -- less so that they didn't know English, but moreso that they didn't feel confident with it. In reality, we found the Japanese contributors to have way better English than they thought they did. People are often embarrassed or scared to use a language they're not confident with, and they worry about making mistakes and looking bad. Make them comfortable.
Note that this is especially an issue with Japan. In my experience, Japanese tend to be (typically) much less confident in their English despite equal or greater skill. This might be in part because the school system typically doesn't emphasize much 'real-time' conversational English.
This problem is more general than just language; open source IRC channels and mailing lists can be intimidating, and people (often rightly) suspect that they'll be mocked for making mistakes, so they just don't bother.
As is common in many non-English-speaking countries, the Japanese have many of their own tools and methods of communication that are unique to Japan. In this case, they had an ongoing x264 thread on 2ch. A friend of mine from the x264 user community offered to help. He was multilingual, able to speak near-perfect English and Japanese, along with many other languages. Using his translations, I answered questions for a few weeks on the 2ch thread.
Eventually it became somewhat obvious that one of the people posting in the thread was a Western developer; the Japanese jokingly called it the 'Black Boat incident', a reference to the arrival of Commodore Matthew Perry's fleet at Japan in 1853 (see http://en.wikipedia.org/wiki/Black_Ships). We convinced a few to drop by our IRC channel; we invited them, noting that difficulty with English was not a problem, and we had a Japanese speaker who they could converse with directly anyways.
A few came, some contributed patches, and a few stuck around; we now even have a small community of Japanese users who hang out in the main channels. One of them noted later that his (written) conversational English had improved vastly just by being on IRC for a few months and was now much more fluent.
Put simply, cultural and language barriers reduce peoples' confidence in communicating. They can also result in misunderstandings; open source developers, for example, tend to have a very blunt style of communication. In the best cases, this can mean they will ignore their own ego and debate decisions on technical merits, without pleasantries. In the worst cases, this can mean rudeness, intolerance, and general dickishness. Especially coming from a culture more heavily based around politeness, this can be daunting.
If you want to welcome a foreign community of developers, you should try to (list certainly not complete):
1. Have someone who speaks their language, so they can feel confident that they have someone to speak to even if they aren't confident in their English skills.
2. Contact them first; don't rely on them to come to you.
3. Be friendly and welcoming. If you have to, mute That Guy who insists on being rude and obnoxious. Mocking grammar errors or being needlessly blunt are quick ways to make people feel incredibly unwelcome. They probably speak your language much better than you speak theirs!
4. Give them extra help -- don't treat them like Just Another Patch Contributor. Your goal here is not just to integrate their changes, but to gain a connection to their developer community. "Patch rejected, I don't like it" is not a way to gain friends.
5. I really shouldn't have to say this, but apparently (from experience) I have to: seriously -- don't be a racist asshole. Particularly if you're inviting people to an IRC or high-traffic mailing list, there are often people (including devs!) who will make all sorts of insensitive comments. This needs to not happen. Yes, this means no stupid jokes about "roneriness" or Indian tech support.
Yes, this also means no stereotyping. Just because they're Japanese doesn't mean they want to talk about Naruto, and just because they're Chinese doesn't mean they really like General Tso's Chicken. I know you really love Korean culture, but just because they're Korean doesn't mean they want to talk about Girls' Generation and Starcraft. Don't treat someone from a different country as if they're some specimen under a microscope either. In short, avoid othering. Making people feel as if they are different and not wanted is a quick way to make them not want to come back.
Thank you, thank you!
This aspect is pretty important to a lot of us coming from Asia - even more so in India, where English is the politically incorrect, defacto national language. Despite that, there is a large gap in adoption of open source software in one of the largest markets in the world, where it is often a choice between being able to afford a Linux desktop vs not being able to afford a Windows computer at all.
It is a cultural gap that prevents most open source software to be architected to solve problems specific to Asia (be it support for complex Indic fonts to assuming expensive system hardware requirements). This gap not just prevents developers from working together, but the even more dangerous symptom is to not identify certain problems as being important enough .
I was not aware that this was (rightfully) considered important enough for a session.
As explained in the FAQ (which is in Chinese I'll give you that) they created the fork for a few reasons:
- Patch have been historically slow to be accepted in NGINX,
- Some of their patches have been specifically rejected, including the syslog/pipe one (and it's probably a crucial feature for them),
- They needed a place to share their pool of patches and enhancements,
Overall, I don't see anything wrong with that; isn't the hacker culture nowadays prone to forking (Github anyone?).
We are here in front of a big player of the Chinese Web space (Taobao is massive in China, we use it on a daily basis at the office as do hundreds of millions of other users) committing resources to release and maintain code in the pure spirit of OSS, in a country where that very specific behavior (sharing and openness) is still in it's infancy.
I applaud the guys at Taobao and hope this will help foster the OSS movement in China.
My guess is that the Chinese open source community is still developing at the early stage where forking & learning from those popular/successful open source projects is still a common scene. When community grows bigger, sharing/contributing would certainly be more common. It's certainly a good suggestion for them though.
If you can read Chinese, you can check out the link below. They are working on a list of projects (some of them are already opensourced) that handle "big data". In case you don't know, taobao is the ebay in China, with alexa ranking of 13.
Feature #1: "Input body filter support. It is quite handy to write Web Application Firewalls by using this mechanism."
Correct me if I'm wrong, but "Web Application Firewalls" sounds like an API for censoring content to me. As much as I loath the idea of internet censoring, I would love to get a peek inside the technical workings of the famed "Great Firewall of China".
Applying the same logic, firewalls used to protect computers and servers are only there to censor content.
From the description, it sounds like it has nothing to do with content filtering and everything to do with app security. Web apps in particular have very different attack vectors (think XSS, SQL Injection etc) that your standard install of iptables/shorewall is going to protect against
I only raise the issue because I happen to know that China is involved in large-scale automated internet filtering and that in order to operate, companies are required to implement those filters.
And the fact of the matter is, many many firewalls ARE used for content filtering. For example, many workplaces block common time-waster sites. Home wifi routers can be easily configured to filter adult content. Universities block file-sharing. ISPs do bandwidth shaping. etc. etc.
kennywinker, every single technology can be used for "good" and "bad" purposes. Taobao's Tengine is a fork of Nginx, a great one at that, I might add. It has nothing to do with the Great Firewall. It's open source and the code is available for everybody to see.
The innuendo, knee-jerk, hypocritical, "guilty by association with big/bad Chinese gov" reaction toward a solid open source web server, shared for all to play with and use, is quite disgusting, frankly.