Hacker News new | comments | show | ask | jobs | submit login
YouPorn: Symfony2, Redis, Varnish, HA Proxy... (Keynote at ConFoo 2012) (joind.in)
67 points by Sujan 2032 days ago | hide | past | web | 49 comments | favorite

I couldn't open the Presentation. Don't have anything that can open .pptx so I uploaded to Google Docs.

Link here: http://tinyurl.com/7bckqm8

After going through the 32 slides, I think this is one of those presentations you really needed to be there for. The archiecture stack is by and large quite standard for high scale applications. The previous HN submission from YouPorn last month contained more descriptive information.

The video should on Youtube next week, It contains a lot more information. The slides are just bullet points.

> The video should be on Youtube next week,

Youtube or...?

Could you link to that presentation? I can't seem to find it.

"Slides" link on joind.in: http://www.manwin.com/confoo/eric2012.pptx

OpenOffice/LibreOffice Impress handle it nicely, FWIW.

Thnx for upload.

It doesn't say anything about the number of servers, that is important to know how efficient the software stack is.

True, Eric provided some info in the Redis google group some time ago:

"I just want to let the devs know that Youporn.com relaunched two weeks ago with Redis as our primary database. With @100 million page views per day, our cluster of Redis slaves are handling over 300k queries per second.

After the switchover we had to add some additional Redis nodes but not because Redis was overworked but because the network cards couldn't keep up with Redis."

So at least the Redis part is I/O bound apparently, that suggests an high degree of efficiency for this use case.

So, a pretty massive rewrite and then it's 10% faster? Wonder how much of the reason for it in the first place is that a bunch of PHP/standard component programmers are probably easier to get (and replace) than competent Perl hackers…

I love Perl (worked with it for 10 years) but we decided early on that finding enough good Perl devs would be too hard. There are still a few Perl parts of the rewrite, you need to use the right tool for the job.

Sure, and I certainly agree with that choice. In my (limited) experience, a Perl->PHP rewrite either indicates a desire for more developers or a seriously weird or outdated Perl architecture. The latter was quite common in the early days of the web, where basically everyone's first web page was a mess of Perl CGI (or mod_perl a bit later), but as YP isn't that old and performed well enough before, I wouldn't have thought that this was the case here. Thanks for confirming my suspicion, wasn't intended as criticism.

YouPorn-before-the-sale outed themselves as a Catalyst stack user some time back; memory says they offered to donate a machine to act as an irc.perl.org node to help the community out but then discovered that their ISP at the time wouldn't allow IRC ...

"Finding enough good Perl devs" is actually very doable, it just tends to involve engaging them through the CPAN community. We've helped customers out with it a few times, with good results overall.

On the other hand, when trying to de-convolute an architecture a change of primary language often helps.

Does "we were substantially more confident of our ability to recruit sufficient PHP talent than Perl" seem like a reasonable rephrasing?

Funny. When I was running my music startup with couple mil users, it was originally coded in perl by my perl-genius cofounder. When my cofounder dropped, it was next to impossible to find solid perl developers(there are many who tinker with it). I remember recoding it in php and life getting so much easier.

10% faster is quite an achievement for a site, that was fast before.

Additionally, I don't think speed was the primary drover behind the rewrite. I'm still waiting for the video/audio of the presentation, but he mentions the 'very complex architecture' of the PERL backend. So switching to a 'standard' architecture that fits on one PP slide seems liek a very good idea.

Maybe they were focusing more on scalability than individual page load speeds?

I wonder if they considered HipHop instead of php-fpm.

So, if one were so inclined how would you get a job working in the porn industry (on the technical side)?

I've always wanted to tell people I work in porn but I somehow doubt that any of their job adverts directly say "This is porn".

These guys are probably who you're looking for: http://wwww.manwin.com

I think there's an extra 'w' in the above link. This one works: http://www.manwin.com

Oops, thanks.

working on porn, in any respect, supposedly locks you into that industry. it makes no sense to me.

any thoughts on why?

Not true at all. I know a few current and former Googlers who worked on porn. The skillset is fairly transferrable to any high-traffic website.

At least in Silicon Valley, people don't really care what you've worked on as long as you have the technical chops. Hell, look at all the folks working for defense contractors - their biggest problem is that it tends to lock them into a proprietary software stack, not that they've written software used to blow people up.

That may be true of actors and directors. But for software? I can't imagine.

Look at the presenter's experience:

> Over a six month period, I lead the project to rewrite a top 100 website using a new software stack. Doing so, we used HAProxy, Varnish, Nginx, PHP-FPM, Symfony2, Syslog-ng, Redis and MySQL to create a platform that handles 100 million page views per day and has room to grow.

There are tons of companies out there eager to hire someone with that experience.

Heh, this is true. However, everyone always misunderstands why. I can speak to this from two perspectives. I worked in Porn early in my career and made the transition. I also actively try to recruit people with this background in a city which has lots of great engineers work in this area (montreal).

The reason why people say that they get locked into porn ( at least on the software producing side of the fence, mind you ) is very similar to the reason why people get locked into working for the finance industry. There are nuances but it essentially comes down to two things: money and attitude.

If you are good at engineering and if you are willing to work for this industry ( a small minority ) you can make crazy money. The industry is pretty meritocratic, long before that became a buzz word in Silicon Valley. If you generate millions of dollars in value for your company, you can take millions of dollars home. Very few industries allow you to make this kind of money, and since the people who work there are self selected to not mind, it's tough for them to leave. Some social factors sometimes compel them to move to other industries anyway ( such as starting a family ) at which point the second problem comes into play.

The second problem is your mindset. The porn industry ( sometimes similar to financial ) has the opposite relationship with it's customers from other industries. You treat the customers as bags of money that ought to be exploited. "After all, they are all wankers" is what you hear a lot. This may sound basic but if that's the reality you have spend a decade in, you will find it tough to adjust for startup life.

Especially startups are usually very good at finding a business model where their own success depends on the success of their customers ( Square, Stripe, my own Shopify, etc). Life is much happier in a setup like this but it's totally alien to people from the Industry.

These things are huge contributors at keeping people from the industry in the industry.

You treat the customers as bags of money that ought to be exploited.

I honestly don't see a difference to most other industries - give or take a few idealistic startups.

Unless you meant to say the difference is that you're upfront and honest about it...

it's possible to make this comparison but it's not binary, it's a scale. Porn is so much further down this lane then anyone else.

It's a funny world where selling porn is considered less honorable that selling your user's information to advertisers.

Why do people keep saying that? As far as I know no respected company is doing that.

It doesn't happen directly, it's more like with Facebook and its targeted advertising, where it allows advertisers to target you and others like you very specifically.

Am I alone in not thinking that this is a bad thing? When I watch TV, I see non-targeted advertising: tampons, Viagra, depression meds, and many other things that I have no need for or interest in. If I have to see ads, I'd much prefer to see targeted ones.

I agree to a point. I do find it somewhat creepy that after I was involved in an email thread related to a friend's wedding all of a sudden all of my gmail ads are for wedding gift websites.

On the other hand when if I am browsing the ASP.Net section of stackoverflow and see Windows Azure ads rather than weight loss ones then I see that as a good thing.

>I agree to a point. I do find it somewhat creepy that after I was involved in an email thread related to a friend's wedding all of a sudden all of my gmail ads are for wedding gift websites.

Ironically it sounds like you don't have a problem with targeted advertising, you have a problem with the quality of it and want it improved, which is quite a different argument than "you shouldn't use my data, give me adds at random". It's not an exact science but it's undeniable that it's reducing the signal to noise ratio.

I think it's more that email is something I generally consider reasonably private. Of course I know that in reality all that is happening is that it is being parsed for certain keywords etc.

Whereas targeted advertising based on being on a particular public section of a public website feels much less intrusive.

I suppose it's like getting adverts on TV for items that are somewhat related to the show that you just watched vs getting letters through the mail advertising debt consolidation services because on the same day you also got mail from your bank telling you that your overdraft is maxed out and your credit card is overdue.

This is purely conjecture but the industry might still face prejudice due to issues of "morality" based on peoples' religions or personal values. A person working in the industry is perhaps establishing a link between themselves and something "dirty" or unscrupulous, and so they appear to potential employers to be less than desirable for that reason sometimes?

It seems likely this is the case to me, but again it's all conjecture. Personally I don't think it matters either. Experience is experience.

(And I'm not sure your supposition is at all true to begin with. The above assumes your supposition has some truth value.)

This actually isn't true. I know several people who worked on the Perl version of YP and none of them are locked into the industry.

Are you kidding?

I worked for gay porn company for a number of years and I've never had issues finding work outside of that industry. At least not for companies I'd be interested in working with in the first place.

If anything it helps considering there is a lot of big scaling issues that need to be solved when the viewer base gets big enough.

It does not, in any way, lock you into the industry. And, in my mind, working in porn helps you find employment with good companies in a way no other company or industry can.

Why should it lock you in?

Because you've been a bad, bad boy and now you're going to be punished... :p

>working on porn, in any respect, supposedly locks you into that industry.

"Supposedly" by who? I don't think it's real at all.

I would think it would help you. Many porn sites get so much traffic, the engineers rival those of Facebook and Google.

As much as I might like to think so, they are several orders of magnitude beyond us.

It's not just about traffic, you know. Facebook is different then most sites, because it generates different content for almost every pageview. This means much less possibilities for caching. And Google, well...

> because it generates different content for almost every pageview. This means much less possibilities for caching.

That implies they are caching very little. I'm comfortable in saying that I'd be shocked if your homepage on Facebook wasn't 90%+ in cache, if not 100%. A site like Facebook wouldn't survive without caching.

I am not sure how you define those percentages, but Facebook's homepage is regenerated on every pageview. Storing the entire homepage html in cache for every user would not work, because every time someone posts something you'd need to regenerate the homepage for all their friends (# of times someone posts something * avg # of friends > # pageviews homepage).

Sorry, I was referring to the data, not the HTML output. Even still, you could cache components of that page. Your feed might update often, but their are many things that don't change on ever page load, and they can be cached for a long time.

Does anyone know what YP is using for their fulltext search?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | DMCA | Apply to YC | Contact