Hacker News new | past | comments | ask | show | jobs | submit login
How YouPorn Uses Redis: SFW Edition (togo.io)
154 points by benarent on July 31, 2013 | hide | past | web | favorite | 94 comments

The thing I find interesting about this article is the brief discussion about the death of PERL. In particular the comment about the inability to find senior developers in PERL. This is a sad development for those of us who really love programming in PERL, but I think eventually one must recognize that a language is fading, however slowly, into obsolescence. Apart from Crowdtilt I don't know any other successful startups or major companies that rely on PERL. And if one wants all the benefits of a network effect of developers, libraries and technical support, it seems today a backend developer really needs to shift his or her focus to Python, Ruby or (bleagh) PHP

It's interesting that someone who professes loving to program in Perl refers to it as "PERL". People who refer to it in all-caps instantly give themselves away as outsiders to the language. It's like someone claiming to be a senior JAVA programmer, it's not an acronym.

Anyway we use Perl pretty much exclusively at Booking.com, and I'm pretty sure we're at least a couple of orders of magnitude bigger than the likes of Crowdtilt. There's a bunch of other interesting companies using Perl as well, a lot of them just don't make as much of a fuss about it as the companies using say Ruby.

Last I knew e.g. Morgan Stanley used it for most everything, including new developments, and it alone probably employs more programmers using it than the combined hobbyist contributor base of some newer emerging languages.

It's pretty hard to claim that Perl is "dead" and square that with the ongoing activity on the CPAN (https://metacpan.org/recent). This is not the kind of activity a dead language gets.

I think what a lot of people mix up is that just because something isn't as big as it was before doesn't mean it's dead.

C isn't as big relatively as it was in the 80s, Perl isn't relatively as ubiquitous for scripting and web development as it was in the 90s, but neither of those languages are dead.

If you look at the total amount of contributions / software written for any of the languages that aren't currently in the spotlight you'll most likely find that we're at the high water mark of the number of the amount of software written in them, and the number of people that have been employed to write software in them, just because the industry as a whole is getting bigger every day.

Maybe they're not bigger relatively compared to some other languages. But that has little bearing on how good they are when it comes to using them in your toolchain. If anything they're better than they've ever been before.

I'm sorry that is the most important thing you got from my comment, Avar. My spelling of Perl is certainly inconsistent. I've spelt it Perl, PERL and perl at various times. I was on my phone and being somewhat lazy and clumsy with my spelling. In that sense, I'm definitely not an insider. But I have been programming in Perl for 15ish years. I've built some fairly sizable projects in perl and I love using it. I'm also sad that it's in decline, and I think it's fairly obvious that it is.

The comments you've made about its usage do not address the comment from the article that I highlighted. It's becoming increasingly difficult to find senior engineers with a lot of Perl experience. I think it's unquestionable that there is a stronger desire in the market for Python and Ruby. Just consider Ycombinator. In my batch of 66, there were only 2 companies (mine included) which used Perl in any way. The vast majority used Python or Ruby. Does this not strike you as an important trend? This is not a rhetorical question, I mean it sincerely, since you also seem to be familiar with the language and love it too.

Don't worry, some people simply try attack others intentionally or unintentionally, it's not getting better here at HN

It's interesting that you and vijayboyapati focus on this one paragraph of a much longer reply, and that you perceive correcting somebody as attacking them. Indeed, that's something that hasn't gotten better over the years, anywhere.

>you perceive correcting somebody as attacking them.

Correcting someone on a pedantic issue outside their main point IS considered if not an attack, then surely quite rude, in regular conversation.

Even more so when you add: "People who refer to it in all-caps instantly give themselves away as outsiders to the language" trying to expose them as some kind of wannabe poser.

What the fuck does this guy knows about the parent's history and expertise in Perl?

It doesn't get better if you engage in it too.

I thought that the rest of avar's comment was a reasoned response to the claim that perl is "fading, however slowly, into obsolescence".

> People who refer to it in all-caps instantly give themselves away as outsiders to the language.

It's actually laid out in perlfaq1 ($ perldoc perlfaq1)

   What's the difference between "perl" and "Perl"?
       One bit. Oh, you weren't talking ASCII? :-) Larry now uses "Perl" to
       signify the language proper and "perl" the implementation of it, i.e.
       the current interpreter. Hence Tom's quip that "Nothing but perl can
       parse Perl."

       Before the first edition of Programming perl, people commonly referred
       to the language as "perl", and its name appeared that way in the title
       because it referred to the interpreter. In the book, Randal Schwartz
       capitalised the language's name to make it stand out better when
       typeset. This convention was adopted by the community, and the second
       edition became Programming Perl, using the capitalized version of the
       name to refer to the language.

       You may or may not choose to follow this usage. For example,
       parallelism means "awk and perl" and "Python and Perl" look good, while
       "awk and Perl" and "Python and perl" do not. But never write "PERL",
       because perl is not an acronym, apocryphal folklore and post-facto
       expansions notwithstanding.
> I think what a lot of people mix up is that just because something isn't as big as it was before doesn't mean it's dead.

I wonder if the Perl6 chasm tainted outsiders' perspectives on perl and the community

C isn't as big relatively as it was in the 80s, Perl isn't relatively as ubiquitous for scripting and web development as it was in the 90s, but neither of those languages are dead.

I think this is important, and worth spelling out. The entire pool of programmers is much larger, and while C may not have as much of a share of the total market, the total number of C developers is still probably larger than in the 80s. In that respect, it's hard to say C itself is in decline, just it's position of dominance.

I'm not sure the same can be said of Perl, as there has been a fairly common theme of former perlers moving on to Ruby or Python, but then again I've heard stories about people returning to Perl after a while due to the Perl renaissance (which is why you'll get a much different opinion from someone active in the community than outside; inside we see a lot of cool stuff happening).

> It's interesting that someone who professes loving to program in Perl refers to it as "PERL". People who refer to it in all-caps instantly give themselves away as outsiders to the language

Alternatively, they might just be a refugee from FORTRAN, the language which predates the invention of capital letters :)

PERL -- Practical Extraction and Report Language http://www.cs.cmu.edu/htbin/perl-man

That is an unofficial manpage for perl 4, a release that has been obsolete since 1994…

I always preferred Pathologically Eclectic Rubbish Lister. Please note I say this lovingly, as a former Perl programmer.

Click on the BUGS link to get to http://www.cs.cmu.edu/afs/cs.cmu.edu/Web/People/rgs/pl-bugs.... which ends with the alternate definition.

Larry Wall has told me that the fact that he had two acronyms that he liked for it was one of the reasons that he chose the name Perl.

No, that is a backronym.


Yes, we know how to google here. We also know how to read more than a few sentences in a row, and understand context of statements.

> I think what a lot of people mix up is that just because something isn't as big as it was before doesn't mean it's dead.

Something that may cause confusion is "acceleration" vs "velocity". Languages like Ruby are still accelerating, Go is accelerating a lot, even Javascript is accelerating from an already massive user base. Languages like Perl and Tcl are still widely used, but their adaption has peaked and is not accelerating.

From where I'm standing Perl is declining quite rapidly in systems administration, which probably is their largest "userbase" nowadays.

"Morgan Stanley used it for most everything"

this is entirely incorrect. They've been running with a plethora of different languages for a while now: Scala in particular, but theres a lot of work in C for HFT stuff.

...and you have exposed yourself to be culturally ignorant. (East) Indian people often spell words in ALL CAPS to emphasize importance. I know this from past experience dealing with (East) Indian people in a professional software development environment for over 10 years. So I believe that was what the original author intended.

p.s. He is also a former Google Engineer who was part of the small team that built GOOGLE NEWS. http://dealupa.com/about

>It's interesting that someone who professes loving to program in Perl refers to it as "PERL". People who refer to it in all-caps instantly give themselves away as outsiders to the language. It's like someone claiming to be a senior JAVA programmer, it's not an acronym.

Or some people just don't fucking care about "standard spellings" and prefer their own.

Exactly. I worked for a magazine that insisted on their company name be stylised in all caps (it was not a acronym).

If you look at a old JAVA/Java logo it actually is stylised as JAVA.

In fact my first JAVA textbook stylised it in all caps also so ever since I just copied that.

Ps. I dont actually give a shit about JAVA or what anyone thinks it says about my skills because of the way I write it

Perl isn't dead, I use it every day for important stuff. But it is true, people younger than say 32 are not very likely to know Perl well, most of them prefer python

>Perl isn't dead, I use it every day for important stuff. But it is true, people younger than say 32 are not very likely to know Perl well, most of them prefer python

That's what "dead" in casual discussion means though.

Because in the stricter sense that "someone, somewhere is using it", even SNOBOL is not dead.

Something is "dead" when the youth no longer care about it? Get off my lawn.

Yes, joking aside, it means it's target use demographic is on the go (to the great hackathon in the sky).

The target demographic of a computer language is people who program computers. Python is as old as Perl. Ruby, at a spry 20, is only five years younger than Perl. Java is pushing 20 and JavaScript is not far behind.

Are you suggesting that when Mats created Ruby, he was thinking to himself, "I can't wait for the all the first graders to grow up and use my language?"

I rather think that the creators of computer languages think in terms of present, rather than future computer programmers, and not in terms of "My language will make all the Beliebers want to become software engineers."

Alas, I've experienced a lot of this. A current project I'm working on, I've done in Mojolicious. Why? Because I had no interest in learning another toolkit that provided me no real benefits other than the language it was based on was more popular. And, the time spent learning another toolkit was time not spent developing.

The first question the rest of the board asked me was "Well, are we going to have trouble hiring?" To which, I truthfully replied "No." The interesting part about non-popular languages, is that you're going to have fewer unqualified candidates. (And yes, in my local market, there are still a lot of perl programmers, maybe not as many as Ruby programmers, but the average perl programmer I come into contact with has more years of experience than the average Ruby programmer I come into contact with.)

I think the argument "we can't hire people that know X" is nearly always the wrong one. Several years ago, when we looked to re-build core services for a product (replacing perl with Erlang, natch!) - we couldn't hire a person that knew Erlang to save our lives. So, we hired good developers in other languages and trained them. If you've got the right technology and the the right tools, you can always find the right people.

This is not always true, especially if there are business-imposed constraints on hiring (or technology).

(cynical meta-comment)

Curiously, anyone who's been involved in the Perl community for any length of time will see "PERL" there and dismiss the poster out-of-hand. The decade-or-two old catchphrase goes something like:

"Perl" is the name of the language. "perl" is the implementation/interpreter. There's no such thing as "PERL".

Lua gets this, as well, except in Lua's case, it never even had an acronym associated with it.

People who type LUA insta-brand themselves as outsiders.

My preferred definition of LUA is "Lua Uppercase Accident." ;-)

that's recursive. :)

I find that I dismiss pretty nearly anyone who refers to a programming language in all caps. Outside of FORTRAN, COBOL, and C.*, it just looks ridiculous -- SCHEME, ERLANG, LISP, JAVA...

Rule: you're allowed to use all caps to name any language that originally had to be programmed on terminals without lowercase letters.

So LISP? :)

Also add a special case "SmallTalk" for instant dismissal. It's Smalltalk.

Heh, of _all_ languages – you'd have thought Perl would end up with CamelCase. "PeRl" – nah…


Maybe they're just yelling the name of the language.

Same thing with TCL vs Tcl. It is an acronym: Tool Command Language, but no one writes TCL anymore.

> In particular the comment about the inability to find senior developers in PERL.

One thing to know is that Manwin (youporn, pornhub, etc parent company) is located in Montreal. For a reason I can't really explain, you mostly find PHP and C# developers here, so most Montreal startups are PHP based.

I also interviewed a couple Manwin alumnis, and they told me that they were a couple hundred developers and that there is a huge turnover (you often see a 6 month / 1 year experience at Manwin when you read a PHP dev resume).

So it's easy to understand that they prefer to use the local dominant language, it do not mean that Perl is globally dying, just that they have specific constraints. And to be fair Perl is not the only small community in MTL, Ruby and Python also have a strangely small community here.

> For a reason I can't really explain, you mostly find PHP and C# developers here

Same here in Vancouver; I always figured the C# part was just our proximity to Redmond (and then PHP because it's the only language that doesn't impinge upon C#'s niche) but I guess it's more of a Canada-wide thing. Weird.

In the Phoenix area C# is king, followed by Java. Most development tends to be around business' internal applications... I've seen a small bit of PHP, Ruby and Python though... In the past two years NodeJS has probably passed them all, aside from C# for new development around here.

Maybe it has to do with what the local college CS department standardizes on.

Finding the same thing here in Toronto, to the point of where I've seen two perl developer positions advertised here since I came back in early '12. Meanwhile there's tons of (usually poorly paying) PHP gigs, along with Java, and to a slightly lesser extent, C, C# and mobile. That being said, I found the two Perl places to be very accepting of those that didn't have experience in Perl, but in other languages (and wanted to escape to something better). Can't say the same for other language shops.

tl;dr Perl devs seem to be in short supply country wide, along with jobs that would need their skillset.

other Montrealer here - I also wondered why that is. Especially considering the amount of gaming companies around here, you'd think a lot of C++ would be in order. I have met quite a few rubyists, but python seems rarely used. PHP coders were always available, wherever I lived though :)

I've found in my job searching over the last few years that if you have a resume with good ruby chops, Montreal is the place to go. Seems to have the largest share of jobs in that area, at least from what I've seen.

I believe DuckDuckGo is using Perl, or were as of a few years ago.

Source: http://www.gabrielweinberg.com/blog/2009/03/duck-duck-go-arc...

Blekko is another, which is very impressive considering they do their own indexing and serving of results.

This plugin infrastructure uses Perl, so they definitely haven't moved away from it entirely http://duckduckhack.com/

Talking to a friend of mine who works there, they haven't moved away from it at all really.

They've been optimizing the heck out of it, and as much as my kneejerk reaction to their infrastructure is "a collage of rubberbands and paperclips waiting to break in unexpected and unpredictable ways", I've gotta hand it to them for not giving in to the temptation to prematurely optimize. Clearly it's been working really well for them, and so I guess it's worth not "fixing" it until it starts breaking.


high-traffic websites that use Perl extensively include Amazon.com, bbc.co.uk, Priceline.com, Craigslist, IMDb, LiveJournal, Slashdot and Ticketmaster.

I have so much to say about this, but I'm not sure what would be covered under NDA, so I'll just say that learning Perl is not a bad idea for your IT career, especially if you're starting out. Perl is great in that it keeps concepts of low level data structures and linux constructs in the syntax so you can go either up or down the stack with it easily. It's very similar to a compiled language in performance and there is a rainbow of forks and derivative technologies that are so tuned and mature, they make SQL look like a social experiment. A lot of these technologies are just another unremarkable file in CPAN.

I'd argue that CPAN accounts for quite a bit of "network effect", never mind that there's still a considerable amount of googleable "I have this error, let's see if someone else had it before me" type of dicussions out there. Certainly more than some hip, yet newer languages (e.g. Haskell, Go). So you're not really scraping the bottom of the barrel here.

I think the YP people weren't just going for some senior developers, they wanted a huge pool, probably with easy enough replacement if someone is going. Considering that not everyone wants to have a porn site on their CV, or might want to quit once they start a family and would have to explain where daddy/mommy works, that seems like a good idea anyway. So you need the largest pool possible, which generally boils down to either PHP or Java, with the latter being a bit more expensive.

Perhaps surprisingly, while it's technically true that Haskell is a newer language than Perl, the difference is only three years - Perl dates from 1987 and Haskell dates from 1990. I wouldn't put it in the same bucket as Go, which dates from 2009.

Don't feel too bad about the other posters complaining about your choice of capitalization; at my shop we spell it "ruby".

Most of the BBC's public sites are built with Pearl, including BBC News. I believe thy use other tools for internal sites and other stuff though.

I've been at the beeb for the best part of a decade and can say that thi used to be the case, but no new sites have been built using Perl for a very long time.

Nowadays the vast majority of sites are built using Java services and a PHP frontend. There is still some legacy Perl about running some fairly critical systems, but there are very few people in the place who could regard themselves as full time perl devs.

Apart from Crowdtilt I don't know any other successful startups or major companies that rely on PERL.

At the very least, I know EMC and Qualcomm both rely heavily on Perl internally. At EMC there has been some shift to Python, but Perl is still the clear winner. Startups are more likely to use the new shiny, but Perl is still kicking in the enterprise.

See http://www.indeed.com/jobtrends?q=Perl&l=San+Francisco%2C+CA...

When I worked in online education in the .com 1.0 days, a lot of the big tools were built on Perl (both Blackboard and WebCT I think). I'm sure some of them still use it.

Thanks everyone. I came into the comments to learn what everyone thinks of their usage of Redis as a database and how their stack is set up. Now I know what's the correct way of typing Perl...

I wondered if he was referring to the fact that they didn't have the staff in house and would have to find someone, as opposed to a general lack of talent in the marketplace.

I find how porn sites run things far, far more interesting than how generic start up X runs things and I'm not sure why.

I think in general its because my friends who work at start-ups doesn't really stop gushing about how MongoDB or Go or Redis is amazing and fixed all their problems, whereas porn sites/porn industry in general is talked about far less.

But this article is not about the porn side. The tech tradeoffs are basically the same as in any industry - this article could have been written pretty much word-for-word about many non-porn sites.

Yes and no. There are 2 things unique about the adult industry:

1) Extremely high levels of traffic. Furthermore, when it's a video site like YP, you've got people engaged and watching videos for a longer time.

2) Common cloud infrastructure is normally off limits. Running adult content from AWS or Linode is forbidden by the TOS, so you have to roll your own servers.

The guy in the interview, Eric Pickup, has a very good talk about his software stack and the amount of traffic that YP gets. https://www.youtube.com/watch?v=RlkCdM_f3p4

> 2) Common cloud infrastructure is normally off limits. Running adult content from AWS or Linode is forbidden by the TOS, so you have to roll your own servers.

Where do you get this? I've worked in the porn industry for 12 years and have ran porn sites on both Linode AND AWS... They allow it.

That being said, we do things very differently in porn than in the startup world I have come to notice. Most startups scoff PHP, whereas 99% of the porn industry is based on PHP. Not to mention we handle HUGE traffic loads on single servers, and the only time we ever have more than one server (outside of the sites sub 1,000 Alexa) is just for media such as pictures and videos, which are only offloaded to handle as static content while the rest is dynamic.

> Most startups scoff PHP, whereas 99% of the porn industry is based on PHP.

Only the vocal ones. I find that PHP is used a lot more than what the chatter on HN might lead one to think.

That's why I rarely take anything these people say seriously. I mean, they talk trash about PHP and how it's horrible but look at the porn industry? We push millions of visitors a day over a typical LAMP stack... But hey, it's shit, right?

It's important to differentiate when people are complaining about the language and it's use, and how effective it is. Almost nobody criticizes PHP performance (relative to other scripting languages), whether not not it is technically faster or slower than another language. Then, even a good portion of the criticisms of the language come with the caveat that it does let you get simple stuff done quick.

My main problem with PHP is that I'm primarily a Perl dev (but have done some heavy PHP work in the past), and it's syntax is just close enough to make me always aware of how unintuitive it is in comparison, and all the extra steps I have to take to accomplish something simple. It's maddening.

Oh, I could be wrong. I worked for an Adult Friend Finder type site and we had a very tough time finding a place to host the site.

Did you guys use any of the crazy file systems, like MogileFS?

That's weird, as there's tons of very good adult specific hosts out there. Crazy file systems? No. Just standard LAMP stacks.

It would be much more fun if this was hosted on YP.

I'm assuming that's where the NSFW version of this article is. It probably has more risque photographs than one of a cat, keyboard, mouse, and right hand.

Of course, the photo might have some subtle implications. For instance, where's the guys left hand????

When I read the title my first thought was "How can you write NSFW article about using redis?". After doing a quick google search and looking through youporns blog, I figured the title is just misleading and no NSFW version exists.

From the stories on HN one would think that everyone who used memcached is replacing it with redis. Is that really the case? What are the differences and similarities?

Here's a pretty good comparison of the two - http://stackoverflow.com/questions/10558465/memcache-vs-redi....

Disclaimer: I wrote the above article and work for RedisToGo. But that article is unbiased :)

My experience has been that many people are using redis instead of memcached. I haven't switched myself, but it seems some frameworks of various types (ServiceStack, Resque, others) build in a Redis-based caching mechanism, and that's driving most of the adoption for Redis-as-a-cache. Perhaps some of the motivation is that it's easier to kinda-sorta run Redis on Windows than it is to run memcached on Windows.

I suspect that most of it is that you can kill two birds with one stone by using redis as your task queue. That's what I do.

Memcached has been running on Windows for a fair bit longer than Redis, and it's in mainline Memcached, though not mainline Redis, now. Of course, neither are necessarily that production-tested, but most people wouldn't be looking to use Windows servers anyway.

1 thing I appreciate about memcached is that it is always fast. Redis has a wider variety of features and capabilities, which can come back to bite you if/when the whole app is slow due to some poorly thought out Redis query -- especially likely if you're using Redis to do multiple things, such as (1) store temp cache data and also (2) store user account data.

What do you mean by query? Are talking about using the 'keys' command to figure out what data you need? Because that's completely discouraged for everything except maintenance tasks, and for those I think it makes sense to keep an extra slave instance that is not used by any apps that connect to redis (i.e., keep it out of the pool they can use).

If on the other hand by 'query' you mean, for example, using operations like diff, intersect, and union on sets that store keys (for the purpose of knowing what data to pull) -- I actually haven't ran into performance problems yet. If what you're saying is "it's important to spend time to think about HOW you're going to be determining what data to access", then yeah, I'll completely agree that this is crucial. However, I will also point out that this will be true even if you're using a relational database -- though those are expected to be "queried" so perhaps designing a feasible solution is a bit less demanding.

I ditched memcached for redis a while ago, just because I was already using redis for my queueing, and it made sense to eliminate redundant dependencies.

I'm now using Redis clusters to distribute cache reads so that each webserver has its own local copy of the cache which it reads from, but there's a single master that accepts writes. This does occassionally mean a double-write, but since we're talking about cache data and redis is atomic, this isn't a problem, and it splits the load of my reads, so that I don't end up bottlenecking if the master is being slow.

Memcached has some rough edges (e.g. slab size management) that aren't there in Redis. Redis also gives you a couple extra useful primitives.

I'm not an expert with either redis or Memcached, but in my personal usage, Memcached is functionally at least equal to redis for the things that it does well like LRU caching.

I find myself continuing to use it for those things just to keep my redis config cleaner and optimized for the things redis is (seemingly) better for like persistent storage and message queues.

Redis can be run without persistence, at which point it can be treated as Memcached with extra features (optimistic locking, data structures, etc). Without persistence, performance is often comparable (even with persistence, performance can be comparable, but you're more at the mercy of the IO subsystem).

> Previously we’d been experimenting with Redis, Varnish and a few other technologies.

I read this paragraph and was a bit baffled. It sounds like they had merely been toying around with Varnish. I would think that heavy edge-level caching was essential to keep that kind of site up. Considering most content probably isn't personalised, it's basically a matter of serving a lot of static content fast.

They might be using Squid, which was the cache server of choice in the mid-2000s in the porn industry and was heavily used.

Yeah, that's probably it. I read it as if they only recently looked into http caching, but it may just be that they recently looked into that specific tool. Makes more sense then.

I really doubt that could be a less interesting or valuable use or the interviews, the interviewees or any potential readers time.

In Summary: we took a perl site, changed it to php and implemented Redis because it is good at lists. I don't really have any specific tips on Redis or any details of clever implementation approaches.

I know it's slightly offtopic, but can somebody please explain how YouPorn and similar sites make money?

mostly advertising.

I read as far as "we decided to use PHP" before I decided there was no way anything they had to say would be relevant to anything I care about.

Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact