Hacker News new | past | comments | ask | show | jobs | submit login
Duck Duck Go written in Perl (duckduckgo.com)
77 points by fogus on July 9, 2010 | hide | past | favorite | 36 comments



How can he use Bing as a backend? I read their API's terms of service and it forbides you from changing search results in any way.

“(c) modify, filter, obscure, or replace the text, images, or other content of Bing results, including by changing the order in which Bing results appear (but this limitation will not apply to Bing results of type "Web"), intermixing Bing results with search results from other sources, or intermixing with Bing results any other content so that the other content appears to be part of Bing results;”

I was quite interested in using Bing API myself but that does seem too restrictive. I hope you can prove me wrong!


> but this limitation will not apply to Bing results of type "Web")

Isn't this what he is using?


That parenthetical only applies to the clause about result ordering. if you read the whole TOS, it's quite clear that Bing doesn't want you to do what DDG appears to be doing. In particular, mixing results from different services is a no-no in most search APIs.


And here are the HN comments on that Duck Duck Go architecture post: http://news.ycombinator.com/item?id=525048


The perl code isn't a full search engine though. He uses Yahoo! Build Your Own Search Service, so that perl code is just taking the results from there and throwing results out, reordering results, getting the zero-click info, etc. Yahoo does all the crawling.


I do my own crawling & indexing in addition to using third-party APIs. It's a hybrid approach where I focus on the places I think I add the most value.


Searching is not the same as indexing. DDG will give different results from Bing search, even though they use the same index.

Edit: DDG does some crawling in addition to using the Bing index.


I wonder if DDG could benefit from http://www.80legs.com/ ?


From some early discussion (and their current job listing), search engine startup Blekko also has a significant Perl component:

http://blekko.com/jobs.html


Yes, blekko has a full web crawler, indexer, and some of the query processing code in perl. Some of the innards and query execution are in C++ for speed.

Generally we tried to delay the conversion of a component to C++ as late as possible, since it makes it so much harder to change and iterate on. We call it "pouring cement on the code". For some things it's absolutely necessary to drop to C for speed, others it doesn't make a lot of difference since you're bound by other constraints like I/O.


Probably the biggest profile Perl (and Catalyst) public app in last few years would be the BBC iPlayer: http://www.bbc.co.uk/blogs/bbcinternet/2008/12/iplayer_day_p...


Actually, YouPorn is powered by Perl and Catalyst and regularly makes the top 25 domains. Not sure if it beats the BBC main site though.



That's cool. I've used DDG a few times in the past and generally like so far. I had no idea it was done in perl though. It's nice to see new perl apps getting some press.

I wonder if they use any of the newer CPAN libs like Moose and family in the architecture?


As for Moose, he (DDG author) answered this here

http://www.reddit.com/r/IAmA/comments/bbqw7/i_am_the_founder...

Having written an indexer in Perl for my current startup product, I really can't see the need for using something like Moose. I'm biased though, as I'm not a big fan of OO programming. I gave it a try in Perl years ago but it really sucked. OO programming that is, but I guess Moose was designed to fix this, but it's a little too late for me. I was taught C in school and I've learned to live without OO.


I can't say I disagree with him on the dislike for OO. However if you are going to do OO in perl Moose is the way to do it.


If you agree with the design decisions. It's rather expressive, but not exactly a minimalistic approach to OO.

Did they ever improve the rather steep performance penalty? (esp. startup)


Runtime is fast, startup is not as fast.

I am not sure why anyone cares about startup time. I start my apps about once a month. 1 second instead of .1 second doesn't really matter. It's like saying, "C++ has an unacceptable performance penalty" because you have to compile your code before you use it. Yeah, you do. You can pay a million times at runtime or you can pay once at compile time.

For desktop apps, start them when you log in and connect to them via App::Persistent.


It's certainly worth thinking about if the app is a command line app. No one wants "ls" to have a 5 second startup time.


I agree. Please contribute your fix to Moose.


I believe he was responding to your "I am not sure why anyone cares about startup time." While you may have been referring specifically to Moose, it's possible to interpret that as a blanket statement for all programs and languages, even in context.


Especially when you're learning Moose, writing some more involved Unix scripts would be an option. It's not that bad for those circumstances, though. Personally, I just have problems wrapping my Perl style around more involved types of OO, so for Perl I'm happy enough with normal packages and procedural programming.

And as a side note, the compilation times for C++ made a lot of people switch to another language, although Moose is about a few orders of magnitude away from those kinds of problems.


I don't think he ever said he disliked OO. The comment after the reddit link are my feelings on OO.


Moose is nice, but making your own objects can be pretty cool too. Try blessing a function ;)


I really can't see the need for using something like Moose

Well Moose is certainly getting traction in some big companies like BBC, Cisco, Hearst, Symantec and Yahoo.

ref: http://moose.perl.org/about.html


Like all things, I think it depends on the problem that you are trying to solve. Some things lend themselves well to having a nice abstraction layer while other times, this abstraction layer just adds unnecessary complexity.

I don't want this to turn into a C vs C++ debate but I do agree with Linus in that it's nice being able to just grep for something.


I agree, thats why I like using language like Perl because it doesn't force me into any rigid pattern(s).

But when it comes to OO i (nearly) always use Moose.


It's nice to see new perl apps getting some press

Some Perl press as appeared on HN before. For eg: http://news.ycombinator.com/item?id=565152 (though sometimes the Perl community have done their best to keep head low on this one :)


I think that's a natural combination. Both p-words showed me that there's more than one way to do things.


Yes Perl is definitely the Kama Sutra of programming languages :)


Thanks. Good Post. This is great example that good application and complex systems can be written in perl as well.

On a side note, i saw the previous post on DDG, where discussion was about how DDG does not store any private information. It makes sense in the context, that since he leverages upon other search engines , his cost of running is low. So he can afford to ignore user information otherwise used for commercial purpose by others.


>This is great example that good application and complex systems can be written in perl as well.

You could write "good application and complex systems" in machine language if you work hard enough. What are you trying to say?


i had pictured something elegant like Lisp. but no it was just hacked together in Perl? Somebody needs to link to the famous XKCD strip. :)


You mean this one? :)

http://xkcd.com/224/


Interestingly, a search for "ostensibly xkcd" shows the comic in the first link:

https://duckduckgo.com/?q=ostensibly+xkcd&v=


On a similar topic, A God's Lament http://xkcd.com/312/




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: