Hacker News new | comments | show | ask | jobs | submit login
Cuil Goes Down for Good (techcrunch.com)
101 points by cristinacordova on Sept 18, 2010 | hide | past | web | favorite | 90 comments



The only lasting legacy of Cuil may be its use in expressing orders of abstraction away from normalcy:

http://cuiltheory.wikidot.com/what-is-cuil-theory


Cuil theory reminds me of the pataphor: http://en.wikipedia.org/wiki/Pataphor#.27Pataphor


Intriguing concept! This is a apt description of the illogic inherent in dreams. Come to think of it, perhaps 'Inception' missed out a golden opportunity to depict each level as a pataphor of the parent one...

(Apologies for the off-topic comment; new ideas are exciting!)


I think the massive hype they ran prelaunch is a good anti-lesson. They had to do all of their learning in full view.


Speaking of legacy, wonder if there would be interest in Cuil as a apache project... who knows what might arise from its ashes! :)


Strange how an unfunded startup with a single founder (DuckDuckGo) is getting better results than a $30million backed one comprised of a bunch of ex-Googlers.


That's because cuil was building a search engine from scratch rather than relying on bing's/yahoo's index to do the underlying scoring. Maybe they shouldn't have started from scratch, but they were certainly tackling a much harder problem (maybe not the right one.)

I've heard through the grapevine that they were able to index and serve 100 billion documents on 100 machines, which is a pretty impressive technical accomplishment if true. I'm surprised they weren't acquired for that. It's unfortunate that their search quality wasn't up to snuff yet.


Those numbers alone really don't mean much.

How many queries per second can they handle on those nodes, and with what latency? What kind of relevancy calculations were they able to do at query-time in their system with 1B documents per node? Were they able to support query-time aggregation of structured fields in their documents? Was the index stale or did they support continuous feeding and indexing of new documents? If the latter, how well did they meet their SLA QPS and latency when indexing new documents?

I can set up a single search node and fill it up with God know how many documents any day, but the difference between supporting 10 QPS with ~500ms latency and 3000 QPS with the 99 percentile below 40ms is really more interesting than exactly how many documents I have per node.


The harder problem IMHO is customer acquisition. Many companies have built (and still have) link graphs of reasonable quality, but they all struggle to gain real customers.

I started out before those APIs existed, and did all my own crawling & indexing. When they came out, I decided to focus on my value-adds because I thought that was a quicker path to customer acquisition.

Furthermore, I don't use Yahoo/Bing straight up, e.g. I re-rank, omit, etc. I also mix them with my own my index/negative spam index from my own crawling efforts.


I don't use Yahoo/Bing straight up, e.g. I re-rank, omit, etc.

All re-ranking you can do is very limited with these services because they are black boxes: you don't see what factors went into ranking a page the way it happened, you can't tweak the weight of different factors, you can't add new factors. All you can have is hardcoded rules like "if there is a wikipedia page in the first 20 results, bring it on top" that don't really add much value, because if that wikipedia page is any good it would be on top already. With spam results it's similar: you can provide impressive customer service blacklisting spam on user requests, but major search engines are already pretty good in low-ranking spam so much that you don't see it if there are any other meaningful results.

Your marketing here on HN has been brilliant, you have some very interesting UI decisions and possibilities that Google doesn't have, but your added value is definitely not in improved ranking of results.


Thx. I can't reveal too much here, but I do a lot more in the reranking area such that the top 20 will look very different for many queries when compared. I think these improvements go a long way to improving search UX, but ranking is subtle and so doesn't get noticed much (except when failing miserably).


That's almost certainly the right strategy for you. The people who founded cuil had a solution in look of a problem, in that they came from the search infrastructure teams at Google and wanted to use those skills. It turns out that running a search engine on very few machines isn't much of a competitive advantage, or even having a very large index.

They also got a lot of marking out of their "ex-Googlers take on Google" narrative which probably wouldn't have worked out as well if they were using something like your strategy.


"That's because cuil was building a search engine from scratch rather than relying on bing's/yahoo's index to do the underlying scoring."

I've heard this in other discussions of DuckDuckGo here, and I don't understand why bing/yahoo allow a potential competitor free access to data that is so important to their search businesses. What's in it for Yahoo/Microsoft? Or is DDG paying for the privilege?


It's probably the same reason google allows free app engine accounts. If someone builds something cool, it's easy to integrate upon acquisition.


At the moment DDG is effectively a customer, not a competitor. If DDG ever became large enough to show up on Bing's radar (Bing currently has 600x as much traffic), you can bet that the terms would change.


As far as I know, DDG isn't paying for access. But I might be wrong about that. Maybe they are they paying for Bing, but not Yahoo?


Yahoo have recently announced that they will soon be charging for Boss.

http://developer.yahoo.net/blog/archives/2010/08/api_updates...

"We are exploring a potential fee-based structure as well as ad-revenue models that will enable BOSS developers to monetize their offerings. When we roll out these changes, BOSS will no longer be a free service to developers."


Yahoo's search api is about to go premium. It's been free to date, but they have always said they'll start charging for it at some point.


They can be the long tail of search engines, an army of Google beaters that are used by people that would never consider using Yahoo search.

I don't think the ordering of the results is Google's competitive advantage anymore it's branding and habit.


100 billion documents on 100 machines

I think Cuil should sell their index as a service. Over which businesses can implement PageRank and http://en.wikipedia.org/wiki/Pagerank#See_also type of algorithms.


or just make accessible their crawl data


Ok, so the problem they were solving was more difficult... but they are going after similar markets (or at least segments of the same market).

Technical achievements are great; but Gabriel is much better placed. He is self funded, he is building on existing tools (always good advice), he leveraged us, the hacker crowd, who can be very loyal, he clearly listens to his customers etc.

Cuil, on the other hand, produced some very confusing (if technically interesting) things and then ranted about those who criticised them. They had a lot of big bucks VC money (always a warning sign) and didn't appear to be leveraging loyalty from any user base.

Even if the problems these two startups are facing are different; there is a lesson here. One is how not to build a product, and one is :)


I voted you up but I don't find it strange. yegg only answers to himself and his users while operating with tight constraints. Cuil, however, had a bumper budget and a cadre of smart people with a plethora of ideas and directions in mind.

To me, Cuil looked like a prime example of design by committee whereas DDG is clearly opinionated but thrives because of it.


DuckDuckGo is Cuiler. http://cuiler.com


IMHO DDG's success stems from the fact that yegg listens to the users, is small yet big, and addresses a niche market - the one that demands quality results and settles for nothing less.

By being small DDG can address issues that others will not even think or bother thinking about i.e. enhanced privacy controls, TOR utilization etc.

I believe that Cull was a dream that went south. Unfortunately that dream had a hefty bill ($33m).

Strangely enough I visited that site once before and could not put a name to a site until I saw a screenshot of it.


"IMHO DDG's success stems from the fact that yegg listens to the users"

I once contacted Cuil about some worthless search results, and got a standard reply asking me to be patient since they were a small company. But there wasn't any hint in the email that they would actually address the issue, so drastically reduced my use of them and never bothered to contact them again. Listening to users would have probably helped if my experience indicates a pattern.


Alexa says cuil and ddg are about the same. (I know alexa sucks etc etc)


$33m sounds like a lot to burn through to get to the gut wrenching point of being down to last pennies.

But for those saying "no one can take on Google" and "search isn't an interesting space any more", remember Google got told the same things about the incumbents and "the portals" when it started.

There is always a better way out there. Someone just hasn't found it yet. Chances are low that Google will still dominate search in 10 years, or that most people will search the way they do now.


Yes, people can, and should, take on Google.

At the same time, I happened to be on eBay today and noticed that I've been an eBay member for just over 10 years. Why is it that we haven't found a better way to run auctions online? Most people still list and participate in online auctions almost exactly the same way they did 10 years ago.

I do agree that the pace of innovation is increasing, so maybe 10 years starting from 2010 is the same as, say, 20 years starting from the year 2000.

But I think chances are high that Google will dominate search in 10 years. If someone discovered a better way (and there are many, they just haven't caught on yet) they would just buy 'em out, right? ;) Or maybe Bing will win. But then it's Microsoft winning, and Microsoft is older than me.


I think an auction's network effect is harder to beat. Search doesn't have the network effect so Google was able to win just by being the best, to beat google at search now you're going to have to be an order of magnitude "better" and have non copyable tech. Facebook was able to beat Myspace even though they had the network because for personal identity the tone and "coolness" is very important. But for an auction it's all very impersonal the only thing that matters is the size of the network, there's not a lot of opening for a competitor. The only way I can imagine to beat ebay would be with a massive amount of money to just pay people to come over.


Hard to beat, but not impossible. As far as auction sites go, NYSE/ARCA was (about 15 years ago) pretty much the only site providing double auctions for stocks. Now there are lots of them - BATS, BSX, assorted darkpools.


to beat ebay you'd start by establishing a base in one market (either one geographical region, or one type of product), and then expand from there. And even if you don't manage to beat ebay, you'd have a decent chance of establishing a moderately-sized business along the way.


I thought about (and setup a prototype) starting an ebay competitor a few years ago now.

The main thing it was going to do differently, was instead of having a fixed ending time for an auction, it would be conducted like a regular auction - in real time, via AJAX, and the auction ends once no one else bids.

Seemed like that realtime element, along with no 'sniping', fairer bidding, a cool UI etc would be a good reason for people to switch.

But the network effect is massive. To be successful as an ebay competitor, I think you'd need a pretty large marketing budget, or several people working on promotion full time.

It's ripe for the picking though. And massive profits.


Does ebay run auctions the exact same way as they did 10 years ago? In some sense, auctions are the same as they were however many thousands of years ago when the idea was invented, today though ebay has branded it and built in some levels of assurance and trust, it's still an auction but they'll help you dispute and resolve issues.

I expect if there is a different way to slice search, google will explore it. Fundamentally, if you're going to beat them then you have to index more, better, faster, and somehow produce better results, then provide a better experience. I'm not saying people shouldn't challenge them but they're good and I think it's an insanely difficult challenge to beat them.

In 10 years, we'll still know of and probably use Google, I don't know if that's true for bing or ddg, or any of the others. Unless we stop searching.


eBay's not the winner everywhere. Here in New Zealand they got comprehensively beaten by Trademe, so nobody here uses eBay at all. Now who's going to make the Trademe killer!


I think a lot of that was helped by the shipping cost from anywhere to New Zealand.

All things aside Trade Me has been a great web success story for the local industry.


For the same reason that Craigslist is still king despite its tragic UI. Solving the chicken-and-egg problems of a large, active community and brand-name recognition is orders of magnitude harder than just making a new auction website.


Craigslist's UI is fantastic.

It's not pretty, but that's about the least important element of web-design.

It's very easy for people to understand how things are organized. Cities are organized together, categories are organized together. All of the pages are text-based so pages load fast. Various functions on the page are explained with helpful text explaining what's going on when you try and post or respond to a post. Etc.

Their UI can be improved with a little more padding in places and a little more focus on quality typography, but it's remarkable that such a large site has maintained such a quality design in the face of enormous pressure to change.


There's certain cases, Amazon is another example, where a user interface violates what traditionally is considered 'good' design and is enhanced because of it.


Craigslist is minimal but reasonably well-designed. Amazon on the other hand I would go so far as to call awful. The #1 thing I'm looking for when I open a product page is the details about the product and it's below the fold. But still, they know far more than me about what sells, so I bow to their judgment.


I have a pretty good idea for how to combine both concepts, running auctions using Craigslist, but keep running into the limitations of their API. It's on the shelf right now, but I am thinking I might take another look at it.


I have a feeling that Google is more worried about a new search paradigm — like social search for instance — than its current competitors in the "index+rank search model".

Hunch + Quora + Facebook questions could well be the biggest threat to Google's quasi-monopoly.


Hard to overcome having your company name be synonymous for abstraction from reality.


Does this leave Blekko as the last standing independent Google-killer? (DDG doesn't count to me because yegg isn't (yet) indexing the web himself.)


Yandex has its own index and shows good results in English (but no photo/video search). Exalead has its own index too but it lacks a good junk filter à la PageRank.


In China, Baidu for sure!


What on earth is 'Blekko'? I must be so out of the loop.


Facebook?


I remember the days when so much hype was around it and now when I read the title I thought to myself, hmm, I remember that name, but who are they exactly?

It feels like they were one of those x factor winners or big brother winners who have a lot of media attention to start with and then become complacent and do not keep up of playing the game of keeping the attention of the public. Marketing is not just five minutes of fame.

I think also that what the above teaches is that there is a lot of hype in social media. Sure there are real serious and enlightening conversations, but much of it is hype and trends and discussions. Maybe it is best to retreat from mass communications like the internet and focus on your projects only. Better be an actor in my own show than a spectator in others.


User acquisition strategy is as important for a search engine as it is for any other consumer web company. Many that read HN are ahead of the curve when it comes to new technologies such as DDG. 99% of America just uses what they have until it breaks. So Cuil's lack of high quality results may not have been their only reason leading to their demise, but with 33m in funding I hope a solid chunk was devoted to acquiring main street America. Yes, Google initially won by having higher quality results, but any new player with higher quality results will only be marginally better than Google.


I hope Cuil is open sourced!!

Would love to learn from from Cuil 's cuil code :)


Wonder why this was downvoted.. Is it bad to hope that Cuil is open sourced??


What do you want to learn - how to make a not-very-good search indexer?


No. How to make a highly scaleable search engine. Examples of this sort of thing are not out there, other then nutch and that hasnt been proven as scaling out to the size of cuil.

Cuil's problem was with their ranking, which may be related to what they were indexing, but not to their ability to scale out.


It's highly scalable in which sense?

Their index was very large, but their traffic was not. I think I've run websites that handled as much traffic as Cuil did on a good day.


Sad, liked the fact that I could search and not have my search saved for data mining purposes. I think privacy is an important thing. I think giving one company Google all that power is a very dangerous thing. I hope people wake up and start supporting alternative search engines and use Google as a last resort.


I think I called that right:

http://news.ycombinator.com/item?id=1693053

"Cuil is dead".

I don't think this comes as a surprise to anybody.


I'm sorry to say it but they never stood a chance, i'm not sure if a company will ever be able to take on Google in search directly.


Anybody that filters out the content generators like ehow, ask.com, demandstudios, etc will get an immediate nod from me. Wait.. doesn't DDG Do this? (honestly, I don't know)


According to Gabriel, he filters out spam aggressively. So the answer is probably "yes".


Forgot where I read it (here?) but he actively removes (dis)content mills.


yeah, there was a post about it here a bit ago: http://news.ycombinator.com/item?id=1549690


I find EHow results useful from time to time. Isn't ask.com another search engine? I don't know about demandstudios because I don't know what their domain names are.


> Isn't ask.com another search engine

In a way they still are but they also to a lot of ehow type SEO.


There's something bigger that stands out about this whole episode to me. It's not just that they never stood a chance. It's that if they were smart enough to build an even half-way functional engine, they should have been able to see from long before launch how things were going to work out. I'm very interested in whatever organizational or psychological dysfunction lead them to releasing the product at all.


I'm willing to bet that they simply didn't eat their own dog food and use their search on a day-to-day. They couldn't have. The results were so bad as to be unusable.



the lore is: they had serious technical problems the day before the launch day and decided to pull through anyways. The load on the already faulty system caused additional failures bringing the cuil's quality to it's lowest point, and that's where the most people saw it.


Bing seems to be doing a pretty good job.

With google's privacy snafu's and other mishaps, I could see them getting upset within 5 years.


Bing is the default search in IE, it had a significant chunk of the market with Live Search previously and started off with a $100 million dollar advertising blitz. Add to that the number and talent of the employees working on it and you have something no startup could hope to replicate.


The grandparent wasn't talking about startups, though:

i'm not sure if a company will ever be able to take on Google in search directly


Why not? Search has a very low barrier to customer switching. If I want to switch to a new search engine that provides better answer than google I only need to change one setting in my browser - it's easier than picking a new desktop color


Once again, you are not the customer, you are the product. Search engine customers (advertisers) gravitate to the largest player because it provides a larger inventory of users/searches to select from. Search advertising is a winner-takes-all market and it would take a huge shift in inventory to change this momentum. Once it starts shifting it is hard to stop and the changing economic factors make it easier for the insurgent to capitalize on whatever mis-steps the incumbent made, but there is a lot of inertia to overcome to get this ball rolling.

A far more likely scenario for a major change in this sector would be for social search to eclipse the crawl it and rank it model that is currently dominant.


As an advertiser, I don't care how many people are using one search engine over another, as long as it is non-zero. I care about the ROI I get and spend accordingly.


Not quite true (well, for most advertisers; I obviously can't speak for you directly). Total return is also important. Especially when you consider that there are non-zero time costs to running advertising on yet another platform.

Would you rather make $ 1 from a $.1 investment or $ 10 k from a $ 2k investment (assuming the former isn't scalable because there are no more users to advertise to)? If there is a non trivial time cost in setting up management, billing etc, you might not even bother with the former.


You make a valid point, but would you not factor that into your ROI calculation?


Yes, in theory. In practice it can be very hard to include switching costs, complexity costs etc into the equation. Most business people I know take a much more pragmatic view of calculating ROI on direct costs and putting a finger in the air about whether it's worthwhile.

Also, if total return is small and capped, it may not even be worth the time to calculate ROI.


Hm, I wonder whether the way to take on Google would-be to remove that setting. Build an engine and offer the phone companies a part of the advertising revenues if they hard-code that engine on their phones.

Problems there would be a) to beat Microsoft and Google at that game, and b) it would turn search into a low-margin commodity. Both make it hare to turn a profit.


Especially not something funded to this degree, the barrier to what is a success is way to high with Google Search as a direct competitor. Something like the one man startup DDG can redefine success as a much more manageable target, allowing a lot longer to compete without going under.


Am I the only one who's happy about that? They were a nuisance to website operators....


But ... but ... if it goes down, how will it ever kill Google? =:-o


That was totally worth the downvote.


I am certainly sure that If I was given 10% of cuil 33M I would have end up with something 10 times more useful and profitable


But would you be targeting an sector as attractive to the investors as Google? The investors weren't worrying about the making "just anything" useful or profitable. They wanted Google profits and Google uses.


do you wonder whether I claim to be able to set unrealistic targets and reach them, if so the answer is no.

Besides, sow do you know what where the investors willing? AFAIK, investors are normally willing to invest one hundred now and get one thousand several years later. In other words, investing 33M to get 330M 5 or 10 years later.


AFAIK, investors are normally willing to invest one hundred now and get one thousand several years later.

Not quite. VC plans on 9 out of 10 deals not working out. So that remaining 1 in 10 has to make enough to pay for the rest. That means that a 10-fold return just breaks even. But it gets worse. For an investment with a 10 year horizon they need to beat alternate investments. If you peg those at 10%/year (compounding annually), then you now need a 25-fold potential return on investment for the fund to have a chance to meet its goals.


... Why was this modded down?

Is the reasoning wrong, or is there some troll going on that I don't know about? It seems quite realistic to me.


I wonder the same, perhaps, this is the new HN.


You're just seeing the actual outcome.

The investors paid for the full probabilistic model, where in 10% of the cases Cuil would have succeeded in beating Google and would have made tens of billions of dollars. That alone is worth 10% * tens of billions = billions. 33M for that doesn't seem much in that perspective.

Your statement about "I am certainly sure" and "10 times more" ignores the probability distribution of your potential success, and how big is your success in each one of the cases.


As my accountant likes to repeat that cliché over and over, "Even-though figures do not lie, lairs can figure" - All I was saying is dead simple, and this is actually the values I was raised by, one should treat some one else's money 70 times more carefully then one's own.

I am sure not everyone would agree to subscribe to this rule, but I do.

By all means it should not take you $33M to recognize that you are in the wrong direction, the first 2-3 should do the job, unless you choose not to see what is happening in front of your eyes.

This is to be said to the founders and investors altogether.

Cuil was a joke, an expensive joke, period.




Applications are open for YC Winter 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: