Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
What do you think of my startup idea?
19 points by steffon on Sept 25, 2007 | hide | past | favorite | 43 comments
I studied a lot of sociology in college and noticed a way that one could organize communities around their shared preferences of content so that people could discover new content with what should be a much higher degree of relevance then current recommendation systems.

I'm calling it a discovery engine, where the user can enter the name of a specific piece of content they have in mind, or something they are generally interested in, and receive recommendations of new content from like-minded people.

Just like Wikipedia issued a call to all people interested in making an encyclopedia, this would issue a call to all early adopters to be recognized as authorities and trend-setters. Think The Tipping Point by Malcolm Gladwell, but taking place online, efficiently, and transparently.

The project calls for combining social-bookmarking and user-generated media with an algorithm that both aggregates similar collections of content into networks and makes recommendations of content based on the evolving network structure. The ranks of "influence" and "in the know" are measured against networks of users with similar collections of content. These two rankings incentivize users to continually post relevant content because they want to remain "influential" and "in the know" in front of the people that are genuinely interested in the same content. The majority of users coming to Topiat for recommendations receive relevant recommendations fueled by the work of those who are genuinely influential and in the know.

But unlike current recommendation systems, which are domain specific and treat an individual as the sum of all their preferences (e.g. Netflix), the discovery engine would allow users to create networks based on all types of content (any combination of music, products, images, videos, URLs etc.) and enables users to explore different interests they have with the ability to create multiple groups of content on their profile. Each group of content becomes aligned with similar groups of content, from which recommendations are generated and delivered to the user (e.g. my oldies music compared with users with similar tastes in oldies music, my surfing group compared with other users that think of surfer the same way I do).

I'm putting together a Y combinator funding proposal based on this basic idea and am looking for feedback before I send it in. If there are any developers that like the idea and want to know more, let me know. Additionally, if you are good with machine learning techniques (e.g. neural networks) and are interested, let me know.



what's a one or two sentence explanation that would get my little sister to use this? what value would she get within the first 2 minutes (assume early on, before you have a ton of people using it?)

ideas like this are tough because they're only really useful once an incredible amount of content has been submitted. what tricks can you do to make the site be sticky to the first 100 users, when nothing yet is submitted? what are the first concrete things the user sees, or does, when they arrive at the site?

i think that's why these recommendation sites are only successful in niches (e.g. travel (tripadvisor), food/entertainment (yelp), movies (netflix)).


I've got a few ideas, such as front-loading the site with content and preference data by using differnt API's, such as Flickr's API, Youtube's, del.icio.us, Last.fm's. This way when you showup, lots of your preferences and rating "work" comes along with you. Additionally, the preference dataset is jumpstarted.

What you would tell your little sister about the site is "find more things you like, and if you're good at finding new content, be recognized for it."


Here's an example of how I would use something like this: I want to watch a movie that's similar to Jurassic Park. I can go to the site and type that in. The web application picks apart the constituent pieces of "jurassic park," like "action movie" and "dinosaurs" and "made a lot of money," then looks for other movies that match the criteria based on information from IMDB, Google, and MPAA.

Then, it looks for people who have voted/commented on the movie, and extrapolates a probability of you liking other movies that they have rated highly or poorly. If your rating history is similar to theirs, it rates their other movies with similar pieces higher. If your history is very different, or if they voted Jurassic Park very poorly, then the other items they voted highly are placed low on your recommendations.

I think that's awesome.


Description is too long, doesn't make sense. However it also sounds like the plan of no fewer than 10 already existing startups with tons of venture money that are already doomed to failure ... Except for Ning.


It made sense to me. It's a great idea but implementation is going to be incredibly challenging. It's only doomed if the founders don't have the skills to actually implement a better recommendation everything.

I think an uber-short pitch here is "A better recommendation engine for everything" which of course only makes sense if you know what a recommendation engine is.


Even if you built the most perfect possible recommendation engine, it still wouldn't matter unless you were able to build a large enough user base who was actually using it.

Re: ability to get recommendation engines working better, see the Netflix $1M competition as an example of just how hard it is.


This is kind of what I tried to do many months ago. It was also based on the premise of transparent authority, many-to-many content and user connections, categorization on multiple dimensions, and such. I also approached it from a sociolinguistics angle. The problem with sociology and the like is that the metrics themselves are very broad and hard to quantify.

The application's complexity spiraled out of my control. I still believe in the core concepts, but execution of this thing in particular is very tricky. Theoretically speaking it could be the end-all recommendation engine, and that is, from my own experience, almost as lofty as building a Turing Machine. Granted that you have user input, it probably only needs to be half as strong, but that is already very strong, and very difficult (at least for me).

Now logic is one thing. One the same level of difficulty is how to make it user friendly; for this app, regardless of how hard the AI is, the UI is probably half the battle.

I'm not sure what kind of app you have in mind specificially. dcurtis's mockup is real slick and to the point. My implementation was a lot more passive, but I suspect internally the structures share commonalities with your proposal. If you are interested we can discuss. I won't be applying with my implementation though, because this the kind of site where without 100k users, a VC won't even bother. And no matter how good it is, it is much harder to get traction today, hence higher uncertainty, higher burn.


Sounds like IRC. Once you find a chan you like and kind of settle in, you realize that the actual convo rarely relates to the title of the chan, but is still of interest to the people who regularly go there.

IRC is pretty popular, I don't see why your idea wouldn't be among the less BitchX-inclined internet crowd. ;-P

You'd have to be brutal with the 'influence' system, though, to keep the trolls out of the 'ponies and barbie dolls' group.

The killer feature, to me, of an idea like this would be something like musicovery.com, where overlapping things lead you down new paths.


Here are two large differences between IRC:

1.) One of the main problems I am trying to solve is in your second sentence: "Once you find a chan you like..." It's time consuming to find new content you like, especially in a setting as nebulous (for most people) as IRC. With the discovery engine, all a user needs to do is rate recommendations based on what they are interested in, and they are immediately connected with groups of like-minded people and what they know. Fast and easy.

2.) In addition to connecting to groups of like-minded people, the user gets recommendations from those who are most influential and most in the know. All content is not equally desirable, and users using the discovery engine will only get recommended that content which has empirically shown is wanted by people that think like them (relevant).


Of course. I only mentioned IRC because there are some similarities that seem to indicate your idea could work.

>only get recommended that content which has empirically shown is wanted by people that think like them

Seems like an OK idea, but I think the site might suffer if this is taken too far and the groups just become echo chambers, walled off from each other entirely. I just think a site could be more popular if it focused on finding common links in disparate groups instead of isolating them based on common differences. Think of it like radio: in my music directory, my biggest artists are The Crystal Method (41 songs), NIN (112 songs), and Sarah McLachlan (85 songs). No single radio station, AM, FM, or XM, that I am aware of covers a range like that. Wouldn't it be cool to find a place that let you keep close to your core group, but at the same time gave you tips on what other groups thought was cool, just in case you agreed?


"... I'm putting together a Y combinator funding proposal based on this basic idea and am looking for feedback ..."

demonstrate the idea by showing ... build an app... Show me the demo


That was my thought too so I have a demo under development. Unfortunately, it won't be ready by Oct.11, and of course, it's all about the right team anyway which is what I'm really interested in. The main reason for posting this idea is to find like-mided entrepreneurs interested in the idea so we can build a team and, as you said, "demonstrate the idea by showing...build an app".


"... The main reason for posting this idea is to find like-mided entrepreneurs interested in the idea ..."

Now I see. Even a mocked up static demo is better than nothing. The Oct 11 date is plenty of time. Now I get what "your going to have to work real hard" comment means especially when you come up against fixed dates.


Here's a very rough mockup that was put together with some brainstorming, just to give an idea of what the first user introduction might be like.

http://quarkfactor.com/topiat_example.png


"... Here's a very rough mockup that was put together with some brainstorming, just to give an idea of what the first user introduction might be like ..."

Better. Now make some more that demonstrate your key points and that will also help you start building. Wish you'd put this up first as it starts to put flesh to words (idea description).


Dude, just want to say you have a great sense of design.

Are you applying for YC this round?


Thanks. And yes.

Send me an AIM IM: quarkfactor


You are too modest. That mockup was great. I couldn't make out from the thread if you and steffon are working together?


Yeah, steffon and I are working together. Between the two of us, we can make a pretty great interface that is very focused and has defined boundaries. The problem of boundaries tends to be what most potential investors are worried about. We've definitely got that concern covered.

What we need is someone who is amazingly good at neural networks, algorithm design, and advanced mathematical engineering with programming languages. These people don't grow on trees; they're extremely few and far between. But the main asset for this idea isn't the idea itself, or even-- to some extent-- the user interface, but the way the math works behind the scenes. That's our innovative secret, and it's the reason I'm excited.


The idea seems to be, basically, make a better social filtering application. I definitely think there's space for something really cool here, but no one's done anything close to useful yet, short of Amazon.

If I were you, I would try to, at least for the time being, try to pare the idea down into something much smaller and more manageable. Once you have that done as a proof of concept, if it's still too small you can expand to other areas. Along the way, you will've learned a lot about what you'll need to execute on the bigger idea. Why not start with the movie suggester you mentioned later on in the thread?


What determines "groups of content" (people? algorithms? either way, the value is in figuring out how)

It isn't clear why "domain-specific" recommendation systems don't work within a group.

Note that it isn't obviously true that separating by group actually produces better recommendations. In fact, the (inadequate) evidence that I'm aware of indicates the exact opposite.

BTW - If you don't have the algorithms mentioned, how much do you think that you actually have? (I can imagine lots of wonderful things that would come from a personal transportation device that got 100mpg, but if I don't know how to make one....)


"groups of content" are collections of content that users create themselves (the social bookmarking aspect of the site). Users that want to build a reputation for being in the know and influential are incentivized to make these collections relevant. The groups also become lists users can form to bookmark content they find around the web and want to save it one spot.

Based on what you have put into your collection of content, and collections others have made, the algorithm aggregates those similar people into the same network. People in the same network get content recommended to each other from their like-minded peers that they have not discovered on their own.

With regards to the feasability of such an algorithm, I've talked about it with many mathematicians and machine learning programmers and the wheel does not have to be reinvented for this application. The tools already exist, and just have to be customized and tweaked for this application.


In other words, I make a "group of content" that I call "movies" and it gets compared with other peoples' "group of content" that they called "movies".

Why isn't the netflix recommendation system useful for generating recommendations from "groups of content" labelled "movies"?

If it works for movies, why can't it combine "groups of content" labelled "science fiction"?


The point of having groups of content is to combine traditionally dissimilar types of content-- movies, music, housewares, etc...

For example I can make a group of content called "new living room" and add all of the things that go into my new living room. This includes the music and movies that I have stored there, the type of TV I bought, the type of couch, stereo receiver, speakers, or even the paint on the wall. When someone searches for something within that collection, the system knows that someone else, somewhere, has combined that "thing" with the other "things" in the collection, so they get rated higher as being compatible.


Why do you think that the netflix system won't do the right thing if we both put some music in our "groups of content" labelled "movies"?


This page explains many of netflix's limitations well: http://harry.hchen1.com/2006/10/03/391. But more importantly, look at these limitations in light of how the discovery engine is organizing its preference data and how it's collecting preference data.

The critical difference with the discovery engine is the idea of a group of content that users fill themselves with content based on criteria they see as relevant. Yes, a users aggregate preference composition is important, but what is more important is their set of preferences regarding a specific collection of content. This way, a user can be really into classical music, horror movies, and modern furniture, and get relevant recommendations for each interest, connecting with people who are most in the know regarding each interest.


I'm not sure what you're asking here-- Netflix doesn't care about anything but movies, and it probably wouldn't be able to recommend movies any better if it knew your musical tastes, or even how your tastes compare to mine.

The idea is that if you search for "sony SSK70ED," on the "discovery engine," it will show you what other people have paired with those speakers, such as receivers, furniture, and televisions. In a way those things are "similar" to the speakers because they complement them. Of course, the system shows you similar speakers first, but the complementing items are interesting results to have when you're searching for a specific item.


> I'm not sure what you're asking here-- Netflix doesn't care about anything but movies, and it probably wouldn't be able to recommend movies any better if it knew your musical tastes, or even how your tastes compare to mine.

That's wrong. The only thing that is movie-specific about the netflix recommendation system is the preference data that it runs over. It doesn't know movies from eyebrows.

If netflix (the company) also collected preference information about music, the recommendation engine would predict music preferences. And, since it would have both music and movie info, it would use music prefs to recommend movies and the reverse, just as it uses movie prefs to predict movie prefs today.

Amazon's "users who bought {something} also bought" is an example of "doesn't know anything about the domain". (They have to tone it down to keep it from recommending "strange" things that are way out of category.)

Disclosure: I know the guy who implemented NetFlix first recommendation system and have written a collaborative filter myself. I know what I can do with the fact that we both like the Pogues and Chunky Monkey. I still don't see what I can do with how we group those preferences.


"I still don't see what I can do with how we group those preferences. "

The way in which this site will allow users to group their preferences seems like a slight organizational difference when compared with other recommendation sites that use collaborative filters, but it has huge implications. This post is meant to give people a taste of what I'm starting to try and find others who are interested. I'd love to talk about any and all specifics and their implications especially if you are programmer. If you want to chat my AIM is rocksld3.


Delicious allows users to put a permissive license on their RSS feeds, so (IANAL and within reason) you can probably get those bookmarks and importantly tags. So how closely would making recommendations per tag match your idea?

Tags are a little messy, and perhaps overly specific, so you may need to cluster them and then do recommendations on that.

I'm not sure I understand how you plan on making 'influence' work, if someone bookmarks a link recommended to them do you simply increment all those who have bookmarked it before?

Please don't use the word incentivize, its terrible :)


utter crap. Go back at the drawing board. Make something more tangible, enough with the pies on the sky.

These are concrete ideas:

Facebook, just like your silly school book, but online, and you can add friends

Google - search engine

Yelp - reviews of restaurants and businesses

las.fm - music recommandations from your friends

pandora - discover music according to your tastes

reddit/digg/news.yc -- recommending news and stories

Yours is not a tangible idea. Narrow it down to one sentence the main idea, and just one paragrapsh on how it will work, and then you have something more tangible.


I don't think you described those sites very well.

Facebook: Digitize your real relationships

Google: Find websites based on keywords

Yelp: Find restaurants, share experiences about restaurants

Last.fm: Discover new music based on popularity

Pandora: Discover new music based on constituent pieces

reddig/Digg: Discover news based on popularity

This idea: Discover new things based on constituent pieces, popularity, and relationships.

Seems like a natural progression to me, even if it seems complicated in the description.


Benefits

Facebook: Better contact with more people

Google: Answers to questions. Solutions to problems.

Last.fm: Good music

Pandora: Good music

reddig/Digg: Inspiration, quick and easy learning about tech, internet, politics

This idea: ?

For example, what's the benefit of getting movies that are similar to Jurassic Park? If you say "comfort and delight" I'd like to know how it's more comfortable than checking the movie with imdb and checking the recommendations. It has to be a bit more comfortable than the way people find movies today. Why would they check your service instead of just walking down to the video store and look at the selection?


This idea: find similar people, good music, good movies, and interesting things.

How are those not benefits?

The benefit of finding movies that are similar to Jurassic Park is that you find new movies that you might enjoy. This is very obvious. What is the benefit of finding new music on iTunes/Pandora? It's the same thing! Of course it's for comfort and delight. That's what you're supposed to be pursuing in life.

IMDB does not recommend movies at all. If it does, it does it horrifically badly because it has never helped me before. IMDB is not a recommendation engine, it's simply a database. And they don't seem to want to innovate beyond that.

Video stores are enormous and movies are ordered alphabetically by genre. There is no way to know which movie is good, or which one you will like, because all you have is a promotional paragraph on the back and a pretty picture on the front. This is also very obvious to me.


You have a good idea, but I don't think you're good at communicating it. The above is a good example - I don't know anyone that sets out to 'digitize (their) real relationships'. They might do that, but it's that's a means to an end, not the end itself.

Users are focused on end goals, not means. The end goal for Facebook is to keep in contact with friends, meet new people, organise events, and show off their photos.

Nobody wants to digitize anything. If FB initially told its users they've made a service to digitize their real relationships, they wouldn't be very popular.


Those are the exact words from Mr. Zuckerberg himself. He said he didn't create Facebook to help you find "friends;" he created it to digitize real relationships and allow you to keep in contact.


Regardless of who said the words, have they ever been used to market Facebook to users?

The jargon used makes it sounds like a conversation with contemporaries, colleagues or investors.


I'm not sure if it's possible to get enough traction for a general purpose recommendation system. Could this be a service that you offer to providers of niche sites? In terms of machine learning, I think you need to look at algorithms that work well with sparse data. This is particularly important in the beginning, but even later you might not have much data in every single field of interest. Good luck.


its called digg.com and about a dozen other sites I've seen. You could even build most of this in Ning for free, right now.

My point being that I don't see anything in there that really evolves the social content sharing platform in any significant way.


sounds like it can be taken care of by mahalo. machine learning < human sorting. i'm also not seeing a business model.

or... i might have said that because i want to discourage you from using this great idea and save it for myself... hmm...


Mahalo is trying to reinvent search by letting people chose the results, and that's not going to work. Using Google, I can manipulate the "machine" to give me precise results.

This idea is about finding new things based on things you already know. It's not possible to manipulate the Google "machine" to discover things that are similar to what you know you like.


Calacanis may be a dick but he has proven himself (to the tune of $30M or whatever).

I don't use Mahalo but I have no doubt he'll be able to monetize that kind of business and eventually sell out for a decent payday.


i'm also not seeing a business model.

Could say the same about Mahalo?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: