

Show HN: Condensr.com, Yelp Snippet Summarization - csauper
http://condensr.com

======
timr
One of our ongoing projects on the Yelp search team is improving the quality
of our own review snippet algorithms. The paper behind this went around the
team a few weeks ago, and we all agreed that it's pretty cool.

If this kind of stuff interests you, send me your resume -- my HN user name at
yelp.

------
torme
Very cool, I like this a lot. I've seen a lot of aggregate tools with google
maps like this that seem to have a better sense of how I actually search for
things.

Minor criticisms and suggestions:

1\. It'd be awesome if this could expand to all aspects of yelp and not just
the restraurants. I tried searching for laundromat and got no results, but
this search gives results on yelp.

2\. It'd be nice to be able to filter by a few of the snippets. The main thing
I'd like to be able to do is show only negative reviews. Often times I want to
hear what people didn't like about a place more than what they did.

3\. It's not really clear what expanding the snippets does. I _think_ it's
showing similar snippets, but I'm not 100% sure of that. One weird example:

\+ The staff's not 100 % attentive

    
    
        -The dining room is cozy and intimate.
    
        -The deco is colorful
    

I'm not sure how these are related.

But as a whole, very cool.

------
aidscholar
Link to the paper summary for the sentiment analysis used in this app:
[http://groups.csail.mit.edu/rbg/code/content_structure/index...](http://groups.csail.mit.edu/rbg/code/content_structure/index.html)

------
callmeed
I really dig this.

If I wanted to learn to do something similar, where would I start? I looked at
the research paper but I fear I might need some prerequisite steps.

~~~
riffer
These papers are good: <http://larifari.org/writing/>

One of the nice things about them is that they come at the problem from many
different angles. Part of the reason summarization has not been particularly
productized is because for a long time the standard approach has involved
focusing on a narrow domain, training a model, etc. That gets the best results
for the local problem, and focus is great for a startup, but that approach is
prone to over-fitting, and it is not scalable, or extensible. Ultimately, it
has held the whole category back. The solution is probably to take a bunch of
concepts from related fields and combine them within the constraint of a
scalable framework.

That's why I recommend these papers (beyond the fact that they are relatively
approachable): you can almost sense that he's feeling different surfaces of
the problem, trying to map texture, and find the right formula for a great
general solution.

------
brendano
Thanks for putting this as its own domain, folks. I wanted to go to it after
Aria's EMNLP talk but couldn't remember the URL, it went by too fast.

