

Show HN: gi.st – the gist of the web - stkbach
http://thegi.st

======
inaseer
The web is a big place. What's the strategy around seeding it with useful
gists so it can gain critical mass? Have you thought about focusing on a niche
(say, TED talks?) or possibly automatically summarizing pages? The latter
could be tricky as incorrectly or poorly summarized pages would detract users
from the site.

~~~
stkbach
It does automatically summarize any text heavy articles.

For example:

bbc.com/news thewashingtonpost.com huffingtonpost.com reuters.com cnn.com
theglobeandmail.com theguardian.com en.wikipedia.org

Go to any of those as a start and submit one of the articles you find and
everything suggested by @gist was generated by an algorithm.

Our hope is that will help get the ball rolling. We are certainly very subject
to network effects, so focusing on a niche and providing tremendous value to
that one vertical is a good strategy.

~~~
inaseer
Ah, your landing page should make that abundantly clear. That's a very cool
feature.

I tried a couple of articles on Washington Post and the algorithm did a decent
job. I then read the gist first before reading an article, and while some
sentences were a little hard to understand w/o the surrounding context, I felt
I still got a decent summary. I can see myself skimming the article through
this service instead of relying solely on my eye balls when scanning through.

Couple of suggestions:

\- Linking the extracted sentences back to the original page if I want the
surrounding context \- A tool/browser plug-in which can allow me to select a
representative sentence from the story and submit to gist

Seems like a useful stand alone service as it is; having people vote on and
submit gists themselves would be cherry on top.

~~~
stkbach
Thanks! There is a browser plugin planned that will show you the gists of any
url you navigate to, and additionally could show the gists of links on a page
before you click on them...

------
bullfrog8055
Hey, seems neat, as a student it makes me wonder if it could help me as a
search tool ... if it summarizes an article for me and 'the gist' is exactly
what I need for a paper, can gist help me find other articles with similar
'gists' so I can use these in my paper?

~~~
stkbach
The real purpose of gist is to help you decide what not to read. It's a better
form of skimming.

------
granolagirl
A gist 'filter' or something would be easier, just on all the time, instead of
cut and paste the url. easy if it were a pop up that kicks in automatically
everytime you go to a new web site or hover over a link - could fade out after
5 seconds if not hovered over ?

~~~
johnny_r
Agreed granolagirl. Just a masking of the site should Gist be turned on, with
a quick link or click away feature to the source.

First tests are pretty damn impressive though. Gist'd (is that even the right
verb?) a couple of wiki pages and found the summary quite useful, and
surprisingly, grammatically correct. Will stay tuned for sure.

~~~
stkbach
:) it's grammatically correct because @gist uses an extraction based strategy.
(ie, an existing representative sentence). If you want an abstract summary
(ie, a new representative sentence), best to ask a human for now.

------
mikerocket
This will save me many hours trying to skim the goodness out of long reads.
The chrome extension will be key for adoption.

I wonder what the feedback will be from big publications? —maybe they will
take the hint and write more concise content...

------
cachichas
This is awesome! No more TL;DR. The plugin or extension idea is also great and
would be much easier to use and avoid the copy paste of urls. Congrats!!!

------
tamateboxgold
I like the concept and it would be great if we could personalize this app a
bit more. For instance, I want “History” tab which shows all my gists.

------
DaveJorg
Great work. Very cool. A progress indicator when the content is being parsed
would be nice.

------
anagramy
Be nice if there was a "Latest" tab or something that showed what had recently
been gisted.

~~~
stkbach
+1 on the list!

------
telephoto
Do you have plans for giving it the capability of analyzing images as well as
text?

~~~
stkbach
In the distant future, but as of right now we're focused more on text and
getting humans to start participating. The machine learning approaches we're
using are getting better every day, and the latest deep learning techniques
are showing promise with pictures, so that is certainly a possibility down the
road.

------
studio364
It tots works! Super cool, I could see myself getting addicted to this tool.

------
granolagirl
It tots works! Super cool, I could see myself getting addicted to this tool.

------
norednoproblem
I wish it didn't offer so many 'gists' \- three is enough

~~~
stkbach
Ya, it returns about 1/6 of the sentences in the documents as gists. For an
item with 12 sentences, 2 is too few, but for something with 100 sentences, 17
is too many.

We'll probably change the ratio to scale in a smarter way rather than a
constant fraction.

------
telephoto
also, it seems to just be reiterating the Title of an article, can it delve
into an article to summarize the content of the article despite what the
author has titled it?

~~~
stkbach
@telephoto...

goto one of the following sites: www.bbc.com/news, washingtonpost.com,
huffintonpost.com, reuters.com, theglobeandmail.com

On those sites, find any text article that's medium+ size in length, and
submit the article (not the home page of the news site) to gist.

You should see that it's not just reiterating the title.

On many sites it's quite buggy, and what you're seeing is likely the result of
gist either a.) attempting to gist something when it shouldn't (like say bbc
news home page), or b.) attempting to gist when the scrape/parse failed and
there was too few or poorly parsed sentences.

We do extract the html title and stick it in at the top of the page as a
reference, but that title, like each gist, is editable by the community, and
serves only to act as a jumping off point.

------
timothystvs
Very nice. I'd be interested in playing around with the api.

~~~
stkbach
Ya, the website shown here doesn't have privileged access to the api. So,
short of some basic rate limiting, we can release the same api the website
uses to the public as a start.

------
mirror2mirror
good start! :)

