
Show HN: Summarized Finance / Tech Newsletter Using Natural Language Processing - chidog12
http://getthecrypt.com/how-it-works
======
kakashi141
Amazing work chidog12, I've tied using TextRank for text summarization before,
but the results are mostly hit or miss. I've found that, elements in the page
like "Follow this reader", Share on twiiter, facebook frequently oocur in
these articles and due to the voter based ranking of TextRank , they get
picked up as high ranked sentences. Which of the algorithms mentioned in your
site were most effective in extracting useful summaries?

~~~
chidog12
Sorry for the late reply,

Honestly, we still need more testing.

However, our top 3 tend to usually be Luhn's Heuristic Method, Latent
Sentiment Analysis, and TextRank.

On the other hand, we have yet to use a single summary produced solely using
SumBasic... I'll need to read into that one more.

------
forgottenentry
I'm curious about your semantic analysis process. Do you employ a particular
threshold for neutrality, or does the team provide input?

No doubt neutral reporting is advantageous; personally I would find it helpful
to also see both a highly positive and negative take. Coming at a topic from
both high and low may, imho, improve accuracy of comprehension.

~~~
chidog12
Hmm, I really like that last point you've made... both highly positive and
negative takes, I'll look into that.

For the semantic analysis process, we are using Microsoft's API - scores from
0-1. And for the most part, it is pretty good based on our views on neutrality
(which of course has some inherent biases).

0.5 is considered pretty indifferent, however, we allow for a threshold of
0.35 - 0.75 as neutral, but we specify it's lean on our end.

On Reddit, we've tested our summaries in the comments for tl;dr and we
actually indicate the articles lean for example "This article is neutral with
a negative lean on the topic".

In Finance, top publications do a great job on maintaining neutrality.
However, if we ever get into politics... we'll need to take this a little more
seriously!

Thanks for the comment.

------
theblackcat1002
I wonder how do you choose which news or story to deliver to users? From your
recent link[1] it seems like the news is bit on the short side.

1\. [http://getthecrypt.com/recent/](http://getthecrypt.com/recent/)

~~~
chidog12
lol yea, we are still working on our choosing process. Right now it's really
just our preferences vs the criteria we've set up using NewsApi.

