

Show HN: See what words are trending on HN - hpvic03
http://hackernewstrends.herokuapp.com/

======
hpvic03
This is a little something I did for fun. The trending hash tags on twitter
are an interesting metric for what people are talking about, so I thought it
would be cool to make something like that for Hacker News. But we don't have
hash tags, so the app just uses words.

Note that you can click on a word to get some of the posts it's mentioned in.

Edit: Also, a heads up -- it's running on a Heroku free dyno, and it's already
feeling a little slow.

~~~
kyyd
Good stuff! This would be awesome as a word cloud.

~~~
hpvic03
That's a good idea. I could do that.

------
tomasien
I posted this on Twitter this morning, but watch
[http://www.google.com/trends/explore?q=bitcoin#q=bitcoin&cmp...](http://www.google.com/trends/explore?q=bitcoin#q=bitcoin&cmpt=q)
and the chart in this post carefully if you bought Bitcoin speculatively (ie,
not because you actually care about Bitcoin or want to be long on it). When
the Google chart drops and this goes off the top 5 on HN but the price hasn't
plummeted, you're at a mid-term peak and you should sell.

The price of Bitcoin is being driven by media and hype right now - deserved or
undeserved, in the near term there WILL be a dip in that and the price will
drop for no "good" reason. Keep an eye on that if you're looking for a peak at
which to sell, or (maybe more importantly) a bottom at which to buy.

------
gabriel34
Couple of suggestions:

Exclude commonly trending words such as Google. It isn't trending if it's
already huge.

I've got '[' and ']' for the last week, exclude these too.

EDIT: As a matter of fact something seems to be wrong if Google is classified
as trending. There haven't been a big surge in posts about Google in the
recent past. Either something is wrong, I am wrong or the algorithm is still
training (e.g.: learning what is the normal rate of appearance of certain
words)

~~~
hpvic03
I've added [ and ] to the filter list, though I'll have to remove some stuff
from the db to get rid of the past results.

When I was developing it I saw Google trending all the time. I think Google
just comes up in conversation on HN very frequently.

I suppose I could remove Google it if it doesn't have meaning. But maybe it
does. This is interesting: if Google stops trending, perhaps that's a canary
signal that it's not relevant anymore.

~~~
alanctgardner2
I think what the parent means is that the derivative of the appearances is a
lot more interesting than the number of appearances (and possibly as a
percentage of the total number of appearances). Google moving from 100 to 105
isn't very interesting, but a new language going from 0 to 10 mentions might
be very significant. (edit: and Google owning x% and staying there isn't very
interesting either)

In other words, frequency of occurrence is interesting, but statistically
unlikely occurrences (more or less frequent than expected) is even more
interesting.

------
dmunoz
Fun to browse, and I could certainly see this enabling me to notice some
interesting things I had missed on HN.

Two minor things I noticed:

Currently, 7 is listed as tending now with seven mentions. Not that numbers
trending could never be interesting e.g. if 600 were trending due to the
discussion about the lowering of the prime gap [0], but 7 seems to be trending
just because of submission titles.

Both [ and ] are tending in the past week. This seems to be due to submission
titles tagged with e.g. [video], [pdf], [<year>].

[0]
[https://news.ycombinator.com/item?id=6784383](https://news.ycombinator.com/item?id=6784383)

~~~
hpvic03
I've added [ and ] to the filter list, so they won't be included in the
future.

It's true that there is a bit of noise. I think it's pretty good overall
though. There is a filter list of 866 words that greatly improved the results
after I implemented it.

------
nmc
Wow! Top5 has "bitcoin","income","bitcoins", and "bank".

Is HN all about the money?

EDIT: also, very neat work! However, you should consider excluding "]" and "["
from the list.

~~~
hpvic03
Thanks. Haha, it seems so, at least this week.

I just added [ and ] to the filter list.

~~~
nakovet
Filter out words that are less than 3 characters maybe?

------
Dru89
Right now "ockhams" is trending with 17 mentions and "razor" is trending with
15 mentions.

Does that mean that someone just said "ockhams" twice?

~~~
route66
You just did. And it shows in the stats I happen to see right now.

------
sylvainkalache
Hnwatcher.com provides graph that shows numbers of time a specific word was
mentioned on HN over time.

------
antonius
Nice work. Before clicking, I knew bitcoin would hold the top spot. Kind of
hope this craze dies down a bit so more non-bitcoin stories can appear.

------
pluralVision
Are you using Lucene or Elastic search for indexing words?

I would be great to read about your project more.

~~~
hpvic03
No, it's nothing fancy. It's just your standard Rails app. It saves a record
for each word found along with time data, then runs queries grouping by word
count and filtering between now and whatever time you choose. I'm sure that's
a naive implementation and the app could certainly be optimized.

I could open source it if you really want to take a look.

