
Ask HN: Visualizing how some HN submissions lead to other related submissions? - asadjb
Here&#x27;s something on the front page right now as I&#x27;m writing this question:<p>[What&#x27;s been wrought using the Piece Table? (2014)] https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=15387672<p>It came up after a few hours of this other article talking about Text Editor data structures:<p>[Text Editor: Data Structures] https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=15381886<p>Say I wanted to analyze how some HN submission naturally lead to other related ones, and find some statistics for these submissions, like how long does it take for one &quot;seed submission&quot; to result in multiple other submissions of a similar time.<p>Is that already done? If not, how would one approach this. I&#x27;m looking for algorithms and analysis techniques that could potentially answer these questions. Given that my knowledge of data science and statistical analysis is ZERO, I&#x27;d love some pointers.
======
nerdponx
You might want to define a "similarity metric" between posts, or locate posts
along some continuum. Then you can look for autocorrelation or temporal
clustering in post contents.

You could, for instance, build a topic model or word vector model on post
replies, then locate post titles in the space you fitted.

------
noncoml
Note that it’s not the joy the submissions but the comments as well that lead
to submissions.

But I guess that would be even more difficult.

~~~
asadjb
I agree. It's people interacting over the comments sections that probably
leads to those new submissions. But from what I've seen (anecdotal evidence)
the submissions are at least somewhat related.

What I'd like to see is the relationship between the different posts, not
necessarily how they came to submitted in the first place.

Sort of like segmenting posts together based on subject. And then figuring out
if one of these was the reason why the other came up.

------
PaulHoule
Definitely there is a "me too" effect where people pile on with what they
think is going to be popular.

