

Ask HN: how to visualize a mailing list as a threaded conversatoin? - shurane

I liked the threaded conversation style of Hacker News and Reddit, especially the online interfaces for them. Google Groups, however, isn&#x27;t as nice. I would like to not migrate off of Google Groups if possible. I&#x27;d rather consume a website or service that visualizes Google Groups. The emails are there, anyway. Any suggestions?<p>Reference mailing list: https:&#x2F;&#x2F;groups.google.com&#x2F;forum&#x2F;#!topic&#x2F;coderdojonyc<p>I have threaded conversation disabled for the time being -- figured it would be a better way to track conversations, but it hasn&#x27;t been really.
======
malandrew
The algorithm you end up with for a particular message format is going to be
very dependent on the meta data you have attached to each message. Given that
"It depends..." disclaimer, I would start here:

[http://www.jwz.org/doc/threading.html](http://www.jwz.org/doc/threading.html)

With respect to GG in particular, you can't really "rethread" an unthreaded
medium via message metadata, but you may be able to do so via "replied to
message data" in the form of quoting.

I would go about parsing messages to extract quoted text, then I would use the
quoted text to determine the "parent" message to which is being replied to.
98% of the time, there will only be one parent, but you need to make sure that
your data format allows more than one parent in the case that someone has
replied to two messages at once via cut and paste quoting.

In such a case, determining quoting is going to be far more challenging and
computationally intensive. You'll have to figure out a few heuristics, such as
maybe limiting the search space to "only search messages posted between the
post time of the current reply and the last message a user posted to a
particular original ancestor message."

You may also try to figure out the replying style of each user to figure out
what text in their message is likely to be quoted. i.e. Are they a top
replier, a bottom replier or an inline replier. Labeling each user according
to their reply style could speed up the identification of the parent
message(s) much more quickly.

