

Slaves of the feed - This is not the realtime we've been looking for - ThomPete
http://000fff.org/slaves-of-the-feed-this-is-not-the-realtime-weve-been-looking-for/

======
fragmede
So how do can we solve this problem of drinking from a firehose?

I think a system like Google Reader system might have some traction here,
wherein friends can recommend items to each other. With enough friends, you
could make a metafeed out of those items. Make it a new service; you get paid
for drinking from the firehose, and you pay to get human-filtered feeds. Throw
some 'liked/hated this firehose drinker' and a pile of people at it, and you
get interest spaces.

~~~
metajack
There are decades of research in information retrieval that can easily be
applied to this problem. Before Google showed up with PageRank, many
algorithms were based on properties of a given document and the historical
properties of the set (example, TF-IDF).

You don't need to drink from a firehose. The firehose will be throttled and
filtered to given you the useful information that you want. This is where the
future of real-time search lies.

Disclosure: I'm CTO of Collecta, one of these real-time search companies.

~~~
evlapix
Search clearly isn't the answer. Can you create a search term that encompasses
all of your interests? I hope not.

Surely filtering the "fire hose" with the query "programming news" will not
yield satisfactory results.

------
tdoggette
He correctly identifies a problem, but no amount of interdevice communication
is going to tell me which 5 of boingboing's posts are most likely to be of
personal interest.

Here's what'd help me with the bottleneck: a unified inbox with a wicked smart
relevance algorithm. VR? Probably not.

~~~
JulianMorrison
We've had plain old naive-Bayesian filtering for ages - I'm surprised more
feed-readers don't use it.

------
sunkencity
The feed items that truly contain _new information_ is not one trains for
traditionally with bayes or so, because it would imply something that has not
been seen before, rather than just being similar to something already seen.
Searching for meaningfully _different_ stuff rather than _similar_ stuff is a
whole different problem.

I'd like to have a treemap view of google reader rather than a list, and where
I could visually, in real time filter the feeds by adding negative and
positive keywords/tags, and click around in various levels of a treemap.

Click on "technology" => make a new treemap of significant tags in that area,
click on the next tag to dive into that space etc, and eventually see all the
feed items somehow. Perhaps present them in all levels by some sidebar or
hoverthing. Color code items for popularity / activity / freshness.

I'm thinking something like dabble.db but with a treemap interface and
cappuchino js interface.

~~~
evlapix
That sounds like a great interface. But it doesn't stop you from having to do
all of the filtering/categorizing. Maybe if after having done all of that work
on your end, there would be a nice way to share that with followers.

~~~
tdoggette
Nah, the only categorization that really needs to be done is by topic. It's
easy to tell if something is, say, a tech article, or tech/programming, or
tech/linux, based on source and keywords. If your algorithm did that once per
item that's in anyone's feed, everything else is refinement. Ideally, a human
won't need to touch it at all. Even if the machine gets it wrong, it could be
corrected with a "this is in the wrong place" button. In any case, an article
shouldn't rise too high in any category that it's not relevant for.

~~~
derefr
I'd like to see the categories automatically created via SVD. I wonder how
people would react to the knowledge that their favorite posts are written by
30-something males on Saturdays, and include the word "has" twice as often as
the word "my"?

------
NathanKP
I don't think computers can help anyone find truly interesting things. An
algorithm would have to be really complicated to discover the fact that just
because so many people are interested in New Moon, or Tiger Woods, that
doesn't mean I am.

When "relevancy" algorithms are attempted they don't do nearly as good a job
at serving niche groups as "community" moderated algorithms such as HN.

HN is an algorithm, in which each person's brain is an equation that estimates
the "worth" of piece of information. As a result the final product is an
extremely complicated result of "equations", as it were, that are good at
finding relevant information.

I don't expect computers to be good at doing that for another ten years or so.

~~~
jgrahamc
_An algorithm would have to be really complicated to discover the fact that
just because so many people are interested in New Moon, or Tiger Woods, that
doesn't mean I am._

A Naive Bayesian text classifier could easily learn that. If you fed it news
stories that you are interested in it would quickly discover that you never
click on stories about Tiger Woods.

It would then classify incoming news for you.

In fact, my old email project POPFile has been adapted to support things like
RSS and NNTP filtering with ease.

~~~
NathanKP
That would work to eliminate "noise" info but such algorithms still do a poor
job of helping one find new content and subjects that you have never expressed
interest in before. I still trust groups of real people to help me in this
regard more than I would trust an aggregator.

~~~
derefr
Theoretically, you could have a (lazily-evaluated) subscription to the entire
universe of discourse. It would start out completely hidden, but would
gradually become visible in your reader as your filters trim down the stream
to a tractable level. Thus, you wouldn't so much be saying "tell me something
I don't know," but rather "tell me everything, and leave out the 99.9999% I
don't care about."

------
ThomPete
The idea is that interdevice communication creates the knowledge of context by
logging everything you do but don't have the ability to to do. I know it needs
more thinking but there is something there imho

------
ThomPete
Lots of great comments, thanks so much.

