Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: Essays curated for you with machine learning (findka.com)
53 points by jacobobryant 78 days ago | hide | past | favorite | 7 comments

Hi. I pivoted to this recently from a cross-domain recommender system (see previous Show HN[1]--it could, for example, recommend movies based on which books you like). Long story short, people thought the idea was interesting (landing page converted at ~7%, and I got hundreds of signups), but the recommendations weren't accurate enough to get people to stay (retention was nonexistent; I also did some user surveys)--and I wasn't confident that I'd be able to make the recommendations much better with the resources I have. I was originally hoping that Findka would be good enough at least to sustain growth so that I could improve the recommendation quality gradually over time, but that didn't pan out.

Findka Essays is a simplified version of the old app. It recommends only essays (instead of any type of content), and you get recommendations through email only, not through a web app. I record which links you click on, and links in future emails get picked based on your click history.

Since the domain (essays) is much more restricted, a more-or-less randomly selected item should still have a decent chance of being a good recommendation (especially since right now I manually curate most of the essays). So I'm hoping that the system will be worth using even before we have lots of data to feed to the algorithm, unlike the old Findka.

Also, I genuinely love essays (reading and writing). I think they are the most valuable form of content, and I think helping good essays spread further would have a big impact on the world. So I'm pretty excited about this pivot.

For those interested, I've written a detailed description of the architecture here.[2] The web app is written in Clojure, using Biff[3] (a web framework I made), and the recommendation algorithm is a ~100 line Python file that uses off-the-shelf k-NN, with some of my own additions for handling exploration and countering popularity bias. See [4] for details.

[1] https://news.ycombinator.com/item?id=23541840

[2] https://findka.com/blog/essays-implementation/

[3] https://findka.com/biff

[4] https://findka.com/blog/essays-implementation/#recommendatio...

I really like the idea but one thing that puts me off is if most of the essays end up being blog posts from the last few years.

I'd be more interested in being sent a selection of recommended classic essays. E.g. by people like Orwell or Montaigne. You could populate it with essays found on project Gutenberg, or from high quality magazines like London Review of Books.

Also the examples are very tech focused. Are you planning to include other areas?

Most of the essays currently are indeed blog posts from the last few years. For now I'm planning to rely on user submissions for the catalog--if you have some favorite classic/non-tech essays, you could submit them (there's a URL input box after you sign in).

Before today there were about 60 submitted essays; we're up to 117 now. So the recommendations going forward probably won't be dominated by my own submissions, but they will likely still be tech-focused, given the (initial) HN/programmer user base. Eventually (after we have enough users/submissions/click data), Findka should adapt to your preferences quickly. i.e. even if most of the catalog is tech-focused, the algorithm will look at what essays you've submitted and select recommendations from the appropriate niches. But at this early stage, the recommendations will be sampled more-or-less uniformly from the entire catalog.

I'm excited to try this out. I'm a fan of essays as well and always looking for high quality content.

Awesome, let me know how it goes.

Love the idea. I think that a pure collaborative filtering algorithm has some limitations (cold start, etc). Maybe a hybrid CF + content-based approach will be more adapted for essays?

I'm skeptical that using content-based filtering will make much of a difference at this stage. If I were ingesting a lot of articles from some external source(s), content-based filtering would definitely be necessary. But I think we'll get better results from manual curation. As of now, about ~90% of the essays in the database were submitted by me--so cold start shouldn't be a huge issue, at least for people who share my tastes.

But at some point, I'll definitely do some A/B tests with content-based filtering.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact