Hacker News new | past | comments | ask | show | jobs | submit login

Fascinating. What did you do for scraping? I built something similar but to do sentiment analysis on news (it is shamefully slow https://maudlin.standingwater.io/ ). I used scrapy to get the articles and post them to a postgres database that flask reads.

I think I'm going to take some design queues from you and simplify my system. Word clouds are cool but too intensive on the $5 server I pay for.




The whole thing is written in Go on my end. Ingesting new headlines is handled in a goroutine that spawns within the process every 30 mins using a combo of the wonderful gofeed (https://github.com/mmcdole/gofeed) and colly (https://github.com/gocolly/colly) libraries.

When loading the front page, you're loading a 1-minute-cached HTML page of it that was constructed out of headlines already in my PostgreSQL database that were put there by the ingestion goroutine.

I like the idea of word clouds actually, I think you're on to something there. I think you just need to pre-generate them rather than doing it adhoc (if that's what you're doing here) for speed. Additionally, perhaps consider using sentiment in a way that orients stories based on positive and negative sentiment. Right now I am not seeing how I as a visitor/user can act on the sentiment analysis as it is presented now.

It would be neat to see a collection of uplifting stories grouped together through the sentiment analysis.

Anyway, food for thought. I hope you keep hacking away on it as it's just good fun to build things.




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: