Was anyone else shocked to be reminded that a few of these top stories happened in 2023 and not years earlier?
I thought the Silicon Valley Bank failure was 2022. It feels like so long ago. Crazy how long 2023 felt like (at least for me).
Also interesting is that the number of Twitter posts were cut almost in half this year. Despite what a certain CEO claims, all indicators seem to affirm that the popularity of that platform has decreased significantly in 2023.
Creator here. I had forgotten all about the bank collapse/contagion week until I put this together. I remember a lot of discussion around if the actions were necessary, or if it amounted to a bailout. By one measure, it's a great thing that SVB seems like a distant memory and not an ongoing concern.
When I first saw that twitter.com stories drop off like that, I thought it might be because of links going to nitter or similar, but that didn't seem to be the case. No telling what the Twitter diaspora will result in.
One of the other interesting domain dropoffs happened between 2019 and 2020 (not shown on that view) when nytimes.com seems to have received some type of penalty.
It says the number is a measure of users who posted their first comment or submission in 2023. So that, at least, is the correct way to read the graph.
That is interesting! Especially considering the true number of signups would be higher based on users who signed up but have never posted anything. Although we dont know how many are spam accounts that signed up, posted, and got banned.
Creator here. I'd love to know the number of accounts created, but that is not in the official data and I couldn't figure out anywhere to get it. It should be possible to get this data by crawling profile pages, but there's no comprehensive list or way to find them. AFAIK there is also no exposed incrementing user id that might hint at how many users were created between two dates.
> It says the number is a measure of users who posted their first comment or submission in 2023.
This number should exclude bots users that had their first posts or comments flagged or removed, i.e. bots and spammers. I'd guess that there'd be a lot of these.
And, I thought I needed to cut down on my time and engagements on Hacker News. A lot of people seem to be spending way too much time here. The COVID-19 got me sucked in here and I like this realm.
I actually made many more than that because I screwed up a few times. Might clean up the compiled data and put it on Kaggle (GitHub? Torrent?), or anywhere else if people have suggestions.
this is amazing -- thanks. side note: when you want to get more data you can just get the new data, and save it as hacker-news-part-2.parquet - then when you use duckdb / malloy you can read them as a glob (hacker-news*.parquet) - and it is if you were reading one file.
Great tip. I used this little project as an exercise to some of learn DuckDB and Parquet. Still thinking about the best tradeoff for file format. An extra complicating factor is that items can change after they are posted so to get the most recent data, it's a good idea to re-request at least some range of historical items.
I thought the Silicon Valley Bank failure was 2022. It feels like so long ago. Crazy how long 2023 felt like (at least for me).
Also interesting is that the number of Twitter posts were cut almost in half this year. Despite what a certain CEO claims, all indicators seem to affirm that the popularity of that platform has decreased significantly in 2023.