Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
Show HN: A dataset of all HN submission texts (2006-2024) in Markdown
(
huggingface.co
)
1 point
by
shutty
7 months ago
|
hide
|
past
|
favorite
We're at nixiesearch.ai building a yet another search over HN, but we found no public datasets of the actual submission texts available - so we scraped one!
TLDR: 2.1M texts, around 55% of all stories still available online.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: