I extract the title, headings (h1,h2,h3), and some meta data from the page conte...

PaulHoule · on May 12, 2023

So that is how you deal with the length limit? Did you just make up a list of tags?

kirubakaran · on May 12, 2023

Yes, I played around with sending first n chars from the web page text etc, but found that sending headings is to pick the tags.

I extracted the list from here as the starting point: https://lobste.rs/tags I spend a lot of time on HN haha, so I was able to expand on that list and I think the current list is pretty comprehensive. I can share the full list if you're interested.

suvasco · on May 13, 2023

I'd love to see the full list. Also, is there a way to filter posts using the chrome extension to exclude some tags?

kirubakaran · on May 13, 2023

Excluding via browser extension is doable. We'd need to:

1. either add a ui element to each tag to let the user exclude, or create a text input perhaps at the bottom of the page where the user can enter the tags they want excluded

2. save the above in local storage

3. after the tags for the page are fetched from the backend at https://gitlab.com/histre/hn-tags/-/blob/main/tags.js#L60 loop over the stories and hide the ones that have any of the tags

PRs welcome :-)

Here is the full list of tags as of now:

  a11y, acquisition, ai, algorithm, android, announce, api, apl, art, assembly, audio, auth, bitcoin, book, browser, c, c++, clojure, cogsci, compiler, compression, compsci, cryptocurrency, cryptography, css, culture, database, debugging, design, devops, distributed, dotnet, drugs, economy, editor, education, elixir, elm, emacs, email, energy, environment, erlang, ethereum, event, exploit, finance, fortran, freebsd, games, geography, golang, graphics, hardware, haskell, health, hiring, historical, interview, investment, ios, ipv6, java, javascript, job, julia, knowledge, kotlin, language, law, layoff, legal, linux, lisp, lua, mac, math, medical, ml, mobile, music, netbsd, networking, news, nix, nodejs, nuclear, openbsd, opensource, osdev, parallel, pdf, performance, perl, person, philosophy, php, physics, plt, politics, practices, privacy, productivity, programming, prolog, psychology, python, release, research, reversing, rss, ruby, rust, scala, scaling, science, security, shell, show, slides, space, startup, swift, systemd, tesla, testing, transcript, twitter, unix, video, vim, virtualization, visualization, wasm, web, webapp, windows, zig