> This by far is by biggest pet peeve of reading online: an article can't even be shown in full. It has to be broken up into disembodied sections, with ads, newsletter signups, or a block of links appearing every few paragraphs.
Absolutely. I'm actually a little worried that with the growing popularity of utility-first CSS and build tools that replace semantic information on HTML elements with randomly generated values, it's going to be increasingly difficult to have filter lists remove these.
Currently the selectors in Fanboy's Annoyance List target semantic attributes, e.g. "newsletter-widget", but already there are a lot of sites that don't contain any semantic info around the sections that aren't a part of the content. The BBC website is one example, they have a list of related links between paragraphs of text on many articles, e.g. https://www.bbc.com/news/uk-57464097. This is the HTML markup:
<div data-component="unordered-list-block" class="ssrcss-uf6wea-RichTextComponentWrapper e1xue1i84">
<div class="ssrcss-18snukc-RichTextContainer e5tfeyi1">
<div class="ssrcss-1pzprxn-BulletListContainer e5tfeyi0">
<ul role="list">
<li>[related link 1]</li>
<li>[related link 2]</li>
<li>[related link 3]</li>
</ul>
</div>
</div>
</div>
These are just related links randomly inserted into the content, but no clear attribute values marking them as such, making automatic removal difficult.
> If that changes, I still have the option of going to FF's "reading view" and Print to PDF from there.
FF's reading view does well on a lot of articles. It handles the BBC situation above quite well, but not others (see for example the screenshot in our article with the newsletter signup).
I'm not a web developer (more like a hobbyist), so I never used tools like Webpack to run 'builds'. Someone recently explained this phenomenon to me when I asked about the nonsensical class names I was seeing on some websites.
I now understand that those machine-generated classes are designed to prevent styling conflicts on modern websites where multiple developers' work may appear on a single page. But with adblockers currently set up to detect cruft via class names, I'm thinking that the clock is ticking on how long these lists remain accurate.
Absolutely. I'm actually a little worried that with the growing popularity of utility-first CSS and build tools that replace semantic information on HTML elements with randomly generated values, it's going to be increasingly difficult to have filter lists remove these.
Currently the selectors in Fanboy's Annoyance List target semantic attributes, e.g. "newsletter-widget", but already there are a lot of sites that don't contain any semantic info around the sections that aren't a part of the content. The BBC website is one example, they have a list of related links between paragraphs of text on many articles, e.g. https://www.bbc.com/news/uk-57464097. This is the HTML markup:
These are just related links randomly inserted into the content, but no clear attribute values marking them as such, making automatic removal difficult.> If that changes, I still have the option of going to FF's "reading view" and Print to PDF from there.
FF's reading view does well on a lot of articles. It handles the BBC situation above quite well, but not others (see for example the screenshot in our article with the newsletter signup).