Hacker News new | past | comments | ask | show | jobs | submit login
Show HN: A userscript that adds archive URLs below the paywalled HN submissions (github.com/mostlyemre)
63 points by MostlyEmre on Nov 29, 2022 | hide | past | favorite | 46 comments
This userscript adds archive URLs to the metadata section of HN submissions without breaking the immersion. Here are 2 screenshots: https://imgur.com/a/PdUu6oG

GreasyFork: https://greasyfork.org/en/scripts/452024-hacker-news-anti-pa...

Source code: https://github.com/MostlyEmre/hn-anti-paywall

Now let me overexplain.

-Why?-

I never liked paywalled articles. I understand where they come from, but I don't like where we cross our paths.

This is why I don't use major news aggregators anymore. Instead, I spend my "catching-up-with-the-world-time" on Hacker News. However, Hacker News (HN) also has its fair-share of paywalled articles. (Around 11.6% according to my short-lived, half-assed attempt at measuring it. See my super old data https://hpa.emre.ca/ I tell the story below.)

-First try-

Around a year ago, when I ran the above experiment, my goal wasn't to run that experiment. It was during my self-teaching & career-changing process, I decided to build a React HN clone. To make it stand-out from the bunch, I added a paywall feature. It would detect paywalled articles and would add an archive URL into the metadata.

The issue with archiving is unless someone archived the link before on the {archiving-project} then the link is most likely not archived. So me sending people to those projects meant nothing. It kinda meant something for me from an ideological standpoint but I assume you are not me.

This rubbed me the wrong way. I decided to build a backend (See https://github.com/MostlyEmre/HN-Paywall-Archiver) that would scan the links and automatically to detect paywalls close to real-time and submit paywalled ones to archive.is for archival. I used Nodejs, Firebase, and React. I was -still am- really proud because I believed it was doing public good in terms of digital preservation. Only 1 person needed to run this script to benefit everyone. As an extra, I was curious on how many paywalled articles were being shared, by whom, at what time. So I also created some analytics functionality to gather the data. And later created a UI to present it.

HN-Paywall-Archiver was great but I stopped running the backend at some point. Because at that point couldn't find a way to continuously run my backend code on some platform for cheap or didn't try hard enough.

P.S. Recently I've been thinking of remaking this version with Cloudflare Workers.

-Hacker News Paywall Archiver Userscript-

After almost a year, I got into userscripts. Super great super awesome concept. People seem to hate javascript unless it is presented as a userscript. So I decided to get my hands dirty to create a simple solution that solves the paywall issue on HN without breaking any hearts.

My solution is not perfect as it had to be simple. But here's the rundown.

Pros:

- Does not beg for attention.

- Simple code, simple concept.

- Unintentionally, indicates which submissions are paywalled without you interacting with anything.

- Not-yet-archived archive links can make you feel like you are contributing to the society after you click on the "archive this URL" button on project page.

- Uses HN html defaults, so I hope it plays well with the HN skins/plugins/userscripts you use.

Cons:

- It doesn't automatically archive the links.

- It uses clone of a static list of paywalled websites sourced from a popular Chrome extension. (https://github.com/iamadamdev/bypass-paywalls-chrome/blob/ma...) So changing the paywall list is slow and manual.

- No guarantees of archived links actually having the archive readily available for reading. Though there are currently 3 projects added, so it should be enough for most links.

So, there you go. I hope you enjoy it. It can break occasionally due to changes in news.ycombinator code, if you let me know on Twitter, I can fix it ASAP. Otherwise you have to wait until I notice that the script is broken, which can take quite a while as I browse HN on mobile.




Heads up this is vulnerable to cross site scripting [1]. If someone submits a link like:

    https://example.com"><script>alert(1)</script>
Then simply viewing the hackernews index page with this extension installed will let the submitter execute whatever javascript they want in your logged in hackernews context - no user interaction necessary.

[1]: https://github.com/MostlyEmre/hn-anti-paywall/blob/main/scri...


I can't thank you enough for pointing that out. Appreciated. I looked into it and pushed an update, hopefully it should fix it. I no longer use innerHTML, instead I now generate links properly via createElement, appendChild, all that jazz.

GreasyFork build is also updated. I recommend anyone who installed the userscript (thanks!) to update.


Does HN allow links like that?


it does... but it's urlencoded so unsure if vulnerable with this user script: https://news.ycombinator.com/item?id=33796527

still a good idea to patch though


a) I'd never run an extension like this.

b) for the auth, yeah, probably a good idea to patch

c) for HN, probably a good idea to sanitize those inputs!


Does anyone remember "BugMeNot" ?


It still exists and the logins often work, at least for me


I thought it became less useful when it no longer supported pay-walled sites, then it disappeared and stopped working all together... :-/

It probably got heavily abused and the sites ended up disabling the working accounts that were useful.


There a “metadata section” on HN submissions?


I didn't know how to call the section where HN shows comment count, submission time, author, domain name, etc. So by "metadata section" I meant the section that holds all the info regarding the post.


Imo these types of thing while probably appreciated by many lead to cat-and-mouse games and probably ultimately to hard-paywalls being more widely adopted (I see them a lot already in fact). So what happens then? The utility of these paywall bypass options diminishes until there’s little of value left behind soft-paywalls


should be the default


OK, I need to inquire about `passTheButter()`. A hat tip to "GreasyFork"? Or perhaps this is about "sliding" past the paywalls?


It is supposed to be a Rick and Morty reference. [1] But I must say "GreasyFork" makes a lot of sense haha.

[1] https://www.youtube.com/watch?v=3ht-ZyJOV2k


That's cool and all, but I'm probably going to make something to hide it. I also hate paywalls, but (almost) all of them are so easy to circumvent that I usually do it in the inspector.


I'm confused, you'd make something to hide the results of a userscript you sought out and installed in your own browser?


Oh sorry, I skipped a step - I was referring to the emerging trend of people adding an archive link as the first comment on HN for many stories, which this is going to accelerate.


Those comments are pinned to the top by the forum with replies disabled. It appears to be a manual process, not automated by user, link, etc based on the comments being pinned.


Indeed, and this is what I would like to hide as it's a distraction for me. I think it would be smarter to put a link in the header instead of messing up the comments.


This feels very ethically icky to me. Folks are working hard to write these articles, and need to get paid.

If you don’t like paywalled articles just don’t read them, I don’t think it’s ethically sound to do this.

Just my $0.02


Maybe they should sort out their pricing model. I'm not going to pay 20 different $10/month subscriptions to different papers so I can read HN and participate in the discussion.

Maybe I would pay something like a Spotify equivalent of 10-15 dollars a month to get access to all of these news websites, maybe even a limited number of articles as I don't browse them without being sent to them from another website, but nothing like that exists. So, they get zero dollars from me instead of some finite amount.

There are also some papers that I particularly enjoy and would like to support them but are just priced too prohibitively.

One of these is the Financial Times, I enjoy their writing a lot, but the digital only plan is $40/month, that's just way too much for someone who doesn't work at hedge funds or SV tech companies.


> Maybe I would pay something like a Spotify equivalent of 10-15 dollars a month to get access to all of these news websites, maybe even a limited number of articles as I don't browse them without being sent to them from another website, but nothing like that exists.

It does, or did… Apple News for instance, previously Texture from Next Issue Media which rocked till acquired and even for a while after.

Unfortunately, publishers keep withdrawing from those things after they're done using the software's popularity to gain addicts. So now you have to "bring your own" subscription to the Apple News app for those publishers that withdrew. It's bizarre.


Maybe Apple should start hiring journalists.


All of your points are completely fine. I think it boils down to "you are too expensive so I'm not going to read your articles". This is completely fine.

I think the problem is "You are too expensive so I'll just take what I want for free". That doesn't feel like a morally defensible position. They are allowed to charge what they want. We are allowed to chose to patronize a publication or not.

I personally use Apple News for this, but of course it doesn't have everything I would like.


Practically speaking, somewhere near zero people were going to subscribe to the Wall Street Journal, but now won't because an archive link gets auto-added to a link aggregator and they don't need to go to archive.ph themselves and search for the URL, or wait two minutes for the top-voted comment to be an archive link added by a human.


I guess I fail to see how this makes it better?

"I'm not going to pay for it, so I should get it for free". I don't particularly see how any of that makes it more ethical.


It you're gonna get all ethical, surly you have also heard, how hard it is to get out of those subscriptions?

How about when an article gets changed (or removed), without archiving sites, how would you know?

At the end of the day, if its "actually" valuable to someone (e.g their livelyhood depends on it or something), they will probably pay for it.


Yup! I'm very careful about my subscriptions. But if I don't agree with the business model, I just don't consume the product. I don't see it as my place to arbitrarily dictate how others should be paid for the fruits of their labor.

Maybe better put. Someone else being unethical doesn’t absolve me of ethics, imho.


"Someone else being unethical doesn’t absolve me of ethics" - Fair point!


Flaws with this idea:

- I don't want 100 different subscriptions. Some newspapers want you to subscribe when you visit them for the first time ever, but beyond an individual news story I have no reason to subscribe to the Tinytown Commercial Appeal & Gazette and am unlikely to ever consult it again.

- An amazing amount of paywalled articles are just reprints of wire service stories or rewrites of press releases/ court documents. I am not gonna pay you for what is already public.

- What about ads? Yes, I am willing to disable my ad blocker...if you have a reasonable ad policy. If you have more advertising than content, and it's animated or insists on getting in the way of the news outlet's user interface, you are destroying your own product. Don't come at me with 'everyone else does it so we have to as well.' Everyone else does not do it, and if you cared about what you were doing you wouldn't either.

I am a poorly paid journalist who gets by on donations, and I don't abuse my readers with ads. Would I like more money, sure. Am I willing to sell my readers time and attention for that, no. I loathe advertising as a consumer and I do not believe that the only viable business models are subscriber lock in or cognitive pollution.


That's fantastic! Are you saying we should force that model on every organization? Or we should not support them until they are good with that model? Or we should steal their content until they adopt your model?


No, I'm telling you why I don't lose any sleep over circumventing paywalls. Many news outlet owners think they run an advertising company with news as the bait to bring people to the billboard. I consider this terrible for society and humanity in general, so I withhold my money. I do pay for products I consider of good quality, but much internet news is not.


You withhold your money and still consume the product?


Ditch the lawyer act. You understood my plain language just fine.


I agree and did not in the past. It's easy to see and judge things from just one convenient/aligned perspective...


I agree that it's a grey area, but how is this different than using an ad blocker?


I get to control what code my computer runs.

They get to decide if the site loads at all if I muck with that.


or heaven forbid with disabled JS or that other heretic method of using reader mode! gasp!


A whitelist maybe that is locally stored? Probably be a good idea. But what you say is correct just life software devs in here who want to get Paid, they too want to get paid. It's strange that we often see here," allow us to pay money and don't show us ads", yet people complain about pay walls too. But, probably sharing a paywall link is not beneficial for knowledge sharing? But other than all this, it helps with deadlines at least.


Paywalled articles should be banned on HN in the first place.

The fact that they even have the slightest bit of exposure on this site is the real icky thing.


I disagree, but IMHO, that's a totally morally sound stance.


I don't think the average HN reader, upon encountering a paywall, is the type to reach for their wallet.

The alternate case here is that the article simply doesn't get read/discussed as much, not that the author actually receives more pay.


Needs to show a project license. Otherwise, pretty cool!


There’s something rather amusing about this comment on a project that’s about stealing content.

(yes, I get that there’s a whole realm of nuance and differing opinions and moral grey areas if you squint enough. I just find it amusing is all)


That’s stretching the word stealing pretty far I feel.


Ah, I thought adding a badge or mentioning the license in the Readme was enough. Thanks for giving a heads up! Just created a LICENSE file.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: