Hacker News new | past | comments | ask | show | jobs | submit login
Bypass Paywalls Clean for Firefox (github.com/magnolia1234)
98 points by joker765 on March 4, 2020 | hide | past | favorite | 81 comments



I am not sure if I want to use this but I am happy to see an extension like this.

Of course the sites in question are allowed to try and raise money from me, and I feel I am allowed to try to dodge this.

More important however, I feel that these serious news sites with their current behavior are contributing to the fake news proliferation as people keep on reading news but since they cannot read serious news sites they are forced to read facebook's and other crappy sources their gossipy, manipulative and spammy propaganda.

I contribute money to The Guardian because of this, they vowed to keep their serious news site open and reward them a contribution for it, I hope more people will for the same reason.


I have mixed feelings about this. My feeling when faced with a paywall is to simply click "back" - if you don't want me to read your content, that's fine to me. You don't expect to be reading a full-text (say) Harry Potter out of a HN link, and I'm good with that, and I paid for my own copy. But a page that looks like content and then won't show it is bait-and-switch, so no thanks.


I approve this observation.


What I really do not like about this fork is that it requests permission for all sites while the original version from iamadamdev only requests permission for sites the extension actually affects.

They briefly switched to all access (like the linked "bypass Paywalls Clean" above) but came around pretty quick to go back to their less invasive model.

See this issue thread on the original repo: https://github.com/iamadamdev/bypass-paywalls-firefox/issues...


Permission for all sites is needed for adding custom sites. But I also saw a request for a release with limited permissions though (loosing custom sites feature).


New 2nd limited-permissions release was added. Customs sites are not working, but now users have a choice.


I see General Paywall Bypass entry in the addons setting, wouldn't that require access to all domains in order to work?


You're confused with the old repo (General Paywall Bypass crudely copied from new repo though). Clean one has the option to block 4 general paywall-scripts (TinyPass, Piano, Poool.fr and Outbrain). This only requires permission for those 4 sites (also on old repo).


Yeah sorry, I should have specified which repo. I'll stick with the original version for now.

Hopefully this quarrel between the two devs can be settled without requiring a permanent hard-fork.


Well, goodluck ;-) New repo supports a lot more European sites though and custom sites is very usefull to me.


Is this based on the plug-in by iamadamdev?


Yes, major committer switched to clean version. Iamadamdev is just copying his fixes lately ...


Thanks for making this! :D


what's "clean" about this?


Apparently the original extension has Google Analytics embedded in it: https://news.ycombinator.com/item?id=22482333


I've just searched that and it appears that the Chrome version does have a GA embed, the Firefox version does not.

Here is the issue someone filed for it on the original repo: https://github.com/iamadamdev/bypass-paywalls-chrome/issues/...


Adam's Firefox add-on supports fewer sites (even compared to his Chrome extension) and has lots of bugs (even more after reduction of permissions) ...


I don't know. Advertising a fork as "clean"ed from something the original project (the ff extension) never had feels a bit disingenuous, marketing-wise.

I'll stick with the old one as it requires less permissions. It seems to work as expected.

You seem to be very eager to point out the superiority of the fork. Are you involved in the project?


Well, goodluck sticking to an 'inferior' repo. It's a free world after all ;-) Btw Clean postfix comes from Chrome extension. I'm just informing users of the new repo which also has an limited permissions release (custom sites not working then).


...so you are part of this project, right?

That's not a big deal, of course. Usually quite the opposite. But usually people here don't try to deflect the question either, so..

It sounds a bit bitter as well. I'd love to see those projects ending up competing in a productive way!


I'm an user and close observer of this repo. But you're done now 'judging' my intentions?


I questioned your intentions, I didn't judge them - as I can't know them. I did judge your responses though. That's a good thing and something we owe to each other in a meaningful conversation.


Sure, disingenuous and bitter isn't anything like judging ;-) But goodluck with Adam's repo, you clearly don't know what you're talking about.


> Sure, disingenuous and bitter isn't anything like judging

This was about your responses, not your intentions. That's also why I said I judged your response, not your intentions.

Quote:

> I did judge your responses though.

How would I know your intentions anyway?

If you search for "good luck" in this HN thread you'll see what I'm referring to.


If you're 'afraid' to move to a new and better add-on then don't blame the messenger for it though.


Informing users of a better repo is key. Let the code speak for itself.


Refactored extension (no google analytics)/add-on with lots of new sites, bug-fixes, add custom sites and update-notification.

https://github.com/magnolia1234/bypass-paywalls-firefox-clea...

https://github.com/magnolia1234/bypass-paywalls-chrome-clean


Did the original extension have Google Analytics embedded in it somehow?


The original Chrome extension does indeed.


Any timeline for hitting addons.mozilla.org?


It never will, I guess (removed). But signed by Mozilla anyway.


Very usefull extension indeed. Quick fixes and updates. Adding custom sites is a very interesting feature also.


For as much as I'd like to receive the utility of such a thing, I don't see how this is anything different than outright theft and therefore it should not be something promoted by any means.


Clearly, if the websites didn't want you to read the article, they'd not send the content to your computer.

I usually just F12 and remove the offending elements or their styling, so that I can read the article. There's only two reasons that's possible: Either the developer hated that they block information, and left a way through for people, or they're incompetent. I like to assume the best in people and believe the former to be the case.


It’s probably closer to price discrimination: they are obviously happy for people to read what they publish even without paying, as long as it’s only people who would not pay otherwise.

Publishers aim for a barrier that is just high enough so that people with less time and more money would rather pay than circumvent them.

It’s the exact same mechanism as supermarkets using coupons to grant people with lower financial means the opportunity to shop with them, knowing that their regular customers would die of embarrassment if their neighbors saw them clipping coupons.


Clearly, if the websites didn’t want you to download their customer db, they’d not let you inject SQL queries!


Is modifying the data that the server sends to my local machine as part of the server’s expected operation, and then me not retransmitting that data a problem? That’s how a user uses any accessibility feature (dark mode, screen reader mode, magnification/zoom, etc).


>Is modifying the data that the server sends to my local machine as part of the server’s expected operation, and then me not retransmitting that data a problem?

Why not? It’s pretty clear that you’re knowingly doing this in order to exceed authorized access. Why is this different from exploiting a vulnerability?

Accessibility features are not intended to defeat access controls.


If that’s the case, then browsers allowing people to view and manipulate the source of a web page (both initial download and XHR requests) are aiding in that process. If the ask is to walk back decades of being able to do exactly this sort of local manipulation of received data, then this form of the Internet would be like cable or satellite television with permitted forms of interactivity.

Why bring content to the Internet if there is desire to change the rules and norms of how the Internet works? Why not roll a separate version of the Internet with those desired rules? Yes, laws exist to protect copyright, prevent theft, and forbid illegal access, but those laws are insufficiently aware of how the Internet operates with respect to things like client-server communication. In fact, the design of the Internet predates many (if not all) of these laws, and the specifications of how the Internet works are a de facto law expressing how clients and servers interact with one another. Should lawmakers give requirements to the W3C to specify desired and undesired operations on the Internet?

I genuinely ask these questions in good faith: they are simply follow-on thoughts of mine from taking your point as a premise. If my ideas seems extreme or exaggerated, then that’s simply where my mind goes first as opposed to necessarily being criticism.


Except, they're sending all the content to me, I show only a subset of it (that is, the content that they sent me, instead of the nag-box they intended to go on top of it).

I actually thought of making a plugin to do this, but decided against it, since I'd rather just keep reading my free articles than making it so easy for everyone to do that they'd be forced to remove that feature, which is a likely consequence of this repository.


You’re intentionally defeating (weak) access controls intended to keep you from reading the article without paying.

Would it be okay for you to exploit a SQLi bug on the website to access the articles just because it’s easy?


That's an improper use of the term, so I think the analogy falls apart.

You don't "inject" queries, you submit queries as normal. You're "injecting" what you know to be code that will execute, where no code is even designed to execute.


> Clearly, if the websites didn't want you to read the article, they'd not send the content to your computer.

If you use a news website (say the NYT) without plugins like the one linked, it’s blatantly obvious that this statement is not true.

So, if you go out of your way to conceal who you are, or to masquerade as someone/something else (i.e. google bot), in order to trick a web server in to sending you bytes that you know they wouldn’t otherwise send you, don’t be surprised if some people accuse you of stealing.


You aren't "tricking" a web server. This is how the web works by design. Cookies were the kludge, and no, telling me I should maintain state to make your attempts at financialization easier does not give you a moral high ground, especially when you're assuming my hardware is there for your use. That was, and never should be the case, and I will fight to the pain against anyone that asserts otherwise. My computer is not your developer's playground. I already suffer through enduring Ads when I don't feel like spreading my personal info to the 4 winds. One should be happy to get that much.

Those living in glass houses are best advised to refrain from casting stones and all.


The description says "referer set to Google", meaning it's just disguising you as someone who clicked on a Google search result.

This begs the question if altering your browser's User-Agent is theft since some sites actually alter their prices depending on which browser or OS you use it on. Like, if a Mac/Safari user is expected to pay $99, but your Windows/Chrome User-Agent gets you a $89 deal, are you stealing $10?


The website is sending out text for searching on though, making it look as if it is available for reading freely, and then withholding it once you start reading it.

In some cases they let you start reading it then swipe it away once you are invested in the article, which seems dishonest.

I don't know about the legal aspect, but morally I am fine with bypassing restrictions under those circumstances.

I don't have a problem with paywalls provided the companies involved are completely up-front about it (e.g. putting their entire site behind a paywall, or offering some articles for free searching/reading and others only behind the paywall). That seems much better ethically, and also impossible to bypass.

I'm quite happy paying a subscription to the Guardian for news despite (or because of) the fact that none of their news is behind a paywall.


Copyright infringement is simply not the same as theft since with theft the original owner loses their copy (one can look up both terms on Wikipedia for details). Any claim they are the very same with a straight face is dishonest at best.


In common English usage, terms such as "theft" and "steal" and "rob" have more expansive meanings than they do specifically in a legal context.

For example the commonly accepted and used term "identity theft", which does not actually involve depriving the victim of their identity. Similarly, we commonly talk about credit card numbers or passwords or email addresses being "stolen" in a data breach, even though you still have them after they are "stolen".

Actually, even in legal contexts "theft" does not always require depriving someone of the thing stolen. Example: "theft of trade secrets". The person whose trade secret is "stolen" still has the information.


The fact common language is inaccurate does not excuse it from being wrong. Example: the phrase "begging the question" is often misused when "raising the question" is meant.

Identities and data can be copied, not stolen. All examples you mentioned are "theft <X>" or "theft of <Y>" At the very least the term "theft of copyright" is somewhat descriptive, and does not equal it to theft, but that was not the descriptive term which was used.


It's not copyright infringement either. They are basically looking at your user agent features, country or whatever to sell you subscription, it's like upselling or even freemium. It's weird that people even entertain the idea of calling theft or copyright infringement the act of not going through someone's silly selling technique.


Let me rephrase: it is at worst (or allegedly) copyright infringement.


I don't like this claim. It is very semantic. With your train of thoughts we can go very far. How about breaking patents? How about stealing your password? None of them should be illegal because you don't loose anything physical.

Our law system sees theft as something that need adjusting to technology. So I can say with a straight face that in 2020, theft do include reading article that you didn't pay for.


I never said copyright infringement should be legal or endorsed (though it should be noted HN specifically allows posting workarounds for paywalls, but not complains about them); just that it is something different than theft. Calling it theft is dishonest.

> How about breaking patents?

We don't call it theft; we call that patent infringement. We have patent laws for that (in at least US and EU and probably [almost] everywhere in the world).

> How about stealing your password?

We don't call that theft, we call it "breaking into computers" (I only know the technical term for it in my native language, not English). If you use a password to log into an account you don't have legal access to, this is breaking into computers. We have laws for that (in at least US and EU and probably [almost] everywhere in the world).

> Our law system sees theft as something that need adjusting to technology.

No, it does not, it has different laws for all the three examples in this discussion, two of which you brought up.

You clearly do not know or understand how your law works. The respective Wikipedia articles explain the differences.


I don't understand you. If you wanted to specify that each case (theft;copyright) has its own law then you should say that the comparison is "wrong\mistaken". But you said "dishonest".

I took it as you trying to justify something. Like saying that one is not as bad as the other. So I answered that they all can be sued for damages and are the same morally (given law as a base).


In my country (I can't speak for USA), copyright falls under civil law, and theft under criminal law.

The laws and punishments are different precisely because they are not the same morally.

You don't have to compare copyright infringement to theft to make your point. Copyright infringement can be proven to cause damage on its own merit, with its own law.

Articulating these are different does not equal saying the law is insignificant, that's a straw man.

Besides that, we're talking about alleged copyright infringement.

You need to start using the appropriate technology if you want your point to be taken seriously.

If it does not interest you to be correct when you're speaking about law, please refrain from making statements regarding law.


OP isn't saying copyright infringement shouldn't be illegal, just that it's not theft.

Conflating depriving someone of their property with copying someone's work is bad -- it is not the same thing.


The problem is that the publishers want to have their cake and eat it too.

They’ll happily let Google index the pay walled articles but won’t let you read them for free, thus polluting search results.

The expectation is that anything you see on Google (aka what the Googlebot was able to access without authentication) should be accessible for free without login or signup. These websites break that expectation, making search result pages unusable.

Publishers want this behaviour, so they implement their paywalls in such a way that they trigger on browsers but not for search engine crawlers (usually because they are JS-based and simply obscure the content below it). This extension exploits that fact. Publishers are free to implement a true paywall where the server wouldn’t serve any content without auth (thus defeating this extension) but this also means restricting search engines, something they definitely don’t want.

Finally, the “ethical” option of paying for the content is also out of the question for many due to privacy concerns. Many publishers have Facebook, Google Analytics and similar spyware on all their pages regardless of whether you’re a subscriber, so they will still track you except now you need to pay and provide validated billing details.


I'm sure in the past search engines checked for this type of behaviour and punished it. If you give a special 'optimised' page to a bot, but give something completely different to regular people, then that's a failing of the search engine.


Like Quora?


It’s quite a feat of logical contortion to somehow claim a privacy motive to legitimize your unwillingness to pay. Unless, that is, you never buy anything online. Because Netflix and airlines and food delivery and Amazon all run tracking on their sites, and a single vendor with Google Analytics would presumably be enough to let them tie your browser to your identity.

In that case you’d still be excessively paranoid, but at least consistent about it.


I disagree. There is a difference between first-party tracking (e.g. Netflix) and third-party tracking. I’m well aware of first-party tracking and how it can be helpful. However, both third-party tracking and the sale of first-party tracking information is unconscionable in my opinion: therefore, rewarding that behaviour with currency is equally unconscionable to me.


I agree with you. However, if I was going to argue the counter point here, I’d note that I’m pretty happy with the effect that rampant piracy had in forcing adaptation of sensible pricing in the music industry. Enough of this kind of stuff and maybe it’ll force content publishers onto micropayment platforms.


Are you talking in terms of album prices being reduced (have they??) or the ability to buy individual songs and essentially only pay for the parts of a musicians work you like? Encouraging publishers to use their influence over the content creators to focus on churning out the type of music they know works (that pays the most); the same thing everyone else is creating. And leaving those other songs, those that the artist might actually want to make and the reason they do what they do, a financial dead-end. I don't think this is a good long-term plan and I am not happy that this is happening. I'd be equally unhappy if it happened to news, although maybe it's too late.


Unlike the music industry, journalism has been decimated, especially on the local level. Even success stories with world-wide audiences often barely break even. And unlike music, professional journalism is a necessity for a function democracy.

So I don’t quite see the hook for your Schadenfreude here.


It has been decimated but not so much by pirates, but more so by the publishers themselves creating unbearable experiences, starting to give away everything creating a certain expectation (and now trying to reverse but running into the expectations they created), by the publishers being complacent and reactionary, by social media, by social media companies abusing their network effects, by "free" 24/7 news channels eating the newspaper's cake while at the same time discrediting the entire profession with their more-often-than-not sensationalist low quality reporting, by entrepreneurs backing news site "ventures" which then blasted and keep blasting everybody with low quality "free" clickbait, by publishers starting to imitate those low quality channels and sites (and their "free as in data hoarding" unsustainable models), predatory/hostile takeovers of local newspapers by "big media" and the associated "consolidating" of writers staff with adverse effects on quality, by "stakeholders" shifting from a "I am OK with only a so-so profit; this stuff is important" attitude to a full blown "we need to maximize profits at all costs", and the list goes on.

I used to read the German Spiegel weekly, first the issues my dad bought, then later as I grew older buying their magazine. I stopped buying it years back, as it stopped being worth it, as the quality of the articles and investigative work plummeted, and article research turned into op-ed pieces.


I think calling 'peteretep’s emotion Schadenfreude is possibly incorrect [0]. I think all that 'peteretep is wishing for is a similar sort of evolution/enlightenment in journalism to that of the music industry.

We can all agree that professional and ethical journalism is a necessary part of a well-functioning democracy as you point out. We can also agree that journalism has been decimated, especially at the local level, as you also pointed out. The observation that 'peteretep made about music piracy and its effect upon how music is monetized today is a fact: today, we have multiple streaming services (Spotify, Apple Music, SoundCloud, among others) and music stores (Apple Music, Beatport, Bandcamp(?) among others. The situation in the music industry remains imperfect, but 'peteretep’s claim that music piracy combined with new technology acted in concert as a forcing function to move the music marketplace closer to where producers and consumers want it to be seems sound.

I for one consume the vast majority of my US political news from Hasan Piker knowing full well that he is a leftist [1]. He’s a full-time political commentator that streams on Twitch. Ever since I learned about him a few years ago, other political streamers have started streaming on Twitch and other platforms like YouTube. I bemoan the death of the local newspaper, but only because of the lack of coverage. Why can’t people with cellphone cameras stream a townhall or a city-planning meeting and let viewers decide the value of the livestream and any associated commentary? The long-celebrated journalistic institutions today have all in some way, shape, or form been “captured” by the political forces of the day. Did your organization report too much controversial news concerning the political sacred cows? Well then, you lose access, which is a death knell for such burgeoning organizations with high headcount and fixed costs. One last question: who hands out these press passes anyway because it’s not like you need a license in journalism to report the news?

[0] https://en.wikipedia.org/wiki/Schadenfreude

[1] http://twitch.tv/hasanabi


Musicians would disagree.


The only way to conclusively prove or disprove your claim would be to have several musicians/singers/songwriters open up their books to show where their wealth came from. From what I have heard over the years, artists and bands make a greater percentage of their wealth from doing concert tours (ticket sales) and selling merchandise than they do from actually selling music via record labels (I have no proof for this claim). My point is somewhat supported by the fact that some successful artists later going on to start their own record labels: if the relationship with the label was otherwise good and the issue was not money, then why make a significant capital investment in creating your own label?


most of these websites are thieves too, they steal your data, while still displaying as much ads as they can, and they want you to pay to access their content as well, fuck that.


I canceled a honest to god subscription to a newspaper online because they decided that even if I was logged in it was necessary to subject me to dozens of 3rd party trackers.

If all they wanted was information about what articles were doing well it'd be... okay, sure I guess though I wish they didn't decide what to publish based on that. But knowingly allowing their paying subscribers (and we're talking over a hundred dollars a year here for web-only) to be tracked by third parties is going too far.

What's the point in paying for a subscription if they're going to sell my private data for pennies even when I'm giving them decent money for access? It's insulting. Now I just delete their cookies and read for free (and without tracking). I'd rather give them money, but after two years of paying for a subscription I don't feel too bad about it.

I made clear to them why I was canceling my subscription -- if I'm paying you I'm not okay with you double dipping to sell my personal information. People viewing for free, okay, sure, I get it you want to make a little bit at least. Don't subject paying customers to this though.

... I say as I place orders on Amazon.

Oh well. Being half way self-consistent is better than not at all.


> I canceled a honest to god subscription to a newspaper online because they decided that even if I was logged in it was necessary to subject me to dozens of 3rd party trackers.

I also have issues with this on a news site where I pay for. If I pay, please quit tracking me. It provides incentive to subscribe, and shows a honesty in the matter.


I kind of assume it's because news websites don't exactly have the best web designers working for them. Might just be technically hard to make it so that javascript doesn't get loaded when someone is logged in.

But if they can't spend the effort to do that, I guess their loss. Or gain, I doubt they lose many subscribers over this. Probably they just figure no one cares and they're probably mostly right.


I’m pretty sure they hate the ads even more than you do. But margins in publishing are thin, and often negative. Nobody is earning money in that business, yet people are inventing this scenario of exploitation to have something to point to when they need to justify their acts to others or themselves.

Google ran a trial to let you opt out of ads and pay the difference, exactly what people in these online discussions have always proclaimed to be waiting for.

They got around a hundred takers. Turns out, nobody is willing to pay what it actually costs to produce the content they want to read.

Local journalism is already dead in many regions, and there’s a study out there showing that corruption tends to increase in these cities, costing several times as much as the local paper ever did, not even including the non-monetary damages of the hollowing out of society.


Google Contributor still exists today: https://contributor.google.com/v/marketing


Everyone just clears cache to reset the article counters anyway, this just makes it more convenient. If newspapers really want to lock down their content they are free to do so. If you’ve ever seen or used a Bloomberg terminal that is a great example of how to do it. However they know that will never work on a public scale as long as there is a single free source of news anywhere, so they don’t do that.


Unless you have a source for everyone, please don't talk for everyone.


Paywalls are a dishonest idea born from the necessity to game page-rank algorithms.

There is nothing stopping these sites from creating login-only paid portals.

They don't. Why? Ranking and the benefits.

It seems like a "have my cake and eat it too." strategy, and i'm not too keen on supporting it, personally.. even more so given that many of these sites are just simply AP 'spin&regurgitate' sites.

I'm waiting on big G to hit paywall sites harder in rankings, but I don't really see it in the cards. Here's hoping.


This.

Serious websites like the Financial Times just block you from reading articles altogether, without letting you first read the lede or anything.

They still give archivers access because it's an effective form of price discrimination, but they have complete control over what they're doing.


From a developer point of view, wouldn't it be more useful, ethical and financially effective to reach out to these websites and teach them how to implement paywalls right?


They want their articles to be indexed, but they don't want the end-user to see the content. You cannot have both effectively.


but they don't want to, because they want to keep their sweet page rank




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: