Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Help seed Z-Library on IPFS (annas-blog.org)
275 points by allenleee on Nov 23, 2022 | hide | past | favorite | 141 comments


Be warned that IPFS is not anonymous, and the IP addresses of the nodes serving the content are quite easily obtainable.

And since the FBI already got involved and arrested people, this is quite the risky endeavor.


Another warning: IPFS is extremely resource intensive (from CPU to bandwidth) .

I’m pretty disappointed they didn’t mention this in the article. If this gets any traction it will eat as much CPU and bandwidth you give it.

Running with Docker and limiting CPU is necessary along with a bandwidth limiter/policer/shaper.

Having run IPFS for a long while it also ends up in strange states and stops responding to requests. I’ve taken to using a monitoring system to GET a known CID and restart the IPFS container if it fails.

My bandwidth and hardware have 100% uptime but IPFS needs restart at least a couple of times a day on average (looking at logs).

In short: this isn’t practical.


There's a new project to make IPFS very performant (within limitations of the protocol, ofc). They've got some really great goals, hope they succeed.

TBH i love IPFS, and i wish i could get behind it and help.. but ever since they starting making Crypto i have moved away from it. I have no way of knowing if they're working on IPFS or if it's merely a means to a Coin.


I agree. IPFS saw the fork in the road and decided to go down the unsuccessful path - there's still time to fix the bugs that matter, but they've wasted a lot of time on infrastructure that will never materialize.


You can just renice/set Low priority to the process for the CPU part, no need for Docker. Bandwidth limiting functionality should be built-in, not sure if it actually is.


nice/renice isn’t what it used to be[0].

Bandwidth limiting is not built in[1].

[0] - https://stackoverflow.com/questions/10342470/process-nicenes...

[1] - https://github.com/ipfs/go-ipfs/issues/3065


This is why the guide suggests to use a VPN.


Filecoin might be a good solution for this. Basically pay other people to host it for you


filecoin is built on ipfs.


If you run an IPFS server, you don't know what's on it, do you?


You do. IPFS isn't Freenet, you're not contributing to some global pool of amorphous storage.

It's just like running a webserver, but with content-based addressing, the possibility of having multiple machines provide the same content and much more convenient mirroring than over HTTP.

If you start an IPFS node, it'll remain empty until you upload something, or pin (copy) somebody else's content.

I think your node might also cache requests you make to it for some time, but that's also something that's a result of you asking for it.


So, if the FBI comes to arrest me can I just get away with that? "Sorry guys, I do have an IPFS server but I had no idea it was distributing illegal content to the world. Can I go now?"


Some systems have tried that approach, yes!

In Freenet you contribute to a global pool of storage, and end up storing and transmitting whatever the network happens to need. Freenet is intentionally obscure about it. The archive is encrypted, there are no logs, there's no way to list what your node contains, and storage is probabilistic making it hard to determine if something is there because you asked for it, or it passed through your node.

Tor also tries something similar, and has premade explanations to give to authorities if they ask about your exit node.

But those are specifically built to try to work with that case. Stuff happens behind your back that you have zero involvement in besides providing a piece of infrastructure for the network.

In IPFS on the other hand you make a personal, intentional choice to mirror or upload something.


I want to slightly clarify the case of Tor: while many people historically thought that running an exit node on your own network (that you also actively used) might create ambiguity between your own activity and strangers' activity, this is not really considered correct or advisable. One reason is that a fairly naive local adversary monitoring your connection at only one location can still tell whether particular outbound clearnet connections do or don't correlate with inbound Tor traffic. That allows distinguishing activity by a local user from activity by a Tor user that's proxied through the exit node.

Tor nodes themselves also don't host any data.


Who wants to deal with that? In the best case they confiscate all tech stuff from my home (including my employer’s laptop).


> Be sure to use these exact hash and chunker values, otherwise you will get a different CID! It might be good to do a quick test run and make sure your CIDs match with ours (we also posted a CSV file with our CIDs in one of the torrents). This can take a long time — multiple weeks for everything, if you use a single IPFS instance!

IPFS devs, take heed. This is a very dedicated and determined user. You should make this very obvious hot path a lot easier for users who are not this determined.


Still no meaningful promises around the safety or effectiveness of this.

Seems like this will get many people a love letter from their ISP.

Not worth the hassle with no idea of if this will work in even the medium term.


> Seems like this will get many people a love letter from their ISP.

Many countries just don't care. A few EU ones and USA might, but once you go into slightly poorer territories, noone cares about foreign copyright.

So, treat it like you would treat seeding a pirate bittorrent.


In Canada the ISPs send email threats but do absolutely nothing. I get them all the time.


This is because the federal government had the foresight to make a law that 1. Limits maximum damages to something like 5,000$ so that it isn’t worth the copyright mafia’s time to pursue, 2. ISP get complaints from copyright mafia and forwards it to the user but the ISP isn’t allowed to share the user details with the copyright mafia.

That’s why these emails always try and entice you to click a link and settle the crime for a few hundred dollars. They can’t do anything else as they don’t even know who you are but a few chumps will click and pay so it’s free money in a way.

Anyway, great piece of law and I’m glad lobbies didn’t get their way with that in Canada.


Law: does absolutely nothing useful...

>great piece of law

lol! I agree, of course.


The instructions specifically state to use a VPN, no different from bittorrent in that regard.

http://annas-blog.org/help-seed-zlibrary-on-ipfs.html


IPFS provides the same privacy guarantees as bittorrent (i.e. none).

Use a VPN.


A VPN does not protect you well from law enforcement.


I think in this case a wisely picked VPN likely offers more than "good enough" protection.


I just wonder why all that can't be uploaded into Library Genesis? They already have the distribution part sorted out (IPFS, Torrent, even ed2k!).


Z-Library was essentially a clone of LibGen with a more accessible UX and monetization. Though I believe they eventually solicited new material that was not fed back into LibGen.


Anna had a post about this. Basically libgen has more strict requirements on metadata, fiction/non fiction split, so it makes meeting difficult.


Most of it probably is on libgen. Putting all the eggs into one basket doesn't seem like a good move when booksharing sites keep getting shut down.


Not according to the author:

We are happy to announce that we have gotten all books that were added to the Z-Library between our last mirror and August 2022. We have also gone back and scraped some books that we missed the first time around. All in all, this new collection is about 24TB, which is much bigger than the last one (7TB). Our mirror is now 31TB in total. Again, we deduplicated against Library Genesis, since there are already torrents available for that collection.

http://annas-blog.org/blog-3x-new-books.html


From what I've heard, about 5% of it isn't on Libgen. Assuming that's accurate, that's still several hundred thousand books and several million articles.


started using this and am VERY VERY excited.this comes from someone who "witnessed" takedown of megaupload, rise and fall of popcorn time, the rise of scihub and library genesis so i VERY MUCH hope this stays online.


There is no chance that I would risk this. I would gladly donate several Terabytes of seeding if this possible in a secure and _anonymous_ network.


It is possible to run ipfs behind tor.

And I'd argue it is the right way to do so.


IPFS wont run via Tor, DHT is UDP based and Tor doesn't support UDP by design. Did my research last weekend, anybody more knowledgable, please correct me if I am wrong. Possibly it should run fine via I2P or Yggdrasil.


Googled ipfs and tor just to make sure it's not hard to find.

The first result was a guide on how to do it.


I'm not sure pointing to Google is "correcting him". But yeah, a good guide on how to use IPFS as an hidden service would be pretty useful.

(edit:typo)


did you also TEST it? last time i did this, it never really connected to the IPFS network. also it was marked as experimental


Yggdrasil is not really anonymous, though.


Doesn't that saturate Tor? Bittorrent via Tor was always highly discouraged to not overload the network with all the p2p traffic and IPFS is a very similar technique.


> Doesn't that saturate Tor? No, that's a stupid myth that needs to die. The tor network is nowhere near the advertised bandwidth capacity: https://metrics.torproject.org/bandwidth-flags.html


https://support.torproject.org/misc/misc-4/ links to TODO which says "We've been saying for years not to run Bittorrent over Tor, because the Tor network can't handle the load;" and links to a dead page "Why Tor is slow". I googled that and found https://www.whonix.org/wiki/Why_is_Tor_slow#Misuse_of_the_To... and https://tails.boum.org/doc/anonymous_internet/tor/slow/index... which sound like they were both sourced from that defunct page.

So, if you call it a stupid myth, please provide some credible proof. The Tor developers as well as the maintainers of two highly respected tools built upon Tor disagree with you.


> We've been saying for years not to run Bittorrent over Tor, because the Tor network can't handle the load;"

12 year old blog post. Both the tor network and internet as a whole are completely different.

> two highly respected tools built upon Tor disagree with you

Both of those pages have a citation that says "it is bad.... just cause it is okay!"

The credible evidence is in my original comment, that shows hundreds of gigabits being underutilized on the network.


how? i am behind NAT for example, i tried doing it. it did not work.

Is there a way for people who have just 20 mbit/s up/downstream and do not have an stable IP addres from home behind NAT?

I would run a server for long for seeding if it would be over TOR.


Running a hidden service on Tor should work without problems behind NAT


You mean, i can do a Hidden Service from behind NAT and connect that to IPFS via onion URL?

Let me see if i can figure this out :D


Dude come on, 20mb/s who will you serve with that, forget about it and move near a backbone.


That's the point of p2p with multiple seed though, every extra bps helps.


Well.. do you have Money for moving and paying subscription? :D


If he's got the rare file that you need, who cares if he's on 20mb/s or 2mb/s?

Pepperidge Farm remembers 300 baud.


If you have a way to run ipfs over tor, please share.


Any pointer on that?


Secure and anonymous network like which? Tor isn't meant for P2P seeding, so not sure which it would be.


I don't know. Tahoe-LAFS in I2P? Syncthing via Tor?


you could probably seed over clear in the first go and then actual long term seeding over tor to protect you


Trend I’m spotting: things like Mastodon / ActivityPub and IPFS are having a moment.


and the usability/UX is horrible.


I use Mastodon a lot now. It has greatly improved vs years ago but yes it still lags behind what closed silos can do. The main reason for this is lack of an economic model to fund polish.

Same is true for all the other FOSS decentralized things.


Maybe Mastodon/Fediverse instances need to start giving out blue checkmarks to accounts that pay them $8 per month to fund development/design work. Then if enough brands and celebrities move away from Twitter, their followers could bully them into paying this fee to support the ecosystem.


Recent and related:

Putting Z-Library on IPFS - https://news.ycombinator.com/item?id=33675224 - Nov 2022 (129 comments)


Even if sharing behind Tor, doesn't IPFS expose all your interfaces bu default? This should also be done in network namespaces preferably so that you're only sending specific traffic. I believe having users subscribe to Mullvad would offer much better performance as I don't think IPFS does any sort of bandwidth optimization to find peers that have better speeds.


Does anyone know if this dataset happens to include any pre-1925 published works, that might not have made it to the Internet Archive yet? I have no idea whether the publication metadata that they have is even good enough to address such questions in the first place.


http://bookszlibb74ugqojhzhg2a63w5i2atv5bqarulgczawnbmsb6s6q...

Press on Search Options under the search box, and you can filter from 1800-1925. It didn't show an exact number of matches (500+), but there were a lot of pages...


What am I missing about IPFS?

It just seems like bittorrent, but worse.


Instructions are not easy. I want a out of the box solution to help without having to deeply dig into the internals.

Something like archive team warriors: http://warrior.archiveteam.org/

For example,

> Bitswap vs DHT

I don't care. I want to help without having to care about these details.

> Launch one or multiple IPFS servers (see previous blog post; we currently use 4 servers in Docker).

what? how? where? link? no out of the box deployment?

> Alternatively, you can do what we did: add in offline mode first, add the files, then take the node online, peer with public gateways, and then finally run ipfs dht provide -r <root-cid>.

what? lol


From the article:

> If this is all too involved for you, or you only want to seed a small subset of the data, then it might be easier to pin a few directories: 1. Use a VPN. 2. Install an IPFS client. 3. Google the “Pirate Library Mirror”, go to “The Z-Library Collection”, and find a list of directory CIDs at the bottom of the page. 4. Pin one or more of these CIDs. It will automatically start downloading and seeding.


Any instruction that involves "Google X" (or even a general term "Install Y"), is not a good instruction. It is like a maze - easy to go once you know the path. Otherwise - dozens of subtle things that can go wrong, at each step (e.g. installing a different version of Y, or to a wrong path, etc).

The best instructions can have descirptions, but also contains a FULL list of command lines instructions, ideally without any placeholders.


It's presumably deliberate: by keeping annas-blog.org clean from copyrighted content (or links to copyrighted content), they will be able to retain a public channel despite the illegal nature of their work.


I usually tend to agree, but in this case the link would be a link to pirated content and would quickly get the site taken down. So "google XYZ" is the only viable way here


There is a "wine brick"-type instruction workaround (https://en.wikipedia.org/wiki/Vine-Glo):

"After dissolving the brick in a gallon of water, do not place the liquid in a jug away in the cupboard for twenty days, because then it would turn into wine."

I have seen this type of messages a few times on growkits.


I share your frustration. Every time an article from annas-blog shows up — which has been often lately — I am left utterly confused and with no idea how to use this or help the project.



Why the hell is this site not https? Note the first blog post was in 2022. No https in 2022? Red flag. This looks like a honeypot to me

And for the record, z library gets its books from libgen? Correct me if I'm wrong, but it was just a frontend?


You should read the link, even though it is http. It would correct your understanding: z-library started with all the books from libgen, set up access controls and limits to incentivize uploading new books, garnered millions of new books, and didn't contribute them back.

Also, consider your risks more holistically:

* Honeypots can and do have valid https certificates, what matters is who controls the site (for example, when the FBI took over a kiddie porn site, it changed the site to deliver over https a zero-day exploit for Tor Browser that broke out of the browser sandbox and deanonymised the visitor) * Eavesdroppers can tell you visited a specific HTTPS site by looking at your TLS1.2 SNI in cleartext. Even if you're using TLS1.3 (which fixes that leak, but is currently up to the browser/site to negotiate), eavesdroppers can still correlate site access with your DNS requests. Even if you use DNS-over-HTTPS/TLS, you don't know if your DNS provider is in cahoots with the eavesdropper. You have to trust someone

Where your actual risks lie in visiting an HTTP-only site:

* If the site has forms/cookies, an eavesdropper can see them. In this case, https buys you nothing; your concern is that someone can tell if you visited this site once, not that they can tell you're a repeat visitor * Eavesdroppers can see any headers that your browser hands out (mainly user-agent) that could more granuarly identify you vs just your IP address * Active attackers with the ability to control your traffic can place anything they want on the website by spoofing its responses. With https they could only deny you access to the site

The benefits of HTTPS over HTTP are enormous, but you have to understand what its limits are. If you're concerned about web surveillance, you should be something like Tor Browser to visit websites, and understand that even it has limitations.


> Even if you use DNS-over-HTTPS/TLS, you don't know if your DNS provider is in cahoots with the eavesdropper. You have to trust someone

but in this case, couldn't you just use an offshore dns provider that's in a jurisdiction that won't cooperate with judicial system? (assuming here adversary is getting sued by copyright holder, not govt agency)


If making requests offshore is a panacea, use a VPN that proxies _all_ your traffic there. Then you needn't care if the website is HTTPS or HTTP

In the case of _this_ site, it has nothing a copyright holder could complain about, so it doesn't matter if you visit it. It says "We don't link to it from here, but just Google for 'Pirate Library Mirror'". Yup, why not tell Google what you're doing? Once you've done that, and visited the (also HTTP only) "Pirate Library Mirror" site, you'll find it doesn't have any .torrent files. But it does have a link to a .onion site available only in Tor Browser. And the onion site _does_ have .torrent files you can download and add it to your Bittorrent client and begin infringing copyrights.

The two HTTP sites involved have nothing that could get you legally into trouble, even if an eavesdropper saw _all_ your traffic. It's certainly good practise to make all sites HTTPS, but for these specific sites, it isn't a downside that they're HTTP-only. The only situation I can think where HTTPS vs HTTP would make a difference is quite unlikely: if an active attacker, able to modify your traffic, substituted a link to a different .onion site, with different .torrent files specifically for you


http content can also be modified in flight by your ISP/VPN/Starbucks wifi hotspot spoofer, you could easily change the links on the page to ones where you would end up downloading malware/spyware versions of the products


> Why the hell is this site not https? This looks like a honeypot to me

How does TLS vs No TLS weight in on your suspicious if something is a honeypot or not? It's as trivial for three-letter agencies/black hats to setup as for trusty individuals/corporations.


We're talking about seeding a library full of copyrighted content. The person who made this site couldn't take 2 seconds to set up HTTPS? They're either incompetent or part of a honeypot. In either case, it's best to steer clear

Keep in mind, the first post on this site was in 2022. 2022, and no https... is this some kind of joke?

And in this case the adversary isn't necessarily the three letters, it's the copyright holders and legal system. No https makes ISP logs even more useful


Again, how does no HTTPS makes it more likely to be a honeypot than not? Everything on that website is about being a "pirate archivist", they don't even need to know what page you visited, just that you visited the domain, so the SNI and Host header already gives it away, TLS or not.


> so the SNI and Host header already gives it away, TLS or not

Couldn't you just use ech/esni, ISP wouldn't see the domain only IP (and dns provider)?


We can't because nobody supports it. Only firefox has support for esni and it's behind an about:config


Z library extended the libgen collection (apparently without contributing back to it).


Also it requires a login to do anything with. Add another check mark on the Honeypot theory


Z-library is used for steeling author's work. Does IPFS want to support it ?

z-library is not equivalent to sci-hub.


IPFS is a protocol like http or bittorrent, rather than a platform like Twitter. It is a federated file storage, so although individual contributors can choose not to host z-lib or allow it through their gateway, there is not single authority that can ban content.

A better question is - is this a sensible technology for hosting legally dangerous material, and the answer has to be no. It is censorship resistant not anonymising. Like bittorrent but kind of worse. It is trivial for an enforcement agent to find all the servers in the world hosting a particular e.g. book and go after them.

I am sympathetic to your ethical question, but I'm just answering you on the technical side which is - your question doesn't really make sense but this is a bad idea.


Copying is not stealing. Stealing removes the original, copying leaves the original intact.


let's just be adults here and face it. unless using rhetoric, we still dont know how to deal with piracy in 2022


Let's be adults and call "piracy" by its real name: unauthorized copying.

Let copyright holders deal with unauthorized copying, why should any layman care about it?


well, that's not how I would address it. Remember here in HN you have many people interested in that subject (experts or not): app makers, artists, lawyers and what not. remember spotify was done based in a whole piracy movement/situation.


On one hand, authors need to pay their bills. On the other hand, free exchange of information benefits humanity immeasurably.

In terms of amount-of-benefit, the second vastly outweighs the first.

But if the author can't pay his bills then there is no author.

Does that cover it?

Would UBI fix this?


Honestly, merely limiting drastically the maximum copyright term would probably cover it. (Not just by making things public domain more rapidly, but also by presumably providing pressure on the business models of companies currently relying on the existing duration of copyright.)

Back in 2006, the UK commissioned a report recommending how to revise the copyright system (the Gowers Review of Intellectual Property), and it particularly recommended against any further increases in copyright terms - contra pressure from the music industry - and essentially only didn't argue for decreases because of international obligations. On pages 52 to 55 of the Review, however, there's quite a lot of evidence suggesting that most producers of creative works would not be meaningfully harmed in earning power if the term of copyright was as short as 10 to 20 years after production.

A 10 year copyright term, renewable for a further 10 on application, would do a lot to redress the balance you mention here.


Copying is privacy and piracy is still an issue for the host.


Piracy is a misnomer used to make people feel like unauthorized copying is "stealing", since "piracy" traditionally refers to people who perform armed robbery on high seas.

Copying can be unauthorized, but when you say it as it is, "unauthorized copying" sounds a lot less terrible then "stealing" and "piracy", and is more precise in meaning.


Everyone calls it piracy, including the people who are active in the 'unauthorized copying' community. I'm not sure what's your objective in this discussion.


My objective is to get people to argue precisely why copying is bad.

Calling it "stealing", "piracy" and other ugly words is a strawman argument - we all agree stealing (physical items) and piracy (on high seas) is bad, so that would imply that copying is bad, too, since "copying is stealing and piracy". It's not, and additional arguments are needed.


I suppose the pirate bay is part of this conspiracy?


*piracy sorry


ngl your point made more sense in the original sentence.

Using an untethered copy is indeed more privacy-friendly than using most garbage publisher portals and offerings. And yes, piracy can be a problem for the host if she has insufficient opsec.


So when you copy answers on a test, that is seen as valid by the teachers because the other person's answers are intact; correct?


Completely false equivalence.

Copying answers isn't bad because you are infringing on someones "right" to have control over their answers. It is bad because you are commiting academic fraud and seeking a qualification you have not earned. You are claiming anothers achievement as your own.

Me reading a pirated book does not mean I claim to be the author.


I guess the elephant in the room is that you're not paying the authors for what they created; in this sense it's like "stealing". Someone spends time and resources making something and you use that something without compensating that someone. I agree it's not exactly like stealing (that's why we invented a different word for it), bt ut's still something that is unfair to the book author(s).


I think it's worthwhile to be precise when we argue. Saying it's not stealing isn't necessarily saying it's harmless or that it's not illegal. It's just not the same sort of action.

Eg, if you steal a book from a shelf, then it's gone. The shop can no longer sell it to anyone else, and needs to obtain a replacement. The customer that really need it may not be able to get it now.

But if a 10 year old from a dirt poor family with $100 to their name downloads a whole library of technical literature worth $10M, it's a very different situation. I think it's very arguable that there was no scenario in which the authors would have gotten paid, and that no real harm has been done in this particular case.


> I think it's very arguable that there was no scenario in which the authors would have gotten paid

Just to be sure: do you think that if people can't afford something that's not physically tangible they should be entitled (or at least permitted) to have it free of charge, because they wouldn't buy it anyway? I wouldn't buy (and probably wouldn't afford) a 30-day stay at the most luxurious spa in my country, should I insist that they let me enter anyway?


> no cost is imposed on the producer of the software, because somebody torrenting an installer doesn't cost the company any money.

Lost profit is still lost money. If you run a bar and I falsely tell everyone that your beer is poisoned (and everyone believes me) I'm not costing you any money, but you're still bankrupt at the end of the year.

> When pirating stuff a kid might grab an university book on biology out of curiosity to see how it compares to their high school lessons, but pretty much nobody actually buys books for reasons like that.

You sure that's the only plausible case? Here's another scenario: you need a programming book for your career, but you don't like spending $30 for it and you just pirate it. You would probably have bought it if you couldn't pirate it, but obviously pirating it costs less at nearly no risk; why should you spend $30?

I think this scenario is much more plausible than poor 10-year old kids downloading biology books for fun. But even if it wasn't, piracy enables both scenarios without distinction. Even if it was 80-20, that 20% of not-bought books would have ben bought if piracy didn't exist.

> The alternative reality is that it either doesn't happen or they check it out at a library instead, and again the publisher doesn't get any money.

I'm pretty sure public libraries pay for the books they have, directly or indirectly (with taxpayers' money)


> Lost profit is still lost money. If you run a bar and I falsely tell everyone that your beer is poisoned (and everyone believes me) I'm not costing you any money, but you're still bankrupt at the end of the year.

Yes, but you can't lose a profit you could never have had.

> I think this scenario is much more plausible than poor 10-year old kids downloading biology books for fun. But even if it wasn't, piracy enables both scenarios without distinction. Even if it was 80-20, that 20% of not-bought books would have ben bought if piracy didn't exist.

I'm not arguing that piracy is completely harmless. I'm arguing that it works differently from theft. We can't consider every potential loss as a real one.

Eg, mass torrenting of stuff can get to the point where on paper, if all of that was legally paid for, it'd cost more than the country's entire GDP. That's obviously ridiculous.


> I wouldn't buy (and probably wouldn't afford) a 30-day stay at the most luxurious spa in my country, should I insist that they let me enter anyway?

That's tangible. You consume space, resources, people time, water, energy, etc. People have to clean after you.

For comparison, take the scenario of a 10 year old from a poor family pirating Solidworks, which costs $5000-ish a license. The family doesn't have $5000 in their bank account.

So there exist two possible outcomes of this situation:

A. Kid pirates Solidworks. Company makes $0.

B. Kid doesn't pirate Solidworks. Company makes $0, because it's impossible for them to buy it.

That's precisely why many such companies have huge educational discounts, and offer software for free to students sometimes, and sometimes ignore piracy in some areas. If you could eliminate piracy by non-engineering companies you wouldn't make much of a difference, because pretty much no hobbyist out there spends $5000 on software they might use just a bit. Rather than buying it, they'll make do with alternatives instead.


> Kid doesn't pirate Solidworks. Company makes $0, because it's impossible for them to buy it.

I have this impression that "poor kid" vs "incredibly expensive software" is used as a strawman here, since we're talking of $30 books that anyone who's not incredibly poor can buy just by saving for a couple of months and anyone who's that poor can probably access using a public library anyway, versus the enormous amount of people that could afford those books, but see no incentive paying since they can pirate them for free without even going out of their house.


Obviously I'm using an exaggerated and artificial example to illustrate my point. Which is that it doesn't really work like actual theft. First, no cost is imposed on the producer of the software, because somebody torrenting an installer doesn't cost the company any money. And second, there are plenty situations where they never going to make any money no matter what.

Eg, back when I was 12 I did pirate software, and I didn't have the money to buy it if I wanted to. There was just no scenario under which those companies could have gotten paid. The alternative would be I'd just get my hands on something else, or mess around with the stuff I already had.

This even goes for things like $30 books. I grabbed a whole bunch of stuff just to take a look at what's it like. When pirating stuff a kid might grab an university book on biology out of curiosity to see how it compares to their high school lessons, but pretty much nobody actually buys books for reasons like that. The alternative reality is that it either doesn't happen or they check it out at a library instead, and again the publisher doesn't get any money.


There is an artist's gallery in my town. The artist sells lots of pieces (or tries to; no idea how successful). If I go and look at each piece, really take it in and absorb it fully and internalize it, but buy nothing, have I done the thing that's not exactly like stealing?

If I download a PDF and read it and then delete it as soon as I am done, have I done the thing that is not exactly like stealing?

If I go to a library and read that same book in its entirety without checking it out ... doesn't seem much different than reading and then deleting a PDF.

Stealing someone else's book definitely seems wrong, but reading it while you are visiting their house seems fine?

I am not claiming to know the right (ethical, moral, whatever) action here. I just have a super huge problem calling it "piracy" or "stealing" or whatever. Figuring out a way to support creators is hugely important, but criminalizing the mere viewing or hearing of art/music/words/etc seems extremely wrong.

Anyway, I've been thinking about all this since at least Napster and I still have no idea really.


My perspective: go with what the person who created it agreed to. They’ve set their life up around certain assumptions, and if I don’t like them I will forgo their work.

For example, that person whose works are in the gallery has built their business on a balance of exposure - letting anyone who walks in look at things - and the fact that people who buy art are willing to pay a fair amount to own a physical object for display. Looking without buying is expressly part of their business model.

Book publishing is different, with the author assuming they’ll get payments from readers - much smaller than that artists but many of them. Since I don’t have any ownership rights over their work, I don’t attempt to change the terms.


There is an artist's gallery in your town. The artist charges $10 on entry to see the pieces, that's how he/she makes a living. You enter from a backdoor that someone left open to avoid paying, and you additionally help anyone who wants to enter for free by showing them the backdoor. You know that what you're doing is illegal, but you don't care because the building where the gallery is hosted doesn't have anyone to check the backdoor at night, so you're extremely unlikely to get caught. It's not exactly stealing, but it's still a) illegal b) selfish c) damaging to the artist


But you can’t really “steal” something digital, when they can be freely multiplied any number of times without degrading.

If you go to a supermarket and steal a TV that’s stealing. You taking a picture of it isn’t.


You cannot really "steal" military secrets by photograping them either, but it's still a threat to national safety. You cannot really "steal" trade secrets by photocopying documents, but it's still an incalculable damage to the company you target if you do. You cannot really "steal" someone's privacy, but if you look at their private correspondence or their electoral card you're still infringing on their right. Does it really matter if you can call an action "stealing" or not? An action could still be damaging even if it's not technically "stealing".


If you gave credit to the source of the answers... then most teachers would just not give you the marks for the bits you copied. The cheating here is faking attribution - claiming yourself as a source of something you didn't originate, not copying - a legally and morally distinct thing to copyright infringement. Indeed, part of the point of copyright infringement is that you do know (and reveal to your clients) who made the thing you're copying, which is why it has value.


As you said, you "copy" the answer, you don't "steal" it.


You steal the author's investment in time and effort to write the book and conceive it's content.

How you steal (making a copy) doesn't affect the fact that you prevent the author to get his share or reward from it.


I invested the time and effort to write this comment. I want you to pay me. I'll be expecting a cheque in my mail soon.

If you don't pay me, you're stealing from me.


I'm obviously talking about ebooks downloaded from commercial platforms like KDP Amazon, and distributed for free on Z-library without permission from the author.

Comparing this to your comment is whataboutism. Comments on hacker news are free to read. If you want to be payed for your writing, use a platform that support this.

However, you do have an automatic copyright on everything you write and I would need your permission (eventually by paying you a fee) to republish it in a book or somewhere else. Fair use is an exception to that, but the amount of text we can reuse without permission is limited. These are rules.


Cheating on a test is not stealing either. It's cheating.


That is pure whataboutism.

What has this to do with stealing books ?


Taking all your fiat money technically doesn't "remove" it either, just relocation. Gets even better if we talk about virtual funds.


When all my fiat money is taken, I cannot use it anymore, therefore it's removed.

If you make an exact copy of my fiat money, yet I can still use it, nothing is stolen.


Let's go down the rabbit hole...

When governments make exact copy of existing fiat money (i.e. engage in money printing) they increase supply of fiat money and as a result cause inflation, which decrease the value of fiat money existing before the money printer was turned on. So while such a copy did not completely eliminate your ability to use your money, it did in fact decreased its utility. And indeed can be a valid case of "stealing".


> And indeed can be a valid case of "stealing".

It's not stealing, government didn't steal anyone's money, they just devalued it. In a similar sense, gold miners are increasing the supply of gold and lowering its price, but they still aren't stealing anyone's gold - they're making more. The fact that fiat money is completely fictional does make it more prone to systemic abuse - but even in that case, what the government is taking away from people is value and labor, not money itself, and those two are a little more abstract and harder to reason about.

The word "stealing" has a specific meaning related to physically taking something away from someone else, let's not use it in contexts where it doesn't apply. Especially when the use of the word is mostly perpetuated by copyright holders who want to persuade the public that not giving money to them is equal to stealing from them. That's bullshit, and it will always be bullshit. To me, it sounds horribly similar to narcissistic abuse: "How could you not give money to me? Do you have any idea how much that hurts me?".


> When all my fiat money is taken, I cannot use it anymore, therefore it's removed.

Ok, so if your intellectual property is stolen ... err, sorry, "copied" of course, how do you get monetized for those "copied" copies? By your initial statement it's not removed, now you tell me it is? Weird.


There's no such thing as "intellectual property". If you don't want other people learning the information your mind has come up with, don't release it into the public where it can be easily shared.

> but I want money! I don't care if everyone else loses their right to sharing information!

Not my problem.


I wasn't sure if you are serious or just a troll. I get it now.


It would be nice if we could read these books online rather than having to download a file and find some sort of reading app for it. Like, what am I supposed to do with an "azw3" file? And do I really want evidence of piracy sitting on my hard disk?


In all likelihood the book you are looking for is available as a .pdf

The "best" format for reading books is epub, as it is an opwn format which does not try to adapt a paper format. There are many free readers available for it.

>And do I really want evidence of piracy sitting on my hard disk?

If your house gets raided by law enforcement, charges for minor copyright infringement should be of absolutely zero concern to you.


> The "best" format for reading books is epub, as it is an opwn format which does not try to adapt a paper format. There are many free readers available for it.

I disagree. ePub assumes that the thing you are reading can easily be reformatted – for mathematics, technical documents, or similar, this is really not true. I've got a few technical books where the publisher released them as either a LaTeX'd pdf or an epub and on my iPad reading the ePub is _awful_ -- the layout is worse, it's very distracting, and the effort that the author put into ideally putting "this equation" and "this text referring to this equation" together has been lost.


Epub pushes the duty of layout onto the reader, it is very similar to a website. With epub you can freely change font, spacing, etc. (great accesibility feature).

PDF is a representation of paper. LaTeX files look good as a PDF because LaTeX files are created to be printed or viewed as a PDF.

A digital book should not enforce layout. Yes, epub has flaws, especially if it is not seen as the main target for publication.


I haaaaate recipe books in ePub with a passion. They always look cheap and the images unapetizing and the formatting of the preparation steps never looks good.


I'm pretty sure it's not EPUB that's at fault. I've seen very attractive EPUB books.


Yes but even a PDF needs downloading on mobile browsers.

Whereas the Books section of the Internet Archive has a consistent format and displays them all in-browser, it's so much more convenient and user-friendly.

This archive should aim for the same.


> This archive should aim for the same.

This archive of a project where the principal actors are currently awaiting extradition to the US for copyright infringement charges should work on making the user experience better?

Think they have bigger fish to fry like making sure the archive of the project survives which, incidentally, is what this is all about.


> The "best" format for reading books is epub

Not really.

Practically, for many pirate ebooks you can find in both ePub and PDF you better choose PDF because it looks and reads like the original while the ePub version probably is almost garbage with broken tables, code quotes etc.

As a format FB2 is better than ePub because it separates the content from the view while ePub mixes them heavily. I usually convert ePub to FB2 before reading just to force my PocketBook reader to use its default typesetting instead of that hardcoded in the ePub file. FB2 also is much more convenient to process programmatically and it defines more meaningful metadata than ePub.


> As a format FB2 is better than ePub because it separates the content from the view while ePub mixes them heavily.

Problem is that as soon as you need some typographical or layout feature beyond what is natively supported by FB2 you're out of luck.

With ePub you still have a chance of manually creating some custom effects before having to give up and fall back to a PDF.

Now granted, for your typical run-of-the-mill novel it's usually not a problem, because those rarely do anything non-standard with regards to layouting or typography.

However some books do play around with their layouting and/or typography, sometimes quite heavily. House of Leaves e.g. would be impossible to properly reproduce in FB2, and since that particular book goes really overboard in terms of playing around with the page layout, I think even as an ePub it'd be quite a feat to successfully pull it off.


>Practically, for many pirate ebooks you can find in both ePub and PDF you better choose PDF because it looks and reads like the original while the ePub version probably is almost garbage with broken tables, code quotes etc.

The reverse can be true as well, where the pdf is some badly compiled epub.

And with the epub at least you can always read the actual text, without repeated swiping for each line.


OK, but without seeders the project will die. If you only want to leech, maybe you should stay away from appeals for participation?


Azw3 is main Kindle format, the best viewer apps for them are Calibre or even the official Kindle app.

It does have previews for PDFs, better formats aren't easily displayed in a browser.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: