Hacker News new | past | comments | ask | show | jobs | submit login
Goodreads plans to retire API access, disables existing API keys (joealcorn.co.uk)
869 points by buttscicles 42 days ago | hide | past | favorite | 422 comments



I recently discovered the https://openlibrary.org/ by The Internet Archive. On the face of it, their "about" page[1] sounds appealing (not least because it resonates with my open source values):

One web page for every book ever published. It's a lofty but achievable goal.

To build Open Library, we need hundreds of millions of book records, a wiki interface, and lots of people who are willing to contribute their time and effort to building the site.

To date, we have gathered over 20 million records from a variety of large catalogs as well as single contributions, with more on the way.

Open Library is an open project: the software is open, the data are open, the documentation is open, and we welcome your contribution. Whether you fix a typo, add a book, or write a widget--it's all welcome. We have a small team of fantastic programmers who have accomplished a lot, but we can't do it alone!

---

They also seem to provide an API[2].

[1] https://openlibrary.org/about

[2] https://openlibrary.org/developers/api


For anyone who wishes Open Library was even better, please join one of our weekly community calls @ 11:30am Pacific.

For an invite, please send me an email at mek@archive.org or go to: https://openlibrary.org/volunteer

# APIs & Data Dumps

- https://openlibrary.org/developers/api

- https://openlibrary.org/dev/docs/api/books

- https://openlibrary.org/developers/dumps monthly data dumps for if you need bulk access and the APIs are not enough.

# Spread the word

Also, if you want to help raise awareness of this resource, please help us get the word out on twitter!

1. https://twitter.com/openlibrary/status/1338185940469051392

2. https://twitter.com/openlibrary/status/1338186553915367425

# Issues

Thank you all for helping us discover some issues with our goodreads importer and search (recently migrated to Python3 + thanks @cdrini et al for these fast bug fixes! If you notice an problem, please help open an issue here: https://github.com/internetarchive/openlibrary/issues/new/ch...

# Learn More

- https://archive.org/details/openlibrary-tour-2020/openlibrar... if you want to learn more about Open Library, here's a short intro vid.

- https://github.com/internetarchive/openlibrary if you want to follow on github.


Hi, Mek. Awesome project!

How do I go about claiming my author page? The current book listed there has been officially "retired" for over a decade now, and I have plenty of other books that I'd be happy to add.

https://openlibrary.org/authors/OL2965893A/Rik_Roots


Howdy rikroots! Thanks for this kind message, we're really excited about designing a way to help authors claim their pages. It's not a feature our current author pages have yet.

As a project of the non-profit Internet Archive, having a trusted catalog is pretty paramount to what we're trying to accomplish and Aaron Swartz's original dream for an Open Library.

I've created an issue for this feature (as it's something we've discussed doing for a while). I'll also be adding this to our upcoming Tuesday 2021 roadmap discussion.

https://github.com/internetarchive/openlibrary/issues/4263

Thanks for raising this


Many thanks for the response. I look forward to seeing the outcome of your deliberations in the new year.


I noticed that two authors with the same name are conflated. But if I try to edit any of the editions involved, the interface won't let me either modify or delete an existing author nor even add a new author using the author ID rather than the author's name. What should I do to sort this out?


Howdy bloak,

Many of our librarian processes & FAQs are detailed here: https://openlibrary.org/librarians

Most data on Open Library is publicly editable by members. In this case, author merging is a capability that only folks who have been added to our Librarians usergroup can access because the process of reverting an accidental merge is quite time consuming (and so we have some training in place).

If you tweet @ our lead community librarian (http://twitter.com/seabelis) she can likely help you make this fix! Also, if you're interested in helping us make changes like this, we can invite you to our slack channel :)


Thanks, but what I'm trying to do is split authors ... and I think I've worked out how to do it. The trick is to make sure you're looking at the "work" rather than the "edition", because when you're looking at the "edition" the author is immutable.


It just struck me how dubious the term “Pacific time” is - a timezone named after the largest ocean on Earth. I’m in Australia on the side which is also on the Pacific, but my time is 6 or more hours away from “Pacific time” :P


Using the three letter acronym usually removes the ambiguity. For instance my European friends and I both live in “Central time” - but when you say CET or CST everyone immediately knows which time zone you mean.


Openlibrary looks pretty awesome. Thank you for sharing!

Would anyone be interested in having an instant search experience for this books dataset, like the one I built for the 2M recipes database posted on HN earlier this week: https://news.ycombinator.com/item?id=25365397


Jabo, please also help us @ openlibrary improve our search. @cdrini is the lead on our solr efforts and we could really benefit from teaming with someone who is really passionate about search. If you have questions about using our data, please send over a message and I'm happy to help: mek@archive.org


Happy to! Sent you a DM on Twitter, but I'll email you shortly.


Alright! Just pushed the instant search app with the Open Library database live: https://news.ycombinator.com/item?id=25414389


That would be amazing!


Alright! A handful of upvotes in addition to a comment. I’m on it! I’ll whip something up and report back in a few hours.


Looks like it's going to be a while before I can download the dataset. Seems to be pretty slow even when downloading on EC2 with a 10Gpbs connection.

  ol_dump_latest.txt.gz           0%[                                                  ]  28.73M   115KB/s    eta 21h 54m


Be great to see your search on there. I was admiring the speed and accuracy on the post you put on the other day...


Thank you! Working on it as we speak.


Aaron Swartz actually built the original version of the Open Library site.


The regime responsible for his prosecution is back. Maybe they will pick up where they left off too.


Sadly, the Goodreads importer appears broken - a fresh export just now of my Goodreads data (<100k) is failing to import with a generic "oops it failed" error almost immediately. :(

[1] https://www.goodreads.com/review/import (export)

[2] https://openlibrary.org/account/import/goodreads


Hello! I work on Open Library; sorry for the bug! We recently deployed a big Python 3 migration that stirred the pot a little. The import issue should now be fixed: https://github.com/internetarchive/openlibrary/pull/4259


Thank you so much for work you do with OL! I'm a goodreads librarian and would love to import my data/lists/etc over, but whenever I try to access (https://openlibrary.org/account/import/goodreads), it gives me a 'method not allowed' error.

(granted, running on older versions of firefox, but still thought it might be a bug you would like to be aware of.)


<3 Also, Open Library has a librarian program.

If you use the https://openlibrary.org/volunteer page and click the Librarian link, this will send an email to Lisa, our head community librarian and we can help you access our slack channel and request access to our librarian features. We also have an optional weekly call for folks to raise issues and questions (e.g. "I want book series!")

More about librarianship @ Open Library: https://openlibrary.org/about/lib

If you hit any issues with this process, feel invited to send me an email to mek@archive.org.


Thanks kradeelav! The "method not allowed" is happening because this page requires login first.

Ideally, it should probably bring you to the login page and (once you login) bring you back to the import page.

We'll see about creating an issue for this, thanks!


Hi! Thanks for fixing that up - I had an 18% failure rate (368/444) and lost the list of failures when I clicked away, so given this I'm now trying to flush/reset my library. I'd recommend (a) logging the import task to a logfile for the user to review failures, and (b) providing editing at scale (list view with checkboxes -> action), as I'm sitting here clicking "remove" hundreds of times to reset book by book. :-/



The GR CSV export worked for me just now. Possibly an overload problem. Regardless, the writing is now on the wall for GR - get out while you still can.


Not just you, gives me the same error :(


Hi! I work on Open Library; sorry about that! We had a Python 3 migration that stirred the pot a little. We deployed a fix, so it should be working now!

https://github.com/internetarchive/openlibrary/pull/4259


Since the past few months I have been searching for a Goodreads alternative. Something that only keeps my books. I don't care about the social features that much. And I think this is it. I am going to donate a tiny bit right away!

Although, I just tried importing my Goodreads export into Open Library and I get the following "Internal Error":

> Hmm... > Sorry. There seems to be a problem with what you were just looking at. > We've noted the error xxxx-xx-xx/yyyyyy and will look into it as soon as possible. Head for home?

Anyone else facing this issue?


Hi! I work on Open Library; sorry for the issue! We recently had a big python 3 migration. Chris/Aaron just fixed + tested it, and I just deployed it to production, so it should be working now!

https://github.com/internetarchive/openlibrary/pull/4259


Appreciate it!

I can confirm it works with at least my import.

Edit: Alas, some errors regarding "Book not in collection" and "No ISBN". But I think I can probably see if there is a way to contribute those books to the repository.


Even opening an issue for this would be a massive help:

https://github.com/internetarchive/openlibrary/issues/new/ch...

The importer is a new project (as of August this year) and I think there's a lot of opportunity for improving it (including importing missing records on demand).

Other hat tips: I also want to call out the great work of https://inventaire.io, another great project in the space which uses wikidata, as they also have a goodreads import which we're learning from. Another project I'm excited about (which has goodreads import) is beta.thestorygraph.com. Not sure the status of its source code but I heard an interview from the founder and she and her work so far seems great.


As one can for example add Audible podcasts as books in GR I can imagine this leading to importer errors as well as they necessarily don't need to have an ISBN.


This is right.

Books pre ~1973 don't have ISBN. And goodreads covers far more material than just modern material.

The current Goodreads importer makes a big/unfortunate tradeoff of trying to have something which works for the majority of material on people's goodreads lists.

It doesn't preclude further efforts -- Open Library knows about millions of books that don't have ISBN. So if anyone wants to help us improve our importer (or at least register their interest by creating an issue which calls out what other identifiers should be considered -- e.g. LOC, etc) that would be very helpful.

# To open an issue:

https://github.com/internetarchive/openlibrary/issues/new/ch...

# Here's the code for the importer

https://github.com/internetarchive/openlibrary/blob/892adebb...


Thanks a lot. I learned something today. I knew of the ISBN issue, but didn't think of it.

Having some free time on my hands I will take a look in the next few days. Looking for an alternative to GR and for something that tickles my interests.

Python, Search and literature. And open source.


You might specifically appreciate that every page on Open Library is an API (including search). Just add ".json"

e.g.

https://openlibrary.org/search.json?q=

https://openlibrary.org/books/OL28018730M.json

This is one of the best implementation decisions that Aaron Swartz, Anand Chitipothu, et al made when starting Open Library.


Have you tried Library Thing?

https://www.librarything.com/


Adding your data to yet another proprietary archive doesn’t sound like much of a solution..


You might wish to review their Privacy policy, which inspires more confidence than Goodreads:

https://www.librarything.com/privacy

The site's also been around since 2005, so, that speaks well to both its longevity and its integrity. Of course, that's not to say they couldn't be acquired tomorrow and paywalled, but I think it unlikely.


I don't think I am particularly worried about a proprietary solution. If I am putting my preferences in the public, pretty sure I am giving up on some dimension of privacy. With books at least, it doesn't seem so concerning as long as the recommendations don't suck too much (I don't particularly rely on recommendations that much though). But of course, you never know!


You do realize this entire post here is about GoodReads discontinuing their API? And as a reaction to that, you’re suggesting moving to another provider that doesn’t allow exporting your data either.


LibraryThing has several robust export options and an API. As they also run small "actual" libraries that includes some interesting old catalog formats as well.


Not sure if you've used the APIs recently, but they have been disabled without warning since October: "Our APIs are indeed disabled for now. They will be re-enabled at some point but we don't have an immediate timeline for it, so I wouldn't expect them back anytime soon." [0]

[0] https://www.librarything.com/topic/325722


Just tried this out. Quite a large community, so I'm assuming it is a well fed machinery. The interface is outdated. Nevertheless, the website does its job quite well. I am happy I found this. Thanks!

...And they even have an API!


I recently "kondoed" my books down to like ~2xx total volumes and once I had them all organized it only took me an hour or so to scan all of them into the LT iOS app. Very few wouldn't scan but those were mostly findable by manually searching. Having an up to date list of all my books is pretty satisfying and LT seems to have a decent reputation in the serious-about-books-online community. Even inspired me to schlep into the office to catalogue my work related collection.

Made me realize how many booksellers(especially used) put proprietary barcode stickers on top of the 'actual' barcode stickers.


LT is in the slow process of upgrading the interface to something newer/modern and especially with better mobile device support. The screenshots on their blog I've seen look like it is coming along nicely, and as a long time LT user I appreciate that they take their time on big changes like that.


Its nice that its opensource, backed by Internet Archive and the Controled Digital Lending program is cool too, but how is it possible that a project 14 years in development is such a mess? Just try and search for some popular books and see for yourself, the most important feature - search for books well, is not present. Basic features are missing, book data is often wrong, etc... honestly why would I join such a project instead of starting a new one?


Hi! I work on Open Library. The project is entirely open source, with an active community, so anyone can contribute fixes/features on GitHub: https://github.com/internetarchive/openlibrary

And yeah, searching needs some work! That's on my task list for this month. Just this Friday I spent most of my day working on updating our search engine, Solr, from 3.6 to 8.7 (wip!). But search is a _BIG_ pain point. We're a small team with a big long list of things to do, but we are making progress! This year we updated to Python 3, switched most of our production environments to docker-based for easier deploys and to give open source contributors more control of production infra, added reading history stats for users, added a new interface for exploring books, worked on a novel recommendation system, added text selection to the online BookReader for public domain books, added GoodReads importing, grew our community, added the ability to search by classification, and much, much more (you can see highlights from our year here: https://github.com/internetarchive/openlibrary/issues/3891 ).

There is still _definitely_ a lot to do, but I think the biggest reason worth using/contributing to Open Library is likely its open source community. Anyone can jump in and help make improvements to the system (as they very often do!). Personally, I think it's more likely that a system with a community will survive/flourish than one maintained by a single person (I also wondered whether I should just create my own before contributing to and now working on Open Library!). And there are also loads of different tasks associated with a site like OL, which would be impossible for me to do if I was going it alone.

If you would be interested, checkout the GitHub repo: https://github.com/internetarchive/openlibrary . It's very active, and you can get an idea of how we work :)


This book is actually a candle, I'm pretty sure: https://openlibrary.org/books/OL28314296M/Harry_Potter

How do you even get an ISBN for a candle?


Goodreads also has something similar : https://www.goodreads.com/book/show/43056033-harry-potter

There is an explanation under "About the author" section :

This "author" was created to segregate those items which have ISBNs but are not actually books. For more information, see the manual and/or start a thread in the Librarians Group.

When an item which is not a book is imported via ISBN into Goodreads, it does no good to delete it: the item will only be re-imported as long as it remains on the feeder site. (Often these are book-related items which are assigned ISBNs by book publishers so that they can be tracked through their book systems.)


I wonder who the executive was who decided to start assigning International Standard Book Numbers to non-books.

“Why not? What’s the big deal?” said someone who has never worked with garbage data.


Not an executive, but a middle manager etc that said:

"paying $2-10k+ to add an International Standard Book Accessory Number field to their software will never get approved for a bookmark that works as a compass, or a promotional Harry Potter bookend set."

Tough to disagree.


Big booksellers whose whole systems are built around ISBNs as a primary key


When all you have is a hammer, everything looks like a nail.


"And hammer marks start to mysteriously appear everywhere."


"Not a bug, but a feature"


You buy ISBNs in blocks or individual numbers, and then submit whatever data you want.

There's very little control. E.g I just published a novel,and Amazon did not in any way validate that the ISBN I have them actually belonged to me - I had not yet registered the book data, so what I told them would not have matched anything they might have looked up.

In this case, based on the reviews on Amazon, it looks like someone changed the description of an existing product, and that the ISBN probably actually belongs to a book.


History section is clear:

ImportBot scrapes Amazon.com, matches product id with ISBN10 (which can be converted into ISBN13 without anything else IIRC), imports product as book.

Candle: https://www.amazon.com/gp/product/168298494X


> ISBN10 (which can be converted into ISBN13 without anything else IIRC)

Yes. There clearly only needs to be one way to identify things, so commonly larger systems just incorporate the smaller ones wholesale. The 13-digit system just incorporates the prior 10-digit system for International Standard Book Numbers with a 978 prefix called "bookland" (most prefixes in this system are geographic, but of course books aren't really from one single geographic region, so they're from "bookland") and adds a new set of possible codes. It also incorporates the entire 12-digit American "UPC" system.

Several other (less well known) systems were gobbled up the same way as "bookland", just allocating them imaginary geographic regions in the 13-digit system.

There's actually a fourteen digit system, but the lead digit tells you about how many of something are bundled e.g. so a distributor can distinguish a truck full of Pepsi cans from just one case or a single can in terms of things you can order. Lead digit 0 means "single" so if you know the 10 digit ISBN you can not only make a 13-digit EAN for that, you can make the 14-digit GTIN that means "just one of this book" which in most cases would be what you want.


Upon visiting the Open Library, I'm greeted by a banner covering the top half of the screen, asking me for a donation to keep up with bandwidth costs. Isn't this platform, as well as the Internet Archive or Wikipedia, exactly of the kind that would benefit from being built on top of some kind of P2P network? Content is generated and maintained collectively; why isn't infrastructure treated the same way?


Hi Funes,

Great points here

1. The banner happens last month of the year (Wikipedia being the perfect analog). Yes, there are mixed feelings and it's not the world's best experience :P

2. Our entire data set is available to download as in bulk https://openlibrary.org/developers/dumps because we'd love to see a decentralized p2p version

3. https://github.com/mouse-reeve/bookwyrm Mouse who used to work @ Internet Archive has a decentralized version of Open Library (Bookwyrm) and it's worth checking out.

4. For the last 5 or so years the Internet Archive has been cultivating a dweb/dapp community and integrating with IIIF, Dat, IPFS, gun, bittorent, webtorrents, and others and hosting regular summits and meetups https://blog.archive.org/2018/07/21/decentralized-web-faq/

5. The wayback machine is an interesting case study: it turns out, incentive structures (even things like FIL/filecoin) haven't been able to perfectly crack the nut on getting folks interested enough to preserve the whole wayback machine. There's petabytes of material and there's a powerlaw about what people care about today. Internet Archive realized what we care about today may not be the same as tomorrow, and so there's a cost eaten (the incentive comes from economies of scale generated by intrinsic desire rather than $). And in a way, this centralized solution (economies of scale) IS the solution a community came up with. It has flaws and advantages (tradeoffs), such as centralized points of failure, and I think the archive would be (and has been) ecstatic to explore improving these opportunities.


Sounds nightmarish. IPFS would work if the project took aim at becoming the default filesystem for linux. There's no storage issue if every new server, terminal and handset was an ipfs node by default and local storage volumes and proprietary cloud storage platforms were the fragile fringe use filesystems. Perhaps the Open Library and the Wayback Machine only work in a more open internet than we have today.


IPFS Comes to mind.


Anyone know of a project like this for video games? I got a spreadsheet I use to organize games I’ve played/am going to play and always looking for an easier way to get metadata. I was also looking at building something like Goodreads for video games and similarly that data would have been great.


What GOG is attempting with Galaxy 2.0 [0] is interesting in this department. Galaxy 2.0 has plugins to import what it can of games/achievements/play time from as many other vendors as it can and tries to show something of a view of every game across all your major services/accounts. Many of those plugins are open source, written in Python, though of course Galaxy 2.0 itself is not itself open source and it is still the primary installer/launcher for GOG's own service, if you are concerned about vendor control. There's a third party export script in Python as well [1], though it is reading JSON and SQLite files directly and not using an official API.

[0] https://www.gog.com/galaxy

[1] https://github.com/AB1908/GOG-Galaxy-Export-Script



I’m surprised I’ve never heard of this. Thanks for sharing.


also rawg.io's api. I've used them in the past.


https://www.igdb.com/ is the one! I do the same thing. I've got a great process built on Airtable + IGDB.


Operated by Twitch nice. Interesting they have a “Time To Beat” field but it’s not really populated. I also use Airtable and was thinking of writing a Zapier hook to scrape HowLongToBeat.com for this info. It’s one of my favorite stats to help prioritize what game to play next. I can’t be dumping 100 hours into JRPGs back to back lol.


Yeah, I get that. It's all wiki based, so I guess folks aren't filling in that data.

There's not an API directly, but https://www.howlongtobeatsteam.com looks promising. It's got an API that you can use to cache your (or all) games. Zapier might be able to do it as well w/ this scraper: https://www.npmjs.com/package/howlongtobeat (disclosure, I work at Zapier).

I used to have a CLI that would trigger a Zapier webhook that would use [this CLI app](https://github.com/xavdid/zapier-igdb) to populate data into Airtable, but I actually moved the whole process to [Airtable Apps](https://airtable.com/marketplace) since I could customize the UI a little more. I've been meaning to open source it, but it's pretty customized so I'm not sure if it'd be widely useful.

I'm curious to see where you end up with this though! It's something I've put a lot of thought into.


> Interesting they have a “Time To Beat” field but it’s not really populated.

Time to beat is a very difficult metric. Does a Visual Novel take 10 hours to beat, or 2 minutes? I'd argue the latter, but I know plenty of people would disagree with me.

"Alright," says the webadmin, "I'll just implement a voting system and take the average". But now they'll be claiming that it takes 5 hours and 1 minute to beat, which everyone disagrees with.

The best system I've seen is SteamHunters' "Shortest/Median/Average", which gives you a pretty reliable indicator of what's possible, what's typical, and how much of a difference there is between optimal and casual play, respectively. It's by far the most comprehensive dataset out there, but unfortunately only mono-platform so probably not too useful for sites like igdb.


Depends on your use case but yes: https://www.giantbomb.com/api/


Have a look at grouvee. I've been using it for quite a while now and I believe is basically donationware.


Thank you buttscicles (hard saying that with a straight face) for OP'ing this thread and to Joe Alcorn for the amazing original article.

I haven't shared this yet -- it's more for the community, but I've tried to address various questions from the community and distill answers + resources for Open Library here:

https://blog.openlibrary.org/2020/12/13/importing-your-goodr...

Tried my best to include others players in the space (wikidata, inventaire, bookbrainz, worldcat, bookwyrm) who are doing great work and pay respects to readng, storygraph and other innovative services which are breaking onto the scene.


I just signed up for this and imported my GoodReads csv export. the csv has 90ish rows and I was only able to import 60ish rows.

I get that Open Library doesn't have as much data as GoodReads, but I wish it would show me the data it couldn't import so I could add it manually to Open Library's data store.

Nevertheless, I love the idea and I'll be opening bug reports and maybe code contributions if something looks easy enough.


bbkane, this is a great idea (identifying which books didn't import). If you'd be so kind as to help, please open a feature request for this!

https://github.com/internetarchive/openlibrary/issues/new/ch...

If you tag @tabshaikh who helped implement the importer and me @mekarpeles we can make sure it gets triaged and tagged correctly this week :)



This is actually very cool.

Dissatisfied with how slow and clunky Goodreads is I actually thought about making my own (albeit much simpler) version of Goodreads to keep track of my reading habits. I often dig through Goodreads to find books or authors I can't remember the names of -- and Goodreads isn't great for that.

Open Library actually provides the missing piece. The fact that they offer bulk downloads also makes it easier to be a good internet citizen and not send tons of API traffic their way.

Looks like I'll have to set up a monthly donation. I'd really like see openlibrary succeed.


Hi! I work on Open Library. Yep, Open Library has public APIs, and data dumps (updated monthly) of all our books/authors if anyone needs them.

https://openlibrary.org/developers/dumps

The project is also open source, and you can find the code (and contribute!) on GitHub: https://github.com/internetarchive/openlibrary


Doesn’t the library of Congress in the US issue ISBN numbers for each book published. there must be a public listing of those.

After some looking, there are Some private databases with millions of # but no official site. Eg

https://isbndb.com/isbn-database


No. ISBNs are issued by publishers, from delegated blocks, and there's no unified listing.

For books in the collections of large libraries (like the LoC) there will be a public catalogue entry with the ISBN attached, but they don't assign it.

There were also a lot of books published before ISBNs were created, and not every book has an ISBN attached even to this day.


That is all correct. Another consideration: occasionally an incompetent publisher puts the same ISBN on two unrelated editions. A bibliographer might just carefully record what's printed on the book, but many users of ISBNs won't be at all happy with that "solution".


I've seen a number of books, mostly children's books from scholastic, with different editions (different art, copyright, etc) that have the same ISBN number. I've since removed most duplicate books from my collection, but I had found a bunch when I last sorted my books; Maniac Magee is the only one I can remember by title. More annoying was the book with a CD attached to the front, that caused the finish to peel off when the CD was removed (the CD contained a non-working installer for a screensaver that, when extracted using other tools, also didn't work).


Correction; ISBNs are issued to publishers, by an issueing agency or firm (R. R. Bowker LLC in the US, now owned by ProQuest).

https://en.wikipedia.org/wiki/International_Standard_Book_Nu...

https://en.wikipedia.org/wiki/R._R._Bowker

http://www.bowker.com/products/ISBN-US.html


Also: ISBNs often change when publishers are bought, split, sold, merged etc.


On amazon you can get an ASIN so you don't have to buy an ISBN. So there's no universal book identifier.


I built a private Goodreads "competitor" for friends (with groups and book clubs and whatnot) using the Open Library data dumps for book/author/publisher data (since GR APIs were too restrictive in how they could be used). They're great, easy to use, and the site behind them looks like it's run well and stable (edit: didn't realize they're under Internet Archive!). Would definitely recommend them as an alternative.


See also WorldCat:

> WorldCat is a union catalog that itemizes the collections of 17,900 libraries in 123 countries and territories[4] that participate in the OCLC global cooperative. It is operated by OCLC, Inc.[5] The subscribing member libraries collectively maintain WorldCat's database, the world's largest bibliographic database.[6]

* https://en.wikipedia.org/wiki/WorldCat

* https://www.worldcat.org


I'm looking for a site that supports not only books but short stories.

My primary interest is in recording my thoughts on books and stories I've read, so a review site is what I'm looking for. This is more challenging for short stories as they don't have a ISBN, may appear online, or in a fiction magazine or in an anthology.

So, I may put down my thoughts on short story X that I read in book Y, but if I look up book Z that also features that short story, my thoughts on the story would also appear there.

In short, I'm looking for a site that also records short stories like the ISFDB [1], but allows users to add reviews.

So far, I haven't found one. I'm now putting down my notes on short stories in a Zotero database.

[1] http://isfdb.org/


Do you know how to filter a search on openlibrary by books that are in the library? It's annoying to search and get hundreds of "Not in Library" results.


The "one page for every book" goal seems to position Open Library as a rival to OCLC (Worldcat), as well as the almost perfectly useless Hathi Trust.


Should this be a separate HN post?


This makes absolutely no sense and has no relation to any economic variables. Goodreads isn’t some struggling self-funded startup — it’s owned by Amazon.com. The acquisition was a deal that should have never been approved, if the Obama administration had been anything beyond completely impotent at protecting us from monopoly games:

https://www.theguardian.com/books/2013/apr/02/amazon-purchas...

I would like to understand the true strategic interest behind this. Is Amazon simply penny-pinching now that they’ve successfully obliterated the market for both new and used books online? There’s way more to this story than appears on the surface.


It's not mentioned in the article but Amazon had disabled their affiliate link program for Goodreads ahead of this announcement, which cut off a major source of their revenue for them, and forced them to sell. They had no choice.

The strategic reason for Amazon is obvious. As someone else mentioned, Amazon doesn't want Goodreads data to be used to add value to their competitors' offerings.

Speaking as a developer who tried to build on top of Goodreads API, I also want to add that this was a long time coming. The API had been neglected for some time. And some of the most interesting datasets weren't even made available through the API.


> Amazon doesn't want Goodreads data to be used to add value to their competitors'

This is it exactly. Goodreads was/is the best/largest source of information on books available online.


See also WorldCat:

> WorldCat is a union catalog that itemizes the collections of 17,900 libraries in 123 countries and territories[4] that participate in the OCLC global cooperative. It is operated by OCLC, Inc.[5] The subscribing member libraries collectively maintain WorldCat's database, the world's largest bibliographic database.[6]

* https://en.wikipedia.org/wiki/WorldCat

* https://www.worldcat.org


that would be great, if it had an open API

It is useless without API


Open Library looks like it has a decent API: https://openlibrary.org/developers/api


Also, user review, lists and tagging/bookshelves. Just massive data


Hopefully the Justice department can use this as fodder for a potential breakup. It clearly demonstrates how monopolies kill competition, not by competing but by buying and extinguishing.


Looks like a by the book anti competitive behavior.


Isn’t that what all the big tech acquisitions have been in the last few decades?

Buy it so no one else can have it or buy it so you can shut it down.


It's not illegal though unless you can prove it harmed the consumers.


Apparently Amazon’s Kindle lost the ability to share progress on Goodreads in the last week as well.

I guess the writings on the wall...


The end of an era. Sad to see. First IMDB and now GoodReads. So much for open data. Thanks for the bait and switch. Good thing we trusted them with our data.

Welp, time to start a better book catalog site with threaded discussions that eBook page turns can be synced with.


I think there's been a long-term trend away from open APIs toward ever-more-proprietary treatment of data. Data that wasn't created by the companies; they just happen to have control of it. Another example is the recent FB lawsuit threat against researchers. [1] Facebook will squawk about user privacy to justify this, but I have a hard time thinking Mark "privacy is dead" Zuckerberg is particularly worried about that.

What I think all off these large companies are doing is pulling up the open-web ladder after they've climbed it to dominant positions. The problem with anti-trust action is that it's reactive; we wait until a company has gotten too big, and then hope we can cut it down to size. I'd love to see moves toward proactive open-data and open-algorithm requirements, so that we guarantee a level playing field. That won't be easy, but neither is trying to rein in companies with annual profits in the tens of billions.

[1] https://www.cnn.com/2020/10/24/tech/facebook-nyu-political-a...


> Another example is the recent FB lawsuit threat against researchers. [1] Facebook will squawk about user privacy to justify this, but I have a hard time thinking Mark "privacy is dead" Zuckerberg is particularly worried about that.

I think you're flattening a lot of complexity here. Yes, I agree that Mark Zuckerberg is probably not terribly interested in our privacy. But one of the biggest outrages in FB privacy history, the Cambridge Analytica Scandal, was driven entirely by a researcher inappropriately accessing and sharing user data. While you might not trust Zuckerberg, and maybe you shouldn't, there is definitely some "there there" when it comes to how researchers handle our data too.


If you're saying Facebook shouldn't blindly trust anybody who claims to be a researcher, I agree, but that seems pretty far afield from what they're doing with the NYU researchers. Indeed, I think the fact that Facebook creates all sorts of real privacy concerns makes it especially bad that they're using that as a smokescreen to avoid accountability.


They did the same thing with IMDb a few years ago. Their data dumps now are absolutely useless and they don’t generate any of the information themselves.


And the real data is now behind AWS Data Exchange at $150k/yr.

Oh, and the old IMDB API you could pay for is gone (or, gone soon), and some of the features it provided are gone permanently.


It's a moat for Amazon. That simple.


I am sympathetic to what you are saying, but I think it does actually make sense: destroying Goodreads and turning it into just another sales funnel was presumably why Amazon acquired it.

A bunch of book reviews and book recommendations that can be used separately from Amazon doesn't help Amazon.


Amazon acquired Goodreads because it was forced to: when still independent, Goodreads made its money through affiliate links to various online bookshops. Soon Amazon's dominance in this sector was so entrenched that most purchases from GR were being made through Amazon.com and not the other sites like Barnes & Noble that GR linked to. It got to the point where Amazon was paying GR so much money in referral fees every month, that it struck Amazon as cheaper just to buy the site outright.


I would want to rephrase that. Amazon acquired Goodreads because it was cheaper to do so than to allow a good service to continue, and prevents those referrals from going to any site other than Amazon. To me, this sounds like textbook anti-competitive behavior, and grounds for Goodreads to be split off from Amazon.


Exactly. Amazon wasn't forced to do anything. Other companies like having a healthy vendor ecosystem. E.g., Toyota is famous for that.


The thing is, Goodreads always felt like a feature. They failed to build a business, let alone one that should take VC (and hence be required to generate venture-scale returns). And my guess is the only way they could have generated the right returns is to move down the sales funnel, which is a direct threat to Amazon's business. Making the subsequent behavior inevitable.

It feels like there are a couple very nice small (not necessarily lifestyle, but not vc) businesses in that space.

I'm aware of https://readng.co and https://bingebooks.com


I don't understand why you think advertising and affiliate fees wouldn't constitute a real business.


Taking VC put them on a collision course with Amazon because of growth requirements, and the limited ways to build the type of business that requires. Advertising and affiliate fees almost certainly don't create a large enough opportunity (interested if you have counterexamples!), and put them on that crash course with Amazon. As the obvious route to the size of business that merits VC is to attempt to cannibalize AAmazon's business. Probably by being the recommendation engine that starts sales, then starting to sell books themselves instead of being leadgen.

Had they instead chosen not to take VC: they had 35 headcount at the time of acquisition. That load probably requires $7m to $10m/year of revenue, particularly when they have to start doing data deals for all that book data / cover photos. I suspect they couldn't make the business work w/o VC given they ran two years on that $750k angel round. And even if they could, having a near-monopsony relationship with their affiliate fees partner makes them not a real business because they are highly vulnerable to Amazon predation. The same as Mozilla -- when there's just one buyer, that buyer names the price.

edit: put more succinctly: a business that either ran and grew on earned dollars (ie no vc), or a VC-backed business that didn't rely on the company you were attempting to cannibalize to play nice with you.


I agree that VC wasn't a great choice. But I don't see much evidence that they were reliant on investor money; per Crunchbase they only took $2.8m over 6 years of operation. And I don't think Amazon was their only potential source of revenue. Book publishers clearly still spend money on marketing, and Goodreads would be able to do very precise targeting. I also suspect they could have sold well-targeted ads to others, as some book categories strongly indicate monetizable non-book interest.


People are weirdly imprecise with their language in this domain. It’s like when people say a corporation is forced to do something to appease its shareholders, a complete falsehood.


I’m afraid you are dead wrong there. Goodreads book listings still contain links to a wide variety of online bookshops other than Amazon: Barnes & Noble, Walmart, Better World Books, etc.


I thought at some scale Amazon has different affiliate agreements.

For example Duck Duck Go. I had thought there was no way DDG could possibly be paid at the normal affiliate pricing structure.

Anyone have insight into this


Oh, I hadn't heard that. Well then, perhaps destroying Goodreads is just the cherry on top of a deal that made sense to Amazon for other reasons.


Its Amazon MO. They similarly killed IMDB by disabling message boards, my main source of interesting details about movies/actors. Why bother with IMDB now when I can read same info on wikipedia?


Amazon was forced to shut down IMDB boards, because the discussions on the most popular blockbusters had devolved into flamewars, people posting obscenities and threats, etc. Amazon ended up having to pay a lot of money for workers to police the forums. It does suck that cinephiles having reasonable discussions about less-popular movies got caught in the crossfire, but I totally understand why Amazon did it, and in fact many large companies have taken down fora on their websites for the same reason.


> The acquisition was a deal that should have never been approved

Regulators should step in when businesses/startups are being harmed by a monopoly via unfair practices such as bundling. But intervening otherwise will simply deny founders and employees a decent exit. And if the economics are not sound enough it'll be harmful for customers as well - the app will get loaded with too many ads or simply shutdown.

So IMHO not a good case for regulation.


I think the 2010s model of “just build it to be acquired by the industry giant” was a mistake, and is officially dead. You can’t expect to sell to the giants anymore, because they can’t expect to be allowed to buy everything that could grow to be a threat. It’s time to come up with a new model (maybe even a return to the 1990s model of small IPOs and mini-consolidation into a crop of strong mid-tier players, rather than giants).


Founders and employees are not a priori entitled to "a decent exit", especially not at the expense of healthy market conditions. Or at least this should be the case, but American regulators have been asleep at the wheel for decades now.


[flagged]


Depending on how charitable you want to be, a lot of the specific claims in that thread range from "a bit misleading" to "complete bullshit". If I know he's making false claims about topics that I'm semi-knowledgeable about, I have a hard time trusting what he says in areas where I'm less informed.


> Goodreads isn’t some struggling self-funded startup — it’s owned by Amazon.com

Not that I like Amazon, but it is not philanthropy. I assume goodreads take non trivial cost to run operations, moderations and the site and just because Amazon is doing well financially it doesn't mean they have free money. This move makes it easier for Amazon to monetize on goodreads, it is as simple as that.


You're missing his larger point by focusing on the first few sentences.

Effectively the only reason it's not supporting itself is precisely because Amazon bought it.


The only reason it was supporting itself previously is because investors are pouring money in it with the hope that some big company will buy it with the hope that they can have value to them.


Nope. Per Crunchbase, Goodreads took only $2.8m in funding to cover the more than 6 years they were in operation.


Key para:

The web has to mature beyond advertising as a business model. For this to happen people are going to have to open their wallets, pay for the services they use, and support independent businesses. That’s how we build a web where indies can thrive - one that’s more village centre than financial centre. I think the shift is underway.

True/false?


I would happily pay for all of the services I use. The problem is that that isn't enough for them and it would still be an capitalistic greed anti-consumer shit show.

If I pay I want: No ads, no tracking, full access to my own data in sane export formats, schemas, no data mining, no data selling, no "sharing data with our partners", encryption options, no dumb hoops, no dark patterns, the ability to point a product at an API endpoint of my choosing, backup options that default to my infrastructure first and so on.

Actually let's add more: The data generated by my use of my data in the product. Non-canned support responses that don't ask for information I literally put in the ticket three weeks ago. Prominent indication of where (geographically and legally) data is stored and used. If/how often you do backups. If/how often you practice disaster recovery.

So really what I want to pay for is sanity and no bullshit.

Yet if I do pay, many services and companies will still do all of this shit in the background until midnight the day it's finally made illegal, all the while gaslighting me about "how much they value me as a customer" and how they "respect" our "relationship".

It's literally obscene.


Yeah if you pay then they'll have strong identities linked to the accounts in addition to the tracking. Free services that can be accessed without a log in on the other hand have to correlate all the data. They are quite good at it, but not perfect.


That’s not different from pre-Internet era, though. Subscriptions were never anonymous.


Of course, the fact that you were subscribed was known to them, but which articles you read in your paper copy of the new york times they didn't know, unless you specifically told them. Nowadays the Reddit redesign tracks your mouse movements, Netflix when you pause a video, and Amazon which page of the book you are reading. I don't want any of this tracking for myself. I have no problem with people opting in, but don't make this tracking inescapable.

A tracking free format would involve in you downloading the entire newspaper issue, then reading inside in ways of your own choosing.


Without regulation that won't ever happen.

Because you, a paying customer, are worth the most to the advertisers.


I never buy this argument. If this were true, why aren't Tesla cars, Apple computers, Rolexes and other premium products plastered full of adverts?...


These companies still participate in advertising. All of these brands include the company logo and usually try to differentiate their design so that you, as the consumer, are a walking billboard for them. Beyond that, you can be assured that they collect any data on the consumer that they can. What they do with that data is out of the consumer's eyes.

It is also worth noting that there is plenty of advertising in Apple products. Heck, the last time I used one they had at least two digital storefronts built in. They may not be as crass as Microsoft is (e.g. with the use of the Start Menu), but it is data driven. Whether the advertising is plastered everywhere or not simply reflects the target market, rather than how they collect and use data.


I think including advertisements to have logo brands or linked purchase options is a bit of a disingenuous generalization, it's not what we're talking about with "advertising" in this context.

Yes, you are promoting Apple products if you carry around a Macbook with its logo, but you aren't being advertized to. The built-in stores might just barely meet the definition, but it is very different from a random ad during a TV show or on your desktop.


Yup. I was very disappointed to discover that with the purchase of my My iPhone, the Settings app, the damn settings app, had ads plastered at the top of it. Two of them trying to sell me subscriptions to Apple Arcade and TV+. I’m very disappointed that, like every other garbage company, they’ve decided to subvert their own customers by following wherever the eyeballs go and sticking ads there.


Apple Arcade, TV+ etc. are basically optional upgrades for your devices.

This is VERY different to letting third-parties advertise to you.


They’re also not _settings_. They are unwelcome clutter that are not there to make my device serve me better while I go about my life. Just send me a marketing email so that I can mark it as spam.


and 1st party ads are still ads.


Tesla aggressively tracks your car’s movements and your driving style.

The have publicly discussed their plans to monetize that data.


Would you mind sharing the public discussion to monetize data? Would be interesting to read.


apple is leaning into services and advertising heavily right this instant


What? Since when does a Hyundai car, Windows computers or Seiko watches have ads? Those are not the categories of things where ads are tolerated.

That's not what the previous commenter was talking about at all.

Some platforms and services have been selling "no ads" as a feature. Say Apple or YouTube Premium or 100s of 1000s of applications on Android. This does not mean they'll never open up to ads in the future. Maybe a future "premium value" will just be "fewer ads". And when they do, they're more likely to generate sales because they're customers who are already paying for premium services.


Have you ever looked at a fresh Windows 10 home install? It's full of ads. [1]

The argument says "the more you spend the more you are worth to advertisers, thus eventually you will see ads". It doesn't add, "there are some categories of products for which ads aren't tolerated but for all the others this holds"

There's more than enough examples anyway where cheaper options have ads and more expensive ones don't, like airplanes or restaurants and bars.

[1] https://www.windowscentral.com/how-remove-advertising-window...


Good Lord. That wishlist.

If only...

Maybe one day these things will be standard. We have to convince the mainstream these are goals worth pursuing... As long as most people accept how shit the status quo is, it won't improve.


The game is rigged, though. Mainstream is not in a position to say anything against the kind of crap modern technology is full of - and those who speak up anyway get labeled as "whiny".

FWIW, I also share the GP's wishlist 100%. But we're a niche in our own industry these days. I'm not having hopes the market will deliver - on the contrary, all these points are things for unscrupulous vendors to control to extract more profit.


So one of the things that bugs the crap out of me is hearing programmers complain about features or bugs in open source software while doing nothing.

Like, I know it's a lot of work to jump into a codebase and make a change and that some projects are a pain to work with, etc etc, but still. This is what OSS is supposed to fix, and we're exactly the demographic who is supposed to be able to contribute.

Obviously, the above is even a bigger lift, but at the same time -- if a community of online hackers, many skilled, many experienced, many damn near independently wealthy from decades at FAAMGs, can't build an Internet 3.0 to fix the Internet 2.0 we all got rich building on top of Internet 1.0, who is supposed to?


> Maybe one day these things will be standard

Many of these things already are standard for EU data subjects.


There needs to be some sort of institution that can verify these claims that these companies can use.

The reason all of these things happen is because it's easy to slip into them in a tight financial spot and there's usually no instantaneous backlash.


> There needs to be some sort of institution that can verify these claims that these companies can use.

A government?


This is the least efficient answer. A more efficient answer is an independent entity that simply has the public trust and the capability to verify such claims. I can imagine a non-profit that pulls this off, if the will to execute were there.


Why is the idea of government regulation of fitness for purpose the "least efficient" answer. That's one of the primary purposes of governments, to ensure the quality and safety of what citizens consume and how it is delivered.

Otherwise, why are there regulators for advertising or for food or government standards for anything?

I'm not talking about censorship or somehow evaluating the quality of content, I'm talking about if a company is delivering a service and I am paying for it, then the conditions under which they deliver that service should be regulated to ensure a fair and competitive marketplace.

Apple shouldn't be getting plaudits for making privacy a unique selling proposition.

All the other companies should be getting told that their business is unfair and exploitative. Privacy should be a right, the control of my personal information should be mine and consent to have it should be able to be withdrawn at any time.


> Why is the idea of government regulation of fitness for purpose the "least efficient" answer.

Government requires thousands (literally) of people and political will in order to move any solution forward. In terms of cost and time, it is substantially less efficient than independent entities.

An independent entity could theoretically start immediately, with a small team, and focus explicitly on the stated goals of the team.

> Apple shouldn't be getting plaudits for making privacy a unique selling proposition.

Absolutely agree. The average citizen has no idea how hard they're being exploited.

> Privacy should be a right, the control of my personal information should be mine and consent to have it should be able to be withdrawn at any time.

Also agree, but the government has shown they have an interest in making that not happen. The ties between 3 letter agencies and large tech companies are just the tip of that iceberg.


EU government might be trustworthy enough to run an institution like this. US definitely isn't.

In the UK the soil association is a private charity which mostly decides which food is "organic" and which food is not and it does a pretty good job. In the US, the USDA decides whether you can call your food organic and from what I understand it's kind of meaningless.

I think a private charity would work better here.


The number of people that would need to coordinate their actions to acheive the outcome is almost certainly fewer.

The formation of something new, and independent, means it would also be less susceptible to corruption.

Given that this is one of the primary purposes of governments, what do you suppose has been keeping them from stepping in so far?


> Given that this is one of the primary purposes of governments, what do you suppose has been keeping them from stepping in so far?

Businesses are much faster at inventing ways to abuse people than governments are to regulate them away, even if we ignore lobbying and corruption.

Such a hypothetical organization we seek needs to be both swift and impactful. This automatically suggests some ties to government, as it's the best way to fulfill the criterion of impact. Not sure how to handle the "swift" part, though. Can't do it through regular market mechanisms, because ultimately price trumps all on mass market. The government reacts slowly for (among others) a good reason - while an unscrupulous businessman needs to invent just one weird trick, the government regulates away whole classes of tricks by default, and they need to take care to not outlaw perfectly legit new ideas.


> The formation of something new, and independent, means it would also be less susceptible to corruption.

"independent" is doing a lot of work here. Government regulators are, at least in theory, independent because they're paid from taxes collected, and have no incentive . What mechanism is supposed to keep the new 'independent' entity, actually independent?


I can think of a few mechanisms that work but they all go after the money. It can be direct (betray the public trust, don't get donations) or indirect (betray the public trust and lose the beneficiary status on the trust that the government dumps your budget into...)

How it's designed is important, but even under the most elaborate schemes it would be less costly and human resource intensive than having the government directly take charge of it.


Who ensures the auditors aren't fucking up then?

It's not that a private entity can't do it - I largely trust the TÜVs and UL and other NRTLs for example. It's that most auditors - especially once you leave the "ship a physical product with verifiable physical properties" sector - instead look like E&Y, which I trust less than most of the companies they have audited.

You eventually need to have someone who will be substantially at risk if the audit is insufficient - "skin in the game" to use the modern idiom - controlling the audit. In the case of UL that's insurances companies (which are highly regulated by... the government), for other NRTLs it's the government directly.


How are you defining "efficiency" here?


Capital to get started, how many people need to be involved, how many people need to be convinced to get started, how long will it take to start, etc


A search engine which only lists sites which adhere to a set of rules / quality standards is the only answer I see here. The question is would people use it? I don't think so. They won't even use DuckDuckGo and similar which attempts to address the issue for themselves but at the same time provide a complete index. Not enough care / awareness.


Audits can do that. Maybe a voluntary audit where they can then say they're compliant with NoBullshitForYourMoney.org


I have a product that does of what you ask, e.g., no ads, no tracking, no "Dear valued customer".

Unfortunately: - competitors use revenue from ads, tracking, selling data to their advantage and undercut your value based on price.

- and free users love free stuff. I had a user ask me to include ads in exchange for paid features.

- and yet people still distrust you. Once I proudly posted a new feature to Show HK and the first response was "How does this post not qualify as 'spam'?". Some people automatically think you're a bad guy since you sell something.

- without advertising, how do you get the word out?

I rely on word-of-mouth, since it carries much trust, and news articles. This means it's a slow game.

It is very hard to find mutual respect (both ways) between user and maker. Most relationships start with distrust or "what is in it for me".


Randomly saw this comment & your profile and just wanted to say hello and well done on the app.

I recently started a historical time-series weather data api based on ERA5/GFS and your story really resonated with me. Similarly, I don't come from weather background and initially built it for my work with building energy analysis, with service live but still at an extremely early stage.

Happy holidays. :)


If I may add: stopping the payment should be as easy as it is to start it. Not "call us at this number in this time frame and jump through multiple hoops that we've set up to try to convince you to keep paying".


Or, if everyone (or at least more people) managed to have symmetrical high-speed Internet connections, self-host a lot more stuff.


You're going to leave security, data integrity and privacy compliance up to random unaccountable anonymous strangers on the Internet who self-host?

You're going to trust them to have proper backups, proper disaster recovery, proper resiliency and scalability?

What you're describing works for small-time stuff like blogs, personal projects or other inconsequential things, but anything at the scale of Goodreads, where users are trusting years of data to someone just can't be hosted by random people.

I'm not saying I have the answer but "people should self-host this kind of thing on their Internet connections" is not it.


I wouldn't necessarily trust others with my data, but if a person wants to trust themselves then that's up to them.

Symmetrical high speed connections allows for the option of self-hosting one's own stuff and being able to serve it to the Internet (or connect to it remotely), but it is not mandatory for people to choose that option. They may not feel confident in their owns skills so choose the option of paying someone else to be responsible for it.

I'm simply talking about having more options.


Is the answer to patiently await the post-scarcity anarchist utopia?


My main criticism is unfettered and immoral capitalistic behaviour. That doesn't automatically mean I want to replace it with a different unworkable -ism.


Ah yes, but doing literally anything about a problem is utopianism!


Rather like those awful gdpr notices which use Orwellian doublespeak like "We value your privacy" while presenting you with a dark pattern dialogue box which you have to opt out of, rather than (as the gdpr stipulates) opt in.

Nearly every time I contact customer services these days I'm fobbed off with obnoxious PR speak instead of just telling me straight.


> If I pay I want: No ads, no tracking, full access to my own data in sane export formats, schemas, no data mining, no data selling, no "sharing data with our partners", encryption options, no dumb hoops, no dark patterns, the ability to point a product at an API endpoint of my choosing, backup options that default to my infrastructure first and so on.

GDPR's right to data portability provides much of the export functionality you're after. It must be structured, in a format that is commonly-used and machine-readable. The ICO's guidance suggests that CSV, XML and JSON best meet this requirement.

Tracking is something else that GDPR helps with. Tracking of personal information via e.g. cookies require active consent. Silence is not consent.

"sharing data with our partners" requires a lawful basis when dealing with EU data subjects. This will normally be consent where data is sold to third-parties for e.g. marketing, so data subjects will be able to make an informed decision and opt out of this. Again, silence is not consent - and burying data sharing in an unreadable legal document is not informed consent.

> the ability to point a product at an API endpoint of my choosing

The right to data portability includes this:

> Individuals have the right to ask you to transmit their personal data directly to another controller without hindrance. If it is technically feasible, you should do this.

> Actually let's add more: The data generated by my use of my data in the product.

This is in scope for a Subject Access Request.

> Non-canned support responses that don't ask for information I literally put in the ticket three weeks ago

This is difficult to solve with regulation but I think it's an entirely reasonable thing to expect for your money. GDPR does not help here

Hopefully if there are multiple competitors in the space, customer support is something that providers can compete on.

> Prominent indication of where (geographically and legally) data is stored and used

Privacy information already must contain a transparent list of data processors:

> This includes anyone that processes the personal data on your behalf, as well all other organisations.

What we really need is for other countries to start taking data protection regulation seriously.


> I would happily pay for all of the services I use.

Idea: Move all online businesses to a new digital-only currency. Let people earn that currency by donating the processing power/storage/bandwidth of their devices, like the @Home projects. Of course people could always current existing currency to the new e-currency.

Let's say an hour of donating an average laptop = an hour of using Google, Facebook, etc.


You get almost all of these things from sourcehut (my product). It is entirely possible, and you're right to demand it.


>and so on.

Yours is a great list of stipulations. I would just add: support for open + interoperable protocols such as activitypub and RSS.



Pardon the pun but that is an awesome link suggestion. I was looking for something exactly like that! Thanks.


> If I pay I want

> Non-canned support responses that don't ask for information I literally put in the ticket three weeks ago.

No, sorry. I don't want to put too fine a point on it, but you get this if you pay what I want, not what you want.


What would you consider to be a reasonable price?


And who is going be able to provide for all this shit? I'll tell you who - FAANG. Good luck starting your company when all of the above is a law! I've had enough clicks on GDPR's "OK" buttons. What did it change?

All this crying about tracking - how else the owner of the place can make product better? If I own a store - I can see where people go, how they shop, how they walk around, which basket size they prefer, what they buy. If I don't collect data on my website - I can't THINK about how to make my service better. I can only GUESS. What about data collections for simple functionality - like, when you come back to the half-filled form and I remember the values you already submitted by matching your cookies. Are you against this as well?

Sorry, but if this is the alternative - I would rather have Google know everything I do online and hope that they honestly don't store data on my Incognito browsing. If they do - worse for them.


I always hate the framing of something being "capitalistic greed". The market has decided that they don't care about their privacy. You are complaining about the wrong people, you should be complaining to your fellow consumers/users. I've lost count of the number discussions I've had with people where they said they don't care about their privacy.

As for legalities. It is a global world. There is no way to enforce this effectively globally. This is a pipe dream.

Also data in some cases must be shared with partners, those might be payment processor, ID checks etc.


The market is literally fueled by people full of capitalistic greed one-upping each other and forcing others to play by the same rules. It's a capitalistic greed optimization engine. For all benefits we yield from it, it's still fair to call it capitalistic greed and notice the ethical failures the market strongly encourages.


I cannot stand the gas-lighting here. Your perception of capitalism is one that is framed as a winner takes all mentality that been sold to you by propagandists.

Capitalism is about free market trade. You provide something and people choose whether they want to buy it or use it. People add greed qualifier in there so they can frame it as something illicit going on.

The fact still remains that if people cared about their privacy (and there is no evidence they do), they wouldn't use these sites.

Before it was done online. Store cards used to track purchases and spending habits in store in the same way that sites do today (however at a much greater scale) and customers were given vouchers in return.

In much the same way. Almost all the local stores have dissapeared to be replaced by large corps that can provide everything in super stores and in much the same way that is the fault of the consumer by not supporting their local stores.

> For all benefits we yield from it, it's still fair to call it capitalistic greed and notice the ethical failures the market strongly encourages.

No the failing is on us and the users of the site for using these services when we were warned by many people that this would be the case. Pretending otherwise is passing the buck.


I agree, the consumer has responsibility here, and leverage.

But that does not absolve the producer. They are still using ethically questionable methods.

The market doesn’t get to decide the rights and wrongs. It just allocates resources. That we have to do collectively if we want to call ourselves democratic.


I don't care for democracy, so no I don't want to call myself democratic.

> But that does not absolve the producer. They are still using ethically questionable methods.

Most people don't care. How can it be an ethical issue if the vast majority of people are unconcerned by it?


But the free market literally cannot exist? The free market won't bring you back to life so you can "choose a different competitor" if it kills you.

That food? Bad. Dead. That tool? Dangerous. Dead. That work on your house? Dangerous. Dead. That car? Unroadworthy. Dead. Those aircraft parts? Counterfeit and not to spec. 300 people dead.

Every single thing humans do is already regulated in some way. Why? Because humans in the end, like all animals, try to achieve the best least effort : highest reward ratio they can.

In the modern world, these regulations need to be extended to automatically cover modern technologies and prevent inherent harm. They shouldn't be overbearing. They shouldn't be pointlessly excessive. But they are required for all things.

Many people think capitalism is greedy because they have literally spent a lifetime experiencing greed fueled capitalism first hand, not because they sit on YouTube watching propaganda videos.


> But the free market literally cannot exist? The free market won't bring you back to life so you can "choose a different competitor" if it kills you.

Life is full of risks. Lots of things were approved by bother government and experts in the past that was bad for you.

> That food? Bad. Dead. That tool? Dangerous. Dead. That work on your house? Dangerous. Dead. That car? Unroadworthy. Dead. Those aircraft parts? Counterfeit and not to spec. 300 people dead.

The FDA has stopped people from getting medication in the US that are over the counter medicines because they haven't been approved for use. It is a double edged sword.

>Every single thing humans do is already regulated in some way. Why? Because humans in the end, like all animals, try to achieve the best least effort : highest reward ratio they can.

Unfortunately. What has the current light regulation on the web brought us cookie popups that are irritating that people just click through and a GDPR warnings that don't actually solve the problem of collecting your data. I don't hold out much hope for future regulation, which btw will favour the big tech players that have been collecting our data thus far. BTW you don't know the names of many of them, because they are B2B players and provide services to the companies we do know the name of.

As for "best result for least effort". Well it depends how it manifests itself. It can either be laziness or efficiency. The latter is not a problem.

> In the modern world, these regulations need to be extended to automatically cover modern technologies and prevent inherent harm. They shouldn't be overbearing. They shouldn't be pointlessly excessive. But they are required for all things.

Inviting any sort of regulation will involve government. Government will try to justify itself by demanding more regulation. It will always be overbearing and that will cement these players in place.

At the moment, we have the best chance of these players being toppled. People are looking at alternatives to big tech and are going to smaller players, mainly due to censorship. The trickle has now become a stream, sooner or later it will be a flood. However because of regulation on the horizon (which doesn't address any of the issues we care about)

> Many people think capitalism is greedy because they have literally spent a lifetime experiencing greed fueled capitalism first hand, not because they sit on YouTube watching propaganda videos.

I suspect you are confusing corporatism (which is a form of fascism) with capitalism (which is a party of liberty).

As for propaganda. I never said anything about Youtube. Don't put words in my mouth. I am talking about how hollywood, novelists (since the 19th century), newspapers have framed it since forever. You are soo fermeted in it you don't even realise it is propaganda.


> What has the current light regulation on the web brought us cookie popups that are irritating that people just click through and a GDPR warnings that don't actually solve the problem of collecting your data. I don't hold out much hope for future regulation,

GDPR is quite decent as laws go; the problems you mention happen because the regulation enforcement is too weak. Displaying a cookie popup was never anything but an admission that you're doing something you're not supposed to. GDPR notices again mostly give the same evidence. A lot of them aren't even compliant. I honestly wish DPAs of EU member states would start beating these companies down until this bullshit stops.

> I suspect you are confusing corporatism (which is a form of fascism) with capitalism (which is a party of liberty).

Potayto, potahto. Capitalism structurally favors something resembling corporatism, because capital compounds - the more you have of it, the easier it is to get even more. The market is a dynamic system - what matters is what it evolves over time into.


> GDPR is quite decent as laws go; the problems you mention happen because the regulation enforcement is too weak. Displaying a cookie popup was never anything but an admission that you're doing something you're not supposed to. GDPR notices again mostly give the same evidence. A lot of them aren't even compliant. I honestly wish DPAs of EU member states would start beating these companies down until this bullshit stops.

All that sites will do is do a cost assessment of whether it is worth serving those in the EU and just block the IP range and people that want to use those services will just use VPNs anyway (which is what I do when I am banned by IP from a site because of the GDPR rules).

>Potayto, potahto. Capitalism structurally favors something resembling corporatism, because capital compounds - the more you have of it, the easier it is to get even more. The market is a dynamic system - what matters is what it evolves over time into.

No it doesn't. Corporatism is a collusion with government. If governments were smaller, buying influence wouldn't be effective. You don't even understand what you are arguing.

Yes the market is a dynamic system that why if you allow it to operate freely those companies that are abusing their position will start to lose market share when other competitors that don't will be more attractive to consumers. However once you involve regulation, then that mechanism doesn't happen because you just raised the bar higher for all the would be smaller players.

Again you always want to frame it in the worst light.

Anyway. Fuck this site, dissenting opinion is frowned upon here. So much for the hacker part.


The major barriers to competing with Google/Amazon/Facebook etc are not regulatory hassles, by far. Even in highly regulated industries, like space launch vehicles, the additional cost of compliance over your own due diligence isn’t the biggest barrier.

Also, consumers on average do a bad job of managing anticompetitive behaviour and harmful externalities, even if they know about them. Convenience and habit are strong motivators. And we need regulation to disincentivize companies from outright lying in the first place.

Companies and people in a fully free market system won’t magically become rational automata that behave ideally. We’re only human. Our superpower is the ability to collectively leverage our individual specialization. Foresight for negative externalities is a specialization that needs regulation to be effective.


There's plenty of room for dissenting opinion here. But if you're going to offer up an argument for capitalist-libertarianism that isn't any more than a largely evidence-free rehash of the same positions that filled talk.politics.theory on usenet in the 1990s, then yeah, you're probably not going to see a lot of support. Offer up a cogent, evidence-filled position that genuinely causes people to say "huh! gonna have to think/read about this ..." and I'm fairly sure you'd see a different response.


I've given my rationale. Calling it evidence-free when there are thousands of examples where it works and almost everytime there is government involvement it happens to be a mess. So piss off with you patronising response.


Thanks for your reply, it was really interesting and gave me some things to think about.

I agree that government intervention is not ideal, as government is also often dumb, evil or incompetent. But that's more our failure to set up a political system where only the best, skilled, most ethical, least selfish people can rise to the top. We are nowhere close.

Propaganda: I wasn't putting words in your mouth intentionally, I see your point though. It was a description of how your words felt to someone making a critism of capitalism, that for me to dare criticise I must be propaganda-ised.

I don't watch movies. I don't have a TV. I don't read newspapers. I read a lot but a broad spectrum of works from a variety of times.

I like your distinction between capitalism and corporatism, it's a great point. I wonder though: Corporations exist inside of capitalism, so isn't it a failure of capitalism to bring outrageous cooperations to heel?

I dislike your "fermeted" comment, you literally followed an accusation of fallacy from me to you, with a whole bunch of actually intentioned fallacy of your own?

Anyway, have a great day!


> Capitalism is about free market trade.

A sort of freedom. Not free from taxes, free from interference. But that's really never the case.

Say you buy something from a store, and you expect their use of your data to follow the terms on the loyalty card. They've got a bunch of commercial-code laws that protect them. You can't pay with counterfeit money, or give yourself a 2-for-1 discount. If you use credit and don't pay, men from the state with the right to use violence come and collect for the store. They're totally legally protected. But how are you protected? Only at the end of a hugely expensive court case in the best outcome, but probably not at all.

The potential risk to the store from you is limited to the cost of goods, the risk to you is almost unbounded. You might lose healthcare coverage, or your boss might buy your data and use the store to link your id to your pornhub usage and fire you. Minimal, limited risk for them, huge unbounded and unprotected risk for you.

But then you discover after buying your gallon of milk, that they (knowingly) sold your data to someone who then sold it to, let's say an insurance company, in direct violation of the words on the back of the card. Now, how do you get made whole?

So, no. I don't actually feel that the free market meets that description for 99% of transactions. Between two citizens over a used lawnmower, yes. Between Warren Buffet and Bill Gates, yes. But between you and the supermarket - not even a little.


> Your perception of capitalism is one that is framed as a winner takes all mentality that been sold to you by propagandists.

Not really. My perception comes from thinking about the market as a dynamic system, and not a static picture described by pro- and anti-market propagandists.

> Capitalism is about free market trade. You provide something and people choose whether they want to buy it or use it.

That's a "motte and bailey" defense. Such a perfect free trade almost doesn't exist, and very few are privileged to partake in it. Market offerings aren't independent - they're in competition, which means a lot of possible products aren't provided, and those that are face competitive pressure to provide less value for more price.

The most important thing to recognize is that the market, as a dynamic system, optimizes for profitable companies. Not for maximizing value these companies deliver to their customers. Usually, providing value is the most straightforward way for profit. But there are other ways - ways to provide no value, or even negative value while still netting additional profit - and the market takes them as much as it can. Vendor lock-in and surveillance are just few ways of making money by providing negative value-add.

> People add greed qualifier in there so they can frame it as something illicit going on.

Not illicit. Immoral. Because after all is said and done, the market is still entirely made of people and their decisions, which get to be evaluated through the lens of ethics.

> The fact still remains that if people cared about their privacy (and there is no evidence they do), they wouldn't use these sites.

They care and they will use them anyway, because the market doesn't provide any other option.

> Store cards used to track purchases and spending habits in store in the same way that sites do today (however at a much greater scale) and customers were given vouchers in return.

In the store. Not across stores. And they gave vouchers back, not shoved extra ads in your face. And that's without touching the qualitative difference between a human clerk doing the surveillance and automated systems doing the same.

> Pretending otherwise is passing the buck.

I hate to invoke the concept of "victim blaming", because it's usually invoked very unreasonably, but - you can't expect individuals to be able to rationally make market decisions while working their asses off trying to make ends meet, having their attention DDoSed, and facing against compounding improvements in manipulation techniques (courtesy of the market). I'm willing to cut regular folks some slack, and instead focus on the people running these companies, who had a clear choice, and chose to engage in abusive practices. You don't impulse-adopt business models, so you can't excuse it as a moment of weakness either.

(But then I'm willing to cut these business folks some slack too; in many cases, it's the market pressures that force to choose the abusive option - which leads us back to my original point: the market is a capitalist greed optimization engine. It promotes business people who think that, much like your margin, your ethics are their opportunity.)


> Not really. My perception comes from thinking about the market as a dynamic system, and not a static picture described by pro- and anti-market propagandists.

Nonsense. Your framing is exactly the same. Don't gaslight me on this. I am not naive.

> That's a "motte and bailey" defense. Such a perfect free trade almost doesn't exist, and very few are privileged to partake in it.

Yes the free market doesn't exist because governments stick their noses in.

>Market offerings aren't independent - they're in competition, which means a lot of possible products aren't provided, and those that are face competitive pressure to provide less value for more price

What a load of nonsense. It is because of the free market we have niche products (Amiga accelerators would exist and that is pretty damn niche).

> The most important thing to recognize is that the market, as a dynamic system, optimizes for profitable companies. Not for maximizing value these companies deliver to their customers. Usually, providing value is the most straightforward way for profit. But there are other ways - ways to provide no value, or even negative value while still netting additional profit - and the market takes them as much as it can.

You are speaking out of both sides of your mouth.

> Vendor lock-in and surveillance are just few ways of making money by providing negative value-add.

If vendor lock in a problem if the customer is happy with it? It is up to the individual customer to decide.

> Not illicit. Immoral. Because after all is said and done, the market is still entirely made of people and their decisions, which get to be evaluated through the lens of ethics.

That is splitting hairs. No you want them to be evaluated through the lens of ethics because it benefits to do so in this argument.

> They care and they will use them anyway, because the market doesn't provide any other option.

You don't need yourtube, you don't need facebook, you don't need a lot of this nonsense.

> In the store. Not across stores. And they gave vouchers back, not shoved extra ads in your face. And that's without touching the qualitative difference between a human clerk doing the surveillance and automated systems doing the same.

You have no idea if that data wasn't sold to anyone else. The vouchers are in themselves ads.

These store cards proved two thinds. The first being that people will willingly give up their details for some trickets, and two that tracking customers and optimising via that works. It was a stepping stone.

> I hate to invoke the concept of "victim blaming", because it's usually invoked very unreasonably, but - you can't expect individuals to be able to rationally make market decisions while working their asses off trying to make ends meet, having their attention DDoSed, and facing against compounding improvements in manipulation techniques (courtesy of the market). I'm willing to cut regular folks some slack, and instead focus on the people running these companies, who had a clear choice, and chose to engage in abusive practices. You don't impulse-adopt business models, so you can't excuse it as a moment of weakness either.

Yes I do expect individuals to able to rationally make decisions. People have been told for years and year and years on end what these companies do and they don't care. So it is their fault.


>If vendor lock in a problem if the customer is happy with it? It is up to the individual customer to decide.

No customer is happy with vendor lock-in. It is negative value to the consumer, which was the parent's point.


Why do people buy Apple products on mass then?


Because there are many factors which go in to decisions and a particular negative feature might be outweighed by other concerns?

Also, pedantic point, I think you were looking for the phrase en masse, of french origin. See https://en.wiktionary.org/wiki/en_masse


Think how many more people would buy Apple if not for the lock-in.


So you think it's obscene, but you still use them? Doesn't that suggest that you value your privacy less highly than you claim?


Life is full of compromises; it may still be the best available service on balance.


Sure, but one available choice is "don't use x". The point is that he says it's obscene, but he acts like he prefers it to not having it.

One worry about the rising tide of anti-Facebook etc. sentiment is that it is driven by (a) a small number of vocal people who (b) say that x is horrible, but don't act like it. (I don't say whether this worry is always, or even mostly, correct. Just that it is worth considering.) The result can be problematic.

For example, like millions of people in the EU, before I use any new website, I now have to click "OK" on a GDPR button. I consider myself as caring about my privacy. But my behaviour tells me that I just click "OK" without even reading what they have to say. Obviously, my privacy is less important to me than the 10 seconds or 1 minute that it would take to read their policy and opt out from anything I dislike.

You can respond that this is a "dark pattern", and I agree. You could also say it would be better for me to be able to set my privacy once - say, via browser settings - and I indeed try to do that, having used various Firefox settings and plugins to manage my cookies. Nevertheless, the fact is that, as my behaviour reveals, I would rather get on with my life, than waste 10 minutes a day clicking on privacy policies. Given that, of course, I would also rather not be presented with the "GDPR OK button" at all. It's a fleabite of annoyance that I deal with many, many times, and every time I mutter to myself "screw the EU and screw the GDPR". I suspect I am not alone.

I also suspect I will be downvoted for this, by just the noble idealists about whom I am grumbling. Sigh.


" would rather get on with my life, than waste 10 minutes a day clicking on privacy policies. "

The price of privacy sounds cheap. 10 minutes a day for privacy. How much cheaper do you need it to be.


That may be an excellent argument for what I ought to want, but I am reporting what I actually do want.


Use what? The jist of your argument here feels like it's heading straight towards "you are against foo, but you still bar, hmm interesting" meme.


Alot of blockchain services share that philosophy and have a decent revenue model with the tokens.

The only problem is that the consumers want to trade the tokens at a profit instead of as purchases.

But that isn’t really a problem for the service that sold them. It is revenue. But people have an uncomfortable relationship with other people making money when they can extrapolate how much and consider that a problem.

Many services now are completely client side and use the nearest node that you connect to as the backend. They store enough variables in their smart contracts on chain and do the rest of the calculations client side. So their web service isn’t tracking you. But if you reuse addresses other people are.


The model dictates the medium. Advertising as a model has forced a need for engagement, notifications, stickiness, attention grabbing.

I'd argue the term 'pay for the services [they use]' is too vague here to be meaningful - there are too many options that would drastically change the incentives.

Pay per API call? APIs start to need 5 calls to get all the info you'd need for one request. Subscription model? Consumers are going to have to juggle a different account for every provider they use. Subscription to aggregator who then pays content providers based on usage? We're back to the clickbait situation we were in in the early days of advertising and are arguable still in.

For me the insane thing is that there are no options. I can't universally buy any song I want DRM free from a range of providers. I can't pay per article for news from an RSS feed.

TV is an interesting one because the industry has convinced the user base to pay for content, but the subscription model is already showing some of the limitations shown above.

I feel like the biggest innovations in this space aren't so much new ideas or 'just convincing people to pay for things', it's a case of making the payments as easy and understandable as their previous counterparts.


> it's a case of making the payments as easy and understandable as their previous counterparts.

Blame AML/KYC.

Frictionless, permissionless micropayments are illegal. On purpose.

This is not a social or technological problem. It is a legal problem. AML/KYC does not scale.


Frictionless, permissionless large payments are illegal. The problem is that no one has worked out how to make micropayments costless, and that has nothing to do with AML/KYC and everything to do with the costs of running a payment system.

Bitcoin has turned out to be expensive to mine. Most other distributed ledger "currencies" have been launched to make money for the creators of the artificial scarcity.

Governments cannot give up control of the money supply but they are the only ones that can establish a form of electronic currency that works the same as cash does now.

Whether they're willing to make it as anonymous as cash or allow it to be exchanged for other government's electronic currency is part of the problem that needs to be solved.


Nanocurrency addresses these issues. There's still the cost of running the node for those that need it. Outside of that transactions are free and typically settle in about a second or less.

The KYC stuff is still there at the exchange level of course.


Could you say what AML/KYC is?


anti-money laundering and know-your-customer rules.


> For this to happen people are going to have to open their wallets, pay for the services they use, and support independent businesses.

Yes but to be frank if I had to pay for everything I use I would go broke.

We're not all on SV salaries here. I have about 100€ of disposable income per month. If I put it on online things, it means I would have to stay at home (I mean pre or post pandemic times).


The problem is, companies overcharge in dollars and undercharge in ads. Your visit to a website may make that site, say, $0.1 in ads. Would you pay $0.1 per day? Of course you would. But when they add a paid ad-free option? It's more like $1 a day. They're greedy.


Even $0.1 per day is too much for the person with $100/mo disposable income to continue browsing with the freedom to browse they have now.

Sure, for one site, that $3/mo will not break the bank.

But being limited to only $100/$3 = 33 sites a month would be quite constraining.

When I'm reading about a topic or searching for a purchase, I'll typically visit that many sites in one session. Over the course of a month it is normal to visit hundreds of sites (I have counted for auditing a job), sometimes over a thousand. That's without giving it any thought, just following links and reading linked content.

(Heck, when reading HN I sometimes read more than 33 HN stories in a single sitting, but to be fair I don't often follow the links to the articles themselves ;-)

If I had to monitor my usage to keep it down to 33 sites in a month, I could do that but it would be a very different reading experience than I currently have.

Like back in the days when we had to pay per minute of online access (outside the US, using audio modems). The change to a single monthly subscription with the freedom to read as much as you want, as long as you want, was liberating and transformative. It would be a little sad to go back to having to self-police reading in order to keep costs down.


On top of that, the parent poster doesn't want to spend 100% of their disposable income on online browsing. If they reserve $80/month for going out, and $20 for online, that's only about 6 sites per month.


In your scenario they need ten times the free users to get the same revenue as one paying customer yields. Presumably, for most services we’re talking about here the marginal cost of having one user versus ten is probably negligable. It doesn’t seem unlikely to me that if a sevice starts charging, 90% of the users drop off, and thus the revenue per user needs to increase commensurately in order to have the same profits.

Of course this is totally made up, so it’s impossible to argue about, but my point is that what you described didn’t necessarily seem out of line to me.


I wonder how much banks are the problem here. I am not sure if payments of a few cents can be realized online without loosing money on fees to payment providers or banks.

AFAIK, this is the "micropayment" problem, that people were hoping bitcoin would solve.


Bitcoin is only economical for very large transactions - the energy cost of even a single btc transaction is huge. It doesn't solve the micropayments problem at all.


The banks are 100% of the problem.

Bitcoin solved it, and then AML/KYC made it illegal to not have bank-like overhead costs.


Bitcoin didn’t “solve” the micropayment problem.

Compared to Visa, Bitcoin solved micropayments only if micro refers to the daily transaction volume.


Visa settles at the end of financials days with revocation possibly up to 80days out. Bitcoin settles every 10mins with revocation possible up to 1 hour out.

To keep the comparison accurate, you can transact many times within an “accounting block”. Bitcoin has the shorter accounting block and tier 2 systems laid on top of it seem to scale Bitcoin well beyond any accounting block maximum transaction cap.

The GP is probably not referring to speed of settlement though. Bitcoin solves KYC and trust and allows frictionless micro-payment.


Miners don't do AML/KYC, they just charge higher transaction fees than Visa etc al on small payments.


AML/KYC is not the problem.

Companies can issue gift cards that are entirely anonymous, paid for in cash, redeemed anonymously for cash.


I think part of this is about an inability to market discriminate effectively. Some people will likely make the website much more money (whales), but if they can't capture that difference with their pricing they'll be making less.


Moreover, running a site with ads and tracking has higher hosting costs than a cleaner one that just serves contents to paying customers.


I don't think it's much of a difference, given most of that will be third party and content can be (but isn't always) heavy.


The part of the website that does payment processing and ensures only authorized users can view content isn't free either.


The ad revenue Google makes depends on many factors but also on the disposable income. If you only make €12k per year, you won't bring them the same revenue as someone making €120k.


It sometimes feels like ad-supported online offerings are a bit like cheap fossil fuels in a way. The negative consequences across the board are obvious. This goes for the short-term, immediate nuisances it creates, but people who think about the hard questions also see the long-term negative consequences way ahead.

Unfortunately, advertising has also created an unsustainably high "standard of living", so to speak - you get so many services and applications for free these days that would realistically not exist or cost much more than you were willing to pay for them had it not been for advertising.

Personally, I don't think there's a way out of it until someone comes up with an alternative that brings the benefits of advertising without all the downsides it has, because individual consumer incentives are just not aligned. I'll gladly pay a one-time fee for some productivity app, or a small subscription for something I use almost daily. But if e.g. goodreads wanted to charge a subscription, community size would probably dwindle, and personally I'd just start keeping a spreadsheet of my books again.


I like Affiliate Links for this. It’s above-board, acknowledged, and aligns user, publisher and advertiser value. There’s still some privacy costs, but they’re focused on users that actually take action to engage with the promotion.


The alignment is imperfect at best. How many “review” sites do you see out there with reviews that are shallow (to put it politely) and consist primarily of gigantic “check price on Amazon” buttons?

If you ban ads and monetize only via affiliate links, I think you’ll see a lot more poorly written, often algorithmically generated content shilling for affiliate clicks.


I don't know if its true but I think its a valid perspective and so there's likely some truth there.

People don't want to pay for content. If people pay for content and feel ripped off, they can ask for a refund. Then cheaters can pay to access the content then ask for their money back if they want to. This puts the content provider in a bad position.

If people pay for content, then they want to have that content themselves forever. In some sense, this is fair. But then they want to share that content with others. Then the other person/people don't need to pay. Now you have a problem where everyone can just get the content when one person has paid for it. This is a bad dilemma where content providers and consumers both seem to have a good case.

Since neither pay model seems to work, companies just show ads. Then people ignore ads, so the ad companies make them more attention-grabbing and intrusively targeted. So people use reader mode and ad blockers. Now no one is looking at the ads that pay for the content.

I wish I knew the answer.


I'd say piracy like you are saying is not even a blib on the radar. Steam, Netflix and Spotify have proven this.


Among younger people I know (under 35), piracy is extremely common, and Netflix/Spotify/Steam is also extremely common (although the Netflix accounts are often shared with a relative or friend). They aren't really mutually exclusive.


I think it depends on the genre. Sci-Hub is hugely popular [0] among academics and the academic publishers allege that they are missing out on a lot of money because of it (Can't find the source now).

[0] https://familyinequality.wordpress.com/2020/01/15/sci-hub-us...


That's a completely different game. Academic publisher have little value add. They essentially steal and make money of the work of scientist while the scientist themselves have to pay them. This is not comparable to piracy in the consumer sense.


I'm not assigning any moral weight to them; just observing the basic facts as they result to publishing.


It's not about anything moral. It's about a qualitative difference between. The second one should even be called piracy.


Whats the qualitative difference?

> The second one should even be called piracy.

Sure, I agree. Piracy is a loaded term anyway.


You are getting content you made yourself for free VS you are getting content you have not made yourself.


I havent made any of the studies I download from sci hub unless youre referring to the funding via tax dollars.


Then you are not the typical user. By and large people who use journals are scientists and students who publish themselves.


They aren’t downloading their own papers; rather those of their colleagues.


I choose the third value, “ridiculous premise”. People love to craw about how “you are the product” because they don’t pay for some service. Well, physical newspapers have been advertiser-funded for a long time now, even though people still pay for them. Chomsky writes in Necessary Illusions that newspapers sell the eyeballs of the readers to advertisers and that that in turn deeply affects what is printed. That book was published in 1989.

As far as the media is concerned, everyone here under the age of sixty have always been the product. And it’s deeply naïve and idealistic to think that “paying for their services”—how will they do that? Oh well—will change something which is now completely fundamental to media as we know it.


I think it's a nice idea, and I love the spirit of it. That's certainly how I try to live. I have a lot of small subscriptions, and my Patreon bill is over $100/month.

But I think individual action isn't sufficient to get us over the hump. There are just too many things we use on a daily basis, and often those things use things that use things. "Free" is an illusion, but it's an illusion with a very low cognitive load. Manually supporting each and every thing I appreciate at the right level is a complex and taxing process. In practice, I'm sure I miss a lot.

In the physical world, we have some solutions for this. I don't have to subscribe to each park I use. I don't have to kick in for each sidewalk tree I walk by. I live in a neighborhood with a lot of street and alley murals, all community supported in various ways. I think the next step forward for the web involves finding ways for collective action with low individual cognitive load. It wouldn't be perfect, but it could be better than what we have now.


Where we are now is the best it's going to get.

Switching away from ads will feel good until some do-nothing decides they'll hit their bonus targets by re-introducing ads in addition to the subscriptions.

This has happened many times. It's obviously going to happen again.

You do not want to normalize subscriptions for every old Web service, trust me.


This is way, way beyond the web. Cue in "the century of the self". Advertising and self deception is too embedded in our culture and way of life. Also cue in "Ways of seeing".

Iow, I believe advertising is symptomatic of our current state of consciousness as a collective, and so it is not a cause.


Thanks for the reference to Ways Of Seeing by John Berger, looks interesting.


There's already plenty of paid content available, but for now, the advertising model wins? Why? Because most paid content is simply too expensive. You can't charge $5-$10 or more for rarely used niche stuff when Netflix is $8.99 and when musicians get $0.01 per streamed view or less.


One wonders if there is even business model on web? Maybe it would be better to go back to passion projects. Ones where person making content also pays for the hosting. Probably leads to lot less content being around, but also removes money.

Ofc, this could exist with actual paid services.


I think a lot of these services could also be run more efficiently.

I’ve not used GoodReads before but it appears to be a book recommendation service. I’m surprised they can’t run the main site & API on Amazon referral links for the recommended books.


Believe it or not, Goodreads is actually owned by Amazon.


Ah, I had no idea!

Makes me think there’s a different reason for closing the API, because Amazon could run the service for close to nothing on AWS.


Well, they got bought by Amazon so basically the whole site is a feature of the Kindle Store at this point.


Most people don't have money, and when I say most, I mean at least 90% of people. So free services powered by ads are the best thing for them, and those people are the large majority.


I'm not sure it's that clear. The fact that free services with ads manage to thrive means that they somehow manage to extract money from those 99% of people. I'm pretty sure it's not only the richest 1% supporting the whole Internet ad eco-system.

Ads convince people to spend money on some product they wouldn't buy otherwise. This in term finances those ads which finances the service. I'm not convinced this is better in the long run for many than paying services directly.


> The fact that free services with ads manage to thrive means that they somehow manage to extract money from those 99% of people

I think that doesn't hold logical water.

If only 1% of people buy things of significant value based on ads, and 99% of people never buy what's advertised, it is still worth advertising while advertising is cheap.

A related phenomenon is the way online games make their money.

The vast majority of players never pay for anything. A few pay a little, and a tiny minority pay so much more than everyone else that it's the tiny minority that the game-maker depends on for their business.


I am not sure why are you getting downvoted. If vast majority of Internet users were broke, there would be no sense in tracking them and showing them ads for paid products/services.

Looking at the strain of our local delivery services right now, that cannot be just 10 per cent of people shopping.


I'm not sure what "don't have money" means here. Presumably 90% of users do have some money. I would think most of them pay money for internet access. The question is whether they can afford service X given the other things they would prefer to pay for.


A web where everything needs a subscription is a closed web. I would only subscribe to the things i deem fun enough or necessary enough. So that would be perhaps fb, Whatsapp, Gmail, Tinder, Netflix and Amazon. The latter three i already subscribed to.


True, but I don't think the shift is underway.

I think in order for this to happen, there's going to have to be better payment models than $4-9 a month subscriptions.


Agreed. There are so many services I hardly use but would happily pay $5-10/year for but their paid services are targeted only at the power users who are happy to pay $10/month for.


> For this to happen people are going to have to open their wallets, pay for the services they use, and support independent businesses.

Crowdfunding is probably the best solution, if it can be made to work - it would allow for some kind of monetization on a voluntary basis, while preserving free access for most users. But I don't know of any site that really uses it successfully, aside from Wikipedia.


I used Flattr, but for some reason they have now discontinued their browser extension... :(


>people are going to have to open their wallets, pay for the services they use, and support independent businesses.

Yes indeed. Right now news & opinion are largely limited to what is considered "advertiser friendly" - think short news cycle, and other problems.

The perspective of paying pennies for everything introduces a lot of friction. Luckily an alternative model is making roadway, mostly in Youtube circles: pay a couple of your favorite creators, and watch the rest for free.

The extension of the old "freemium" model, but this time based around stochastic balance between viewers who pay this or that creator. The paying subscribers often get access to additional material - one that's more geared towards fans or deeply interested viewers - while the regulars, free-to-watch, viewers get the general content just fine.


Isn't Goodreads primarily a community? Are there many pay-to-enter community websites?


Lots - you just have to figure that the value you get is worth the price of entry. Usually it's an "I'm going to learn something, or learn from people enough that will cover the cost of community" or an "I'm going to satisfy a personal need" (be that meditating, new recipes, or getting tied up and whipped).


Advertising is just too lucrative for them - they'll still do it even if we end up paying subscriptions. For example, look at the New York Times; you're still served ads with a paying subscription.


> The web has to mature beyond advertising as a business model.

I've said it many times before but it's worth repeating...

The Internet survived just fine before all the ads and tracking and it will survive fine without it.


Back then the internet was created and run by enthusiasts which passionately wanted to share and build.

Today it’s run by people, standing on their shoulders, whose dominant motivation is making money or how to “capitalise” on something, something which they have no fundamental interests or passion for.

Obviously the outcome is going to be different.


False, one of the big players will either rip them off(snapchat) or force them into a buyout by under cutting them (diapers.com)


This kind of service is pretty trivial, you can run it for cents per user.

If you allow the web to move to a model where you have to pay dollars for a service worth cents it turns into the same kind of market as the mobile phone operators market charging fees for sms messages: a total rip off.

Why does this needs to be a business anyway? The only reason is because it can be.


I want to pay for stuff that I want to read/consume. The problem is that no one implemented micro payments and wants me to pay some amount every month for crap that I don't want (Netflix, spotify, Disney, newspapers). This needs to end first.


I used Flattr, but they have now discontinued their browser extension... :(


I think that the traditional card/banking companies who handle transactions are probable stopping any innovation in this area. They have a working model and no reason for innovation if it works. The newspapers and traditional banking are slowly bleeding/dying here in DK so perhaps they need to die off before something new can take their place.



Sadly I don't think the majority of the people are willing to pay. Getting things for "free" is so ingrained into how the web is, that it's too big of a paradigm shift now.


True. Web “Advertising” is a poison that is shredding society apart.


It makes me really curious to know how they plan to monetise readng too. There’s no pricing or anything yet that I can see, but I wonder if that’s where they plan to go post-beta.


> The web has to mature beyond advertising as a business model.

Is the Web a business model?

The Web really is machines serving HTML, JSON, XML,... over HTTP/HTTPS across physical connections. There are several ways of looking at this, but often enough the debate gets reduced into a dichotomy.

I'll put this into two simplified extremes.

The Web is a shared infrastructure seen as a "public commons". You can access that infrastructure, request/receive bits and bytes from other machines for free and if you want to host content yourself, you connect your own machine to that infrastructure and you share content via your own machine for free. You carry the costs of the usage of the infrastructure yourself, regardless of the direction of data traffic across the network.

The Web and it's infrastructure are commodities. Storage, maintenance, bandwidth,... are expenses that should be offloaded. The main goal of hosting and making information available on the Web is to, either directly or indirectly, make a marginal profit. You pay for the privilege of accessing someone else's machine to download data, and you get paid by those who want to gain access to information you host on your own machine.

The problem with the statement above is that it implies that both extremes are mutually exclusive, and only the latter one is viable.

This is false.

The Web is ultimately a decentralized network which is build on top of intentions and goals of humans. And those intentions and goals can be wildly differing. There are parts of the Web that operate according to the former idea, and there are parts that operate according to the latter. Both exist, and there's a spectrum in between.

In the analogue world, the same notion translates into private businesses, non-profits, cooperations, community initiatives, charities, public initiatives and so on.

Goodreads choosing to close down it's public API is just one case choosing to move towards one side of the spectrum. It's by no means an indication that the entirety of the Web and - more specifically - it's denizens decide to move towards that side.

That spectrum does emerge based on laws of economics, though.

Goodreads has always been a private business. public API's of private businesses are never truly "Public". They are either a courtesy or a business investment. And they will step away from such courtesy if the costs outstrip the benefits.

The Web isn't quite the same as public space though - parks, beaches, forests, streets, grasslands,... - because the vast amount of infrastructure is privately owned. In that regard, the notion of "The Web is a Commons" is only true to the extent that private people are willing to accept and support that idea, and are willing to carry a shared part of the costs.

It's that last part which makes all the difference. Operating a basic website with a limited number of visitors comes at a low cost, and so one could operate a small Goodreads like website with a niche of books. There are plenty of examples of people keeping freely accessible blogs and the like with their own book reviews.

Goodreads tries to turn that idea into a business model. The intention of generating a profit is very distinct in that regard. However, not only is it hard to sell the opinions of other people, it's even harder if costs generated by trying to cater to an audience of millions outstrip the revenue generated.

The Web isn't financial centre. Just like London nor New York aren't representative as to the entirety of human society. Big businesses are - ultimately - only a part of the Web, just as much as they are only a part of society. And if their business models fail to keep them operational, the Web, and society, will, ultimately, churn on without them.


That said, the there is a tilt towards the second idea in some areas where perhaps it would be prudent of us to invest more in the public space model, or in other ways shape things more to our liking.

As the infrastructure changes the dominating idea shapes it.

As an example there used to be (and stills is, but the change is evident) vendors selling music recordings. Vinyl, tape, cd. The market traded in tangible artifact that could change owner, could be copied (legally or not) could be put in a library and lent out.

In Sweden we even pay a special copy compensation tax when buying any device with storage capacity to tunnel some money towards “content creators” in support of this distribution form.

However as the technology has shifted, allowing for direct streaming, the trading of artifacts has disappeared. The laws and economic realities now promote a market with fewer vendors offering only a limited catalogue of recordings, and only in a form that can never leave their control effectively.

This is only an example. And I think this particular one is mostly about reviewing the legal landscape.

Another example might be how protocols like RSS, XMPP, SMTP was used for interoperability and allow different vendors to offer compatible services. As things shift, this time perhaps more due to economic realities, the dominating tendency is still to erode interoperability and dominating players shape the technology towards their more siloed reality. Perhaps we need more tax funded players, (public service?), simply competing and collaborating to tilt things back again.


True, it could work like pinboard or MetaFilter.

In general, I doubt that the network effect can be overcome for consumer platforms. People want to share their book reviews. Why should they limit themselves to people who pay? Paying only works for those who want social filtering.

To compete with Goodreads, the data would have to be free like Wikipedia, for other competitors to emerge so that it is not just a village but a country with many villages. But then, it's a rtf reader situation where it is difficult to survive as an app creating company.


The web was built to facilitate the exchange of knowledge. Ad-sponsored content, despite shortcomings, has so far kept textual content mostly free of charge (talking about content that wouldn’t exist by voluntary contribution here). I fear this push for paywalls is merely going to further disadvantage poor people and people in developing countries who can’t easily hook into whatever payment scheme that is required.

No, I don’t have a great solution other than the status quo.


We need to start taxing digital advertising heavily. All it creates are dark patterns.


I’m certainly sympathetic to this idea. The marginal cost to serve an HTTP request is vanishingly small, but the fixed costs to develop the application itself (mostly labor) is pretty high. That means your cost per user is pretty high at first, since all that work supports a handful of people. You seek more users to amortize costs further, and it works until you start accumulating enough HTTP requests and user data that costs start to climb again. And of course more users means more use cases and user agents, which require ongoing maintenance investments.

The whole process doesn’t have to be expensive, but it’s certainly not free. You can build very cool stuff and give it away for free, but sustainability and scalability ultimately require revenue. The magic thing about a successful business is the ability to cover execution costs, support the development team, and still leave value on the table for users.

I think digital services have drastically different economics depending on (1) how adding a NEW user changes the value proposition for EXISTING users, and (2) something like the user’s start up/discovery cost relative to lifetime value

A direct payment model makes sense when your users’ value is independent from the size of the user base (assuming performance scales at least linearly). These services can tolerate moderate startup/discovery costs. For critical enterprise services, startup costs can be high because the lifetime value is still much larger.

If value scales with the user base, as in a social network like good reads, then startup/discovery cost must be pushed to zero to grow the user base as quickly as possible. A paywall slows user base growth and reduces value for those users that actually choose to pay.

So far, advertising is the only known way to monetize (and thus sustain) a digital service while maintaining near-zero startup/discovery costs for individual users. Micropayments, even with good UX and low fees, increase joining costs relative to advertising. Thus they will reduce the value of the service to paying users if value scales with user base size, but would benefit services where value is independent of user base size.

Federation is maybe the best way out of this dilemma IMHO. The value of the overall network grows with the user base, so adding new federation partners should be near-free. Each instance is small relative to the network as a whole, and thus can focus on individual user value rather than growing the network, which means it can charge users directly. This is why you’re willing to buy a great Twitter or Reddit client app, but would never pay for a Twitter or Reddit subscription. (Yes I know those are centralized services, but the model holds if you look at the business relationships).


Ideally, "book reviews" would be a semantic web category, which a search engine could fetch from personal websites. (Note that this already kind of works with current search engines.) But the incentives were never really aligned for that...


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: