Hacker News new | past | comments | ask | show | jobs | submit login
Free and liberated e-books, carefully produced for the true book lover (standardebooks.org)
639 points by Pick-A-Hill2019 on Nov 18, 2020 | hide | past | favorite | 106 comments



Love love love Standard Ebooks.

If by some chance a maintainer reads this thread a couple requests:

1) There are a lot of obscure books which is great. Also doesn’t hurt to add some of the GOATS such as the works of Plato or perhaps even the Harvard Classics

2) Sorting on site is frustrating

3) I would be willing to commission high quality epubs and I’m sure others would too for my own benefit and humanity. For example I can get copies of the works for Plato on Gutenberg but it merits a Standard Ebooks edition and I wouldn’t blink at donating $250 or some number toward that cause. Knowing kids across the globe would have access to high quality digital editions is worth it to me. Worth exploring setting up a program for this.


Project lead here. There's thousands of books we want to produce out there, and limited volunteer hours. If there's something you want to work on, it might be fun to get started as a producer for the project yourself. (Note that we ask that new producers start on a simpler project first, which might rule out some longer and more complicated stuff like Plato translations--at least to start with.)

Our wanted ebook list includes a list of works that we think are good starts for first-time producers: https://standardebooks.org/contribute/wanted-ebooks

I've been exploring the idea of crowdsourcing financing for ebook production. But I think we need some kind of not-for-profit framework set up first. If anyone is in that space and would like to talk about NFP infrastructure, or being a NFP fiscal sponsor, please contact me!


I looked around on the site a bit and perused the source and I couldn't find the answer to my question: what sort of things are in the "advanced" epub version that aren't in the standard?


The “advanced” version is what we produce by default, before it’s processed for consumption by older readers. So, to give a couple of examples, CSS pseudoclasses are processed into actual classes, and MathML is rendered as PNGs. The hypothetical perfect ereader could support this natively, but for most ereaders its going to be a less buggy experience to go for the version with compatibility shims.

(Incidentally, Kobo has an excellent MathML engine, so we don’t do the aforementioned PNG rendering for kepubs.)


Interesting. I've never heard of a kepub before and I have a Kobo. The kepub format isn't even mentioned in the spec list for the [Forma].

[Forma]: https://ca.kobobooks.com/products/kobo-forma


I assume it's just an epub that's been optimized to work with Kobo's tweaks.


> There are a lot of obscure books which is great.

This is my biggest issue: discoverability.

The site has an index which shows 12 books at a time, and it looks like 36 pages. So roughly 430 books.

This is not actually a gigantic number of books, one that is small enough that it would be great to look through all the books, and yet I'm never going to do that if I have to page through 12 results at a time.

It would be great to get a simple list view of the collection.


I also wanted just a list of the collection, but it looks like the 'best' list is the list of (just HTML) repositories in their github org: https://github.com/standardebooks?q=&type=&language=html

It would be great to just index whatever metadata is available and provide search facets on those fields. It looks like there's quite a lot of metadata in just the few that I cherry-picked: https://github.com/standardebooks/p-g-wodehouse_right-ho-jee...

Looks like at least some of that already works, since you can sort by the "se:reading-ease.flesch" field on the web site.


This would be a great marketing opportunity for Algolia, or Duckduckgo or Qwant site search.


you can search by your interests for example "philosophy" would bring ancient books on Stoicism etc.


I strongly agree with the parent post that having a simple list of books somewhere on the site is a necessity to enable browsing. If I have to search by interest, it is quite unlikely I would search here no matter how much I like the work that you do.


I'd like just a .jsonl or .csv of the list.


We publish a full OPDS feed: https://standardebooks.org/opds/all


+1 for the idea of some kind of donation framework


Yeah, just a "most downloaded" sort order would be great improvement.


What are GOATS?


Greatest Of All Time


Not to blow my own trumpet too much, but I just finished producing a complete Poe short fiction collection for SE if you’re looking for something to read, and submitted 1,025 corrections back to Gutenberg’s collection in the process: https://standardebooks.org/ebooks/edgar-allan-poe/short-fict...


Thank you for your work on this, I was very happy when I saw Poe's writing pop up this morning among the releases [1]!

[1]: https://standardebooks.org/rss/new-releases


I was just thinking that it might be nice to form playlists of short stories from multiple books/authors. There's probably a bunch of lists (award winning, by genre, era etc). Seems a shame to enforce the bundling at a book level.


> “The Fall of the House of User,”

Sorry for bringing it up, but is that a typo in the description?


Classic case of user error if I ever saw one!


Yes, this has been fixed, thanks!


Project lead here. Glad to see that people like the project, and I'm happy to answer any questions.

We're always looking for producers to volunteer to work on new ebooks. The process is a blend of many different disciplines, so if you're interested in art, literature, programming, and the command line, trying your hand at an SE production might be fun!


First: thanks for this project. Currently halfway Gullivers Travels, and re-ead and re-discovered some great classics through your project.

What are the plans for other languages, if any?

I noticed that you have e.g. German or French originals translated to English. Is there place in the project for the original languages?

And what about translations to other languages? If I'm going to read a translation of Les trois mousquetaires, I'd prefer a translation to my native language instead of English.

Edit: saw the thread about this down below too late; was on mobile. Never mind my questions, they seem to be answered.


One of me pandemic projects was to take the books on Standardebooks.org, parse the ePub and write transformations into HTML with indexing. Was trying to build a platform for me to write tech books on, but I wound up just enjoying getting into the classics again. Will probably delete my book and just make it a site to read the classics online.

https://papiary.com/


This is very nice! A nice addition would be a widget to allow changing between a few different font faces and color palettes (like you'd see in typical ebook readers or "reader mode" browser tools).


That’s what I was considering for a pro version. Would you pay something like $10 a year for that?


They also have a very nice style manual that is useful reading for anyone who produces ebooks:

https://standardebooks.org/manual/1.1.1


A hack to show every book on one page:

https://standardebooks.org/ebooks?query=e

displays every book with an e in the title or author's name on one page. I guess that doesn't leave out many.


I think this prints out title and author of every book:

  curl https://standardebooks.org/opds/all | awk '/<title>/ {split($0,a,">");split(a[2],b,"<");printf b[1]} /<name>/{split($0,a,">");split(a[2],b,"<");print " - "b[1]}' | tail -r
The tail -r is to reverse the output, it's otherwise in approximately reverse alphabetical order of author's first name. Output is

  Hadji Murád - Leo Tolstoy
  Pierre and Jean - Guy de Maupassant
  The Red House Mystery - A. A. Milne
  The Moon Pool - A. Merritt
  Fables - Aesop
  Poirot Investigates - Agatha Christie 
  The Man in the Brown Suit - Agatha Christie ...


oh very nice. My machine doesn't have a `tail -r`, but it has `tac`, which I guess does the same thing.

I tweaked your output to put author first, to make sorting work better:

    curl -s https://standardebooks.org/opds/all | awk '/<title>/ {split($0,a,">");split(a[2],b,"<")} /<name>/{split($0,c,">");split(c[2],d,"<");print d[1]" - "b[1]}' | sort
gives something like:

    A. A. Milne - The Red House Mystery
    A. Merritt - The Moon Pool
    Aesop - Fables
    Agatha Christie - Poirot Investigates
    Agatha Christie - The Man in the Brown Suit
    Agatha Christie - The Murder on the Links
    Agatha Christie - The Mysterious Affair at Styles
    Agatha Christie - The Secret Adversary
    Aldous Huxley - Antic Hay
    ...


The "all" feed is sorted by most-recently-updated-first (this is part of the OPDS standard), so instead of tail -r you should pipe to sort.


Are these changes ever contributed back to the Gutenberg library as updates? I’d like to assume so, but I can’t find anything indicating that’s done.



Their RSS feed for new releases:

https://standardebooks.org/rss/new-releases


I was thrilled when I read the part about not ignoring dashes. So I was a bit disappointed when I saw the "Rich & detailed metadata" screenshot had the dreaded double hyphen. Even acabal's replies on this thread use a double hyphen.

I guess I shouldn't complain too much, even the New York Times misuses the double hyphen.


Maybe you're referring to the LoC categories? Those are separated by two dashes as the industry standard for that metadata. The actual text of the books features correct em dashes.


I’ve created a webapp for Gutenberg catalog if anyone’s interested - https://bookdark.com

Features:

- works in all platforms with pdf format

- faster navigation

- faster search


Doesn't work for me with Firefox 82.0.3 (64 bit) on Win10. Hitting the Read Book link just gives me a blank window with the Close Book link at the top.

Took me a few moments to realize that this is because my Firefox is set to download PDFs instead of displaying them in Firefox.


Thanks for letting me know. Hopefully you can at least access the pdf by downloading. Just so you know, the pdf display embed only works in desktop screens etc., mobile screens would open pdf in a new tab in mobile browser. This is because pdf embed within app makes text to be too small for some readers.


Hey, great idea.

Just one thing: the two-level horizontal scroll doesn't work consistently for me, and just seems a bit weird. I feel like the top level navigation should be left to the obvious top links. I don't really see the point of visualising all the sections as a continuous whole, whereas the individual lists make sense to be scrollable.


Hey jgtrosh, Thanks for the valuable suggestion. I know exactly what you are referring to. The ui behavior is similar to mobile app, where I implemented kinda “force touch” in order to scroll right within that section. And regular scroll does page Navigation. I admit it’s takes some time to get used to. Regarding top navigation, I’ll need to research on what’s a better ui experience for mobile users.

I did experiment with list based home screen, but I thought the information density is low for mobile screens. However for tags and authors it’s list based as you suggested.

That being said, I’m planning an update in near future to include public domain audio Books where I can reform Home screen section behavior better for new users.


Standard Ebooks are awesome.

If maintainers are seeing this: any plans to publish non-English books? Is there room to collaborate on adding support for it?


I've been working on a "fork" for Spanish books under a new imprint that I will launch when I finish my 10th book (I'm about 60% along).

One fact of reality is that Standard Ebooks' tooling is English-based. Everything from the pages/xhtml it generates to its typography tools to its style guide expects English. For example, Spanish uses — and «» instead of English's “” and ‘’ for dialogue so you can imagine how SE's punctuation tooling heuristics are going to differ here.

You'd also have to come up with a new set of standards for another language. What sort of correction is a fair modernization and which would be unfair editorialization? SE itself already makes controversial decisions here for English like "to-day" -> "today".

It would be a large undertaking to parameterize some sort of LANG=EN setting and imo not worth it. I think the only route for that sort of thing to happen is if 2+ "forks" get to Standard Ebooks' quality and then decide to work together years down the road.

Also, legal clarity is sometimes an issue in other countries where an English translation written in the USA of that same work is clearly in the public domain. There are works where the original non-English content, despite being long translated into English, are still owned by the estate, but the English translation is liberated.

Something I realized was just how many books exist only as scans. Transcription isn't very fun, but this makes it rewarding. For example, I have some early Spanish sci-fi books I've transcribed into epubs that you cannot find outside of dirty scans.


Sorry for the OT but may I ask which tools are you using to process the books and how/where can you verify the legal situation of a book? I'm from Spain and just curious about it.


I'm super interested in those early Spanish sci-fi (mostly because I like sci-fi and I've been looking for more spanish reading material to practice with since it's been so long since I've lived in Spain). Are they publicly available.


Not a maintainer, just an occasional contributor.

https://standardebooks.org/contribute/accepted-ebooks

> Types of ebooks we don’t accept

> Non-English-language books. Translations to English are, of course, OK.


Yeah :-/ I know, that's why I'm asking if this can be reconsidered. As a non-English native speaker, I think it's a shame. Rather than read the English translation of Les Misérables, I'd rather read the original French text, and having Standard Ebooks as a source for non-English books means I can:

1. Discover / browse / search.

2. Expect high quality text & typography.

3. Be part of a high-volume community that archives and refines texts year after year.

"Then build your own similar site with high-quality titles in French!" , I hear some say. Sure, but I see tons of excellent marketing & infrastructure already done by Standard Ebooks, which could be reused for non-English books!

Said differently, mid/long term I see the greater good being achieved by supporting non-English titles in Standard Ebooks itself, not in one satellite mimic site per language, of varying maintenance quality. To compare with the domain of online encyclopedias, these are the same reasons to have en.wikipedia.org and fr.wikipedia.org maintained under one (Wikipedia) umbrella sharing infrastructure, not wikipedia.org and frenchwikipedia.org maintained by entirely different teams.


It looks like the subject has come up a couple times on the mailing list [1] [2], but in my (admittedly casual) searching I didn't find anything since 2017, with the most relevant answer being "not at the moment". Since it has been a couple years, maybe it's worth bringing up again on the mailing list?

[1] https://groups.google.com/u/1/g/standardebooks/c/JdVpCm3ckGg...

[2] https://groups.google.com/g/standardebooks/c/osOEfs5HdLo/m/2...


Thanks for the ML digging :) .

Agreed, it's worth bringing up again on the mailing list; will do this week and post here a link to my message.


I was thinking that one of the problems with a lot of public domain translations is that the translations are extremely dated. So you're reading this text that you almost need a translation of that itself is a dated translation.

This isn't the fault of Standard Ebooks, but it's an elephant in the room of public domain texts in general. In many ways I'd rather see effort put into crowdsourced or volunteer modern translation than into improved copyediting.

It's especially problematic if you consider that the original text in the foreign language isn't available as well, as you're noting.


I guess that makes sense. They’re pretty rigorous about conforming to style and it’d be hard for the (presumably English-speaking) editors to uphold the same standards for a language they don’t share. Not to mention each language would probably require its own style guide. I know French literature often uses guillemets or a leading em dash where English uses quotes.


I agree there would be work to do and policies to establish, but I don't see this as fundamentally blocking.

Mistakes can/will be done, and fixed afterwards. It's the beauty of our online worlds.


I have a longer response that explains why parameterizing the project around language isn't so straightforward. Nor does it necessarily make sense because the thing that SE brings to the table is English expertise.

All of the work in polishing an epub is the chore of transcription and then nitty gritty details like correctly tagging things like roman numerals and embedded poems and following some sort of standardization guide.

The thing that Standard Ebooks does, aside from being decisive over its standards, is then require the epub to go through a review process by its creator who is a domain expert in the craft. This expert bottleneck is a big reason why SE book quality is so reliable. Accepting other languages drastically changes and perhaps even relaxes this bottleneck and changes the whole organization.

In another comment, I think you suggest that it might be time to try to persuade him to accept non-English books:

> Agreed, it's worth bringing up again on the mailing list; will do this week and post here a link to my message.

But it's not really up to persuasion. Because it takes more than an idea to expand the accepted languages. It requires at least one reliable expert in that language who can stand up a completely new set of tools and standards and then steward that project to fruition. And that's such a big undertaking that it's really a whole new project, not just yet another egg under SE's wings, but a whole new chicken coop.

Suggesting that Standard Ebooks move to support other languages is 0.0000001% of the work towards that goal. People do that all the time, then claim "okay, I'll fork the project", and then fizzle out.

Another example is that whole swathes of code in https://github.com/standardebooks/tools, SE's core workflow, become useless once you're targeting something other than English. By browsing their style guide and that repo, you'll realize that SE's value really is its focus on English. It's more of a suite of English tools than it is a epub editing kit, as the latter is the easy part.


Thanks for the input.

I've been monkeying around open source long enough to know veeeery well that "Suggesting [...] is 0.0000001% of the work towards that goal" , and I've been culprit of that myself :) .

Still, I like SE a lot and might be interested in doing the work. So, I'll make my point to the ML (expanding on A. points I brought here, and B. contradictions that you and other commenters wrote, thanks), asking if folks are convinced by the vision, and asking for technical advice to build the incremental path. Then if there's agreement, maybe I, or someone else, will commit to it.


Fair enough. But SE's creator is simply going to recommend that you fork the project. Not that I speak for them, I've just been following the project long enough to see the interaction many times.

It's also the only avenue that makes sense. As I said in a sibling comment, I've been working on a Spanish-language version of SE and I have some good ideas of how the tool chain could be parameterized for multi-language support, but it also feels like a pointless cherry on top (and a complication of an already non-trivial workflow) to bring everything under one umbrella. And it would require a loss of control for SE's creator.

I recommend epub'ing a few book scans yourself in your preferred language and starting a project around that. You'll find other people who have at least started a fork themselves that you might be able to round up under one org. I personally have aimed for 10 ~finished books and a domain name that hosts my own style guide + tutorial to prove that I'm serious about it (to myself) before I publish my efforts. And you need as much to appeal to any would-be contributors.

I think the only plausible place to be in 5 years is for a couple major sister projects to reach maturation and then form a sort of ring of "Check out lettres-libérées.org for a similar project for French works."


acabal (SE's maintainer) answered, and you were right :) : https://news.ycombinator.com/item?id=25145829 .

Thinking about it! Thanks for the advice.


As others mentioned, different languages have different typography rules. Our toolset is English-based and making it generic for any language would be extremely difficult and time consuming. Additionally, all of us read in (at least) English, and at least I can't credibly say I'm a grammar/typography expert in any other language.

In the past people have expressed interest in forking the toolset for other languages, which is totally fine. But I don't think I've seen any of those attempts come to fruition yet.


Hi! Thanks for chiming in :)

In another comment of this sub-thread ( https://news.ycombinator.com/item?id=25140605 ) I wrote:

> "Still, I like SE a lot and might be interested in doing the work. So, I'll make my point to the ML (expanding on A. points I brought here, and B. contradictions that commenters brought), asking if folks are convinced by the vision, and asking for technical advice to build the incremental path. Then if there's agreement, maybe I, or someone else, will commit to it."

Do you think it remains worthwhile that I bring the discussion, or is it 100% nailed among current maintainers that the "incremental path" I'm hoping for doesn't exist and "just fork / do your own thing for your language and register your domain" is already the consensus?


hombre_fatal's explanation further upthread is exactly on point. Forking is the only sensible option.


Alright then, considering building "lettres-libérées.org", as hombre_fatal phrased it :D . Thanks for the chat, and long live SE.


You can find some non English book PDFs here - https://bookdark.com/

(which are sourced from Gutenberg catalog, but with dynamic navigation and search)


This is fantastic, I had no idea of this site and thanks for that. Some weeks ago I had the idea to create a similar site to compete with Delphi , but you can't compete against free(in both senses), especially when the quality is very high. Kudos.


I just released what I think is a well crafted e-book for a favorite old science fiction book, with permission from the author: https://twirb.github.io


Hey! I own a copy of that, although I’m not sure where it is. A great fun read. I think I’ll download and read it again.


This looks familiar (and nice). Was it mentioned on HN before?

Yup: https://news.ycombinator.com/from?site=standardebooks.org


Meta comment:

Huh, that is weird that the dupe detector didn't pick that up. I was going through some old bookmarks and submitted dark.fail and it hit the dupe detector as a duplicate submission from October 2019 from user rakefire.

A few other links to different sites I tried submitting also hit the dupe detector too. Normally I do a hn.algolia.com search on the headline keywords (sorted by date) for anything current before submitting a link but after the third or fourth 'dupe' I dug out the Standard Ebooks link, gritted my teeth and mentally dared the dupe detector to find someone else on HN that had come across the site before. Turns out many had (lol).

Your comment made me chuckle (and also scratch my head a little about the dupe-detector algorithm) so thanks for pointing it out (and have a +1 from me as thanks).


August 2019 is over a year, which likely is the threshold for the detector.



What e-reader software for desktop computers do you recommend that doesn't look bad?

It seems like a waste to read a carefully crafted ebook using Calibre, which is pretty bad from a graphical/aesthetic perspective. Foliate is better, but not great either.

I'm thinking of running a Windows VM just to run Adobe Digital Editions and read epubs comfortably. Is there a better option?


Foliate looks fine, to my eyes, but I'm not sure what parts of it you dislike, so I'm limited in recommendations.

Bookworm, perhaps?

https://www.flathub.org/apps/details/com.github.babluboy.boo...


Foliate is very good as far as desktop ereaders go. It also includes the SE catalog by default.


I use nov.el in Emacs. I think it looks quite nice, but YMMV.


Send the file to your kindle address if you have one and read using kindle cloud reader?


Reminds me of mutopiaproject.org. Unfortunately, Mutopia Project’s site hasn’t been updated in a while.

This is very cool and I hope it stays around.


This is awesome, thank you!

edit: Aside from the books, the site is quite nice too.


I absolutely love this service and have downloaded the books and uploaded and synced them on my Google Books iOS app. Some books I'd like to recommend: Meditations, Dialogues, The Enchiridion, Discourses, and Siddhartha.


Siddhartha is there, not sure about the others.

https://standardebooks.org/ebooks/hermann-hesse/siddhartha/g...

Edit: I thought you meant "recommend to add" vs. "these are available and I recommend them", my mistake.


I would love if the browse page could sort by "popular" or "recommended". With such a carefully curated collection it seems a shame to not surface the best ebooks more easily.


I found this project about a year ago and it’s truly wonderful. Very high quality ebooks and an awesome collection that is growing by the day.


Well, by the week :)


I started reading The Arabian Nights a while back. The OCR errors in the edition that was in Apple's book store (which I believe was from Project Gutenberg) made the book damn near unreadable. I don't think I even finished the first volume (although I did eventually figure out that “cloth” was supposed to be “doth”).


That’s been on my list to produce for SE for a couple of years, but it’s dauntingly big.


That's very nice.

It'd be great if one could view a compact list of all titles available. Viewing 20 at a time is frustrating.


Great platform, but books should also be included in PDF format. Those of us with Windows PCs have no decent-looking option for an EPUB reader.

I have been converting the EPUBs to PDFs (through online converters like Zamzar) and they look great in both formats.


Have you tried Calibre Windows based reader app, or a browser based extension EPUB reader or a web based viewer service like Google Books?


Is there a site similar to this for books not in the public domain? I wouldn't mind paying for ebooks if they were this nice and didn't lock me into using one application...


I think it would be hard for non public domain for one single reason - licensing and drm. Similar to how Netflix/Prime video don’t let us download the movie file without their app.


There are non-public-domain ebooks out there without DRM. Kobo sells some, for example, though you have to check the format before purchasing since they aren't all DRM-free. Libreture has an extensive list of DRM-free ebook shops[1].

[1] https://www.libreture.com/bookshops/


Ah, excellent! Thank you!


I recommend the Jeeves and Wooster stories -

https://standardebooks.org/ebooks?query=wooster


I didn’t know about this and I love it. I’d been torrenting classic books and always felt a bit silly about it since they are long since out of copyright.

These look beautiful. Great work.


This looks great. Under 'compatible e-pub' format it says "all devices except kindles and kobos" -- I thought epub files worked on kobos?


You can transfer a regular epub to a Kobo, but it will trigger the ADE renderer, which is garbage. Think IE6-level rendering. Using a kepub file will trigger Kobo's better native renderer, which I strongly suspect is Webkit-based and can render lots of advanced CSS and even MathML.


Huh. I have two Kobos, a Kobo Clara HD and an older Kobo Glo HD from 2016, and I've never had any trouble sideloading epubs using Calibre.

Perhaps the quality is lower, but I've certainly never noticed, compared with the books I've bought directly from Kobo.


This is awesome!

Is there a way to download every book you have? I have an eReader that I'd love to stuff full of these kinds of books.


We haven’t traditionally had that, as the books are improved gradually over time as we update the core frameworks and they get more proofing. A download all button would give you a static snapshot of a point in time that might take many years to consume.


How could I submit my own book? I can't find a way.


As in to volunteer to produce a PD book? Try starting at https://standardebooks.org/contribute

If you mean submitting an ePub you’ve already worked on, then it would need to be redone to use our framework. By having a “Standard” imprint we can continuously upgrade and work on the whole corpus.


I wrote and published a book, currently under a friendly creative commons license.

I would like it to be added to the list of books you have.

The book is titled "Nonovvio", and it's in Italian language. [0]

[0]: https://www.amazon.com/Nonovvio-Italian-Simone-Brunozzi-eboo...


Ah, OK. We’re focused purely on English works in the first place (see conversation elsewhere in the thread), but we don’t accept modern works dedicated to the public domain anyway. Congratulations on getting your novel finished though! I’m always impressed by people who can complete a book.


I wish I could browse more than 12 books at a time. Information density at an all time low my god


Agreed, I've been meaning to update the list view and add better sorting and filtering. The current list view was created when we just had a few pages of ebooks, and it made sense then. Now that we have hundreds of ebooks it doesn't make as much sense anymore. But the problem is finding the time!


You could try the OPDS page:

https://standardebooks.org/opds


A search for a tag name seems to return everything that matches on one page:

https://standardebooks.org/ebooks?query=science+fiction




Join us for AI Startup School this June 16-17 in San Francisco!

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: