If by some chance a maintainer reads this thread a couple requests:
1) There are a lot of obscure books which is great. Also doesn’t hurt to add some of the GOATS such as the works of Plato or perhaps even the Harvard Classics
2) Sorting on site is frustrating
3) I would be willing to commission high quality epubs and I’m sure others would too for my own benefit and humanity. For example I can get copies of the works for Plato on Gutenberg but it merits a Standard Ebooks edition and I wouldn’t blink at donating $250 or some number toward that cause. Knowing kids across the globe would have access to high quality digital editions is worth it to me. Worth exploring setting up a program for this.
Project lead here. There's thousands of books we want to produce out there, and limited volunteer hours. If there's something you want to work on, it might be fun to get started as a producer for the project yourself. (Note that we ask that new producers start on a simpler project first, which might rule out some longer and more complicated stuff like Plato translations--at least to start with.)
I've been exploring the idea of crowdsourcing financing for ebook production. But I think we need some kind of not-for-profit framework set up first. If anyone is in that space and would like to talk about NFP infrastructure, or being a NFP fiscal sponsor, please contact me!
I looked around on the site a bit and perused the source and I couldn't find the answer to my question: what sort of things are in the "advanced" epub version that aren't in the standard?
The “advanced” version is what we produce by default, before it’s processed for consumption by older readers. So, to give a couple of examples, CSS pseudoclasses are processed into actual classes, and MathML is rendered as PNGs. The hypothetical perfect ereader could support this natively, but for most ereaders its going to be a less buggy experience to go for the version with compatibility shims.
(Incidentally, Kobo has an excellent MathML engine, so we don’t do the aforementioned PNG rendering for kepubs.)
> There are a lot of obscure books which is great.
This is my biggest issue: discoverability.
The site has an index which shows 12 books at a time, and it looks like 36 pages. So roughly 430 books.
This is not actually a gigantic number of books, one that is small enough that it would be great to look through all the books, and yet I'm never going to do that if I have to page through 12 results at a time.
It would be great to get a simple list view of the collection.
It would be great to just index whatever metadata is available and provide search facets on those fields. It looks like there's quite a lot of metadata in just the few that I cherry-picked: https://github.com/standardebooks/p-g-wodehouse_right-ho-jee...
Looks like at least some of that already works, since you can sort by the "se:reading-ease.flesch" field on the web site.
I strongly agree with the parent post that having a simple list of books somewhere on the site is a necessity to enable browsing. If I have to search by interest, it is quite unlikely I would search here no matter how much I like the work that you do.
Not to blow my own trumpet too much, but I just finished producing a complete Poe short fiction collection for SE if you’re looking for something to read, and submitted 1,025 corrections back to Gutenberg’s collection in the process: https://standardebooks.org/ebooks/edgar-allan-poe/short-fict...
I was just thinking that it might be nice to form playlists of short stories from multiple books/authors. There's probably a bunch of lists (award winning, by genre, era etc). Seems a shame to enforce the bundling at a book level.
Project lead here. Glad to see that people like the project, and I'm happy to answer any questions.
We're always looking for producers to volunteer to work on new ebooks. The process is a blend of many different disciplines, so if you're interested in art, literature, programming, and the command line, trying your hand at an SE production might be fun!
First: thanks for this project. Currently halfway Gullivers Travels, and re-ead and re-discovered some great classics through your project.
What are the plans for other languages, if any?
I noticed that you have e.g. German or French originals translated to English. Is there place in the project for the original languages?
And what about translations to other languages? If I'm going to read a translation of Les trois mousquetaires, I'd prefer a translation to my native language instead of English.
Edit: saw the thread about this down below too late; was on mobile. Never mind my questions, they seem to be answered.
One of me pandemic projects was to take the books on Standardebooks.org, parse the ePub and write transformations into HTML with indexing. Was trying to build a platform for me to write tech books on, but I wound up just enjoying getting into the classics again. Will probably delete my book and just make it a site to read the classics online.
This is very nice! A nice addition would be a widget to allow changing between a few different font faces and color palettes (like you'd see in typical ebook readers or "reader mode" browser tools).
The tail -r is to reverse the output, it's otherwise in approximately reverse alphabetical order of author's first name. Output is
Hadji Murád - Leo Tolstoy
Pierre and Jean - Guy de Maupassant
The Red House Mystery - A. A. Milne
The Moon Pool - A. Merritt
Fables - Aesop
Poirot Investigates - Agatha Christie
The Man in the Brown Suit - Agatha Christie ...
A. A. Milne - The Red House Mystery
A. Merritt - The Moon Pool
Aesop - Fables
Agatha Christie - Poirot Investigates
Agatha Christie - The Man in the Brown Suit
Agatha Christie - The Murder on the Links
Agatha Christie - The Mysterious Affair at Styles
Agatha Christie - The Secret Adversary
Aldous Huxley - Antic Hay
...
I was thrilled when I read the part about not ignoring dashes. So I was a bit disappointed when I saw the "Rich & detailed metadata" screenshot had the dreaded double hyphen. Even acabal's replies on this thread use a double hyphen.
I guess I shouldn't complain too much, even the New York Times misuses the double hyphen.
Maybe you're referring to the LoC categories? Those are separated by two dashes as the industry standard for that metadata. The actual text of the books features correct em dashes.
Doesn't work for me with Firefox 82.0.3 (64 bit) on Win10. Hitting the Read Book link just gives me a blank window with the Close Book link at the top.
Took me a few moments to realize that this is because my Firefox is set to download PDFs instead of displaying them in Firefox.
Thanks for letting me know. Hopefully you can at least access the pdf by downloading.
Just so you know, the pdf display embed only works in desktop screens etc., mobile screens would open pdf in a new tab in mobile browser. This is because pdf embed within app makes text to be too small for some readers.
Just one thing: the two-level horizontal scroll doesn't work consistently for me, and just seems a bit weird. I feel like the top level navigation should be left to the obvious top links. I don't really see the point of visualising all the sections as a continuous whole, whereas the individual lists make sense to be scrollable.
Hey jgtrosh,
Thanks for the valuable suggestion. I know exactly what you are referring to. The ui behavior is similar to mobile app, where I implemented kinda “force touch” in order to scroll right within that section. And regular scroll does page Navigation. I admit it’s takes some time to get used to. Regarding top navigation, I’ll need to research on what’s a better ui experience for mobile users.
I did experiment with list based home screen, but I thought the information density is low for mobile screens. However for tags and authors it’s list based as you suggested.
That being said, I’m planning an update in near future to include public domain audio Books where I can reform Home screen section behavior better for new users.
I've been working on a "fork" for Spanish books under a new imprint that I will launch when I finish my 10th book (I'm about 60% along).
One fact of reality is that Standard Ebooks' tooling is English-based. Everything from the pages/xhtml it generates to its typography tools to its style guide expects English. For example, Spanish uses — and «» instead of English's “” and ‘’ for dialogue so you can imagine how SE's punctuation tooling heuristics are going to differ here.
You'd also have to come up with a new set of standards for another language. What sort of correction is a fair modernization and which would be unfair editorialization? SE itself already makes controversial decisions here for English like "to-day" -> "today".
It would be a large undertaking to parameterize some sort of LANG=EN setting and imo not worth it. I think the only route for that sort of thing to happen is if 2+ "forks" get to Standard Ebooks' quality and then decide to work together years down the road.
Also, legal clarity is sometimes an issue in other countries where an English translation written in the USA of that same work is clearly in the public domain. There are works where the original non-English content, despite being long translated into English, are still owned by the estate, but the English translation is liberated.
Something I realized was just how many books exist only as scans. Transcription isn't very fun, but this makes it rewarding. For example, I have some early Spanish sci-fi books I've transcribed into epubs that you cannot find outside of dirty scans.
Sorry for the OT but may I ask which tools are you using to process the books and how/where can you verify the legal situation of a book? I'm from Spain and just curious about it.
I'm super interested in those early Spanish sci-fi (mostly because I like sci-fi and I've been looking for more spanish reading material to practice with since it's been so long since I've lived in Spain). Are they publicly available.
Yeah :-/ I know, that's why I'm asking if this can be reconsidered. As a non-English native speaker, I think it's a shame. Rather than read the English translation of Les Misérables, I'd rather read the original French text, and having Standard Ebooks as a source for non-English books means I can:
1. Discover / browse / search.
2. Expect high quality text & typography.
3. Be part of a high-volume community that archives and refines texts year after year.
"Then build your own similar site with high-quality titles in French!" , I hear some say. Sure, but I see tons of excellent marketing & infrastructure already done by Standard Ebooks, which could be reused for non-English books!
Said differently, mid/long term I see the greater good being achieved by supporting non-English titles in Standard Ebooks itself, not in one satellite mimic site per language, of varying maintenance quality. To compare with the domain of online encyclopedias, these are the same reasons to have en.wikipedia.org and fr.wikipedia.org maintained under one (Wikipedia) umbrella sharing infrastructure, not wikipedia.org and frenchwikipedia.org maintained by entirely different teams.
It looks like the subject has come up a couple times on the mailing list [1] [2], but in my (admittedly casual) searching I didn't find anything since 2017, with the most relevant answer being "not at the moment". Since it has been a couple years, maybe it's worth bringing up again on the mailing list?
I was thinking that one of the problems with a lot of public domain translations is that the translations are extremely dated. So you're reading this text that you almost need a translation of that itself is a dated translation.
This isn't the fault of Standard Ebooks, but it's an elephant in the room of public domain texts in general. In many ways I'd rather see effort put into crowdsourced or volunteer modern translation than into improved copyediting.
It's especially problematic if you consider that the original text in the foreign language isn't available as well, as you're noting.
I guess that makes sense. They’re pretty rigorous about conforming to style and it’d be hard for the (presumably English-speaking) editors to uphold the same standards for a language they don’t share. Not to mention each language would probably require its own style guide. I know French literature often uses guillemets or a leading em dash where English uses quotes.
I have a longer response that explains why parameterizing the project around language isn't so straightforward. Nor does it necessarily make sense because the thing that SE brings to the table is English expertise.
All of the work in polishing an epub is the chore of transcription and then nitty gritty details like correctly tagging things like roman numerals and embedded poems and following some sort of standardization guide.
The thing that Standard Ebooks does, aside from being decisive over its standards, is then require the epub to go through a review process by its creator who is a domain expert in the craft. This expert bottleneck is a big reason why SE book quality is so reliable. Accepting other languages drastically changes and perhaps even relaxes this bottleneck and changes the whole organization.
In another comment, I think you suggest that it might be time to try to persuade him to accept non-English books:
> Agreed, it's worth bringing up again on the mailing list; will do this week and post here a link to my message.
But it's not really up to persuasion. Because it takes more than an idea to expand the accepted languages. It requires at least one reliable expert in that language who can stand up a completely new set of tools and standards and then steward that project to fruition. And that's such a big undertaking that it's really a whole new project, not just yet another egg under SE's wings, but a whole new chicken coop.
Suggesting that Standard Ebooks move to support other languages is 0.0000001% of the work towards that goal. People do that all the time, then claim "okay, I'll fork the project", and then fizzle out.
Another example is that whole swathes of code in https://github.com/standardebooks/tools, SE's core workflow, become useless once you're targeting something other than English. By browsing their style guide and that repo, you'll realize that SE's value really is its focus on English. It's more of a suite of English tools than it is a epub editing kit, as the latter is the easy part.
I've been monkeying around open source long enough to know veeeery well that "Suggesting [...] is 0.0000001% of the work towards that goal" , and I've been culprit of that myself :) .
Still, I like SE a lot and might be interested in doing the work. So, I'll make my point to the ML (expanding on A. points I brought here, and B. contradictions that you and other commenters wrote, thanks), asking if folks are convinced by the vision, and asking for technical advice to build the incremental path. Then if there's agreement, maybe I, or someone else, will commit to it.
Fair enough. But SE's creator is simply going to recommend that you fork the project. Not that I speak for them, I've just been following the project long enough to see the interaction many times.
It's also the only avenue that makes sense. As I said in a sibling comment, I've been working on a Spanish-language version of SE and I have some good ideas of how the tool chain could be parameterized for multi-language support, but it also feels like a pointless cherry on top (and a complication of an already non-trivial workflow) to bring everything under one umbrella. And it would require a loss of control for SE's creator.
I recommend epub'ing a few book scans yourself in your preferred language and starting a project around that. You'll find other people who have at least started a fork themselves that you might be able to round up under one org. I personally have aimed for 10 ~finished books and a domain name that hosts my own style guide + tutorial to prove that I'm serious about it (to myself) before I publish my efforts. And you need as much to appeal to any would-be contributors.
I think the only plausible place to be in 5 years is for a couple major sister projects to reach maturation and then form a sort of ring of "Check out lettres-libérées.org for a similar project for French works."
As others mentioned, different languages have different typography rules. Our toolset is English-based and making it generic for any language would be extremely difficult and time consuming. Additionally, all of us read in (at least) English, and at least I can't credibly say I'm a grammar/typography expert in any other language.
In the past people have expressed interest in forking the toolset for other languages, which is totally fine. But I don't think I've seen any of those attempts come to fruition yet.
> "Still, I like SE a lot and might be interested in doing the work. So, I'll make my point to the ML (expanding on A. points I brought here, and B. contradictions that commenters brought), asking if folks are convinced by the vision, and asking for technical advice to build the incremental path. Then if there's agreement, maybe I, or someone else, will commit to it."
Do you think it remains worthwhile that I bring the discussion, or is it 100% nailed among current maintainers that the "incremental path" I'm hoping for doesn't exist and "just fork / do your own thing for your language and register your domain" is already the consensus?
This is fantastic, I had no idea of this site and thanks for that. Some weeks ago I had the idea to create a similar site to compete with Delphi , but you can't compete against free(in both senses), especially when the quality is very high. Kudos.
I just released what I think is a well crafted e-book for a favorite old science fiction book, with permission from the author: https://twirb.github.io
Huh, that is weird that the dupe detector didn't pick that up. I was going through some old bookmarks and submitted dark.fail and it hit the dupe detector as a duplicate submission from October 2019 from user rakefire.
A few other links to different sites I tried submitting also hit the dupe detector too. Normally I do a hn.algolia.com search on the headline keywords (sorted by date) for anything current before submitting a link but after the third or fourth 'dupe' I dug out the Standard Ebooks link, gritted my teeth and mentally dared the dupe detector to find someone else on HN that had come across the site before. Turns out many had (lol).
Your comment made me chuckle (and also scratch my head a little about the dupe-detector algorithm) so thanks for pointing it out (and have a +1 from me as thanks).
What e-reader software for desktop computers do you recommend that doesn't look bad?
It seems like a waste to read a carefully crafted ebook using Calibre, which is pretty bad from a graphical/aesthetic perspective. Foliate is better, but not great either.
I'm thinking of running a Windows VM just to run Adobe Digital Editions and read epubs comfortably. Is there a better option?
I absolutely love this service and have downloaded the books and uploaded and synced them on my Google Books iOS app. Some books I'd like to recommend: Meditations, Dialogues,
The Enchiridion, Discourses, and Siddhartha.
I would love if the browse page could sort by "popular" or "recommended". With such a carefully curated collection it seems a shame to not surface the best ebooks more easily.
I started reading The Arabian Nights a while back. The OCR errors in the edition that was in Apple's book store (which I believe was from Project Gutenberg) made the book damn near unreadable. I don't think I even finished the first volume (although I did eventually figure out that “cloth” was supposed to be “doth”).
Is there a site similar to this for books not in the public domain? I wouldn't mind paying for ebooks if they were this nice and didn't lock me into using one application...
I think it would be hard for non public domain for one single reason - licensing and drm.
Similar to how Netflix/Prime video don’t let us download the movie file without their app.
There are non-public-domain ebooks out there without DRM. Kobo sells some, for example, though you have to check the format before purchasing since they aren't all DRM-free. Libreture has an extensive list of DRM-free ebook shops[1].
I didn’t know about this and I love it. I’d been torrenting classic books and always felt a bit silly about it since they are long since out of copyright.
You can transfer a regular epub to a Kobo, but it will trigger the ADE renderer, which is garbage. Think IE6-level rendering. Using a kepub file will trigger Kobo's better native renderer, which I strongly suspect is Webkit-based and can render lots of advanced CSS and even MathML.
We haven’t traditionally had that, as the books are improved gradually over time as we update the core frameworks and they get more proofing. A download all button would give you a static snapshot of a point in time that might take many years to consume.
If you mean submitting an ePub you’ve already worked on, then it would need to be redone to use our framework. By having a “Standard” imprint we can continuously upgrade and work on the whole corpus.
Ah, OK. We’re focused purely on English works in the first place (see conversation elsewhere in the thread), but we don’t accept modern works dedicated to the public domain anyway. Congratulations on getting your novel finished though! I’m always impressed by people who can complete a book.
Agreed, I've been meaning to update the list view and add better sorting and filtering. The current list view was created when we just had a few pages of ebooks, and it made sense then. Now that we have hundreds of ebooks it doesn't make as much sense anymore. But the problem is finding the time!
If by some chance a maintainer reads this thread a couple requests:
1) There are a lot of obscure books which is great. Also doesn’t hurt to add some of the GOATS such as the works of Plato or perhaps even the Harvard Classics
2) Sorting on site is frustrating
3) I would be willing to commission high quality epubs and I’m sure others would too for my own benefit and humanity. For example I can get copies of the works for Plato on Gutenberg but it merits a Standard Ebooks edition and I wouldn’t blink at donating $250 or some number toward that cause. Knowing kids across the globe would have access to high quality digital editions is worth it to me. Worth exploring setting up a program for this.