
GITenberg is an open source community for publishing ebooks in the public domain - bilinualcom
https://www.gitenberg.org/
======
gravitas
The [https://standardebooks.org/](https://standardebooks.org/) project has
been rebuilding ebooks (open ePUB format) with an eye on quality and
readability on mobiles/tablets (fonts, copyedit, etc.) which might be of
interest, all books and the website revision are tracked in git.

~~~
sethish
I'm a big fan of standardebooks! They're doing fantastic work producing better
ebooks from Project Gutenberg sources.

~~~
bilinualcom
OP here, I found this website when I was looking for a way to get the updated
version (with correction) of a PG (Project Gutenberg) book and all
changes/diffs from the point that I scrape the book from PG website for my
language learning side project:
[https://www.bilinual.com](https://www.bilinual.com)

The bilinual Project also rebuild ebooks (modern HTML, PDF and open ePUB
format) with better quality and readability( while it is not its prime goal).
Take a look at one example here:

[https://www.bilinual.com/book/18043sven/sv/en#line=68&lpp=23](https://www.bilinual.com/book/18043sven/sv/en#line=68&lpp=23)
[https://www.bilinual.com/download/18043sven-sv-
en.pdf](https://www.bilinual.com/download/18043sven-sv-en.pdf)
[https://www.bilinual.com/download/18043sven-sv-
en.epub](https://www.bilinual.com/download/18043sven-sv-en.epub)

~~~
bscphil
I tried to pull up a couple of books on the front page to check out what you
had, and

* One of them 404ed: [https://www.bilinual.com/download/30117fren-fr-en.pdf](https://www.bilinual.com/download/30117fren-fr-en.pdf)

* The other was full of problems: [https://www.bilinual.com/download/16210fren-fr-en.pdf](https://www.bilinual.com/download/16210fren-fr-en.pdf)

For example, many words don't have translations at all, and those that do are
often incorrect. This feels like a _very_ rough machine translation? For
example:

> et c'est surtout dans les paroisses riveraines du Saint-Laurent

You translate this

> and Ce east primarily in the · · some saint Laurence

While Google Translate gives

> and it is especially in the parishes bordering the St.Lawrence

If you're using machine translation, why not use a Google API that might give
usable results at least? If that's not plausible, maybe you should try to get
together a team of volunteers to manually translate these ebooks for language
learners?

(I hope these suggestions are helpful, I'm not trying to be dismissive of your
project.)

~~~
bilinualcom
Hi, Thanks for checking the website.

1- 404 issue: I implemented the PDF generation recently and I noticed that
WeasyPrint has issue with html files that have too many tags (our books have
around 2*number_of_words tags in them). This is not a big issue and it will be
fixed soon in the next iteration.

2- Using Google API: Google APIs and other translation tools are great for
translating sentences. However, the problem with use of parallel texts for
language learning is our brain laziness. After few pages, our brain looses its
patient to solve the translation problems (critical thinking!?) and actually
learn words and structure of sentences. The focus immediately goes toward
translated sentences in your native language rather than the original text.

Personally, I learn a word for a life when I slow down and think about similar
words, its root, and at the end looking it up in a dictionary. The process is
valuable.

3- Team of volunteers: It is easier said than done. The functionality is
present but I prefer to improve the suggestion engine as much as possible
before I involve volunteers. Are you interested to join?

------
thangalin
I wrote a technical comparison between various public domain eBook projects:

[https://dave.autonoma.ca/blog/2020/04/11/project-
gutenberg-p...](https://dave.autonoma.ca/blog/2020/04/11/project-gutenberg-
projects/)

------
koolba
They’re probably going to have to change their name to something that does not
have “git” in it: [https://public-
inbox.org/git/20170202022655.2jwvudhvo4hmueaw...](https://public-
inbox.org/git/20170202022655.2jwvudhvo4hmueaw@sigill.intra.peff.net/)

~~~
sethish
Gitenberg was one of the alternate spellings of Gutenberg in the 1400s.

~~~
agumonkey
I propose Gytenberg

~~~
artiszt
or, widely accepted standard [in german and slavic-languages] way back then,
'v' instead of 'u', and vice versa

~~~
agumonkey
ah well for some reason the visual asthetics of gytenberg vs gvtenberg made me
chose the former (even though at first I typed gvtenberg)

------
maire
This looks like a subset of Project Gutenberg?

The Girl from Alsace in Gitenberg:
[https://www.gitenberg.org/book/35926](https://www.gitenberg.org/book/35926)

The Girl from Alsace in Gutenberg:
[http://www.gutenberg.org/ebooks/35926](http://www.gutenberg.org/ebooks/35926)

The numbers are even the same which seems suspicious. Hmmm.

~~~
sethish
Yes, absolutely. Apologies if that's not clear from the website. GITenberg
started as an experimental fork of PG, but due to the work of Eric Hellman at
the Free Ebook Foundation, much of the infrastructure such as metadata
formats, CI/CD for building books, and DVCS backend are being ported upstream
to the Project Gutenberg infrastructure.

~~~
maire
Thanks for the clarification!

~~~
bilinualcom
OP here, I doubt that the website is "a SUBSET of Project Gutenberg". This is
not mentioned anywhere in the website. As mentioned in PG website, while you
can use the book freely, "The name 'Project Gutenberg' is a registered
trademark.". I guess this is the reason they didn't mention PG.

~~~
sethish
Now I remember, that was the reason I wasn't more clear about the connection
to PG when I wrote the website [https://github.com/gitenberg-
dev/giten_site/](https://github.com/gitenberg-dev/giten_site/). Since then my
co-founder Eric Hellman has been doing engineering work for Project Gutenberg,
as well as running the rest of the Free Ebook Foundation, which is the parent
org of GITenberg, free-programing-ebooks, and Unglue.it.

I think that the GITenberg collection contains all of the books in PG. At this
point, the creation of new repos is automatically done when Distributed
Proofreaders creates a new book in PG. Originally, I didn't include around 400
PG books due to their creators claiming copyright, and didn't include Bruce
Sterling's book because he wouldn't let me re-license it creative commons
rather than his pseudo-public-domain license.

Not much has been happening with GITenberg itself in the past few years. But
luckily, a lot of the concepts and code are getting upstreamed into PG. Which
in my opinion, is way way better.

~~~
bilinualcom
Thanks for clarification.

------
hnarayanan
I don't know why, but I expected some amazing typography in their PDFs. :(

~~~
sethish
[Standard Ebooks]([https://standardebooks.org/](https://standardebooks.org/))
has accomplished a lot in producing better ebooks of public domain texts.

~~~
bilinualcom
OP here, it seems standardebooks doesn't provide the books in PDF format. My
side project, [https://www.bilinual.com](https://www.bilinual.com) rebuild the
ebooks in PDF format with translation hints, if you don't mind about learning
a new language while reading your favourite books ;)

~~~
sethish
There are about 400 GITenberg books that have CC-by licensed covers provided
by Recovering the Classics. If you're interested in using that art for your
PDFs I can find you the index!

~~~
bilinualcom
Thanks, it would be great. I am curious to know why do they have CC license
and not public domain?

~~~
robin_reala
We have this problem with Standard Ebooks. The number of people that say
things are public domain without actually checking it is very high. A CC0
licence is an _explicit grant_ of public domain status by the licensor, and
hence the legal issues rest with them in the event of any problem.

Public domain obviously can be ascertained, but if CC0 hasn’t been granted we
rely on dated reproductions: basically a photograph of the artwork in question
in a book or journal with a copyright date of 1924 or earlier.

(that’s obviously specifically a US legal reading, but SE from a legal point
of view is a US project)

~~~
maxerickson
CC0 isn't an indemnity.

The difference between falsely claiming something is public domain and falsely
claiming to grant a license to it under CC0 is going to be pretty minimal (and
is likely result in little more than "please stop", blood and turnips and so
on).

------
sethish
You might be more familiar with another project the Free Ebook Foundation
maintains: [https://github.com/EbookFoundation/free-programming-
books/](https://github.com/EbookFoundation/free-programming-books/) Which is
one of the top-10 repos on github by number of stars.

------
anaphor
I like how they're able to accurately tag translators vs original authors,
e.g. [https://github.com/GITenberg/The-History-of-the-
Peloponnesia...](https://github.com/GITenberg/The-History-of-the-
Peloponnesian-War_7142/blob/master/metadata.yaml)

------
voldemort1968
What is improved by adding Git to PG?

~~~
bilinualcom
Well, for my side project,
[https://www.bilinual.com](https://www.bilinual.com) I needed version
controlling for changes that are made on a book (fixed typos, ...) and I found
GITenberg project. There are several interesting tools developed during the
project accessible here: [https://github.com/gitenberg-
dev](https://github.com/gitenberg-dev)

