Hacker News new | past | comments | ask | show | jobs | submit login
Digitized Books from 1923 Now Available at the Internet Archive (openculture.com)
324 points by ingve 76 days ago | hide | past | web | favorite | 59 comments

The Internet Archive is amazing!

They accept donations: https://archive.org/donate/

Here's an odd thing: browsing the fiction books with date:1923 I saw a familiar title, "Seven Days in May" by Knebel and Bailey[1] -- a story definitely written well after that year, since it opens at the Pentagon, a building only opened in 1943.

In fact, per Wikipedia[2] it was published in 1964.

Yet if you open the IA copy[1] at the linked page you will see "First published 1923".


Heck, here's another: The Doomsday Men by J.B. Priestly[3], "First published 1923" but in fact[4] in 1937.

[1] https://archive.org/details/in.ernet.dli.2015.124792/page/n1...

[2] https://en.wikipedia.org/wiki/Seven_Days_in_May_(novel)

[3] https://archive.org/details/in.ernet.dli.2015.148207/page/n5

[4] https://en.wikipedia.org/wiki/J._B._Priestley#Novels

Since this is a sliding window it might be worth looking at the entire middle and late 20's. Cribbed from a top 50 list, I see works by F Scott Fitzgerald, Hemingway, Agatha Christie, Virginia Woolf, DH Lawrence, Dorothy Parker, Gertrude Stein, Frank Kafka, PG Wodehouse, Aldous Huxley, ee cummings, Pablo Neruda, AA Milne, Sinclair Lewis, Faulkner, Hesse...

Man the Roaring 20's had a shitload of authors...

Nice catch.

I cannot think of any other reason than a typo in the original prints, and the people reponsible for digitizing the books have mechanically filed them with the date reported in the body as that of the first edition.

Maybe it is the equivalent of a modern day copy/paste error, they got the sheets reporting the first edition mixed-up with those of another book, who knows.

There are still many, many 18th century American cookbooks that haven't been digitized yet. Some have probably been lost given their proximity to kitchens, which were separate from the house, more prone to weather and, perhaps, fire. I can only find references in old catalogues that they existed, but nothing much else.

I don't know how to ask this question, but how useful is a cookbook from the 18th century (or even the early 20th century)? The ingredients are likely called something else, or may be unavailable.


> There are ways the historian can read between the lines of the recipes, so to speak to answer questions that are not directly related to cooking or material culture but may deal with gender roles, issues of class, ethnicity and race. Even topics such as politics, religion and world view are revealed in the commentary found in cookbooks and sometimes embedded in what appears to be a simple recipe. The most valuable of cookbooks and related culinary texts also reveal what we might call complete food ideologies...

> Nor should any of these caveats prevent the historian from cooking historic recipes today. There is something palpably direct to be gained from tasting food from the past, in much the same way as one can learn from hearing a symphony on period instruments or viewing an old painting in a museum. The esthetic values that inform flavor preferences of the past are indeed very different from our own. Some ingredients people enjoyed would today be considered abhorrent; some flavors and textures bizarre if not disgusting. But this should not deter the intrepid investigator... What one learns from such an exercise is a feeling for the embodied experience of physically carrying out certain culinary tasks and a direct apprehension of what the palates of our forebears might have experienced.

As a cookbook? Not very. I find them interesting as historical documentation of cooking.

You'll recognize most of the ingredients well enough based on my reprint of the 1884 Boston Cooking School Cook Book. But a lot of meats are going to be different enough that you'd want to change the prep. And, to most modern tastes, the recipes are just not going to be very good. I still keep some of the cookbooks handed down from my mother (like old Gourmet cookbooks from the 1960s or so). And even those are just not that useful for how I cook today.

I have my mom's old cookbook. Stuff works as long as you realize that modern butter has a about 2/3 the fat as 1950's era butter. And store bought eggs are about 2/3 the size of ones in the 1970's.

Try making your own butter. The way I do it is just shake heavy cream in a jar.

You're correct that there's often a lot of interpretation and historical research that has to be done when looking through such cookbooks. However, they can offer quite an interesting glimpse into another era. Specifically, I'd recommend watching some of the Townsends "18th Century Cooking" YouTube videos [0] which show how some unusual (or familiar!) recipes from colonial America may have been made.

[0] https://www.youtube.com/user/jastownsendandson

That is one of the best channels on YouTube, and I know a few teachers who used some of his videos in their lesson plans.

You may also like https://www.youtube.com/channel/UCxr2d4As312LulcajAkKJYw

Lots of interesting 18th century cooking and other period skills.

Also, "A Taste of History", seen locally on PBS stations.


One great example of utility: my grandmother had a cookbook from the early 1900s that was truly farm (or forest) to table: it included instructions for catching and cooking squirrels, as well as the complete process for taking a chicken from your yard and preparing it for dinner. It was quite impressive. And this was a brand-named cookbook, like Better Home & Garden or some such. My aunt now possesses it, much to my chagrin.

Get some Foxfire books on ebay.

History? Curiosity? Some of them might contain interesting recipes? Some cookbooks might've been written in an interesting style? Some books might be able to give clues about miscellaneous uncertainties about the everyday life of people around that time? Or if not their life, perhaps they might shine light on how certain tools were used in ways we don't know? Perhaps some words used in these books are useful for linguists? Maybe we'll uncover some hidden etymologies.

Also, reading stuff that's just out there can help shake up your mind in interesting ways.

There are tons of possibilities.

Actually surprisingly useful. Cooking has grown as an art, but it's not dramatically different. Words differ some, but not in a way that can't be resolved with a bit of experimentation.

Agree with this, and further add that sometimes you find interesting solutions/ingredients that were used in the past because of lack of modern convenience. You could buy most staples, but even things common during their harvest season were unavailable most of the year. One of the most surprising ingredients I've seen was turpentine! People used to eat that!

Actually there are still people that eat turpentine, it gets peddled by some of the "natural medicine" crowd.

Old cookbooks are little historical slices of culture in the commoner language/parlance of that time.

They are incredible useful for historical insights that would otherwise be lost like what was the supply chain even like [1]?

[1] http://journal.media-culture.org.au/index.php/mcjournal/arti...

Kitchens definitely were separate from the house due to fire concerns.

While I love the Internet Archive, their search can be a bit of a mess. They often have several copies of the same work in varying qualities, with no easy way to find the best one, they have somewhat inconsistent standards in tagging, and they frequently have copyright-encumbered works inadvertently available. As people mentioned below Project Gutenberg is excellent for finding public domain books, but there seems to be no way to search for and watch most public domain films in a reasonable quality – or even high definition. Is anyone trying to fix that?

I'm pretty sure I've downloaded (non-fiction) books newer than 1923 from the IA before --- in fact I just checked one and it's from 1953. Is it just because those are not actually public domain, and no one has bothered to issue a takedown notice for them (yet)?

A lot of books published before 1964 in the United States are out of copyright:


Works published before 1964 in the US are all in the public domain, excepting only those for which a renewal was registered with the US Copyright Office. Relatively few works from this era have had their copyrights renewed. A US Copyright Office study in 1961 found that fewer than 15% of registered copyrights had been renewed.

Google Books, last time I looked, applied a blanket year-based cutoff rule and did not allow full view access to these works that failed to renew their copyrights, unless publishers explicitly opted in. Other institutions like the Internet Archive and HathiTrust have put more work into determining the copyright status of individual pre-1964 works and opening the public domain works up for full view.


What's the book?

There's lots of reasons why a work after 1923 might be in the public domain, but as of this year, everything before 1923 is in the public domain. I would guess FOSS/FLOSS software is probably the most likely thing people might recognize as "public domain-like" (due to the issue of taking something public domain, then copyrighting your original work on it e.g. Disney's erstwhile business model).

Copyright could have been abandoned, creator hasn't pursued their claim, it could have been specifically released by the creator, "copyleft", ...

This one, for example:


It's probably the first two of your guesses, or the pre-1964 reason mentioned in a sibling comment. (There's stuff newer than that on the IA too, which are likely to be there because of the reasons originally mentioned.)

Looks like there's a lot of McGraw-Hill Book Company (now McGraw-Hill Education, not sure if that has something to do with it) books on archive.org, particularly from the "Public Library of India." I'm wondering if these came through some digitization effort that slipped through copyright, or maybe the original digitizer had obtained specific permission for these volumes? Definitely not obvious to me.


I think the Prophet by Kahil Gibran is considered one of the classics just entering the public domain. I also found the Prospects of Industrial Civilization by Dora and Bertrand Russel to be interesting. In looking at the plain text though I was annoyed at how error ridden the OCR job was. I think it will be interesting to see how much of the 1923 work is added to a site like Project Gutenburg as AABBY FineReader 11.0 leaves a bit to be desired based upon the published results on archive.org

If interested, you can contribute to the process: https://www.pgdp.net/c/

The bottleneck at PGDP these days is not actually proofreading texts or even marking them up, but "post-processing" them - going from a bunch of proofread and marked-up page texts (and the original page images) to a complete HTML edition, ready for posting at Project Gutenberg. Many projects run aground at this stage, because they have relatively few contributors for that sort of work. So, if you really want to help (and have relevant skills), you should ask about taking over some old projects that are stuck in that state.

All of PGDP is fun and satisfying volunteer work if you like reading, but the Post-Processing stage was for me, the most fun of all. However, in order to qualify as a PPer you must first put in a rather lengthy apprenticeship working in the different levels of proofing.

PGDP FAQ: https://www.pgdp.net/wiki/DP_Official_Documentation:General/...

PGDP Post-Processing FAQ: https://www.pgdp.net/wiki/DP_Official_Documentation:PP_and_P...

> On its journey through multiple proofreading and formatting rounds, the text may have been worked on by hundreds of volunteers. Post-processors must standardize the formatting of the book and adjust it to comply with Project Gutenberg's requirements. They must also deal with any detectable mistakes or inconsistencies that have survived all proofreading and formatting rounds.

> The ultimate goal of post-processing is to create a consistently formatted etext, that contains as few errors as possible and that accurately reflects the intentions of the author.

I was wondering about PG as well. If I understand right, the whole project relies on volunteers to OCR and then manually correct the texts. I'm not entirely clear on how they're organized, though.

Edit: see sibling comment made at the same time. PG distributes pages to volunteers for proofreading.

I don’t know how I should feel about this but I don’t recognize many books published in 1923. I was looking to see what classics I could easily add to my Kindle but the list doesn’t seem as recognizable as 1922 or 1924. Some of the things I recognize from 1923 are parts of series (Poirot, Jeeves & Wooster) rather than stand-alone books.

Can anyone recommend something from 1923?

Three of my favorites:

The Prophet, by Kahlil Gibran https://www.gutenberg.org/ebooks/58585

Men Like Gods, by H. G. Wells http://gutenberg.net.au/ebooks02/0200221h.html

Leave It To Psmith, by P.G. Wodehouse https://archive.org/details/in.ernet.dli.2015.126397/page/n1

This is the argument I've heard from Fine Arts people on the subject of the written word:

You won't have heard of any of those books from '23. Your English teacher picked books that they could get for next to nothing for a roomful of students (eg, public domain "classics"), or suck it up and fight for their absolute favorite book of all time.

So you're mostly going to hear of really old books in class, and really modern books from media and friends, and between that maybe the best 50 books from the last 50 years from proper literary mavens.

By moving the public domain forward in time you start opening up the opportunity for kids to hear about books from the 20's and 30's, things that might tie into History class (the Depression, the leadup to WWII) and thus into other aspects of life (civics, poli-sci, etc).

So your kids will know of books from the 20's, their kids books from the 40's (Tolkien), and so on. And your great great grandchild will encounter Harry Potter for the first time in the 7th grade.

I'm not sure. I admittedly didn't go to a public school but we definitely had in-copyright books from John Hersey, Pearl Buck, James Hilton, Evelyn Waugh, John Steinbeck, Ernest Hemingway. Maybe that qualifies as "maybe the best 50 books from the last 50 years from proper literary mavens" but, in general, I'm not sure that out-of-copyright was overrepresented for any reason but that they were Western canon. (And, in many cases, not very readable by high-schoolers.)

I'm all for shorter copyright terms of course.

There were certainly classes where we had paperbacks, but you don't spend an entire semester going over one book. I think other than electives and higher education, I rarely saw more than one paperback per semester, and a few of those were mandated by the state board of education (eg, Animal Farm).

In the earliest book-heavy class I can recall, probably senior year of high school, we read a bunch of Edgar Allen Poe out of the textbook (Poe is out of copyright). There were book reports, but that was one copy per student and I think more than a few of us checked books out of the library for this.

I think Gatsby was the only book we had to buy that year. That book is from 1925, and should have been out of copyright decades ago. So that everyone could suffer^W enjoy it for free.

Goodreads has "Popular by Date" functionality that will help a lot. Just a glance at the top of the list shows The Prophet by Kahlil Gibran, Bambi by Felix Salten, some Borges, some Lovecraft, the Discovery of the Tomb of Tutankhamen by Howard Carter, The Ego and the Id by Freud...

Overall I'd agree with you. Most of the specific titles I recognize from this list are dry, downbeat and not the strongest or most memorable work from those authors. The rest is either series material with characters or stuff that seems to have been forgotten.

Borges is probably no good in this context; the original Spanish might've been published in 1923, but the English translations will be more recent.

What? There are 400+ million native speakers of Spanish in the world. Second language included? 600+

I think what he's saying is that the translation is still under copyright. Only the original would be public domain.

Theres a translation of Da Vinci's notebooks mentioned on the website. One would hope that the original had fallen out of copyright by now, so its only the translation that is falling out of copyright.

Exactly. I produce ebooks for Standard Ebooks and there’s plenty of Russian literature that I’d like to do, we’re it not for the fact that the available translations are much later than 1923 even if the original books were published earlier.

And this is an English-language website predominantly read by users in the US about a change in status under US copyright law.

I think you underestimate HN's international readership.

I don't think I do, and I've looked at the estimated geoips of hundreds of thousands of HN hits to my website: https://www.dropbox.com/s/rtqafvuy18awda6/Analytics%20www.gw...

You see many Spanish-speaking countries listed...?

While I agree with your premise, it's not wrong to think of it the other way too (especially in terms of numbers as opposed to percentages)

Nevertheless, 13% of the US population are native Spanish speakers[1], and presumably more have some level of reading skills. Hard to judge overlap with HN readership, but it's probably fair to think that 4-5% of your sites readers can read Spanish.

[1] https://en.wikipedia.org/wiki/Spanish_language_in_the_United...

> Hard to judge overlap with HN readership,

Low. Very low. Think about the Hispanic population a little.

> but it's probably fair to think that 4-5% of your sites readers can read Spanish.

If they are, they aren't setting their OSes or browsers to use Spanish rather than English... An 'es' language code doesn't even show up in my GA language headers until #13 (consistent with the geoip), putting them at ~0.5% of all hits. I have more German, French, and Russian-preferring readers than Spanish (which is only slightly more popular than Portuguese). HN is cosmopolitan, but all the evidence I have is that it's not in a Spanish sort of way.

I appreciate that HN is more cosmopolitan then people think :-). People from different countries often bring great perspective to things that I wouldn't have come up with on my own. Another argument for the benefit of diversity, I'd say.

In drama, George Bernard Shaw’s Saint Joan. Though, of course, better seen performed. There is also Robert Frost poetry although much of that was already widely available.

> Robert Frost poetry.

The collection New Hampshire seems to be from '23.


Here it is on a Canadian website (where it was already PD)


Doctor Dolittle's Post Office

(also Richard Dawkins' favourite, curiously. https://www.independent.co.uk/arts-entertainment/the-book-th... )

Perhaps easiest to use Goodreads date search as a basic filter: https://www.goodreads.com/book/popular_by_date/1923/

I had found that list, but it doesn't really come with any kind of interpretation.

Reminiscences of a Stock Operator is from 1923 and is considered an investing classic. It is a good read (and still relevant) if you are remotely interested in trading.

Re: copyright, “We have shortchanged a generation,” - Brewster Kahle Internet Archive founder

A powerful statement! I love the Internet Archive. It gives me hope for humanity.

Looks like a non-GDPR-friendly site?


You don't have permission to access /2019/01/11000-digitized-books-from-1923-are-now-available-online-at-the-internet-archive.html on this server.

Same with the site's root.


You don't have permission to access / on this server.

Anyone have a mirror of it handy?

Coincidentally, it looks to be available via the Wayback Machine: https://web.archive.org/web/20190107210010/http://www.opencu...

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact