Hacker News new | past | comments | ask | show | jobs | submit login
Ask HN: How do I easily catalog a couple thousand physical books?
177 points by deanebarker 32 days ago | hide | past | favorite | 121 comments
I own a couple thousand books. I'd like to catalog them all. I have a child who is a broke college student, so I was thinking of paying them to do it over break.

What's the most efficient way to do this from an INTAKE standpoint? I need to get all the ISBNs into a database of some kind.

(The only other info I need is whether or not it's a hardcover or software -- that's something only the physical copy can tell me, everything else I should be able to get from the ISBN.)

I don't want my daughter to have to find and key all the ISBNs in. Can they be scanned in some way? Is the ISBN in the UPC code? Could I buy a cheap bar code scanner and just have her scan away?

I have 3,476 books cataloged at the moment on https://www.librarything.com/. I bought one of their barcode scanners (https://www.librarything.com/more/store/cuecat) to do my initial cataloging but you could also use the scan feature on their mobile app.

I prefer LibraryThing to Goodreads because LibraryThing focuses more on cataloging than social features. Their team also builds software for actual libraries. They source book data from almost 5,000 external sources so it's easy to map ISBN information with the correct edition and cover. You can also get your data out pretty easily, they offer exports in multiple formats.

EDIT: For most books, you can scan the barcode on the back to get the ISBN. Mass market paperbacks seem to usually have separate UPCs. The ISBN barcode is often located on the reverse side of the front cover, so you want to scan that one instead of the one on the back.

> one of their barcode scanners (https://www.librarything.com/more/store/cuecat)

That's a blast from the past! I remember getting a CueCat in the mail (I guess in 2000, based on the timeline in the Wikipedia article[0]). Neat to see that they've found another life (many others, by the sounds of it).

[0] https://en.wikipedia.org/wiki/CueCat

FWIW, the CueCat works but is fairly slow at scanning. If you have a large volume of books, it's worth spending a little more to get a faster scanner.

yah faster barcode readers are surprisingly cheap these days...

I recall trying to look into barcode scanners for personal use (figured I could catalogue everything in my house) but found it surprisingly difficult to find recommendations that seemed reliable, fast and integrated with software usefully.

My dream really is the Home Depot Scanner, because that thing is stupidly satisfying

I think I still have a couple of cuecats as souvenirs from the dotcom days

Another happy user of Librarything here. It provides a beautiful public-facing catalog[0] if you want in addition to the full[1] one.

Most importantly for me, Librarything lets you add books that do not have an ISBN, unlike every one of their competitors I've tried. Nearly 20% of my books are too old to have an ISBN, so that's pretty important to me.

[0] https://www.librarycat.org/lib/ddrucker [1] https://www.librarything.com/profile/ddrucker

I'm super curious -- do you mind sketching out what books you collect that generally don't have an isbn, if you have time?

Thanks for indulging my curiosity. Cheers!

You can look for yourself! Go to https://www.librarything.com/catalog/ddrucker/yourlibrary and select style E, then sort by ISBN so all the blanks are at the top!

oh cool ty!

ISBNs were only introduced in the 70s, and I have quite a lot of second hand books that predate its introduction. Even if a book does have an ISBN there are a significant number from before bar codes became ubiquitous so are slightly more annoying to scan.

Haha same, made me instantly think of 'The 9th Gate' which I always loved for the showcasing of the rare book world. I'd always wondered if that's how it really is out there. One man with a priceless collection on the floor, another with his in an air-tight high-tech vault in a high-rise.

I would like to collect some 19th century and early 20th science and technology books, those wouldn't have the ISBN. Electricity, telegraphy and radio, industrial production, that sort of things. History of technology is fascinating, and sometimes you can find those books very cheap or even get those for free from someone clearing their grandpas attic.

edit: I don't own any but regret not buying one good collection recently for cheap, but it would have taken a considerable amount of space on the bookshelf. There's definitely difference in occasionally looking through physical book, the digital ones don't offer the same joy :-)

I don't seem to see anything on their website, but does it have any features for understanding where a book is located? I have many different locations in the house (or at the office even) where it may be located, so having a "where" would be quite handy.

I use their "Collections" feature for this. A book can be in multiple collections, and I have a "Basement" collection for books in storage. They also have tags that you can use for a similar purpose.

Ok, that could work. But I was thinking about something with more of a hierarchy. Home > Bedroom 1 > Bookshelf 2 > Shelf 3, kind of thing. And then move it to Home > Bedroom 2 > Nightstand 2, or whatever it is.

You could name the tags with it's path:


I've seen this in some databases that store tree structures.

It must be amazing to have your own library. I wonder sometimes what various collections would have looked like now, had I ever been allowed to keep one.

I spent a lot of time in libraries over the years, especially as a kid, and being surrounded by books just does something.

Now libraries feel more like hangout places where kids vape and talk on the phone.

To be surrounded by books, in your own peaceful place, sounds like pure bliss.

Never too late to start. I have lots of books that I've collected over the years, but I've had several portions of the collection "dispersed involuntarily" for various reasons. :D

To be surrounded by books of your own choosing is nice. It doesn't have to be a lot though. Just having a stack of nice books near your bed that you like or believe you will like is usually enough.

I am a firm believer in tsundoku.

Visit a few estate sales and library book sales. It’s quite doable and starts to feel immersive once you have two shelves.

Im going to give this a go,al much appreciated!

When you see a lot of books, ask the seller for a bulk discount. Check with recycling centers and ask really nicely. Third, local libraries get donations they then sell for money donations or they have to dispose of - especially old text books.

These are ways to acquire books, but maybe not quality or good condition. Another option is Half Price Books sells "books by the foot" like old encyclopedias or law books for aesthetic value.

Thanks for all of these, I'm making note of all the great tips.

I'd have never thought of any of these. I just feel like the older I get the more precious books and knowledge become. Like if there's anything at all worth collecting it's books.

I can't believe books are sold by the foot now! I used to save money from doing odd jobs to get encyclopedia Britannica volumes as a kid, those and the Wildlife Treasury Series. When I got a bit older those damn Scholastic catalogs got me too! XD

LT has a free iOS app - that's what I use to scan in my books.

Would other barcode scanners work as well? I know the cuecat has a cult following and a fun history, but you can get modern handheld scanners for about the same price (up through "far far more expensive" obviously). But I have no idea if librarything integrates with them as smoothly.

Almost certainly any barcode scanner would work. A barcode scanner is essentially a keyboard emulator. It types in the value it scans into whatever field the cursor is in. You can also program them using a set of special barcodes you can download. A nice feature is to make it hit 'enter' after scanning to activate a feature or 'tab' to jump to the next field!

Oh wow they still have CueCats to sell?

From the sibling links it looks more like they continued making them - it now uses USB instead of PS/2.

I don't think they make them anymore, according to this website[1], DigitalConvergence had an inventory of 3 million PS/2 CueCats and 1/2 million USB CueCats when they folded. I'm pretty sure LibraryThing bought some of them, they first started selling them in 2006 [2].

[1] http://www.cexx.org/cuecat.htm#usb

[2] https://blog.librarything.com/2006/10/librarything-does-cuec...

I did the same thing, worked fine - although I had to add a certain number manually, which was still pretty quick.

I found it impossible to maintain though. From time to time I'd cull the physical books, but not update the library. Now it's just some books that I used to own.

Chiming in: Another happy LT user here.

Tried to create an account to try it out. Limiting passwords to 20 characters seems like a bad sign.

Barcode Scanner Phone app + Calibre

I've used this (Android) barcode scanner before: https://play.google.com/store/apps/details?id=com.sukronmoh....

On Calibre, simply go to "Add Books" > "Add books by ISBN", and paste a list of ISBNS. It will automatically download metadata and images for them.

You can also buy a bluetooth barcode scanner that works faster if this is going to be a regular thing.

I copied my LT & GoodReads catalogs into Calibre recently to reduce my dependency on third parties. I've been a Calibre user for years but hadn't pulled in all my physical books until recently.

I'm very happy with the results.

I sampled books from my personal collection and found many didn't have an ISBN. Officially ISBN started in 1970, and I just found one on a book that was printed in 1972, I'm not sure what the adoption curve was like.

ISBN is also not guaranteed to be a primary key. It's designed to serve the needs of new book sellers. If a book goes out of print the publisher can reissue the same ISBN to a new book. It's unusual, but South End Press notably was resentful about paying for new blocks of ISBN numbers and recycled ISBNs to "stick it to the man".

Some books have an ISBN barcode on them


but you don't want to waste time with a "cheap" barcode reader made in China and sold on EBay. I have played around with those and find that they read barcodes when they feel like reading barcodes and it is quicker to type the codes in.

There's a certain rare, out-of-print book that took me a couple of years to track down. It shares an ISBN with another common, inexpensive, in-print book whose only similarity to the rare one is that it came from the same speciality publisher. Every time I found a copy of the rare book on various used book websites, the common one showed up on my porch instead. Apparently there are a lot of people who treat ISBN as a primary key. A non-trivial number of sellers thought I was trying to pull a return scam on them because they could not be convinced that ISBNs are not unique. The fact that the book I tried to buy generally listed for around $300 and the book I was actually sent went for $15 used really made returns difficult.

My search finally ended when a copy showed up on eBay, complete with listing photos. I suspect the price I paid was much higher than usual for a book with similar desirability and rarity because the ISBN collision made it much harder to find.

Dumb question: If the book was harder for buyers to find, wouldn't that drive the price down, because not everyone who wanted to bid on it would be able to do so?

It was hard for buyers to find on used book marketplaces because sellers would scan a $15 book, their system would pull metadata for a $300 book because of the ISBN collision, and then you don't know which items are actually for sale until you order them and they arrive in the mail. It's worse when dropshippers get into the picture and there's effectively multiple listings per item.

On the other hand, eBay listings for books are usually made for a specific item, complete with pictures of the actual item. It was obvious that the listing was for the real thing.

Do you have a source for the non-uniqueness of ISBN? Google didn't provide anything tangible and I'd be very interested in a reputable place confirming your claim.

Actually, as far as I see [0] ISBN have to be unique. It seems like this "South End Press" is more of a case of a publisher going rogue than anything els.

[0] https://www.isbn-international.org/content/isbn-assignment (see the '“Out of print” editions' bullet)

Stephanie Marlowe's reply to


points out the non-uniqueness of ISBNs. I used to work in the IT department of a major university library and it was a problem that the librarians were aware of.

It takes exactly one "rouge" publisher to make a problem. (Funny I used to know the people at South End Press)

Fair enough, but my issue with your original comment is that it implied that _by design_, ISBN does not uniquely identify books, which is wrong, as far as I can tell. In any case, the issue of reuse of ISBN has practical consequences, so that the original intent might not be so important in the end.

Do you have any idea how many cases there are of reuse? Is it actually common?

rogue or communist publisher?

The ISBN might be unique, but that doesn't mean the barcode on the cover is.[0]


I'll second the recommendation for LibraryThing. I wanted to be able to shelve books by LoC call number and this has a pretty good (although not always complete) lookup for most call numbers (I have learned how to generate a call number for books that don't have them and also discovered that the University of Chicago Library doesn't use the same cutter numbers that the LoC and most other libraries use).

LibraryThing's mobile app will scan barcodes just fine.

The gotcha is that mass-market paperback before sometime in the 80s (I probably have the date wrong) do not have the ISBN in them. These will need to be entered manually (not to bad with the mobile app which has a dedicated ISBN keyboard). It can also look up by the LOC catalog number (which is not the call number but rather a consecutively assigned number which can be found on the copyright page of books published starting some time in the 1960s).

ISBN, by the way, will tell you the format of the book. Paperback and hardcover books have separate ISBNs.

> ISBN, by the way, will tell you the format of the book. Paperback and hardcover books have separate ISBNs.

To add, ISBNs will also tell you the cover of the book. If you have 5 hardcover editions where each has a separate cover image, that's 5 ISBNs.

The ISBN corresponds to a specific format of a book, always. Only reprints share an ISBN. And ISBNs also tell you a lot about the book.

I used this android app to scan the ISBNs of my books - https://play.google.com/store/apps/details?id=com.eleybourn....

It's OSS: https://github.com/eleybourn/Book-Catalogue

From what I recall it also pulls additional info about the book from online.

I had a gig doing this once in grad school. Here's my method. It worked great:

First, close the library (or library section) until your work is complete. It's critical that the books not go wandering or get rearranged during this process.

Next, grab an SLR (or equivalent mirrorless) camera with video mode. Set it to video mode. In good lighting, play it over the shelves, one by one, from left to right. Slowly.

Make sure the spines are all legible. This is your set-of-books.

Set yourself or someone else up transcribing the titles from the recording, in the order shelved. Check it a couple of times. If you missed a book, or couldn't read the spine from the recording, add it here.

Once you are certain your list is accurate and complete, print (or put on your phone) the list of books. (Still in the order shelved.)

Now, again, working top-down, left-to-right, take books out in sets of eight. (I like eight because it's a nice round number, it's near miller's magic number, and it's also a number of books I can typically carry.)

For each 'byte' of eight books, take your SLR and, in photo mode, take a pic of the frontmatter page of that book -- the one containing the date of publication, and, most critically, the ISBN.

Put the eight books back on the shelf and take another eight. Repeat until complete. Be sure not to miss a book.

Now you have a list of books and a set of pics. Guess what? They are the same length and in the same order. So, book 1 on your list is the first pic on your SLR. And so on.

Now, you can OCR those pics for the ISBN. As backup/redundancy you can grab other info as well, e.g. publisher, etc. to sanity-check the results of your ISBN lookup.

Congratulations, you now have enough information -- a title and and ISBN -- for e.g. Google Books to pull up the rest of the info, which you can sanity check against the other deets you OCRed out of the frontmatter page.

Final tip: Calibre has a book information lookup thingy; it wasn't what I used back in the day, but AFAIK it should work great. It may be possible for you to simply populate the Calibre book list with titles and ISBNs, and have it just magically whisk other details -- date of publication etc. -- into the appropriate fields. Again, you can cross-check these (either exhaustively or spot-check) against the OCRed contents of the frontmatter pages, which (again) you associated with a book title in the initial step.

Happy librarianing!

I like the idea of photographing the books on the shelves, I wonder if you could possibly OCR the title of the book from the spine. Although I guess fonts used on books are pretty varied.

That's a cool idea. I was doing this in ~2008, and IIRC OCR was barely grabbing "ISBN" in 12pt Times on a white page, let alone RANDOM_FONT on a book spine.

(Unrelatedly I also had a gig OCRing books for someone with a visual disability, which is where I got the idea.)

Today I work in the AI division of a major tech company and I can say with some confidence that you could absolutely OCR spines if your OCR platform is new enough :3

This is a wonderful comment that juxtaposes any suggestion that requires buying a scanner. That said, I wonder if this works at the scale that OPs kid will be tasked with?

I processed a few thousand books this way. It scales nicely.

(also, why is this on-topic, strictly-procedural comment I spent 30 minutes writing downvoted? Hacker news, you are fickle)

Perhaps because it seems overly complicated, with a lot of manual parts prone to error? You made something that is N long (scan each book) into something over 3N (photo all might not be N, but transcribe titles, pull out, open, photo, then transcribe isbn/info is 3N or more.) I got the idea from a book (24 Hour Bookstore/Ajax Penumbra) but a panorama of a room to pick up the spines would be amazing.

I don't see it here, but even accepting the need to barcode/isbn all the books, I would like a way to pull digitized or high-quality spine images for a VR library that is fed from an ISBN database of what you want it stocked with, using a query for shelving/sorting. I fear that this is a whole new project, akin to the original good cover scans, but lacking even a starting point from publisher sites.

There was a great bit of free software that I had downloaded a bunch of years ago, LibraryDB, which allowed you to set up your own software. If memory serves, you could hook up a laser scanner and scan the barcode, which would make things pretty easy for you, but I can't seem to find the software online anymore.

But to be honest, I've always found cataloging and data entry to be a lot of fun, and there's something meditative about entering a book's title, date, author, ISBN, etc. into a system. I found that it helped me figure out what I have and think about why I'm keeping some books. It also led to some neat discoveries about certain books. I found a couple that had been signed that I'd never realized had been signed!

I use the free version of Libib.com [1]. Both the Android and iOS apps work just great. The app has an integrated barcode scanner and automatically looks up for the book's info. You can even export the catalog as csv.

[1] https://www.libib.com

I use Libib as well. It has a blazing fast barcode scanner, though manual entry is a bit cumbersome. They seem like a good company, though longevity might be a concern. There is an easy export option, anyways.

I've been a happy user of the (free) version of libib for ~8 months now. Android and website both work great for my purposes (800 books, manually added) but as parent states the service can be used with a barcode scanner for larger libraries.

Ditto, I used Libib for a much smaller library (500 or so) and scanned everything when I moved last month. It was handy because the phone app easily scanned most of my books and I could manually enter the ISBN for those that didn't.

Instead of spending money on hardware and software that will be flakey, time-consuming, and error-prone, put the money to good use.

Call up the nearest college or university with a library science program. Ask them for the names of students specializing in curation and cataloging. Contacts the students and tell them how much money you have, how many books you want to catalog, and arrange for them to take care of it.

Seconding this. "How to catalog all my books" is a decidedly nontrivial problem that people literally make the concentration of their masters' degrees (MLIS). The advice will depend based on the type of catalog you are likely to have, and you'll probably get asked questions you didn't even know you should be answering.

This will also be legitimately valuable experience for someone who can't get an internship at a real library (not enough slots/demand) and needs something to put on their resume. (If you had a budget of $0 you could possibly get someone to do it for free even, but don't do that.)

I have around a thousand books myself. I once tried to arrange my shelves by Dewey Decimal myself. I didn't even want to make them strictly ordered, just within the classes 000-900.

Turns out it's a major pain to figure out what class some books belong in. Newer books from major publishers have LoC cataloging-in-publication data, which is handy, but the older the book the less likely it is to have that, and if it's from a small publishing house or independently published, someone has to decide what catalog.

Considering just the time involved, I wouldn't ask anyone to do it for free, and MLIS degree programs aren't cheap!

ETA: turns out it's pretty common for folks with the means to hire grad students as interns and freelancers. Some universities will have programs devoted to connecting students with opportunities, even, although at that point you'll be dealing with official contracts and pay rates, which may not work.

They’re going to want to know how you want to store the information and how you plan to use it in the future.

Yes, and an MLS student will know what questions to ask and how to assess a good solution.

Plus, if you have 2000+ books, you may have the makings of a collection with titles or editions a library would appreciate as a gift or bequest. If it's already cataloged according to ALA conventions, that makes the gift much more attractive.

So no different than this thread, then.


I have about 5,000 books. I just sat down with a spreadsheet and entered the particulars for every one like any librarian in existence over the last 2000 years would do. I did this over a few months with 1-8 hours of effort in a burst.

A lot of things still have no quick and automated solution.

When I'm doing accounting and financial analysis, it's still primarily a numbers grind of data entry, cross checking and analysis.

Librarians use bar code scanners today, not spreadsheets.

The two are not mutually exclusive.

For example, I use a very simple barcode scanning app[0] that exports to CSV.

[0] https://apps.apple.com/us/app/reading-list-book-tracker/id12...

I did that for about 70 books recently. I purchased a scanner for very little (about 200 dkk, no idea what it would cost in USD), put it in the USB on my laptop and opened a spreadsheet. The handheld scanner automatially "enters" a newline once it has written the ISBN number, so the only thing you have to do is move onto the next book. You can scan books faster than you can pull them out and put them back.

Actually getting the book info from the ISBN was very time consuming, I didn't know about Library Thing.

You can almost certainly use a smartphone, but that will be a lot slower, comes with the risk of dropping it and then you don't have a bar code scanner left over for other projects.

As I put some books into storage, the LibraryThing app on my phone could use the camera to scan the isbn bar codes for my books, and I could tag them with "box 6" and the like. Searching for books without isbns wasn't hard.

I don't have the code to hand at the moment, but when my partner and I catalogued all our fiction books (over 2,000) I discovered that you can search Amazon via ISBN, so the workflow was:

1. Scan book ISBN via a standard barcode reader (in keyboard emulation mode) [1]

2. The barcode reader 'types' in the ISBN into the GUI of my homebrew app along with a new-line character.

3. The app takes this ISBN and searches Amazon via it's API, then parses the results. If there's more than 1 result it just picks the first one (which is 99% correct), if it's not correct there's an option to override or enter details manually.

4. The app then queries the amazon API again for the book details, and places them all into a record in a SQL DB.

5. For reasons I can't quite remember the app also goes to the actual product page of the book on Amazon and grabs the high resolution picture of the book cover (which only some times matches the edition of the book we actually own) and stores that locally.

6. There's a web app for searching the DB that we use when we can't remember which books we actually own (first world problems!)

7. Any new book that enters the house is catalogued BEFORE it is even allowed in the Library.

It's a basic set-up, but other than the Amazon API, which is abstracted via my own functions, it's not dependant on any other (closed or FOSS) book management software.


[1] A cheap one like this will do it: https://amzn.to/3CyqaAC

That sounds like a good approach. I guess you could just a smartphone app to avoid buying a scanner e.g. https://play.google.com/store/apps/details?id=com.gamma.scan...

I guess the experience might be smoother with a dedicated scanner though.

Do you have a git page on this? This sounds impressive, above what I can do right now.

I have been searching around off and on. I found a HN thread for years ago too!

It seems that people have favourites but there is nothing leading the charge. I know someone that has a barcode reader and wrote ee's own code to parse it for a database, even printing personal barcodes to handle repeats (sentimental value, different format).

I think ISBNs sometimes are different soft vs hard, BUT ISBNs are new and sometimes get reused, plus the barcode might be for a category or "book/zine of series" and not a particular copy.

I know Amazon, and thus goodreads, can now do search by cover - it is amazon, but it does avoid a lot of issues. Goodreads might be best for you just for phone and cover scanning. It is, however, slow. You can use shelves/tags for locations but it is ... slow.

I think the best system would be a combo of cover scanning, cataloging your photo in the entry, and notes of signed/writen notes. BUT for location tying it to a barcode reader that can scan the new shelf's barcode and the book, but that requires personally barcoding books (please use removable stickers!) and is a bit over involved.

Personally, the weird case I really want and am not sure how to peice together is a way to pull good spine photos/digitals. I know mu books by spine so I'd love to see if a book catalog could have a spine database and be used to make a virtual library.

edit: Found the thread. Last I read it did not have anything particularly good, but a few workable options. https://news.ycombinator.com/item?id=19817219

I'm not sure of the value for me to catalog all books just for the titles. What I would like is whenever I search Google, it would also search my books and tell me which book to open. It feels like a lot of knowledge sits on my shelves that I'm not really using. Not sure if that exists, but obviously the hard part is getting access to full-text contents of books - scanning pages by myself would be extremely time consuming.

My family has used ReaderWare for years. It's got smart cataloging across multiple sources (including LoC for old books w/o an ISBN) and runs on a number of platforms. We use an old USB CueCat.


While I haven't used it (and it's a Mac-only app with a companion iOS app for scanning), I've heard many people rave about Delicious Library from Delicious Monster: http://www.delicious-monster.com

I'm a very early user of Delicious Library, and I really loved it back when I was using it. At some point, I lost my library (my fault, not the app's) and could never be bothered to start over.

With all of that said, the thing that I loved about Delicious is how rich the experience was, and the ability to catalog any type of media.

Looking at their site now though...it really looks like it's stuck in 2009, with HTTP-only site to boot. Has anyone used Delicious recently?

Release notes reference March 2020: http://www.delicious-monster.com/release-notes/

Downloads reference Big Sur (Nov 2020).

No price anywhere I can find on that site. It costs $39 in the U.S. Mac App Store though if you want to know before trying it.

Delicious Library is (was?) one of the things that could get me to buy a Mac, but hasn't yet. I'd love to hear if it's still as class-leading as it was back before the smartphone era.

I cobbled together something similar using a couple of javascript libraries. It is a really simple web-app that you can open on any phone, and it uses the phone camera to scan the barcode. It saves the results to a Google Sheets (what I wanted). The code is public, mostly because I couldn't bring myself to clean it up. If you are interested, I could make it public. I wrote about it at https://nikhilism.com/post/2021/tracking-books-i-read-using-...

Honest question: what’s the benefit of cataloguing books? I’m surprised so many people do it.

I have 3 tall bookshelves of books but don’t really feel the need to catalogue them. I sort them by topic and sometimes physical size and it seems fine.

I can think of three or four good reasons:

1. To prevent buying duplicates. I own enough books (on the order of thousands) that I can't remember all the ones I own, and occasionally buy something only later to discover that I just bought a 2nd (or 3rd) copy.

2. Insurance purposes. If I had a fire or something I imagine I might need the catalog, photos, etc. as part of my claim process.

3. Related to (2), but if I had to replace a substantial portion of my library, it would help if I had the items in the library cataloged.

4. Locating books. I have enough books jammed into my apartment that I sometimes struggle to locate a particular one. Or even remember if I own a particular title at all or not (see (1) above). A catalog, with notes like "In living room, on bookcase A, shelf 2" or "in office, bookcase Q, shelf 4" would be a huge benefit at times. Even more so for the handful that would fall into the "Boxed up and moved to storage locker on 08/16/2019. Box labeled BR549" category or whatever.

I have 3 tall bookshelves of books but don’t really feel the need to catalogue them

Well sure. If I only had 3 bookcases of books I don't know if I'd bother either. But when you get into the multiple thousands of books, it becomes more important.

My main reason is the same as mindcrime's and daggersandcars's: I found that every now and then I'd buy books I'd forgotten I already had, and it was annoying.

At one point I thought there would be value in recording where in the house each book was, but it soon became obvious that there wasn't so I stopped. If space considerations required some of them to be in boxes, it would become valuable again. (I do tag each book with what it's about, and shelve things by subject, so in some ambiguous cases the catalogue might be useful for finding books. Hardly ever has been, though.)

It's occasionally useful when I know there was a book with such-and-such in the title but can't remember the author or the exact title (though not very useful since usually in such cases it's also possible to find the relevant shelf and browse until I find it. But you can't grep a bookcase.)

And if I didn't have a list I'd wonder every now and then how many books are actually in the house, and waste time estimating.

As mindcrime says, this isn't something that has much value for a few bookcases of books, but the value increases as the number of books grows.

When I had ~1000 books, I catalogued them to avoid buying books I already had and to ease lending them to my friends.

Back in the pre-smart phone days, there was a Mac app that used a webcam to scan book barcodes. The author wrote a neat article on how it was better to scan barcodes really fast and throw away the noise than to try to scan the barcode perfectly the first time.

I have a backburner idea of a VR library room. It would be neat to have a database I could sort how I liked (for example, by fiction/non, then publication year -> but series use the AVERAGE single year for the series) That sorting is a mess on its own.

It also needs a way to pull spine images or have a computer go from bad spine pics to simplified colour-matching digital ones.

Other options - finding when you have a room with books from multiple homes (blended families, inherited), insurance, or for a "why is book 7 of the series here" massive sorting project with a good audiobook for company.

Agreed. It's largely a waste of time, unless you genuinely enjoy doing it.

I have a few thousand books myself so I wrote this. https://github.com/konsbn/xlibris this is almost exactly what you want

The "Handy Library" Android app includes an ISBN scanner and allows you to import and export collection data.

You could also check out some of the tooling and APIs around openlibrary.org. Unfortunately, I think it's basically a moribund project, but may have sufficient tools for your needs. I know they have a list feature, but I don't think ingestion is particularly easy; nor am I sure of the import/export functionality around their lists.

edit: I'd forgotten about librarything (mentioned in another comment). They have better tooling that openlibrary.

I’ve been using BookCrawler (iOS) for the last decade-plus to keep track of what I get (to avoid accidentally purchasing multiple copies), but I’ve discovered that at around 3,000 entries both the search functionality as well as all internal stats suddenly took a vicious and permanent dirt nap, and never recovered.

This threshold was reached a few years ago, so who knows how many books I have now. Everything else in that app works spectacularly fine, so… ¯\_(ツ)_/¯

As an interesting anecdote, the history of book digitization and its implications in fair use / copyright in regards to what you're trying to do is actually pretty storied (primarily litigated between Google and the Authors Guild over the course of the past decade).[0]

[0] - https://cdlib.org/services/pad/massdig/mass-digitization-his...

Another vote for LibraryThing.com and a barcode reader or smartphone. I scanned in ~1100 books, and only ran into 15-20 without ISBNs. All but 3-4 of those were easily input by title. But I like it because its very easy to keep up, and isn't a moneygrubbing advertising/tracking site like Goodreads. The phone app is very handy when wandering in bookstores-- easy to check if you own such and such a title.

Using the Goodreads app for the initial data loading might be the easiest way. You don't even have to specifically scan the barcode, it can often identify your book from the cover alone. I'll defer to what @Jtsummers said about getting your data out of their database though, as I have not tried that part myself (yet).


If you're logged in, at the top is an export option which will generate a CSV file. It includes the ISBN if present and a lot of other columns of data that may be useful.

I don't think that's how I did it when I pulled the data out years ago, but it works and involves no 3rd parties so that's nice.

If you're willing to use it, Goodreads lets you scan books to add them to your collection. I've not used it in a while, but I believe there are easy enough ways to get your data out once entered into it. I did that once a long time ago, I imagine it hasn't gotten too much worse since then.

https://MyBookList.club Scan the ISBN and it adds it to your library. Then you can also export to CSV if you want.

Disclaimer: I made the app. Happy to give free promo codes to anyone interested in trying and give feedback!

Can it go by covers with the camera, instead of barcodes? Is there a way to see the books shelved, such as you can see the spines?

Not yet but it's in the pipeline!

You could put together an OCR app for their phone that could scan the title and author from the spine of the book (or cover) and do a lookup against something like the Google Books API or Open Library to get the ISBN (or store the work in your account on that service).

Yep, that's in the pipeline for MyBookList!

Looks like there is at least one other person interesting in such a thing, but no hits on it existing already: https://datascience.stackexchange.com/questions/43635/ocr-in...

This might have some ideas, but it looked like no particularly good solutions: https://news.ycombinator.com/item?id=19817219

'My Library' by Julien Keith, is great for Android users... Over 1m downloads and 4.6 star average... My own library is only about 600 odd books, but it was easy to use, interesting, and great now it's done!

I have a lovely $5 USB barcode scanner that was pulled from an old time clock system. It’s fast, simple, and acts as a keyboard. That plus Calibre would be good, but a spreadsheet would work just fine too.

Buy a cheap hand held battery operated wireless barcode scanner (cheap on AliExpress). These work really well for scanning stacks of books... pick the book up, zap, put the book down. You have to config the scanner to operate in "keyboard" mode or some such... basically what you scan gets typed as if from a keyboard.

I used a simple Excel macro for data capture and lookup. Basically when a cell changed (book was scanned) it would request the book data from outpan.com. If outpan didn't know the upc beep and return to the cell, otherwise decode the response (json) and populate the spreadsheet row.

Here's the excel macro (why I used the B column instead of the A column is a longer story):

    Private Sub Worksheet_Change(ByVal Target As Range)
        If Target.Cells.Count <> 1 Then
            Exit Sub
        End If
        If Application.Intersect(Range("B2:B99999"), Range(Target.Address)) Is Nothing Then
            Exit Sub
        End If
        Dim Ean
        Ean = CStr(Target.value)
        Dim Url
        Url = "https://api.outpan.com/v2/products/" + Ean + "?apikey=[haha get your own key haha]"
        Dim HttpRequest
        Set HttpRequest = CreateObject("MSXML2.XMLHTTP")
        HttpRequest.Open "GET", Url, False
        Set json = New VbsJson
        Set o = json.Decode(HttpRequest.ResponseText)
        If Not IsEmpty(o("error")) Then
            ActiveCell.Offset(-1, 0).Select
            booktitle = o("name")
            If IsNull(booktitle) Then
                ActiveCell.Offset(-1, 0).Select
                If IsVarArrayEmpty(o("attributes")) Then
                    Author = ""
                    PublishedOn = ""
                    If IsEmpty(o("attributes")("Author(s)")) Then
                        Author = ""
                        Author = o("attributes")("Author(s)")
                    End If
                    If IsEmpty(o("attributes")("Publication Date")) Then
                        PublishedOn = ""
                        PublishedOn = o("attributes")("Publication Date")
                    End If
                End If
                Cells(Target.Row, Target.Column - 1).value = Cells(Target.Row - 1, Target.Column - 1).value
                Cells(Target.Row, Target.Column + 1).value = booktitle
                Cells(Target.Row, Target.Column + 2).value = Author
                Cells(Target.Row, Target.Column + 3).value = PublishedOn
            End If
        End If
    End Sub
    Function IsVarArrayEmpty(anArray As Variant)
        Dim i As Integer
        If IsObject(anArray) Then
            IsVarArrayEmpty = False
            On Error Resume Next
            i = UBound(anArray, 1)
            If Err.Number = 0 Then
                If i < 0 Then
                    IsVarArrayEmpty = True
                    IsVarArrayEmpty = False
                End If
                IsVarArrayEmpty = True
            End If
        End If
    End Function

edit: you will need VbsJson from http://demon.tw/my-work/vbs-json.html (why that's a chinese page I don't know all I know was it was a single file json parser that was easy to work with for this).

edit2: I used this solution to scan and log 750 books in a couple of hours? Maybe 3? It went pretty quick.

I'd like to commend you for publishing your working code. This is a great help for somebody wishing to replicate your excellent work.

I am awed that you have a couple of thousand physical books.

1) How long did it take for you to collect all of them?

2) What are the books mostly centered on? tech? polticis? fiction?

3) Do you happen to know whats your read? and to be read stat?

I'm not OP, but I also have thousands of books (a little over 4100 at last count), so I guess my answers might be of interest.

1. I'm about 50 years old. (The furthest back my records go is 2004, when I was about 35 and the count was about 2300.)

2. Of the 4100 books in my catalogue (it's a CSV file, very high-tech), 1233 are fiction. Rough counts for some non-fiction subject areas: science 520, mathematics 420, philosophy 250, computing 220. Others with substantial numbers: religion 270 (I'm not religious, but I used to be and my wife still is), humour 240, history 210, children's books 180 (may be wrong; we don't always bother to record these and sometimes we get rid of them since children grow), reference works 110 (dictionaries, encyclopaedias, etc.), music 100, puzzles 100, poetry 100, language 100, books of essays 100, cookery 100 (these are mostly actual cookery books, which maybe don't really belong in the same list), biography 90, literature 90 (meaning literary criticism, books about books, that sort of thing; actual works of literature are mostly under "fiction"), politics 80, autobiography 75 ("biography" earlier excludes these), games 65 (this is things like chess books, not books that are somehow also games), education 65.

3. I think about 10% of the books are unread at any given time.

I've also got a couple of thousand. I already had about a thousand by my early 20s. I was an avid reader as a kid. I'm in my 40s now. I still like to read, but there are so many more distractions these days compared to 20 years ago. So, the rate of book acquisition has slowed down. When I was a kid, I read a couple of books per week. (And not just YA, I've been reading adult-level books since I was 10.)

My library is about a 50/50 split between fiction and non-fiction. I like speculative fiction (scifi/horror/fantasy and their intermixings such as weird fiction), with a smaller percentage made up of thrillers, comedy, dramas, and classic literature such as you might cover in an undergrad English Lit degree.

The non-fiction covers just about everything: physics, biology, mathematics, philosophy, history, the arts, finance, computing, politics, religion and the occult, and roleplaying game rulebooks. (Well, maybe rpgs are fiction, but it feels like non-fiction because they are more for rules reference than because of the great prose inside.)

I've read almost all the fiction, except for about a dozen in a pile which I'm still working through (I did a bulk buy last year and it's taking time.) The percentage of unread non-fiction is a bit higher, mainly because I don't have as much patience/time as when I was younger. It's a lot easier to set aside an hour for fiction before going to sleep at night; reading non-fiction at that time would just give me insomnia.

You definitely want to pick up a handheld bar code scanner and dump all the data into a csv.

From there it would take a few hours of playing with the data to get it in whichever form you prefer

The bar code scanner idea (already mentioned by several people) is even better than it sounds. Almost magically good. Much, much better than manual entry.

ISBN are different for paperback and hardcovers, even for the same title, so you should be able to get that info from the ISBN :)

you can write something with https://serratus.github.io/quaggaJS/ I used it in the past, with few tweaks it is very accurate.

and then you can query goodreads or amazon to find the actual book

Donate them to a library. Then come back later after the librarian has cataloged them.

Most books donated to a library to to the yearly book sale without being cataloged. What doesn't sell goes to goodwill similar thrift stores.

How about this:

1. Make a snapshot of a secondhand book store's online inventory.

2. Sell all the books to them.

3. Diff the inventory.

4. Buy them all back.


similar question, is there a good way to do this for video games? I probably have a couple hundred physical games and would loves to not have to manually build a spreadsheet of them

isbn scanner and tellico, i did this. you can add databases to tellico to ask for your book and it will add the info to your local database.

Could be worth it to buy a USB scanner.

I'm interested in using RFID tags to help me locate my books. Does anyone do this?

Apps like QRBot [1] have the ability to scan ISBNs (and barcodes generally), and have a "history" feature that keeps track of what you've scanned and lets you export (to CSV, among others). The app is free on both iPhone and Android (there is a paid version, don't know what extras it has or if it's just ad-free), but may want to verify how much history gets stored before you go scan-crazy.

From a US perspective (may apply elsewhere), for books published relatively recently (within the last ~20 years or so), the ISBN is often part of the barcode on the back of the book (ISBN-13s (the updated standard) start with 978, so this is a good clue that the barcode is an ISBN). For a period of time prior to that (and perhaps still applicable to Mass Market Paperbacks), there is a barcode on the back that is NOT an ISBN, but there is an ISBN barcode on the inside front cover. I've not discovered any systematic way to pull an ISBN out of a non-ISBN barcode (though I haven't dug too far -- my collection hasn't reached 4 digits yet and I've been happy to type when scanning wasn't an option).

Once you have the ISBNs, I like to query against the Open Library API [2], which is a part of the Internet Archive. The information in there is fairly robust, if inconsistent (the capitalization of titles is sometimes as printed on the title page, sometimes Library of Congress format, other minor things). They have a lot of data points available, such as cross-referenced IDs with Goodreads and LibraryThing, but again, this is community-supported data, so YMMV as to completeness or accuracy.

Another note -- many books have separate ISBNs for hardcover editions, trade paperback editions, mass market editions, eBooks, etc (and sometimes don't have an ISBN at all for things like Book of the Month Club editions). I don't know if this is a requirement, or a luxury that big publishers have, but it is something I've noticed (you'll sometimes see multiple ISBNs listed on the copyright page, along with their formats -- also you may see related editions on Indiebound [3], along with their ISBNs). A cursory glance at Open Library doesn't seem to have a data point distinction for this (which is unfortunate), so you may still have to note this, but theoretically it may be possible to get this information from the ISBN directly at some point.

Source for ^^: I read a lot, have a lot of books, briefly ran a (failed) specialized online bookstore, and wrote a CLI tool [4] for myself to solve this very issue.

[1]: https://qrbot.net/locale/en/ [2]: https://openlibrary.org/dev/docs/api/books [3]: https://www.indiebound.org/ [4]: https://github.com/winsbe01/booki

You should ask Ancestry.com how they catalog hundreds of thousands of records…

Hint: it’s not manual and it’s completely automated.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact