
Ask HN: Inferring book genre from its description - jmstfv
Given description of a book find its genre. For example:<p>Book: Python Cookbook<p>Description (taken from Amazon): If you need help writing programs in Python 3, or want to update older Python 2 code, this book is just the ticket. Packed with practical recipes written and tested with Python 3.3, this unique cookbook is for experienced Python programmers who want to focus on modern tools and idioms.
Inside, you’ll find complete recipes for more than a dozen topics, covering the core Python language as well as tasks common to a wide variety of application domains. Each recipe contains code samples you can use in your projects right away, along with a discussion about how and why the solution works.<p>Output: [Python]
======
tabtab
There are various AI techniques you could apply, but I've also used "keyword
weight tables" for similar auto-classification tasks. Simplified Example:

(Word) | (Category) | (Weight)

COBOL | Programming | 7

Python | Programming | 3

Python | Animals | 3

Java | Programming | 3

Java | Drinks | 3

("COBOL" is generally unambiguous so we give it a high weight.)

You run all the words of the description thru it and the final "best guess" is
the category with the highest score: the weight sum. Book titles may be too
open-ended, but I was dealing with domain-specific lingo. For example, it
could be used for customer email routing for specialized services or products.
It will require human inspection and tuning to get decent results.

~~~
tabtab
An adjustment to the above (word, category, weight):

Java | Programming | 2

Java | Drinks | 2

Java | Islands | 2

