Hacker News new | comments | ask | show | jobs | submit login
Talk to Books (books.google.com)
210 points by tristanho 10 months ago | hide | past | web | favorite | 37 comments

This is cool! Not sure how practically useful it is (don't Google search results already include results from books?) but it seems like one of those things that could be super powerful once a good application is found for it.

To be honest, I'm just relieved Google seems to be doing something new with Google Books.

I've been fearing they'd take down Google Books (or severely hobble it) since it looks like it's been stagnant for years now (actually less than stagnant, since it seems like they took away Popular Passages? I really miss that!).

It was tied up in a court case about copyright.


That lawsuit has been over for 2 years now.

Yes, I've never understood the stagnation over the last 2 years. You still can't do simple things like share your entire library of saved books with friends.

In fact, they've removed the ability to download your own books that you uploaded, which is unfortunate since I uploaded a lot of books with the expectation that it would be a functional archive...

>I've been fearing they'd take down Google Books

Oh that would be terrible. Google Books is extremely useful for academic work: not every library has good text search and I often found myself using Google Books to get reference material I could then look up in the library for full access.

I relied on Google Books' digitizations of 17th century rare books constantly while in grad school for history; probably more than half of the texts I cite in my academic publications were accessed via Google Books. But lately I've noticed that many of the rare books I relied on are no longer available to read (they're clearly still living somewhere on Google's servers since you can search for snippets, though).

Anyone know why this happened? It shouldn't have anything to do with lawsuits related to publishers since the books I'm talking about are centuries old and housed in places like the British Library, which one would think would be committed to making them publicly available. In some cases (like my weird little specialty, 17th century Portuguese apothecary and drug manuals) I'd say the decrease is on the order of 80-90% fewer texts available.

Luckily HathiTrust largely makes up the difference, but otherwise this shift would be kind of a catastrophe for me as a researcher. There are dozens of times when I my choices were either "access this rare book via Google Books, or fly to Europe and look at the closest physical copy, which happens to be in some obscure library in Belgium/Lisbon/Rome/Budapest/etc."

I'm just an armchair historical linguist but the 19th century books and dictionaries written by British missionaries dispatched to China, that are freely available in abundance on Google Books, are just as fascinating as the cutting-edge monographs of the past decade. They lacked proper training but their observations are fascinating. For example, the pronunciation guide in a dictionary of Shanghainese notes that older people would say Beiging (I'm going to avoid linguistics jargon and true Shanghainese pronunciation for the sake of understandability) and younger people would say Beijing. So we know exactly when this sound change (palatalization) spread to the Shanghai area. The author didn't call it as such and seemed to have the attitude of "kids these days don't talk properly" which makes it really amusing to read.

That sounds interesting. Could you provide a couple of links to those books?

So, google, "what's fun about programming"

The first result says this

Similarly, programming a computer can prove fun because you might design a simple program that displays your boss’s ugly face on the computer


lol, didn't see that result. where was it from?

A related blog post is https://research.googleblog.com/2018/04/introducing-semantic... (posted by walterbell and lainon but didn't get attention).

Glad to see Kurzweil's group has been up to something interesting.

I tried tricking it a bit, and here's what I got:

Does P=NP: Something about rare earths, then a couple books that mention the problem. https://books.google.com/talktobooks/query?q=Does%20P%3DNP%3...

Who are you: Looks like Books is shy. https://books.google.com/talktobooks/query?q=Who%20are%20you...

Why are fire trucks red? Apparently, there are different answers, and none mention the Monty Python one. https://books.google.com/talktobooks/query?q=Why%20are%20fir...

What is 1+1? 1, apparently. The same query without a question mark gives the right answer. https://books.google.com/talktobooks/query?q=What%20is%201%2...

Which company used to have the motto "don't be evil"? Not Google, it seems. https://books.google.com/talktobooks/query?q=Which%20company...

I'm curious where this narrative that Google dropped "don't be evil" comes from.

At the time of the Alphabetization, there was some shuffling of the code of conduct document, but everything lives on.

The Google code of conduct[0] still begins with "Don't be evil."

The Alphabet code of conduct[1] opens with "Employees of Alphabet and its subsidiaries and controlled affiliates (“Alphabet”) should do the right thing."

And yet I repeatedly read the claim that Google has secretly abandoned "don't be evil"—presumably on the down low so that they can now do all the evil they want.

[0]: https://abc.xyz/investor/other/google-code-of-conduct.html [1]: https://abc.xyz/investor/other/code-of-conduct.html

I think it doesn't work well for questions with simple answers; it's meant for more open-ended questions. I got much better responses for some random questions I just thought up: "What will AI enable in the future?" "Why are so many depressed?" etc.

My first question (and the response):

Q: What is the sound of one hand clapping?

A: The sound of one hand clapping is the sound of that silent wave, the sound of an absence, the absence of the noise ordinarily made by the collision of two hands.

   -from The Secret Parts of Fortune: Three Decades of Intense Investigations and Edgy … 
by Ron Rosenbaum

This still feels like search and not "talking" to anything. To me talking would require back-end-forth refinement.

I wonder why they are releasing this only against books? Does it not work against the internet at large, or is it a question of scale? Because they aren't even doing all books, it's only 100,000 of them.

I still think it's pretty cool. Some future natural language search will make keyword search seem primitive. As in how did we survive with only it for so long.

I'm glad Kurzweil ended up at Google and is able to work on cool projects like this. He's 70, I hope his vitamins work and he can't stay active in development a while longer.

This is the coolest thing I've seen in a minute! Instantly exciting and it really works. I'm using this to explore old books I've read and forgotten parts of. Really fun to relive those special moments from fantasy and sci fi I've read over the years. Also this is an alternative to my experience at bookstores where I thumb through interesting books.

> will the AI take over the world?

- The number of dumb things the AI will be able to get away with has a direct relationship to what sort of intelligence the AI is supposed to represent.

- The AI will not underestimate the opponent’s military might.

Well, first thing I expected it to use the microphone for actual listening to talking. Alas, the title was misleading, and it expects text input.

An interesting idea anyway, but they could title it better.

This is a cool tool. Though this tool reminds me of a dilemma I have with Google: what do you think about the fact that Google has effectively extracted value from the authors of these books without paying them, yet Google itself does not allow you to scrape them AFAIK (If I'm wrong I'd be curious to see where in their ToS they allow this)?

Though Google doesn't seem to be monetizing this directly, the fact that they're training their models on this data does likely have some nebulous value. Given that most books are for sale, surely there's a good way to reconcile the fact that the authors would like payment and organizations like Google would like to scrape their text.

Google scrape your website, then allow people to search your site and read snippets from it.

Google scans the book, then allows people to search the book and read snippets from it.

This is fun but I hope it doesn't replace the normal gook search since it doesn't seem to have any advanced search features like limiting by date.

Interesting. I wonder what the dataset used to train the machine learning model looks like. Books and conversation style query pairs?

I wouldn't be surprised if they were actually using some techniques from NLP. In particular, I wouldn't be surprised if they'd run sentence parsers over their entire Books collection and then manipulate the parse trees to find sentences in books that look like good answers because their sentence structure is similar.

Basically, yes. This area is called Question Answering (QA). One of the popular datasets is SQuAD https://rajpurkar.github.io/SQuAD-explorer/

Wouldn't semantic searching be far more application as an extension of google scholar or google patents?

There's also recoll, a document indexer and search tool, which I've heard is pretty good.

Is Google still scanning orphan works?

This is great. books answered my question

"which countries supported Iraq chemical attacks during Iran-Iraq war?"

"Saddam Hussein received chemical weapons from many countries, including the USA, West Germany, the Netherlands, the UK, France and China (Lafayette, 2002). In 1980 Iraq attacked Iran and employed mustard gas and tabun with 5% of all Iranian casualties directly attributable to the use of these agents"

from Handbook of Toxicology of Chemical Warfare Agents by Ramesh C. Gupta “Handbook of Toxicology of Chemical Warfare Agents” by Ramesh C. Gupta

That’s funny cause I just put that question into TTB and copied and pasted that paragraph into google and found this page. Interesting indeed. I was more looking into what could be proof that the chemical weapons use on Halabja are the same ones tha Saddam had in his arsenal. Cross referencing the studies that was done on the attack site with the sources might help seal the deal.


in response to the US-backed coup of a democratically elected leader (hehe, Americans don’t like democracy when they do it “wrong”, see assad, orban, hussein, and so many more)


Could you please stop posting flamebait to HN? I realize it started upthread, but you've done it repeatedly recently, and we're trying to avoid flamewars here.


Russia straight up invaded Ukraine and annexed a part of it, an independent democratic country

What is a Wojak?

Applications are open for YC Summer 2019

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact