Hacker News new | past | comments | ask | show | jobs | submit login

I'm curious how the knowledge graph API performs disambiguation without any context. E.g., if you search for `Apple` will it return the company or the fruit?



The current service returns a ranked (with scores) list of up to 200 entities. You can specify a type in your query or filter the results to select types of interest (e.g., Person, Place or Organization). The top result for 'apple' is the Corporation 'Apple, Inc.' and #2 is Thing 'apple' (a fruit). The score is probably based on a graph popularity metric (e.g., number of inlinks) possibly augmented by pagerank. Interestingly, the knowledge graph ID is the same as the Freebase MID and the results for the KG search for 'apple' appear to be a subset of a similar Freebase search and also in the same order.


I don't see how it could. If you search Google for apple, or even ask a person to give you information about the term "Apple", how can they give you what you need without further context?


The Knowledge "Graph"[1]offers an optional disambiguation parameter you can query with[2]. DuckDuckGo (I swear I'm not a shill or associated with them!) offers a disambiguation API out-of-the-box and integrates some of the RDF material I mentioned below. Here's your "Apple" example[3].

Though, based on the amount of data Google has on the average user, and the fact that you have to sign-up to get an API key which is presumably associated with your search history, Gmail history (either any conversations sent from your Gmail account, or any mail you received dispatched from a Gmail account directed at you), they could easily determine if you meant Apple the fruit [you work for the USDA], Apple the company [you're an engineer in SF with a User-Agent history that's very heavily skewed towards Safari], or etymological basis of Apple, the word [you're a linguist], and disambiguate based on aggregate information. I'd imagine it'd be pretty trivial to do with their existing advertising profile + visit history of any site that either has Google Analytics or a Doubleclick ad.

[1] Again, I struggle to call it a graph, even if it's implemented as a GDB on Google's end, until the end-user traverses it, it's just a Knowledge API.

[2] https://developers.google.com/knowledge-graph/reference/rest... See: `types'.

[3] http://api.duckduckgo.com/?q=apple&format=json&pretty=1


Really cool, thank! The `types` seem to be a good way to add context if you know, a priori, the type you're looking for.

Google definitely fuses user data into their knowledge graph. This is seen in Freebase's `g.` identifier [1]. I'm curious if they'd influence their publicly facing API algorithms using that data.

[1] https://groups.google.com/forum/#!topic/freebase-discuss/_8x...


Completely agree! I'm curious if the `query` parameter in the API performs well on long queries (with context) or if it needs to be focused to a single entity's name




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: