Hacker News new | past | comments | ask | show | jobs | submit login
Knowledge Graphs (arxiv.org)
261 points by fxru on March 6, 2020 | hide | past | favorite | 49 comments

We are working on an open-source Knowledge Graph called Cayley (https://cayley.io) including a browsable web interface (https://github.com/cayleygraph/web) and a new query language improving on the work done by Google Knowledge Graph (formerly Freebase) and the W3C's Semantic Web project.

That is very cool. You mention some RDF support, but no SPARQL query support if I understand your first linked web page.

For what my opinion is worth, I encourage you! I have been a fan of the semantic web since day zero, and I worked with their Knowledge Graph while working at Google. In order promote the tech, I am working on an app (for iOS and macOS) that will walk new users through using SPARQL against endpoints like WikiData and DBPedia. My email is in my profile, contact me and we can set up a phone call sometime.

Where can we find information about how Google is using KGs? All I can find is pretty old info (freebase etc). there are details of what they do with it (those boxes in searches, but no technical explanation)

What's the license? I can't find one in the repo...

The repo says Apache 2.

I think we're looking at different repos...


Exactly. That's a different repo. Is the UI source included there as well? If so, then it's covered, but if not I can't assume that it is.

Last time I checked Cayley, it didn't have Linked Data support. Is it something that you have added recently? And does it support SPARQL?

Two of the paper's authors wrote up a helpful history of knowledge graphs that would be interesting for those who enjoyed this submission - http://knowledgegraph.today/paper.html

Here's a less comprehensive and certainly much less academic, but nevertheless potentially interesting, related post: https://www.linkedin.com/pulse/knowledge-graphs-end-user-pro....

For anyone interested, there's a (virtual) knowledge graph conference going on in May (www.knowledgegraph.tech). Ignore the pricing, since it reflects the original in-person event. I'm organizing an investor pitch session for this conference and am looking for early (A, seed, or pre-seed) startups interested in participating - let me know if interested!

I'm interested in the online event. I don't have a KG-related business to pitch though. Is that an issue?

Sorry for the delay! You should definitely just attend the conference: https://www.knowledgegraph.tech/

Whao, great resource. I just skimmed it, but I like they define and categorize knowledge graphs (annex A.3) because this is something that wasn't properly done.

I'm so glad I read your comment because it motivated me to actually click into the PDF and read further. It truly is an incredible resource. Really a tour de force.

Can anyone point to a place where knowledge graphs are used effectively in industry/research/personal hobby?

Current consulting project involves strategy for patching 8000 vulnerabilities, across 300 OS images, hosting 500 applications, managed by 50 branches, in 20 divisions, where about 40 people are accountable at a business level for the applications.

I'm using neo4j to integrate this set with HR and financial allocations data to show the direct exposure of business applications to their budgets, and provide a rough value at risk on a per branch basis, and create big scary graph viz clouds of vulnerabilities that all point to the person responsible for them, mainly to get people to get off their asses and patch their shit.

I'm also using queries of the graph to auto-generate hierarchies of jira epics and stories and assign them to the business owners so we can track remediation across the whole enterprise in both jira and azure devops.

It has produced a navigable organization map and we can run queries like, show me the vulnerability exposure of this division based on the aggregate risk of the applications deployed by its branches, and show me who is responsible for fixing it. Show me the sr. manager who has the most vulnerabilites roll up to them by their jr. managers, etc.

Isn't that mostly a database problem? I have loved knowledge graphs for a while until I realized they were just a way to obfuscate complex problems, not solving them.

Here, what part of your problem wouldn't be solved by a DB with a table of vulnerabilities, a table of branches and divisions, and a table of people?

A query can fetch the images and apps used by a division, then use that to query the vuln database.

I mean, I understand the problem is not super simple, but I fail to see what knowledge graphs bring to it?

Using a graph simplifies the query parameters, reduces the need for join() operations, and allows you to reason more directly about the data model instead of rdbms concepts and SQL. Could you do it in SQL? Sure, but the bolt browser and cypher made it go a lot faster. I'd go so far as to argue that graphs are optimized for reasoning and analysis, where relational dbs are optimized for performance.

When I need a performance problem solved, I'll hire an engineer to reimplement it, but until then, I can analyze the data at the business level. Right now I just need a rope across a river to see what's on the other side and not a bridge. :)

Aren't knowledge graphs just a buzzword way of saying "graph database"?

So yeah, the types of problem a db system solves are db problems, even if its using a semantic network model instead of a relational model

Graph databases can hold totally bespoke data that makes sense only to the consuming application or they can hold data that has been factored and connected to outside terminologies and external datasets. One holds data, one holds knowledge.

As i understand it, the graph nature of Knowledge Graphs allows simpler queries. And also a better moddeling of the data.

Some even support techniques like inferring "new" wisdome that was not implicitly moddeld.

In the end you could solve all that on other ways like making the application smarter and using a "not so smart" db system. But i am not an expert in that topic...

Sounds like a good idea for a startup or SaaS project. The JIRA epics to business owner mapping seems quite useful.

Thanks, my actual product does that now. (qtra.io, collaboration tool for product level TRAs, exports to Jira for normal agile backlog grooming of threat/risk stories into iterations and development, shrinks TRAs from weeks to hours,) It generates a graph on the back end as well, aggregating risk data across all customers, without proprietary vuln data in it.

However, reality is on this project, all the input data exists in spreadsheets, which means you have to normalize it into your graph ontology data model. The SaaS here would be something like Graphene or hosted Neo4j.

If there were a graph model for something like powerBI for graphs, that could be a play. Else, it's more of a high dollar consulting solution.

Finding that SaaS in enterprise is limited because most enterprise people live in spreadsheets, which are both orders of magnitude more sophisticated, yet stunted compared to SaaS. But for all it's limitations, excel is turing complete, and your CRUD application never will be, even if the users don't know what that means. Enterprise tech is about generating data to drive conversations, it isn't about solving problems because managing that problem is somebody's job.

If a CIO came to me and said "fix my company," I would say, "sure, give me your data, then 6-12 months to herd all your cats" and I could do it. Nice lifestyle, but there's no exit in that.

Yes indeed. I work for a startup that offers an intelligence analysis platform built on a knowledge graph. It is open source.


SaaS is in the works.

to which extent does it scale? are we limited to in memory graphs?

We’re releasing a cluster version soon. It is a persistent graph (so not limited to RAM).

A single host can probably handle a billion nodes or so. Our largest in-production graph is about a third of that.

Pinterest wrote up a brief paper explaining their rollout of a knowledge graph aimed at describing their users' tastes - https://arxiv.org/pdf/1907.02106.pdf . This paper won an award under the industry category at IWSC 2019. You can read more about what the knowledge graph enabled at https://business.pinterest.com/en/blog/introducing-the-pinte...

Section 10 of the paper is titled "Knowledge Graphs in Practice" and gives concrete examples of usage of Knowledge Graphs in Enterprises and Open / Public scenarios.

For security+fraud+IT examples, you can see some of the talks @ https://www.graphtheplanet.com/ a couple weeks back

We primarily get pulled in as a component for projects that use as part of:

* Asset management, whether sec, IT, manufacturing specs, pharma, metadata stores, config systems...

* ML model unification (e.g., how Google translates between many systems)

* Knowledge bases/search for both people & for algorithms. Wikis, recommendors, ... .

There are less common scenarios we see that I also expect to explode, like massively bootstrapping ML systems via weak supervision. GraphDBs & compute are also used for what I consider non-knowledge graph scenarios, likely scaling some compute tasks. Exciting times!

We use them at Forge.AI for high-precision entity disambiguation and to provide explainable automated dependency analysis [1] as part of a filtering system.

[1] https://hackernoon.com/knowledge-graphs-for-enhanced-machine...

Entity disambiguation, I am currently running a parallel graph for that very purpose to mend a "pivoted one too many times" database with lots of duplicate entities and misgrouped entities.

I build this service that lets you create knowledge graphs based on api results: https://alman.ax

I tried selling it but until now it has proven hard to explain what the benefits are, arguably it needs more work to be really useful.

In my lab "knowledge map" is one of the internal trendy buzzword, we use it to model course content and concepts to be acquired by students. I personally use a graph-based lexical system which allows me to create dictionary apps super quickly (basically by just transforming the initial resource, sometime by creating additional UI components).

In industry Google use it (they coined the term), otherwise I don't know. In the open data world DBpedia is the reference and is the target of a lot of inbound links.

Use on a research & competition robot: robot has some predetermined knowledge of the environment, like which object types are related to which storage locations. Robot then goes around finding objects and putting them in the right place based on the object type, storage type and their locations. Probably a much smaller graph than sibling comments are working with though

The little panels and answers you see in Google Search are powered by the Google Knowledge Graph [1].

[1] https://developers.google.com/knowledge-graph

In the Google Knowledge Graph to enhance google's search results with relevant information from multiple sources like wikipedia etc.

Google's Knowledg Graph seems to have given knowledge graphs their name in the first place.

Contact tracing and disease spread modeling are really useful knowledge graphs when you use them to capture “I think I went to these locations” and match it against other reports.

Google and Facebook use Knowledge Graphs. I have had consulting customers for this tech in the health (medical) industry.

The paper reads like an index into a number of related research fields. One running example, very concise explanations, maaaany references.

Very strange question, but can anyone identify the serif font used in the body? The two-story characters are very nice and the serifs are not distracting.

It would appear to be this: https://www.dafont.com/linux-libertine.font


If you have poppler (and maybe xpdf) installed, the "pdffonts" command will list the fonts referenced in a pdf file (and whether they're embedded or not).

I found causal loop diagram much more effective

The paper looks like it will be a nice resource to direct AI students towards; however, the Reviewer 2 in me is stirring, I'm sort of concerned with the lack of Betty's Brain citation in the paper, as that is one of the studies on knowledge graphs in education.


No. Its year=2020, Month=03.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact