KGs are great for data integration because they are naturally composable. If you have two data sources represented as graphs, and their node identifiers are in the same range (ie, they have been reconciled), and their edge labels are mapped to the same schema, then you can combine them just by taking the union of all their triples.
This works across different vertical domains -- for instance, you can create a joint film + music KG just by unioning all the triples from a film KG and a music KG, as long as you've mapped the node identifiers to a common namespace (eg Wikipedia). You can then do cross-vertical queries like soundtracks for Will Smith movies, etc.
Hi! Can you say more about that please?
You read it, and it all seems very abstract, and it's hard to understand why any of it would matter much to a programmer.
But look at things the other way around: assume you're a late 20th or early 21st century developer who grew up with the relational model, and now you have to work with some 1960s-era data modeling technology such as COBOL. You'd probably find that certain things that were easy in a relational database are suddenly really, really hard with this other tech stack. And probably, when you try to explain this to colleagues who'd never worked in anything but COBOL, they would respond with something like, "It's not hard, all you have to do is [really fiddly, error-prone, and difficult-to-maintain design pattern]." It can be difficult to distinguish between difficulty that's inherent to the problem, and difficulty that's only due to limitations of the tools that you know.
This is not by way of trying to imply that knowledge graphs are like the relational model in that they're definitely better than previous tech. But, assuming for the sake of argument that they do do some things better, perhaps they are like the relational model in that you need to spend a certain amount of hands-on time with them before their value becomes readily apparent.
... except they cannot possibly be like the relational model in that way, because the relational model isn't like that. The value of relational databases becomes readily apparent when you spend five minutes fiddling with a appropropiate small-but-nontrivial relational database; you do not need to spend a [nontrivial] amount of hands-on time with them before their value becomes readily apparent.
But SQL is absolutely amazing.
You have to understand the data and the relations, what's not in the data that you'd expect to be etc...
This distinction does not exist. Any difficulty apparently inherent to the problem can be addressed by new tools. We do not do this all the time because some issues are one off, and designing a tool(/any automation) for it would be time consuming.
We started building with OWL, the web ontology language, to represent the shape of the graph. This made sense because OWL is a very rich language for describing graphs. However it also has drawbacks. It is very hard - and alien to common experience - for developers to read OWL. It was not built to describe schemata but rather ontologies (to describe what could be represented, rather than what must be represented). It also had no concept of a document, and as we were trying to build a document-oriented knowledge graph, we had to graft one onto it, which became a source of confusion for our users.
Eventually - with much pain and time - we decided to simplify the interface, make the concept of the document more central, make the primary interaction method be through json documents and create a schema language that looks like the JSON you hope to build (and feels more like one you might write in a programming language).
It is early days for the relaunched version (and we had to swallow the frustration of such a deep breaking change), but it certainly feels like regular programmers are now able to quickly build knowledge graphs. The combination of graph, schema, and document is powerful.
For instance you might put a "record" into a knowledge graph that describes some topic like
and then later decide to delete it. The "record" consists of not just the
?s ?p ?o
?s == https://www.wikidata.org/wiki/Q108937326
You can reconstitute the relational database without "rows" (e.g. turn a relational database into a graph and do OWL inference on that graph, or run SQL queries against a database with a columnar organization) but the row concept, like the document concept in document-oriented databases provides a boundary for updating records that (mostly) works even in the absence of transactional semantics.
Many of the older approaches to implementing transactions were row-centric, although newer MVCC approaches apply just fine to graph systems.
You might like this white paper (but for reasons above you will have to overlook some of the OWL information): https://github.com/terminusdb/terminusdb/blob/dev/docs/white...
While the technology is built on the back of what programmers do, there is nothing inherent to knowledge graphs that imply that building them is a programmer's task. It's very possible that that task and responsibility falls to someone else and programmers are left building the interface and access portal to a tool used by a different specialization.
Why do programmers want to do everything?
The whole point of this type of systems analysis is to be able to lift and shift the task from a group of people who can but also don't want to do it to a smaller group of people who have chosen and specialized to do it.
one thing missing (from my perspective) is some sort of informed assessment / lessons learned about the (relatively) limited adoption of such methods in practice and the role of blind spots or academic biases in this respect.