
AmpliGraph: A TensorFlow-Based Library for Knowledge Graph Embeddings - mulletboy
http://ampligraph.org
======
nl
Graph embeddings are one of my favorite underused things in ML.

I've used them to do things like characterise users based on follow/follower
patterns, but there are many more applications.

In the past I've had great success with Facebook Research's StarSpace

~~~
evrydayhustling
Can you share more about that project? Sounds cool.

------
jamasb
I've been doing some work on link prediction in knowledge graphs recently with
poor results on real-world data. These methods don't necessarily require a
huge amount of data but they are very sensitive to noise and the 'density' of
dataset. The benchmark datasets are, in essence, very easy to get good
performance on. It's a real shame that metrics for these methods' tolerance of
noise and sparsity are not reported because these are going to be present in
almost any real-world dataset in far greater quantities than current
benchmarks.

~~~
mulletboy
Well, the landscape is still quite fluid (there are new models proposed in
literature at every major conference). Processing real-world graphs is
obviously more challenging, for a number of reasons (multi-modality, scale,
etc.) - even though benchmarks are catching up, and are becoming harder (see
FB15k-237 or WN18RR).

As a general rule of thumb, it is important your graph has enough redundancy
in it, i.e. the more relations, the better. Also, bear in mind these models do
not support multi-modality, i.e. literals such as numbers, strings, geo
coordinates, timestamps are simply treated as entities. In most cases it is
probably better to filter literals out before generating the embeddings.

------
pilooch
Looks very seriously made and documented, congrats ! Was looking at it a bit
closely the other days and put it onto my list of future tools. There's been
another somewhat related library released by facebook recently,
[https://ai.facebook.com/blog/open-sourcing-pytorch-
biggraph-...](https://ai.facebook.com/blog/open-sourcing-pytorch-biggraph-for-
faster-embeddings-of-extremely-large-graphs/)

~~~
mulletboy
Thanks! Feel free to play around with it - and of course any feedback is much
appreciated (GitHub, email or on our Slack channel
[https://join.slack.com/t/ampligraph/shared_invite/enQtNTc2NT...](https://join.slack.com/t/ampligraph/shared_invite/enQtNTc2NTI0MzUxMTM5LTAxM2ViYTc0ZTI2NzNhOGZiNjkzZjNkN2NkNDc3NWUyZmU2Njg0MDMxYWY5NGUwYWVmOTNkOWI5NmI0NDJjYWI).
I was not aware of pytorch-biggraph. Looks cool. It's good to see there's a
lot going on in graph representation learning!

------
bravura
Can you help me understand, what are possible inputs to ampligraph?

I think the main use-case is plugging in an existing knowledge graph, and it
filling in the gaps, correct?

Can I augment this will really high-quality embeddings for the nodes, that
were learned over auxiliary unlabelled text?

What are other ways I can augment the data set?

Is this useful only when there are many edge-types, or is it also good when
there are very few?

It looks promising, I just couldn't immediately grok when I use should look to
this library.

~~~
nl
I like the README.md for StarSpace[1] because it has lots of examples which
get you thinking.

I used graph embeddings as input to a classifier to classify people when
follower/followee information was easy to gather but text wasn't.

Basically anything that can be represented as a graph can be used. There is
some interesting work being done using code syntax trees as input which uses a
very similar approach. See code2vec[2]

I'm not aware of any way to transfer text embeddings into graph emneddings,
but you can could concatenate them and use them together (I've done this
before) or maybe do some dimension reduction or do a multi-task learning thing
and try to learn some combined representation.

I'm not ware of the scalability limits for this particular library, but
Facebook Research's pytorch-biggraph[3] (released 2 days ago) scales to
trillions of edges and billions of nodes.

[1]
[https://github.com/facebookresearch/StarSpace](https://github.com/facebookresearch/StarSpace)

[2] [https://arxiv.org/abs/1803.09473](https://arxiv.org/abs/1803.09473)

[3] [https://ai.facebook.com/blog/open-sourcing-pytorch-
biggraph-...](https://ai.facebook.com/blog/open-sourcing-pytorch-biggraph-for-
faster-embeddings-of-extremely-large-graphs/)

------
quenstionsasked
Cool. KGE methods are becoming more and more useful as companies are trying to
find ways to interface some internal knowledge graph with machine learning
techniques. I expect this space to grow substantially!

~~~
mulletboy
Indeed. Here in Accenture Labs we use it in quite a lot of diverse applicative
scenarios. Besides, KG embeddings can be used for other tasks beyond link
prediction (e.g. link-based clustering).

------
mulletboy
btw, we are hiring research engineers here in our Dublin Lab. Send me an email
if interested: luca.costabello@accenture.com [https://www.accenture.com/ie-
en/careers/jobdetails?src=&id=0...](https://www.accenture.com/ie-
en/careers/jobdetails?src=&id=00693824_en)

