Hacker News new | past | comments | ask | show | jobs | submit login

Not especially super-stealth, but it's not really ready for public consumption.

Basically I've been glueing lots of pieces pre-existing software together, sticking webservice front ends on them and making them work together. It's all Dockerfied so things can be run separately.

For us, we already do semantic concept extraction using Apache Stanbol, against content that flows into our enterprise social network product, and then store the associated triples in an RDF triplestore. We have a primitive search feature exposed, which lets you query using SPARQL, but realistically, we know "normals" will never, ever, ever, ever write SPARQL queries, so the big push is to do automated translation from natural language (even if it's a slightly restricted natural language) into SPARQL so users don't have to think about triples and what-not.

Very similar here.

I'm (currently) using DBPedia dumps, loaded into Jena. I'm experimenting with content extraction (for eg CIA Factbook).

I have the Quepy->SPARQL mapping working, though.

I'll send you an email.




Nice, sounds like we're using a very similar stack. We are using Jena as our triplestore, but we don't touch the dbpedia triples directly, but rely on Stanbol to do the entity extraction processing for us. And we're also starting down the path of using Quepy.

Great minds think alike? :-)


Jena here too. 14G of data total. Total Triples: 124,294,115 (SELECT (COUNT(*) AS ?no) { ?s ?p ?o })

I believe that's actually pretty big for a triplestore. Seems to work ok, but loading is pretty slow.

I'm contemplating switching to YAGO2[1] or Freebase, but I think I'd be better served doing entity extraction myself (DBPedia & YAGO tend to be out of date).

[1] http://www.mpi-inf.mpg.de/yago-naga/yago/




Applications are open for YC Summer 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: