Basically I've been glueing lots of pieces pre-existing software together, sticking webservice front ends on them and making them work together. It's all Dockerfied so things can be run separately.
For us, we already do semantic concept extraction using Apache Stanbol, against content that flows into our enterprise social network product, and then store the associated triples in an RDF triplestore. We have a primitive search feature exposed, which lets you query using SPARQL, but realistically, we know "normals" will never, ever, ever, ever write SPARQL queries, so the big push is to do automated translation from natural language (even if it's a slightly restricted natural language) into SPARQL so users don't have to think about triples and what-not.
Very similar here.
I'm (currently) using DBPedia dumps, loaded into Jena. I'm experimenting with content extraction (for eg CIA Factbook).
I have the Quepy->SPARQL mapping working, though.
I'll send you an email.
Great minds think alike? :-)
I believe that's actually pretty big for a triplestore. Seems to work ok, but loading is pretty slow.
I'm contemplating switching to YAGO2 or Freebase, but I think I'd be better served doing entity extraction myself (DBPedia & YAGO tend to be out of date).