
Open Semantic Search - based2
http://www.opensemanticsearch.org/
======
nl
There's a lot of claims on the front page, but digging into the docs I'm not
seeing much beyond Solr?

The textmining page[1] talks about how to search (?) and generate word clouds
(ie, words with counts - core Solr faceted search functionality).

But it links to a page on Named Entities[2] which sounds promising! But in
that case it turns out you have to setup the named entities yourself (ie, no
NLP tools to find them) and then you get.. Solr facets.

I think there are some promising ideas here, but right now I think it makes
the easy things in Solr slightly more complex and doesn't help with the hard
things.

[1]
[http://www.opensemanticsearch.org/doc/analyze/textmining](http://www.opensemanticsearch.org/doc/analyze/textmining)

[2]
[http://www.opensemanticsearch.org/doc/analyze/textmining](http://www.opensemanticsearch.org/doc/analyze/textmining)

~~~
mark_l_watson
You might be right, I have only taken a quick look. Still useful if the
implementation is easy to hack. There so many NER (named entity recognition)
libraries, including Apache OpenNLP that I am surprised that NER was rolled in
by default. I have my own NER as part of kbsportal.com

~~~
nl
Yes, it seemed like something of a no-brainer to build a NLP enrichment
pipeline before ingestion into Solr.

I assumed that was what it was and was interested because it something I
suspect a lot of people have built a lot of times (I've built it at least
twice).

~~~
chatman
Lucidworks' Fusion is such a product. It has a lot of goodness built into it
to enrich the Solr indexing and querying experiences. For example query
pipelines, connectors for importing from almost all imaginable datasources,
indexing pipelines that enrich the documents, including NER/NLP etc.

------
matt4077
That website could use some structured data itself. I feel lost in a vaguely
text-mining associated word cloud.

------
fsiefken
This is great, now I can convert my OSX DevonThink archive with an open source
solution and run it on linux/bsd.

------
chatman
This is based on Apache Solr, which is awesome!

~~~
donretag
Elasticsearch is aligning itself with the logging/analytic crowd. Not much
focus on search itself. Many doing NLP are using Solr.

There really is no difference at the Lucene level.

