
Apache releases Lucene Core 4.0 and Solr 4.0 - andrevoget
http://lucene.apache.org/
======
shadowmint
If you've never heard of Lucene and are wondering why you should care, the IBM
quick start guide is a decent place to get to know what it is and what you can
do with it: [http://www.ibm.com/developerworks/java/library/os-apache-
luc...](http://www.ibm.com/developerworks/java/library/os-apache-
lucenesearch/)

Basically, its a search engine; if you need to run one of those yourself, it's
really neat.

~~~
encoderer
Lucene + Solr is also a NoSQL database. Not the best choice for every
circumstance but then again, nothing really is.

~~~
mattdeboard
Indeed. But its capabilities don't really take it above what you'd get out of
a normal relational DB. Or even a more traditional key/value store.

------
gfodor
Maybe it's just me, but over the years Solr and Lucene have become an
impenetrable project. When I first started playing with Lucene many years ago
it was very simple, transparent code backed by solid and understandable
algorithms and data structures. Now of course, it was limited.

But now Solr has absorbed Lucene and it seems to do everything and the kitchen
sink. The abstractions in Solr in my experience are mindbogglingly complicated
if you want to extend its capabilities, because it does _so_ many things.

I have yet to look at 4.0 but I hope they have removed as much as they have
added. I think if the focus for the next release of Solr was to remove lots of
unused, poorly factored, or complicated features it would be a huge benefit to
the project.

~~~
gojomo
Yes, a bit like Java itself, I see it as trending towards being an "Aircraft
Supercarrier Battle Fleet" of functionality. (See also: the Hadoop ecosystem.)

But, it's free, powerful, and supported by a giant community... so once you
get the hang of picking and choosing the parts you need, ignoring the rest
until needed, you can get impressive results with minimal effort.

------
dermatthias
Last time I checked, Solr and Nutch (the web crawler) had a really bad
documentation. Only some wiki pages slapped together, mostly outdated.

Any improvements on this front?

~~~
iankp
Nope, you're spot-on. The Solr Wiki reads like it expects you to be the person
who wrote the entire system.

------
simonw
Anyone know what the most exciting features are since the last major stable
release?

~~~
donretag
Lucid Imagination has an annotated version of the “Release Highlights” for
Lucene/Solr 4.0

[http://searchhub.org/dev/2012/10/12/apache-solr-and-
lucene-4...](http://searchhub.org/dev/2012/10/12/apache-solr-and-
lucene-4-0-0-released/)

Spellchecking built into the index instead of needing an ancillary one is a
big feature, IMHO.

Andrzej Białecki, Robert Muir, and Grant Ingersoll will be presenting a paper
on the Lucene 4 architecture: <http://opensearchlab.otago.ac.nz/paper_10.pdf>

~~~
snapbug
Will be? Already have, a couple of months ago.

------
awj
The spatial enhancements are pretty exciting. You can support a lot of
interesting things when search can be polygon queries against polygon data.
WKT is a pretty fatty format though.

~~~
blutonium
Spatial is great! What's better than WKT though? WKB?

~~~
awj
Yeah, that's what I was thinking. Either way you (potentially) have a lot of
data to push around, but encoding things as text that could largely be fixed
width binary floats really bloats them up.

------
ajtaylor
I'm really excited about all the new features in 4.0. The more I read, the
more I can't wait to try it out on my dev server. The new features that are
most interesting to me: spatial search improvements, atomic updates, custom
scoring options, pivot faceting, speed improvements and Solr Cloud. All these
have the potential to make searches on $work's website much more relevant and
faster.

------
dotborg
API is cool and dandy, but I think project should go in other direction.
Lucene focus should be REAL support for languages, lack of proper
dictionaries, lame stemmers makes it hard to use.

For me Lucene was always an NoSQL database and the rest of real search stuff I
had to implement myself. (not to mention Solr mess).

It's fun and cool to work on some Java code architecture, but the real need
lies somewhere else.

------
JackParsons
SolrCloud- automated peer-to-peer scaling. Very nicely done.

