

PlatformLayer adds Solr-as-a-service, 1 day after AWS CloudSearch launch - justinsb
http://blog.justinsb.com/blog/2012/04/14/platformlayer_solr_as_a_service/

======
gtani
While the cloudSearch config does look kind of clunky compared to SOLR, it is
kind of FUDDY if you're implying that cloudsearch won't let you configure
these at all: tokenizers, stemmers, scoring, ISO latin to ASCII mapping, stuff
like that

[http://docs.amazonwebservices.com/cloudsearch/latest/develop...](http://docs.amazonwebservices.com/cloudsearch/latest/developerguide/stemmingopts.html)

~~~
justinsb
Thanks for pointing that out. What I've found with aws is that it all looks
good until you want to do anything beyond the basics. Then you spend a lot of
time doing undifferentiated heavy lifting to get around the aws limitations.
With solr and lucene, you can create whatever you want.

It reminds me of the whole "share-cropper" argument.

That said, the auto sharding is pretty cool. We will probably have to wait for
the next version of solr for comparable simplicity.

~~~
justinsb
I need to apologize: I hate it when people respond with assertions, rather
than facts; and I did that myself there. Sorry!

So; taking stemming as a concrete example:

* With Lucene/Solr, stemming is done with a pluggable class; you can write your rules as a series of exceptions, word by word, but really you'll want to write code for anything non-trivial. Here's an example for Norwegian: [http://e-mats.org/2009/05/modifying-a-lucene-snowball-stemme...](http://e-mats.org/2009/05/modifying-a-lucene-snowball-stemmer/)

AWS only allows word-by-word exceptions, and only allows 500KB. If you need
more than 500KB, look like it's more undifferentiated heavy lifting for you...

* With AWS "some basic algorithmic stemming is always performed, such as removing plural suffixes". I think that means you can't turn it off. A lot more documentation is needed there, not least as to what these immutable rules are.

* Solr ships with a bunch of different languages built in; AWS appears only to support English (although I can't believe this is right!)

I'm sure AWS will fix some of these points eventually; but they might not
consider it worthwhile to do so, or may not do so on your schedule. With open
source, you always have an option.

And this is just stemming! With scoring functions, you're limited to the AWS
mini-language, rather than being able to write whatever you want. etc.

------
amirnathoo
"7 bullets vs 3 bullets. PlatformLayer is more than twice as good!"

'nough said. I'm with PlatformLayer.

------
justinsb
Which service should PlatformLayer add next?

