
Real-Time Full-Text Search with Luwak and Samza - nehanarkhede
http://blog.confluent.io/2015/04/13/real-time-full-text-search-with-luwak-and-samza/
======
felipesabino
Samza's author really opened my mind related to how useful stream process is
to high performance data processing [1] and I think Samza's only bummer (for
me, personally, today) is it's lack of support for non-JMV languages [2]

[1]
[https://www.youtube.com/watch?v=fU9hR3kiOK0](https://www.youtube.com/watch?v=fU9hR3kiOK0)

[2]
[http://samza.apache.org/learn/documentation/0.7.0/comparison...](http://samza.apache.org/learn/documentation/0.7.0/comparisons/storm.html)

~~~
d3fmacro
Apache Storm has non-jvm languages
[https://storm.apache.org/](https://storm.apache.org/)

~~~
felipesabino
Yap! Also, Spark added a Python API for Spark Streaming after v1.2 [1]

[1] [https://spark.apache.org/docs/1.2.0/streaming-programming-
gu...](https://spark.apache.org/docs/1.2.0/streaming-programming-
guide.html#overview)

------
vosper
I think this is the first post I've seen about Samza from a non-LinkedIn team.
I'd love to hear any details about the Samza experience - it seems like it
should be the logical choice for Kafka users, but there's not much out there
about it.

~~~
rb2k_
I don't think Confluent really counts as "non-LinkedIn team" :)

"Jay is co-founder and CEO at Confluent. Prior to Confluent, Jay Kreps was the
initial developer on several open source projects, including Apache Kafka,
Apache Samza, Voldemort. He was the lead architect for data infrastructure at
LinkedIn."

(Martin Kleppmann also has a LinkedIn background)

~~~
vosper
Ahh, thanks - I checked that the speakers weren't currently working for
LinkedIn, but I didn't look further into their backgrounds.

Oh, well. I still hope to one day read about someone else using Samza in
production.

~~~
martinkl
Here are a few production users:
[https://cwiki.apache.org/confluence/display/SAMZA/Powered+By](https://cwiki.apache.org/confluence/display/SAMZA/Powered+By)

The Metamarkets team wrote a nice post on their use of Samza a few days ago:
[https://metamarkets.com/2015/simplicity-stability-and-
transp...](https://metamarkets.com/2015/simplicity-stability-and-transparency-
how-samza-makes-data-integration-a-breeze/)

~~~
felipesabino
Interesting, I would be very interested in learning more about Metamarkets
transition to Samza, as 1 yr ago they were using Storm instead [1] [2]

Or may be they did not transition and are actually using both, I don't know

[1]
[https://youtu.be/3Qb_2GGRz24?t=20m24s](https://youtu.be/3Qb_2GGRz24?t=20m24s)

[2] [https://storm.apache.org/documentation/Powered-
By.html](https://storm.apache.org/documentation/Powered-By.html)

------
AznHisoka
This is long long long overdue in SOLR. The percolator feature is what has
made me stick with ElasticSearch for the past 3 years, and has contributed to
its increasing popularity over SOLR.

------
huskyr
Bit offtopic, but does anyone have suggestions for simple full text search
engines? I basically want something for names, just a couple of thousand,
nothing fancy. Setting up something like ElasticSearch seems like overkill
(and is quite hungry for specs as well). I was thinking about simply hacking
something together with Redis and Python, but i suppose someone might have a
better solution.

~~~
ignoramous
Sorry if come off as naive (as I don't really understand what a 'full text
search' is), but for a couple thousand names wouldn't grep with regex
wildcards suffice?

~~~
huskyr
That doesn't sound like a bad suggestion at all. I think it might get bigger
over time, or i might have aliases for the names, and in the end using just
regex i'll probably hit a ceiling sooner than later.

------
gearhart
Great write-up.

If you're interested in this and you live in (or would like to live in)
London, we're hiring. Email's in my profile.

------
phpnode
offtopic - I've seen a number of blog posts using this hand drawn diagram
style recently, does anyone happen to know how it's done? (answers other than
"by hand" appreciated)

~~~
spdustin
Looks like Paper, by FiftyThree:

[https://appsto.re/us/KfqkE.i](https://appsto.re/us/KfqkE.i)

~~~
martinkl
Correct. If you're going to try it, I recommend getting a stylus for your
iPad, since handwriting with your fingertip doesn't work very well.

~~~
spdustin
Pencil (by FiftyThree) works great, of course. And unlocks the additional
features in the Paper app. I'm actually quite impressed by how well it works.

[http://www.fiftythree.com/pencil](http://www.fiftythree.com/pencil)

