
Show HN: Minibatch – Python stream processing - miraculixx
https://github.com/omegaml/minibatch
======
miraculixx
Scenario: you really have a streaming use case, but you don't want to deal
with the complexity and overhead of some of the other streaming frameworks
like Apache Spark or Flink.

If so minibatch is for you

~~~
tudelo
do you have a list of features? For example, implemented windowing type?
Triggering?

~~~
miraculixx
good feedback, updated the README

------
woile
Interesting I've started a project with a friend to build a simple toolkit for
stream processing [https://github.com/python-
streaming/dam](https://github.com/python-streaming/dam)

I'm happy to see the streaming ecosystem is getting bigger in python, Java has
a lot of market share

------
AndrewKemendo
Kafka-python does python native streaming already. Why would I use this
instead?

------
ewhauser421
Why not Apache Beam? It doesn’t require Spark or Flink

~~~
miraculixx
Thanks for the question. Beam has a more complex execution model and AFAIK
also needs some executor environment like Spark to really parallelize
workloads. Given a mongodb that all producers and consumers can attach to,
minibatch runs anywhere.

------
dennisy
Do you have any scale or performance metrics?

~~~
miraculixx
no benchmarks yet unfortunately. While it should scale O(n) in the number of
streams, there will be limitations to scale a single stream because the stream
processing functions by default are executed synchronously (pending
enhancement).

------
kstrauser
It lost me at "for humans". I'm not even kidding: it's trite, demeaning to
every similar project (with the implication that everything else is obviously
for weirdos, not like you and me, amirite?), and shows a kind of naivete
(everything else is overly complicated, lol, and I have no idea why that would
be!).

It was momentarily cute the first time I saw it but that faded quickly. By the
100th "for humans" project, it had become a distinct code smell.

~~~
acrisci
I disagree. To me this term is a meaningful way to express the values of a
project. It means that the abstractions will be at the highest level possible
to provide the result. Whenever the library can make a decision for you, it
will attempt to do so. It is like you are dealing with a human intelligence
that is always guessing what you want and trying to give it to you. When I see
"for humans", I expect that the api will behave like a human. This is not
always desirable. Sometimes the highest level of abstraction is the wrong
level of abstraction. Sometimes you need to be making all those decisions
yourself, and you don't want the library to ever guess what your intention is.
The human-like apis tend to work wonderfully until you need to optimize, have
security concerns, or figure out you're actually doing something novel. Then
you jump down a layer of abstraction and everything fits wonderfully again.
Sometimes you really are just dealing with a machine and telling the machine
what to do and whatever humanity is brought to the task is simply a
distraction.

~~~
kstrauser
You say you disagree, but then go on to spell out my position quite nicely.
"For humans" implies a certain level of abstraction that quite often tends to
be _too_ abstract in exactly the way you describe. "We don't bother you with
all the little details that the others make you handle!" frequently ignores
the fact that those others make you specify all that stuff for a reason.

For example: "Do you want this database to be AP or CP?" "Don't pester me with
the nitty gritty! I just want to store data." A "for humans" database that
quietly made the choice for you would be very bad news if it later turns out
to have chosen poorly for your own workload.

The first "for humans" thing I saw was Python's Requests library, and I think
it earned the title. Having just built a web scraper on top of raw httplib a
year earlier, I would have killed to have had Requests available. It's a great
example of a project with a decent track record of mostly setting good
defaults and letting devs concentrate on their parts of their projects. Since
then, I have seen very few "for humans" projects that weren't so abstract as
to become almost unusable. I mean, you could call MS Paint "Photoshop for
humans", but that doesn't make it so.

~~~
acrisci
Oh ok I thought you were making the point that "for humans" is a sort of
meaningless marketing term, but I see you understand the nuances of this.
Sometimes it is good to be "for humans" but you are saying most people in your
experience who make the attempt fail.

The classic example I think is Microsoft Access (databases for humans) which
is great until it can't do the one thing you need and then it doesn't work
anymore. And everyone needs a different one thing.

~~~
kstrauser
Ah, got it! I think that's a great example.

