
Cloudera and Hortonworks merge - moritzplassnig
https://www.cnbc.com/2018/10/03/cloudera-and-hortonworks-announce-all-stock-merger.html
======
alexnewman
I was an early employee at Cloudera, am a Hadoop contributor and think this
entire market is garbage. Basically the big data field went off the rails and
this is Cloudera's way of trying to remain relevant. It's hard to point to a
product that came after my tenure (I was on the team that original made the
POS called Cloudera Manager) that's really used by anyone at scale. Cloud is
displacing all of these tools and never got the clouds to play ball.

~~~
colin_mccabe
> It's hard to point to a product that came after my tenure (I was on the team
> that original made the POS called Cloudera Manager) that's really used by
> anyone at scale.

There are tons of them. Spark, Hue, Sentry, Kafka, the list goes on.

~~~
dswalter
I think the parent comment was saying that none of these were primarily made
by cloudera.

~~~
colin_mccabe
Hue is a Cloudera project, not even an Apache project. Sentry is an Apache
project but primarily developed by Cloudera.

~~~
alexnewman
Hue was made before cloudera manager. I know because when I first wrote it, it
was based on hue. I think if sentry and hue are all cloudera has brought to
the table...

------
capkutay
Good move for both companies. The surplus of 'enterprise hadoop' companies was
created by a mixture of hype and a peak in VC investment in open source.

The fact that Cloudera, Hortonworks, MapR were all founded and raised $100m+
around the same time was a bit superfluous for the whole market.

Hard to say where this leaves MapR now. They seem to be the odd man out in
terms of growth and adoption.

~~~
jamesblonde
We have a startup, Logical Clocks (www.logicalclocks.com), that just raised
money to sell our data-science next-generation Hadoop stack. It has a new
verison of HDFS called HopsFS (Nvme storage, distributed metadata) and support
for GPUs in YARN. And distributed tensorflow, Spark, Airflow, Flink.

So, what does that mean for us? I am shellshocked. I expect prices to increase
(good for us). What else should we expect?

~~~
agibsonccc
Both Horton and Cloudera require a huge number of partner resellers and
consultants. What horton and cloudera have isn't just software, but brand, an
install base, and the ability to back it up.

They merged partially because they were both being cannibalized by services
revenue they couldn't get rid of. Now they are struggling to move to the cloud
(see Atlas now)

I'm not sure yet another hadoop distro with a bunch of 1 off tooling that is
supposedly faster is the answer. Why is all that stuff even needed?

~~~
jamesblonde
Adam, what about Java for deep learning? Is that needed?

~~~
agibsonccc
According to my customers and users yes? Granted, we do a lot more than just
that though.

Dl4j itself has a decent sized user base. Ranging likely from your phone maker
to your bank and retail store.

We have our own software distro too which is why I'm commenting on this. We
don't try to boil the ocean with a bunch of tech though.

There's a whole new crop of companies focusing on solving bits of the ML
problem well rather than trying to do storage and god knows what else.

My point here about you guys is you're trying to compete in what is largely a
commodity market. People don't need all this stuff. Simplicity won here. It's
not about better tech.

You guys have the same pitch MapR does and largely the same problem: Better
tech is only part of the problem with adoption. You need customers, users, and
a clear business model when going to market.

Cloudera and Horton ran one playbook that at least somewhat worked (it got
them public) and now they can focus on competing with the cloud vendors, which
made the right decision and just made commonly used software easy to use.

~~~
jamesblonde
We don't. We are the only on-premise vendor with proper support for GPUs and
python. We are a full data science stack, backed by Hadoop. We even have
Kubernetes for model serving. And python in the cluster (with conda
environments). And we have customers and funding. Nothing like MapR. And none
of the legacy mapreduce crap.

~~~
agibsonccc
You bundle way more than they do and on top of that have your own file system
just like mapr does.

Your pitch is still about differentiated tech, not a large install base, a
differentiated business model

and something related to _people_ like a good partner ecosystem.

Your pitch here requires tons of services.

People don't know how to use all of this stuff _especially_ on prem.

It takes more than just code to build a business.

I say this as someone who's been doing this since 2013. It's not easy.

~~~
jamesblonde
Ok, now you're changing your angle. Differentiated tech is what we are all
about - that is ok by me (for now).

If you want to train DNNs on a hundred GPUs today on-premise on TensorFlow,
come to us, we can do it. They can't.

~~~
agibsonccc
I don't think I changed my angle here? I'm still addressing

the same point. Tech doesn't matter. Simplicity does.

Even in our own product line, we only do a small

subset of this. We don't even require a cluster

to run. We also work with tech that people use.

You are currently competing with horovod

and kubeflow. eg: "competing with free"

You need more than that to survive.

Generally, that comes down to services.

~~~
ironchef
A slightly different wording is it comes to using the tech... which is either
through services or through adoption / operationalization (which... with
complexity... is very difficult)

~~~
agibsonccc
Exactly!

------
ameyamk
Good move! Would help consolidate Hadoop ecosystem. MapR would find it hard to
compete now.

Both these companies are being challenged with a new crop of Databricks and
the confluents of the world.

------
francoisprunier
Interestingly, none of them mention Hadoop. It makes sense to merge from a
business point of view, and on a technical level it could finally mean more
work on polishing the gazillion tools they include, which is badly needed.

------
Tsarbomb
Can someone explain to me what the big draw was for Hortonworks or Cloudera?

Working as a lead in a small team that deals with a colossal amount of data
(human genomics), it was always easier for us to hand roll deployments with
terraform/ansible in either baremetal or OpenStack environments.

In the public clouds like AWS we are using the managed services like EMR.

The whole sales pitch I've always got from either hortonworks or Cloudera
always seemed more aimed at nontechnical stakeholders than the technical ones.
Am I wrong? Have I missed out on some cool stuff?

~~~
coredog64
You’re not wrong. It would be hard to sell CDH if you only engage with
technical stakeholders. I can imagine the pitch now...

“So what you get is really old versions of everything, plus some shaded jars
that will break if your classloader ever changes load order. As far as cloud
goes, we’ll give you this half-baked automation (Director) and we’ll also
constrain you from using any features of your cloud provider (availability
zones, load balancers, Amazon Linux 2). Finally, we require you to use
Oracle’s JDK but we won’t distribute it so that you’re indemnified.”

~~~
computronus
But you have _heard_ of Director :)

------
Thaxll
Hadoop very "fancy" 5-6 years ago, what's the trend now? I guess with managed
services from AWS / Google it makes Hadoop less useful?

~~~
threeseed
The managed data science services from AWS/Google is Hadoop/Spark.

~~~
medh2000
True. Some one in the cloud is managing these Hadoop stacks for you. You just
pay usage of these services.

------
thinkmassive
> The two companies are committed to supporting existing offerings from the
> two companies for at least three years but will work on a "unity release" of
> software, drawing on technologies from both companies' portfolios, Reilly
> said.

I’m really intrigued to see whether Ambari or Cloudera Manager wins out in the
long run. This is unexpected but interesting.

(I worked at Hortonworks 2014-2016, no current affiliation)

~~~
jamesblonde
This is, of course, an anti open-source move. So, Cloudera Manager will win,
there. The stack will go largely proprietary, as this is a defensive move -
AWS are just killing them in the cloud. They need to prevent AWS just
repackaging the Hadoop core and reselling it - what better way than by making
your new platform AGPL-v3 or something, so you're still open-source, but AWS
can't just resell it as EMR.

------
xab9
That's interesting. Been at their Hungarian office for a job interview
(Hortonworks, a year or so ago), it was weird though. Haven't tried Cloudera,
but they too have an office at Budapest, I wonder how it will effect their
workforce in the region.

------
gnanesh
Looks like MapR has got something to say to people visiting their homepage,
"Two wrongs don't make a right... SEE WHY CLOUDERA AND HORTONWORKS CUSTOMERS
HAVE MOVED TO MAPR!"

------
docker_up
Lots of overlap between the two companies. I unfortunately expect a lot of
layoffs on the Horton side of things, which is disappointing because I know
quite a few people there.

~~~
batoure
I work for neither company but know people at both and think that this is
highly inaccurate. Though there will be operational redundancy both companies
make most of their revenue from support contracts and professional services
since they provide a mostly FOSS product. Services and support are the two
most heavily labor intensive parts of the tech industry every customer
requires x amount of labor present. Additionally internal engineers paid to
contribute to the FOSS products are still needed. Because both companies have
been benifitting from the contributions of both teams of engineers for years.
Since Horton has no closed source products it has very few redundant
engineering resources.

------
softliving
Hmmm, merger slowdown market, employees beware of the trickery! this happens
and then the mgmt thinks of redundancy to show markets how efficient they are
... is there a humane company which maybe does not pay much but secure enough
and challenges us to rise together ... seems like a distant dream!

------
SimplyUseless
This is not to eliminate threats from new comers but to find an identity in
the new era of cloud.

------
nerpderp83
Yikes!

~~~
medh2000
I have both used Cloudera stack and Hortonworks. I would say in my experience
that installation and management is much straightforward with HW distribution.
Both companies stacks have similar Apache projects and management software
like Ambari and Cloudera Manager.

~~~
nerpderp83
It doesn't bode well for either of them. This is a pure contraction of the
markets they are trying to serve.

