

Thank you Microsoft, and so long - aliostad
http://byterot.blogspot.com/2013/12/thank-you-microsoft-and-so-long.html

======
endianswap
I feel like developers who haven't worked with medium-sized data don't know
where to draw the "big data" line. You see this with companies overspending
for complicated "big data" solutions when the only have something on the order
of billions of rows and nothing more. I deal with many billion row tables in
the products, and SQL Server still works wonderfully there. Throw 1TB of RAM
and a stack of Fusion IO drives on a pair of servers and I'm confident the
pair (for mirroring obviously) and a DBA can easily handle the needs of any
company that thinks they have "big data", save the current big data pioneers
(e.g. Google, Facebook) who are only spending time on these problems because
throwing more money (read: hardware) was hitting diminishing returns. In my
specific case, we have billions of rows store in SQL tables whose schemas
(including PK and indices) and queries were built by server engineers without
any explicit DBA experience. If we can manage to store our data quite naively
in SQL server by just putting it on a pair of superboxes, I find it hard to
believe this solution wouldn't apply to all but a handful of tech companies.

~~~
belluchan
Big data doesn't have to be "big". I'm not going to store statistical
information, information that has at most a value and a date and maybe a key
or two, and is updated multiple times per second per user in an SQL database.
It doesn't fit in the table row column paradigm.

For me at least, big data is more about a form of non-relational data that
updates very quickly, doesn't need joins, needs high availability and can be
processed later to derive value out of it.

~~~
endianswap
We do exactly this, with billions of rows each day of statistical data right
into MSSQL server instances. For data like this, the only difference is that
we partition the data into daily tables and provide views into the data as
necessary.

What experience or expertise can you share as to why this doesn't work?

~~~
jules
What's the reason for partitioning into daily tables? Why not add a day column
and index it on that column?

~~~
RyanZAG
Index would become bigger than ram. Removing old data would be more difficult.
They likely only use todays data. This is probably a trading data system of
some kind.

------
ripberge
Disagree with almost everything this guy wrote.

99% of startups and in house business apps will never need to scale at a level
that an RDBMS can't support. We should all be so lucky to have the success
that brings these horizontal scalability problems.

Also, I hope he lets us know what kind of incredible data he end up mining
from his mouse movement logs. Surely every website and app on planet Earth has
completely exhausted basic solutions like A/B testing and usability testing to
make their apps better. The only solution now is to embark on a data mining
expedition so that artificial intelligence will tell us how to make better
software.

~~~
crazygringo
Came here to say the same thing.

A bunch of statements that have little or nothing to do with the requirements
for 99% of projects, then somehow links this to being the end of Microsoft?
Wait, did I miss a step? And then he concludes with "I will carry on writing
C#, ASP.NET Web API and read or write from SQL Server".

Totally bizarre.

------
mrpickles
I agree with what you said. I worked with .net since the 1.0 days and after
years of staunch support, I got a job last year working with python/javascript
on a unix stack and kissed microsoft technologies goodbye.

You're absolutely right about Microsoft being behind the curve. There's a
pattern of denial when it comes to new technologies, and a total ineptitude at
spotting new trends and adapting to them. I watched with envy for years as all
the rails kids played with their new toys until Microsoft got its shit
together and came up with an MVC solution.

The whole world is using flash? Lets come up with a knock-off 5 years too
late. Interactivity is a big deal? Let's try to convince them that webforms is
MVC until its too late. Who needs javascript anyways? Distrubuted software you
say? Let's not jump on the REST bandwagon, lets make WSDL defined web services
and make a shitty API on top! Why have an ORM solution when putting all of
your business logic in SQL is SO EASY with microsoft? See, I can drag and drop
a database connection from Visual Studio!

I love C# as much as the next person. It's a great language. SQL Server is a
pretty good relational database. But if you choose to devote all of your
energy to Microsoft development and don't learn anything else, you have nobody
to blame expect for yourself when you find your skill-set behind the curve.

.Net developers would never agree about the big-data part, because to be
honest they haven't invested time into understanding what data science is.
There are so many IT shops developing on the Microsoft stack that make less
than say 50 million a year that have tons of data (billions of records they
say!). Maybe they make CRM software, or school software, or an inventory
system for a small grocery store. They have tons of test scores, purchase
records, waste numbers, demographic info, yet if you ask them to develop
something that gives you insight into what factors influence student
performance, or to visualize perchasing trends, or to predict sales numbers so
that the company doesn't over purchase perishables, you'll get a blank stare.
The idea that data can be transformed into useful information, and that this
isn't simply a matter of CRUD hasn't occurred to them. They probably don't
even replicate their db for their half ass attempt at reporting, so they
wouldn't begin to understand what something like hadoop even does.

You simply CANNOT explain to a .net zealot what big data is. They just don't
get it. "It's stored in the database, and its a few terabytes...how big does
it need to get? We aren't Google afterall!"

~~~
stirno
Your statement is an incredible generalization of a large group of developers.
It may be your experience, but I have to counter it with my own.

While the Windows/.NET stacks may not provide as many pre-built solutions to
handle 'big data' (a term I'm growing to hate), I have worked with tons of
competent developers who have been able to build solutions to answer exactly
the kinds of questions you propose.

I've seen it done with SQL Server Reporting Services, Analysis Services and
even straight custom C# code. I've seen it done with Hadoop + streaming API,
StreamInsight (CEP rather than post processing) and custom Workflows in WF.

As for your other statements that are just built to jump on the 'hate
Microsoft' bandwagon, just because a single large company doesn't iterate fast
enough to keep up with the rest of the OSS landscape doesn't mean that .NET
didn't have options available to it. Nancy provides great REST capabilities
and has existed since 2010. Devs were building REST services with Microsoft's
ASP.NET MVC during the same period. Others were using WCF REST (not great, but
usable) at the same point as well. WCF Data Services was another option.

ORMs.. Theres the ever-expanding Entity Framework (which I don't love),
NHibernate and many many others.

Lets not just jump in and bash MS because you've only worked with bad
developers using their stuff.

------
joshuaellinger
I'm in a similar boat except that I think I can keep the good (C# / .Net) and
leave the bad (Microsoft Analysis Server).

We are looking at scaled DB backend using column stores, multiple compute
instances feed through message queues running on Amazon or private cloud, and
a mostly Javascript / HTML5 front-end. It is all tied together with C# and so
far it doesn't look like it will be too bad.

Not surprisingly, the stickiest part of the puzzle is... Microsoft Excel,
specifically Pivot Tables. Weird but true.

~~~
michaelcullina
Have you looked into PowerPivot free add in for Excel from Microsoft?

------
smnrchrds
Sometimes when I talk to CS students at my university, I feel like we are
living in parallel universes. I live in one where Python, Ruby and JavaScript
are the most important languages, basic Unix skills is a must-have, etc. In
their world, the whole programming landscape is divided in two sides: .NET and
Java. Everything that doesn't fall in either of those categories is a toy.

I thought it was just the atmosphere of my university, or maybe realities of
the job market in my country[1]. But reading this made me wonder id there are
really two parallel universes in computer-land.

[1] Based on the name of OP, I am 90% sure we have the same nationality.

------
Aloha
This guy makes the assumption that all problems are big problems.

Most problems are NOT big problems.

~~~
gaius
It's a cargo cult thing - people think "if I do like Google, I will be
successful like Google!"

Data is getting bigger, sure. Movies in HD, umpteen-megapixel cameras, yadda
yadda in the consumer space, but hardware is getting bigger too, for the same
price. You can easily buy a terabyte drive now. My first HD was 20M and I
thought I'd never need anywhere near that much!

But in the corporate space - where is this data actually coming from? If you
have a company selling widgets, then by how many orders of magnitude do your
sales have to increase before you have to worry about "big data"? Most
companies could probably handle a tenfold increase in sales on their existing
systems, even if those systems haven't been updated in a few years. Bread and
butter software work is not going to change in the foreseeable future, it'll
just have a different set of buzzwords.

~~~
Aloha
I see cargo cults all over, its my favorite way to explain the problem of
people conflating causation and outcome.

In business I see lots of folks who do this: "Well, if I do what XYZ does,
I'll be successful just like they are!" So they do what XYZ does without
understanding the WHY part of what XYZ does, and invariable either fail, or
succeed and making a very poor copy.

------
zequel
I agree there's an increasing demand for big data but I think the OP is
overreacting. I think there's going to be a big demand for internal LOB apps,
mobile apps etc for a long time. Not everything is big nor needs to scale.
Can't hurt to learn unix or jvm languages though.

~~~
aliostad
I am mainly talking about server technologies. So yes, client technologies do
not need to scale. But if the when even LOB apps need to work wit

Remember "Can't hurt", not quite true. I am shifting focus, so I am spending
time that could be spending on learning and mastering, blogging, speaking
Microsoft technologies. If I am wrong, it will hurt me.

------
petepete
> The fight between Silverlight/XAML vs. Javascript took so many years.

There was a fight? I must've missed it.

~~~
_random_
Yes there was, people refused to use a crappy script language but now they
pretty much have too :(.

------
MichaelGG
So his point is that databases are going to go away (odd, considering even
Google went back to a nice ACIDy RDBMS for AdWords), and Microsoft is just
going to ignore any sort of scaling technology and disappear. That's certainly
an odd perspective.

MS has two problems with "big data" (and let's just pretend lots of people
have big data problems - I've seen people deploy 100-node Hadoop clusters to
deal with 30M rows/month of data because they think Hadoop is some magic
sauce.)

First, they want to squeeze as much money out of enterprise customers as
possible. Look at the new SQL Server pricing and limitations, where the SQL
team is doing things they promised they wouldn't and previously mocked Oracle
for doing (charging by core). Their in-memory solution (Hekaton) comes a bit
late, and only for Enterprise edition.

Second, they generally want to deliver solutions that the majority of their
customers can figure out. Again, look at Hekaton. From what I've read, it'll
have extremely high compatibility with T-SQL. Shipping a limited release that
only had a small subset of SQL, even though it'd be useful for certain apps,
probably was never a serious consideration. Hell, look at C# and how their
customers are begging them to please not innovate too much, since learning is
hard.

There's also the Azure push. SQL Azure has federations, making it easy to
shard a traditional SQL schema across many physical instances. They also
promote Hadoop on Azure. (As I understand, Hadoop's a dead-end; even inside
Google, MapReduce was toasted by Dremel, right?)

If Microsoft just added Hadoop-style functionality to one of their server
products, it would not be friendly. People capable of writing an OK SQL query
for a report can't necessarily format that same query in an efficient map-
reduce style.

And really, since this need is far less than it's played out to be, MS is
probably just fine pushing and profiting off their traditional solutions.
Azure helps them secure a few leading-edge needs, and eventually they'll roll
out an easy-to-use commercial solution.

------
riyadparvez
I don't agree on big data part. But I certainly agree on not innovating and
not have a solid plan. MS doesn't have any plan, their biggest problem is they
don't like to innovate and instead just love to jump on the bandwagon of any
trend. I mean what's the point of Silverlight? Adobe Flash is already
dominated market. Instead of focusing on HTML5, they just built another thing
like Flash.

MS needs innovation and solid plan; not jumping on the band wagon of next hot
topic.

------
apapli
I'm a bit confused by this statement:

"Cannot say the same thing for middleware technologies, such as BizTalk or
NServiceBus. Databases? Out of question."

I'm curious about his view that middleware is not here to stay. Surely the
multiple vendor cloud trend that enterprises are following (Google +
salesforce + Microsoft + oracle) - because no one vendor does everything -
means middleware is going to be more prevalent in the future.

Thoughts?

~~~
aliostad
Author here. Point is middlewares that are not built from ground up to scale
horizontally are bound to die. BizTalk uses SQL Server for its storage as such
cannot be horizontally scaled. End of story. NServiceBus... well, let's not go
there :)

~~~
kstenson
NServiceBus is a very different technology than biz talk, I wouldn't call it
Middleware.

It basically a message based event driven transport layer. By its very nature
it's extremely easy to scale horizontally.

~~~
aliostad
It is about High Availability. Have you ever tried to use a central broker in
NSB? The best HA you get is clustering 2 machines. And that is not HA.

~~~
kstenson
Thats because NServiceBus is a Service Bus and not a broker:
[http://www.udidahan.com/2011/03/24/bus-and-broker-pubsub-
dif...](http://www.udidahan.com/2011/03/24/bus-and-broker-pubsub-differences/)

For HA you simply run a distributor process for each logical event on windows
cluster and add as many worker nodes as you see fit.

Because NServiceBus uses the store and forward pattern if a NServiceBus
process/machine hosting it goes down, you are still guaranteed the eventual
delivery of messages when the process/machine is resumed.

------
jimbobimbo
It's like Microsoft can't catch a break: if they develop something that
competes with other solutions - it's NIH syndrome; if they don't - they're
behind the curve.

Fun fact is that DevDiv produces tons of things that allow .NET devs use
existing OSS solutions for things that are not done by Microsoft themselves.

~~~
runjake
I think it comes down to adoption rates and general influence, and Microsoft
isn't meeting those goals in many people's minds. They doing great stuff with
Javascript, Node.js, Python, Azure, and all that other jazz, but I view that
as "me too" stuff.

