
Readings in Database Systems, 5th Edition (2015) - kediz
http://www.redbook.io/
======
pixelmonkey
Michael Stonebraker has an interesting set of conclusions in his assessment of
the MapReduce vendor market in 2015 from the "Dataflow" chapter here:

"\- Just because Google thinks something is a good idea does not mean you
should adopt it.

\- Disbelieve all marketing spin, and figure out what benefit any given
product actually has. This should be especially applied to performance claims.

\- The community of programmers has a love affair with “the next shiny
object”. This is likely to create “churn” in your organization, as the “half-
life” of shiny objects may be quite short."

~~~
jrumbut
I think it's a bit of a shame that the MapReduce concept got the shiny object
treatment since I thought it was a nice pragmatic approach to a useful set of
problems that are faced all the time and often addressed with ad-hoc programs
that make a mess.

People always looked down on those that used Hadoop or somesuch for <1GB of
data, but while it wasn't needed from a technology perspective it gave a
structure to the project.

Now many places are back in the world of one-off scripts, and I think
something of value was lost (even if it was a little ridiculous to fire up a
cluster for something Excel or SQLite could handle).

~~~
throwaway_pdp09
> People always looked down on those that used Hadoop or somesuch for <1GB of
> data, but while it wasn't needed from a technology perspective it gave a
> structure to the project.

What 'structure'? Why is it so important that it makes it worthwhile firing up
a large, complex framework? I'm beyond baffled.

~~~
cbcoutinho
The same 'structure' that makes it easy to onboard new co-workers because
they've seen the same project 'structure' before in the past. In that sense,
the bottleneck in an organization is getting people productive as fast as
possible, even that means using a cleaver instead of a scalpel.

~~~
throwaway_pdp09
If all they can use is a massive cleaver (big data tools), and have no
experience with scalpels (small, sharp, cheap and fast data tools), IMO your
company has a serious, fundamental and systemic problem (no, let's call it
_failure_ ) towards employee experience, training and knowledge. Edit: and
resource management.

~~~
_jal
Seems to be a sort of inverse of the massive spreadsheets that run supply
chains on accretions of spaghetti-macros.

But, a tree chipper can serve as a paper shredder, and I imagine a lot of
shops in certain markets saw it as a sort of prestige asset around 5-8 years
back, when a bunch of companies started hiring data scientists for no apparent
rational reason.

(Not bashing data scientists or data companies. Just remembering the fad that
went around Bay Area companies a while ago.)

------
i0exception
This needs a [2015] tag.

The thing that makes the redbook special in my opinion is that the editors
have been able to apply their research to solve actual problems for paying
customers! You don't get to see enough of that in academia.

~~~
chmaynard
> This needs a [2015] tag.

That's debatable. It's a book, not a blog post.

~~~
asah
Technically it's a book but the structure is actually more like a series of
blog posts...

------
omginternets
Relatedly: I’ve been trying to wrap my head around MVCC (I’d like to write my
own implementation). Any recommendations for a thorough overview of the
subject?

~~~
oftenwrong
[https://vladmihalcea.com/how-does-mvcc-multi-version-
concurr...](https://vladmihalcea.com/how-does-mvcc-multi-version-concurrency-
control-work/)

[http://www.interdb.jp/pg/pgsql05.html](http://www.interdb.jp/pg/pgsql05.html)

~~~
omginternets
Thanks! The second link is really excellent (as is the first, but I've already
read it).

I also found this paper in the refs, which seems really good. [0]

[0] [https://drkp.net/papers/ssi-vldb12.pdf](https://drkp.net/papers/ssi-
vldb12.pdf)

