
Pipelining – A Successful Data Processing Model - stuartaxelowen
http://blog.stuartowen.com/pipelining-a-successful-data-processing-model
======
thmcmahon
F# and R both have good ways of doing this. See
[http://cran.r-project.org/web/packages/magrittr/vignettes/ma...](http://cran.r-project.org/web/packages/magrittr/vignettes/magrittr.html)

~~~
sebastianavina
R is one of those heavily misunderstood languages because it uses features so
advanced yet so common once you are a seasoned programmer that nobody cares to
explain or even make a blog post about.

------
agibsonccc
Disclaimer: Committer on this library. We actually built canova[1] for
unstructured data. Docs aren't quite there yet but I'd love to float the
concept out there for people to get feedback on.

In ML we like to term this as vectorization rather than pipelining. Same idea
though.

Many libraries like binding the ML with the feature vector transformations.
Our goal was to isolate this process in to something composable.

General idea is file -> feature vector/matrix

[1]:
[https://github.com/deeplearning4j/Canova](https://github.com/deeplearning4j/Canova)

------
misterdata
In fact, pipelining is supported in a lot of systems already. RethinkDB
([http://rethinkdb.com](http://rethinkdb.com)) is built around a query
language of chained data operations. Warp (OS X tool) provides a user
interface for pipelining and executes its queries on regular SQL databases
([https://pixelspark.nl/2015/warp-a-query-by-example-
analysis-...](https://pixelspark.nl/2015/warp-a-query-by-example-analysis-
tool-for-big-data))

~~~
dougabug
Oracle has had pipelined table functions since version 9.0.

------
graffitici
Somebody rediscovered functional programming, or am I missing something?

------
visarga
You mean what we have already been doing for decades with head, tail, cat,
cut, grep, sort, uniq and awk (or inline perl)? Only problem is that it
doesn't validate types and has a weak error system.

After learning about functional programming, I was amazed how advanced is unix
the command line. The level of composability is famous. To think that the
common unix pipe is related to FP, blows my mind ...

Another place where this pattern is widely used is in jQuery, where the pipe
is much smarter. Also, see Lazy.js for pipe-like lazy data processing.

------
aidenn0
Sounds a lot like
[http://en.wikipedia.org/wiki/Lucid_(programming_language)](http://en.wikipedia.org/wiki/Lucid_\(programming_language\))

