Map/Reduce for Mortals 75 points by herodotus 4 days ago | hide | past | web | favorite | 25 comments

 The conversation Bill's touching on -- the tradeoffs between different notations -- is really valuable but I think it misses something a lot of us desire in a syntax: compositionality. Take the provided "onion" notation, loops, and the "new" syntax. They all look something like this: ┌────────┬────────────────────┐ │myreduce│┌─────┬────────────┐│ │ ││mymap│┌────────┬─┐││ │ ││ ││myfilter│x│││ │ ││ │└────────┴─┘││ │ │└─────┴────────────┘│ └────────┴────────────────────┘  (The math one looks more like this:) ┌────────┬────────────┐ │myreduce│┌────────┬─┐│ │ ││mymap │x││ │ ││ │ ││ │ ││myfilter│ ││ │ │└────────┴─┘│ └────────┴────────────┘  But none of them provide the same kind of "putting pieces together" feeling as this: ┌────────┬─────┬────────┬─┐ │myreduce│mymap│myfilter│x│ └────────┴─────┴────────┴─┘  which we see in the wild as this: (myreduce ∘ mymap ∘ myfilter)(x)  or this: x | myfilter | mymap | myreduce  or this: myreduce mymap myfilter x
 I didn't spot Clojure here yet: (->> [1,2,3,4,5] (filter #(-> % (mod 3) (= 1))) (map #(* % %)) (reduce +))  #( ... ) is an anonymous function, arrows are threading macros: https://clojure.org/guides/threading_macrosIf you wanted more trickery you could play with transducers and compose them with "comp".
 List comperhension version without threading: (reduce + (for [n [1, 2, 3, 4, 5] :when (= 1 (mod n 3))] (* n n)))
 > reduce(lambda t,x t+x,map(square,filter(lambda x: x % 3 == 1,[1,2,3,4,5])))> Clear? As mud> the lambdas and the functions obscure what is being computed.Simply use J!g =: +/ @: : @ (] * =&1@(3&|))Or:f =: +/ @: *: @ (#~ (=&1@(3&|)))
 This already exists in Python in the form of comprehensions sum(x * x for x in [1,2,3,4,5] if x % 3 == 1)  I'm not familiar with Lucid but maybe evaluating the implementation of comprehensions could help solve some of the design challenges.
 Might work, worth thinking about
 Having pointfree functions and curried application makes the functional example more readable, as it removes the obscuring lambdas: (reduce (+) . map (^2) . filter ((==1).(%3))) [1,2,3,4,5]  The above is pseudo-Haskell. If we want to
 In actual Haskell I would write it as sum [ n^2 | n <- [1..5], n % 3 == 1 ]  which seems to express the idea fairly directly and I'm fairly sure has a close analog in python.
 python: sum([n * n for n in range(1,5,3)])
 haskell: sum [n^2 | n <- [1,3..5]]I just remembered haskell's "range" operator ".." can also take a step size.['a', 'd' .. 'z'] => "adgjmpsvy"
 yeah to me this just hollers for the kind of "pour it through the functions" approach we see in functional languages or even with Ramda.pipe in JS. Ezpz.
 Or in functional rust (1..5) .into_iter() .filter(|i| i % 3 == 1) .map(|i| i * i) .sum();  I don't know if we really need a new syntax for map reduce.
 That's also basically the same syntax as doing it in JS, aside from the lack of builtins: import { range } from 'ramda'; range(1, 6) .filter(i => i % 3 === 1) .map(i => i * i) .reduce((a, b) => a + b, 0);  Or, if you're using the pipeline operator (experimental, stage 1, nobody can quite decide on exactly the syntax but this gives the rough idea): import { range, sum } from 'ramda'; range(1, 6) |> x => x.filter(i => i % 3 === 1) |> x => x.map(i => i * i) |> sum
 We can get rid of these arrows using ramda's curried list functions, at the expense of being slower and harder to read but making us feel more clever: pipe( range, filter(pipe( modulo(__, 3), // x => x % 3 equals(1) // y => y === 1 )), map(flip(Math.pow)(2)), sum )(1,6)
 > "that’s as close as I can get in wordpress but you know what it looks like"Um, I still don't know what it looks like. I know mod and there is a range of numbers it seems and an i to the power of two, but I really have no idea what the other symbols mean or if their position/layout in that fashion is important. I can probably reverse engineer what the mathematician writes by reading the code but I'm still confused as to why many articles about software assume a complete mathematics syntax knowledge.
 Agreed, I think the syntax used is not that common. (This is a problem, I think, with math notation -- everyone's notation means different things, google "substitution notation history" for the worst.) They definitely meant to write:https://latex.codecogs.com/gif.latex?\sum_{\substack{i\in[1,..."sum of all i² where i is in [1,5] and i mod 3 = 1".
 https://en.wikipedia.org/wiki/Element_(mathematics)#Notation...I'm not sure what the third line is supposed to look like, but I guess it's a filter. So: Sum i^2 where i is in the set {1,2,3,4,5} and i % 3 = 1.
 What exactly is this pyLucid? I searched for it and found a CMS and a generative video ML library. Don't see how either one connects with this.
 The author of the post is the co-inventor of the Lucid programming language. In an earlier blog posts he describes a Python interpreter for a version of Lucid that he named pyLucid.
 Thanks. For anyone else reading, it may or may not be this https://code.google.com/archive/p/plucid/
 List comprehension in Haskell is probably the best approach so far, imho. However once you're using these tools then map, reduce, filter should be easy to understand, so the terseness and clarity of it doesn't come without a cost.let mod3 n = if n mod 3 == 1 then n else 0sum [ i^2 | i <- [ mod3 n | n <- [1..5] ] ]
 > sort of objects are ranges? In Python they would be iterators, and that works out fine. In Haskell they would be lists.> I think the best choice is to make them vectors (arrays)In Haskell they are lazy linked lists! Which in theory is more efficient than vectors (in terms of space).
 this seems to be a syntax problem. The for loop is not really 'equivalent' in complexity to the functional solution. In Elixir, I'd do this: list |> Enum.reduce(0, fn -> x, acc when mod(x, 3) == 0 -> acc + x * x _, acc -> acc end)  But the presented functional solution throws more functions at it, which might look like this, which I think is quite clear. list |> Enum.filter(&(mod(&1, 3) == 0)) |> Enum.map(&(&1 * &1)) |> Enum.reduce(acc, &Kernel.+/2) #better yet use Enum.sum
 You might even want to use the Stream module. Beautiful Elixir.
 Stream would technically better, however given the discussion is about Map/Reduce, the only thing Stream has in common with Map/Reduce is it's lazy. If you wanted something comparable (mapping is done in parallel, reducing as well just over partitions), then you'd want to use the Flow[1] library. As it does the same thing as Stream.map |> Enum.reduce just parallelized/partitioned, and what's great is the Flow module is more-or-less a drop in replacement for Enum/Stream (with a few caveats like calling Flow.partition before Flow.reduce). But, with just some quick a dirty benchmarks you can see Flow outperforms Stream on all but the smallest data set (range 1..100): with_stream = fn range -> range |> Stream.filter(&(rem(&1, 3) == 0)) |> Stream.map(&(&1 * &1)) |> Enum.reduce(0, &Kernel.+/2) end with_flow = fn range -> range |> Flow.from_enumerable() |> Flow.filter(&(rem(&1, 3) == 0)) |> Flow.map(&(&1 * &1)) |> Flow.partition() |> Flow.reduce(fn -> [0] end, fn val, [acc | _] -> [Kernel.+(val, acc)] end) |> Enum.sum() end iex(4)> Benchee.run( iex(4)> %{"stream" => with_stream, "flow" => with_flow}, iex(4)> inputs: %{"small" => 1..100, "medium" => 1..10_000, "large" => 1..10_000_000} iex(4)> ) Operating System: macOS CPU Information: Intel(R) Core(TM) i5-5257U CPU @ 2.70GHz Number of Available Cores: 4 Available memory: 8 GB Elixir 1.9.4 Erlang 22.2.1 Benchmark suite executing with the following configuration: warmup: 2 s time: 5 s memory time: 0 ns parallel: 1 inputs: large, medium, small Estimated total run time: 42 s Benchmarking flow with input large... Benchmarking flow with input medium... Benchmarking flow with input small... Benchmarking stream with input large... Benchmarking stream with input medium... Benchmarking stream with input small... ##### With input large ##### Name ips average deviation median 99th % flow 0.0994 10.06 s ±0.00% 10.06 s 10.06 s stream 0.0782 12.78 s ±0.00% 12.78 s 12.78 s Comparison: flow 0.0994 stream 0.0782 - 1.27x slower +2.72 s ##### With input medium ##### Name ips average deviation median 99th % flow 83.87 11.92 ms ±20.48% 11.30 ms 25.53 ms stream 74.88 13.35 ms ±32.02% 12.32 ms 30.22 ms Comparison: flow 83.87 stream 74.88 - 1.12x slower +1.43 ms ##### With input small ##### Name ips average deviation median 99th % stream 4.98 K 0.20 ms ±87.16% 0.169 ms 0.56 ms flow 0.70 K 1.42 ms ±21.58% 1.35 ms 2.52 ms Comparison: stream 4.98 K flow 0.70 K - 7.06x slower +1.22 ms  [1] https://hexdocs.pm/flow/Flow.html

Search: