Hacker News new | past | comments | ask | show | jobs | submit login
PaSh: Light-Touch Data-Parallel Shell Processing (arxiv.org)
47 points by matt_d on Dec 28, 2020 | hide | past | favorite | 6 comments



The GitHub URL provided in the paper gives a 404: github.com/andromeda/pash Has anyone had luck finding an implementation?


Paaper is still pre-print so they might be waiting for reviews to complete before releasing the code.

The joys of peer review.


So how can reviewers review the code then? The features do look excellent, a strong coreutils candidate.


Academic papers are often reviewed independently of code.

It's the academic contribution to the wider body of knowledge reviewers will focus on as the paper, not the code, will be the thing presented at conference.

Code will likely be spaghetti. Academic research is often more about "finding new stuff" than "building robust stuff".

Doesn't always make sense IMO, but that's the way of the world.


I haven’t read the full paper, but I think there is a lot of opportunities in parallelizing various shell workloads.

Simple things like grep can be split and parallelized on a single system with a ssd. The pain comes back to doing something like combining results.

Additionally, commands have so many options they can move between simple to parallelize, to more complicated.

I also wonder if the approach should be to build a query planning layer al la pash, or addressing the parallelism in the command itself. I.e their sort example.


Very cool idea. Trying to parallelize bash scripts can be annoying to do manually, so is not often done.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: