
PAWK - A Python line processor (like AWK) - rachbelaid
https://github.com/alecthomas/pawk
======
babarock
I'm going to go on a small tangent here and warn strongly against parsing the
output of `ls` (cf. the examples given in the README).

Parsing `ls` is inherently fragile[1] and it _will_ break at some point.
Please don't do it.

[1]: <http://mywiki.wooledge.org/ParsingLs>

------
kamaal
I really find this amazing.

I've been watching Larry Wall's Perl 6 talks off late, where he talks of
designing Perl programming languages(1,2,3,4,5 and 6). He goes back to 1987
and talks about problems in 1987 about difficulties in sanely doing text
processing then. He mentions the problems really were that if you wanted to do
simple text processing work there was always awk/sed and other little unix
utilities in form of a command line program using pipes. But any thing more
than that and you really had to write program combining them in a language
like C(The dominant language of the day;circa 1987).

I see the situation hasn't changed at all.

Almost anybody who doesn't use Perl, continues to bend <insert their favorite
language here> in a way that badly implements what Perl was invented to do in
1987. Which is to really solve problems like these, in the most appropriately
designed programming language for it.

Perl was designed to fit the niche between [c/python/java/<insert your
language>] and [shell/unix text processing utils].

When you bend your language to fit that niche, like this tool does(This tool
makes a good attempt at turning Python into a Perl like language). You make it
largely look Perlish. This is dilemma languages trying to replace Perl must
face, everytime you try to invent a language to replace Perl, you end up
inventing a tool/language that largely looks and works like Perl(Read:
Ruby/Perl 6). People who have learned Ruby after learning Perl know this fact,
it really feels like you are just learning some more extra syntax above Perl.

Giving a powerful command line language is very difficult. Because by very
definition you have to design a powerful syntax in the most terse ways ever.
Where does that leave us? This in a C based syntax and what you get is Perl.

It will take a little while to understand that languages like Python and Ruby
are not _replacing_ Perl. That is for the same reasons why Perl even in CGI
days never came to replace Php. Its just these languages and tools like
Perl/awk/sed are designed to rule a very different niche.

When you begin to understand the design and niche they occupy you will
suddenly discover that your 100 custom scripts and jar files you spend days
writing do badly what was already solved in one constant design paradigm some
2-3 decades ago.

This is the reason why the grey beard unix guy smiles at you every time you
show him your world changing script.

~~~
don_draper
Have you ever maintained some else's poorly written Perl code? I have.
Rational or not, I hate Perl.

And yes you can write crappy code in any language, but Perl is just asking for
a bad programmer to write bad code.

~~~
kamaal
I have maintained Perl code.

I don't what you mean by the term 'poorly written Perl code'. I have
maintained more horrible code in languages like Java than in Perl.

>>but Perl is just asking for a bad programmer to write bad code.

And what is that wand that other programming languages have, that when cast
with a spell magically transform a bad programmer into a Dennis Ritchie?

~~~
js2
It's just that Perl's power/expressiveness and TMTOWTDI mindset tend toward
harder to _read_ code. You can write awful hard-to-understand code in any
language, but in those other languages the reading part is usually easier.
Except for maybe C++. :-)

------
dgulino
Yet another option: pyliner - A Python line processor (like Perl):
<https://gist.github.com/dgulino/4750088>

------
blahpro
This looks like a really useful tool, although the automatic importing of
anything that smells like a module makes me a little uncomfortable:

    
    
        modules = re.findall(r'([\w.]+)+(?=\.\w+)\b', all_text)
    

then later:

    
    
        for m in modules:
          try:
              key = m.split('.')[0]
              self[key] = __import__(m)
          except:
              pass

------
rachbelaid
I love awk but personally cannot remind the syntax to do complex stuff without
googling or dig in the man. I didn't test the performance but I don't always
need performance (or even rarely) ... I'm adding pawk in my toolbelt and I
glad somebody built it

------
shenedu
You may find this useful, also in python:

[http://code.activestate.com/recipes/437932-pyline-a-grep-
lik...](http://code.activestate.com/recipes/437932-pyline-a-grep-like-sed-
like-command-line-tool/)

~~~
gvalkov
And if you find that useful, here's an extension of that same idea:
<https://github.com/gvalkov/python-oneliner> (I haven't worked on it in months
though)

------
cju
A similar recipe: <http://code.activestate.com/recipes/577962-awk-like-
module/>

------
willvarfar
Hmm I feel uncomfortable if there's an eval going on in there; what if the
commandline is tainted etc? Its something to keep in mind.

~~~
arethuza
Surely if your command line is tainted then you have bigger problems?

~~~
willvarfar
people often have websites that take some parameters and then pass them to
grep or whatever, in the exact same way that they pass parameters that came
from the user to an SQL engine.

As long as its data, no problem...

Injection attacks are injection attacks.

------
jnazario
neat. not sure i'll use it, but neat nonetheless. thanks for sharing. i've
been a big awk user for well over a decade (and a python user for nearly as
long), so this was neat to see.

------
thomasjames
mawk is fast like nobody's business. I mainly use awk for speedy filetype
conversions for large data sets. I am not sure if I would reap any benefits
from switching my awk interpreter into a JIT language, but this is a really
cool project nonetheless.

