
Making single-purpose utilities example: filter URLs from input - textmode
<p><pre><code>    &#x2F;* ---
    Made for use with http clients as described in https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=17689165 and https:&#x2F;&#x2F;news.ycombinator.com&#x2F;item?id=17689152
    Assuming code below is saved as &quot;030.l&quot;, one might compile program &quot;yy030&quot; with something like:
     flex -8iCrfa 030.l
     cc -pipe lex.yy.c -static -o yy030
     --- *&#x2F;


     #define p(x) fprintf(stdout,x,yytext);
     #define jmp BEGIN
    %s xa xb xc
     int e,b,c;
    xa &quot;http:&#x2F;&#x2F;&quot;|&quot;https:&#x2F;&#x2F;&quot;|&quot;ftp:&#x2F;&#x2F;&quot;
    %%
     &#x2F;* non-printable *&#x2F;
    \200|\201|\204|\223|\224|\230|\231|\234|\235

    {xa} p(&quot;%s&quot;);jmp xa;
    &lt;xa&gt;[^ \n\r&lt;&gt;&quot;#&#x27;|)\]\}]* p(&quot;%s\n&quot;);jmp 0;

     &#x2F;* http:\&#x2F;\&#x2F;[^ \n\r&lt;&gt;&quot;#&#x27;|]*    fprintf(stdout,&quot;%s\n&quot;,yytext); *&#x2F;
     &#x2F;* https:\&#x2F;\&#x2F;[^ \n\r&lt;&gt;&quot;#&#x27;|]*    fprintf(stdout,&quot;%s\n&quot;,yytext); *&#x2F;
     &#x2F;* ftp:\&#x2F;\&#x2F;[^ \n\r&lt;&gt;&quot;#&#x27;|]*    fprintf(stdout,&quot;%s\n&quot;,yytext); *&#x2F;
    .|\n
    %%
    int main(){ yylex();}
    int yywrap()
    {
    }</code></pre>
======
theamk
Uh, I cannot imagive why one would prefer this to a single “grep -o”
invocation.

Not only “grep” command will be simpler to understand later, it will also be
trivially customizeable/extendable

~~~
textmode
Of course I use grep -o too. This is not a "correct" filter. It is not a
perfect regexp for 100% of urls.

However for something as simple and essential (for the author) as filtering
urls I do not want to always have to worry about potential differences in
shells, different versions of grep or the absence of a grep as I use different
computers, different OS or OS versions. I find this more predictable and
portable.

Neither customization nor extensibility are goals. For that a scripting
language is better suited.

~~~
theamk
Change that "grep" to "sed", and you will get a solution that works even on
ancient machines, like HP-UX from 1990's. Grab msys, and you'd have your
solution for Window-based systems as well.

At the same time, installing "flex" and "cc" on a random machine would be much
harder. Old Solaris boxes, for example, come without any C compilers, not to
mention lexers.

And finally, what are you going to do with the results? It is very likely that
you'd want to pass them through sed/grep anyway. So you will have to worry
about differences in shells and versions anyway.

So sorry, I see no advantages of this, just disadvantages. Of course no one
cares if you run them yourself, but posting them for other people is just
evil.

~~~
textmode
Yeah, I am pretty good with sed. Probably better than you. I have sed versions
of all these programs.

I am not using any computers that cannot run flex and cc.

Results usually go to yy025, a program that makes http from urls.

