Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> ... I have written and continue to write small programs to do this ...

Would you mind sharing some of that code?

Some of your recent comments on web browsers, text browsers and javascript [1 + its follow-up] are really interesting. Thanks for sharing.

1: https://news.ycombinator.com/item?id=32131901



Below is one for PDF. Compile the 052.l file with something like

     flex -8iCrf $1;
     cc -O3 -std=c89 -W -Wall -pedantic -I$HOME -pipe lex.yy.c -static -o yy${x%.l};
     strip -s yy${x%.l};
     test -d yy||mkdir yy;
     export PATH=$PATH:$HOME/yy;
     exec mv yy${x%.l} yy;
"yy045" is a small program to remove chunked transfer encoding.

These programs are to be used in pipelines, something like

      echo https://www.bezem.de/pdf/ReservedWordsInC.pdf|yy025|nc -vv h1b 80|yy052 >1.pdf
"h1b" is a HOSTS file entry for a localhost TLS-enabled forward proxy

"yy025" is a small program that generates HTTP.

Interestingly I think curl was modified in recent years to detect binary data on stdin. I just tested the following and it extracted the PDF automatically.

       curl https://www.bezem.de/pdf/ReservedWordsInC.pdf > 1.pdf
However, one thing that curl does _not_ do is HTTP/1.1 pipelining. I use pipelining on a daily basis. That is where these programs become useful for me.

       cat > 052.l

       /* PDF file carver */
       /* PDFs can contain newlines */
       /* yy045 removes them so dont use yy045 */
   
    #define echo ECHO
    #define jmp BEGIN
    int fileno(FILE *);
   
   xa "%PDF-"
   xb "%%EOF" 
   
   %s xa 
   %option noyywrap nounput noinput
   %%
   
   {xa} echo;jmp xa;
   <xa>{xb} echo;jmp 0;
   <xa>.|\n|\r echo;
   .|\n
   
   %%
   int main(){ yylex();exit(0) ;}

   ^D




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: