
The Catch 22 of Base64: Attacker Dilemma from a Defender Point of View - WhiteSource1
https://www.imperva.com/blog/2018/04/the-catch-22-of-base64-attacker-dilemma-from-a-defender-point-of-view/
======
zrm
> based on the assumption that legitimate users have no practical need to do
> multiple encoding of the same text.

Things can legitimately be encoded multiple times because encapsulation is a
thing and multiple independent stages may each be configured to accept
possibly binary input and produce base64 output.

You also don't need to re-encode something an unreasonable number of times to
get "Vm0wd", all you have to do is start with "Vm0" and base64 encode it
twice. "Vm0" only has 24 bits of entropy which means it will regularly occur
at random in legitimate data.

And then nobody can figure out why "Vm0-Edge-West" isn't working.

------
benchaney
I'm not really sure what the threat model is here. If the attacker can control
what encoding scheme you use, surely you have much more serious problems than
the possibility of wasting space.

~~~
jessaustin
TFA is about _attackers_ using Base64 encoding.

------
loup-vaillant
> _While Base64 encoding is very useful to transfer binary data over the web_

This part I cannot fathom. The era of 7-bit bytes is over. What can possibly
justify the need for a "printable characters" encoding now? Something stupid
like putting data in a JSON string? What's the next step, base-64 encode the
JSON containing that string and put it in an XML tag?

~~~
dylz
I see crap like:

    
    
      <data><![CDATA[PD94bWw+PHdoeS8+PHRoaXNpc2F3ZnVsPjwvcm9vdD4=]]></data>
    

constantly in real life in 2018

~~~
loup-vaillant
Yes, but why? There's a reason for everything, so I guess this serves some
purpose, _somehow_? I guess the ultimate reason is stupidity or bad planning
or cut corners, but I'm still curious.

Bonus point if there's a situation where avoiding something like base64 is
actually _difficult_ , even in 2018.

~~~
jessaustin
If one wanted to embed a random token in an email, what would one use instead
of base64?

~~~
loup-vaillant
Attachments?

~~~
jessaustin
ISTM that amounts to the same thing? MIME _is_ Base64.

~~~
loup-vaillant
<facepalm>

------
boneitis
To really crank up the pedant-O-meter, wouldn't it be accurate to instead
describe the output growth in point #1 as polynomial, even sub-quadratic?

~~~
boneitis
Perhaps i can blame the booze, as i an admittedly very drunk but still
unconvinced and/or confused.

I asked my question coming from the angle of big-oh analysis, and all three
(as of the time of this reply) responses in the thread seem to describe a
growth by a constant factor of 1.3bar, bar over 3, otherwise known as
polynomial growth, yes?

Absolutely, the growth can be mathematically expressed with an exponent, but
we verbally characterize the actual growth to be polynomial if the scalar
stays constant (a.k.a. a literal number in the exponents place) rather than
increasing as the amouny of input increases (independent variable lying within
the actual exponent), no?

~~~
boneitis
Man, I'm dyslexic.

