
Ask HN: Should we still Base64 encode? - mcqueenjordan
Base64-encoding incurs some overhead (storage and computation). For context, Base64-encoding is a mechanism to encode bytes (0-255) to a radix-64 alphabet (0-63), with &#x27;=&#x27; for padding. Base64 encoding is 4:3 output:input, so 33% overhead.<p>Assumption: Many systems (modulo email) are 8-bit clean nowadays.<p>As an example, many session tokens are base64 encoded, but if we know that these tokens will only interact with 8-bit clean systems, should we not avoid the overhead?
======
pwg
> Assumption: Many systems (modulo email) are 8-bit clean nowadays.

Your assumption is flawed. Most of the 'systems' that were not 8-bit clean
(and resulted in the various 'encodings' being required) remain just as 'not
8-bit clean' today as they were then.

~~~
mcqueenjordan
Follow-up: Is it worth investigating changing that? e.g. an RFC for 8-bit
clean HTTP cookies.

~~~
tony-allan
I'm not sure how you would get every HTTP/Cookie library to correctly follow a
new RFC. There is a lot of software out there that is old and not well
maintained that will still be around in another decade.

I haven't read the HTTP/2 RFC's but with any luck they are 8-bit clean, which
would be progress.

~~~
wmf
HTTP/2 and /3 tried not to change semantics so that you can losslessly
translate between different versions.

------
A-AronBrown
I had a similar thought a few months ago when looking at ways to encode data
in HTTP cookies and came to the conclusion that base64 wasn't broken, so there
was no need to fix it.

For medium-large messages I would tend to use JSON/msgpack where appropriate,
so no need to further encode anything.

For small (binary) messages, other encodings (e.g. ascii85) often wasn't much
smaller and didn't encode faster, so there was no measurable performance
benefit. And given the extra complexity and compatibility issues it would take
to use something else, it just wasn't worth it.

------
wmf
Probably not. Most places where base64 is used are no cleaner than they ever
were. XML is not perfectly 8-bit clean and neither are HTTP cookies. You could
probably replace base64 with more optimized yenc-style encodings in many
situations but it's probably not worth the hassle.

~~~
mcqueenjordan
I missed that cookies aren't 8-bit clean. That's unfortunate.

------
zzo38computer
There are sometimes cases when such encoding is necessary, such as to not
interfere with the message framing, and there are many such cases; it happens
in many text-based protocols (which is good in order to be able to work it
without specialized software; for example, NNTP). Or if you are going to type
in the data by hand, maybe, such as from a print out.

