Hacker News new | past | comments | ask | show | jobs | submit login

And saying "base52" is misleading. It implies that there is a source data that's been encoded, which is not likely here. You can be specific without implying that.

A random string out of a specific character set is also subtly different from taking random bits and encoding them with that same character set, in that the first couple digits will have different distributions.




> saying "base52" is misleading. It implies that there is a source data that's been encoded

No, it doesn't. 0xFF is a number I just made up, no source data at all, I promise. Also, it's base 16 :)

Anyhow, the source data was most definitely base 2 (as is your computer's memory, I assume) and later encoded into base52 to be represented as a string (unless someone at Microsoft wrote it in base52, which seems unlikely).


> 0xFF is a number I just made up, no source data at all, I promise. Also, it's base 16.

It's not base 16 encoded, which was his point. Encoding demands a source. This is just a base-16 number unless you encoded something to arrive at this. You could interpret "Romeo and Juliet" as a very large base65 number (65 unique chars in the random copy I grabbed) if you want, but it's not meaningful or accurate to call it a base65 encoding.

> Even if that were true, the source data was most definitely encoded from base 2 (which is what our computers work with).

This is the kind of pedantry that people hate because it adds nothing to the conversation. It's a way to inject "I'm right" moments into the conversation so you can feel smart, while no one else really cares. It makes for unpleasant conversations.

<pedantry>

You're also not right. Your brain doesn't work in base-2, and you likely didn't enter this number into your computer in base2. You typed in the string "0xFF", and that string was encoded in base 2. The base2 that represents the string "0xFF" is very different from the base2 number that represents the logical (base16) number 0xFF.

</pedantry>


> It's not base 16 encoded, which was his point.

I didn't understand it that way.

Also, your whole <pedantry> block and the paragraph above is based on misunderstanding my comment (probably because I'm not a native speaker and you caught me inbetween edits).

I think you're the only one making an "unpleasant conversation".


> No, it wasn't (or I didn't understand it that way). He'd have said encoded somewhere.

The original comment did say "encoded". The discussion about the phrase "base52 encoding" was the base of this entire thread. The parent of your original response also used the term "encoded". The context is clear. I don't see how you could have missed it. (Edit: I see you're not a native speaker. That might be part of why we're not understanding each other. Plus I apparently keep replying in between your edits, which happened again.)

> Also, your whole <pedantry> block and the paragraph above is based on misunderstanding my comment

Well, you rewrote the comment after I replied. I assumed your "source data" was your logical 0xFF number. If you were referring to the "source data" for the strings in the update description, then in all likelihood, there was never a "source" number at all. These strings were almost certainly generated via random selection from a set of characters. You could generate a very long number and then base52-encode it to produce the same thing, but it would be more work and less obvious for future code maintainers. So the "source" was a sequence of characters (azAZ), not a base2 number. You could argue that this is still somehow base-2 since it's in a computer, which I guess is fine (if pointless and pedantic), but it's still not accurate to say that these were "encoded" into base52.

I was unnecessarily snarky and rude. I'm sorry for that.


You're right, but the "base52" part isn't what you're right about, it's the 'encoded' part. "base52" is accurate, it is a series of base52 characters, however it doesn't appear to be the product of an encoding.


Base52 is still not really accurate. Base52 implies an encoding. Further, Base52 is not a standard so it's not even meaningful to say that the characters are from the Base52 set. You could also represent Base52 by including 0-9 and excluding Q-Z. Any string that "looks like" Base52 (azAZ) also looks like Base64 and any number of other encodings.


>No, it doesn't. 0xFF is a number I just made up, no source data at all, I promise. Also, it's base 16 :)

But the full quote was "base52-encoded". (Though I would argue that base52 implies encoding, because nothing naturally works in base52. The only thing that's naturally 52 is "random letters with random case". Or something with cards.)

>Anyhow, the source data was most definitely base 2 (as is your computer's memory, I assume) and later encoded into base52 to be represented as a string (unless someone at Microsoft wrote it in base52, which seems unlikely).

That is an enormous assumption. It's easier to pick random letters than it is to take a specific binary number and convert it to letters. And they don't give you the same result. Bits stored in base 52 will never start with zzzzz.




Applications are open for YC Winter 2020

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: