Hacker News new | past | comments | ask | show | jobs | submit login

Validating is just the process of ensuring an input is admissible in the way you want to use it. That can be symbols in a string, whether a string is an e-mail or even if a number is in a certain range. Escaping is just validation + fixup which can be used in some cases. Anyway the only way to validate an e-mail in practice is to use it and confirm.



> Escaping is just validation + fixup

You're confused and annoyingly persistent.

Escaping (or quoting in general) is a simple translation from a literal representation of a string to a (lexical) syntax representation with the purpose of embedding the string in an external medium (e.g. source code written in that lexical syntax).

Escaping is a mechanical process that doesn't discriminate between "valid" and "invalid". It is completely ignorant to the higher-level meaning of the string that is translated (e.g. email address) but solely operates on the constituent characters.

That is in contrast to validation, which is a simple function that decides whether a given object is admissible or not (as you say yourself). "Admissible" here is in with respect to a meaning that is higher-level than lexical syntax. It is semantic (is this a valid email), not syntactic.

(There are sometimes certain technical restrictions on which values can be represented in a lexical syntax, for example hard limits on string lengths. So there is a small extent to which "validation" can fill a purpose with relation to lexical syntax, too - but that's not what we're discussing).

To make it even more confusing, email addresses conform to a (albeit poorly specified) lexical syntax, too. And you can certainly attempt to validate if a given string is valid email address. However, HTML syntax doesn't care about that. Email address syntax is not part of the HTML syntax. HTML specifies how to escape _strings_, not email addresses.

And HTML syntax is right not caring about email syntax because it would be unnecessary complication in practice.

Just as the other examples I gave. E.g. looking up email addresses from an address book is not a task that in practice needs to be more specific than looking up a string from a list of strings.

> Anyway the only way to validate an e-mail in practice is to use it and confirm.

Which was my initial statement "Just use it" that you heavily disagreed with.


> Escaping is a mechanical process that doesn't discriminate between "valid" and "invalid". It is completely ignorant to the higher-level meaning of the string that is translated (e.g. email address) but solely operates on the constituent characters.

It does discriminate between "valid" and "invalid". This symbol is "valid" and we don't need to do anything. This symbol is "invalid" and we need to escape it. Validation occurs throughout the whole abstraction stack. Not only at the level of meaning of an entire string.

> Which was my initial statement "Just use it" that you heavily disagreed with.

In the case of e-mail I don't disagree with you. It is however balls to the wall insane to say "Just use it" in general. Which was my point. Notice how my reply specifically mentions vulnerabilities that were caused by the "just use it" mantra.


> This symbol is "valid" and we don't need to do anything. This symbol is "invalid" and we need to escape it.

It's quite a stretch to call symbols that need to be escaped “invalid”. And it's often possible to escape without discerning between “valid” and “invalid” characters. For example, in HTML you might just convert all characters into numeric entities.

> It is however balls to the wall insane to say "Just use it" in general.

Good thing the parent didn't say it in general.




Consider applying for YC's Spring batch! Applications are open till Feb 11.

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: