Hacker News new | comments | ask | show | jobs | submit login

Great list. A few questions:

* How could this be used to test 'corrupt' characters? Isn't the process of savign the file itself as UTF-8 un-corrupt...the file?

* Is there some recommended way to group these into "strings that should pass validation" versus "strings that should fail"... or is that too application-specific?

Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact