Hacker News new | past | comments | ask | show | jobs | submit login

Because technically its not very sound. Examples of where "minifying" (removing whitespace) could go wrong if done across any served HTML document...

1) pre tags - being that pre tags take into account formatting, removing whitespace is going to alter how the content is rendered

2) any empty tags - its becoming a thing of the past but there are many instances where browsers will render a tag with a single space inside differently than a tag with nothing inside. In other words, space within a tag may be intentional by the developer.

3) spaces inside attributes may matter - you could have an attribute on an html tag that say is data-whatever="1 2\n 3" and potentially reducing those spaces could be bad - depends upon what the developer intended

Additionally there are some other things to consider...

1) GZIP if used will make the impact of scrubbing out whitespace almost nonexistent

2) Most HTML served is dynamic, meaning that the HTML compression will need to be run on every HTML response - this could have some performance negatives. (If your just compressing static HTML once it should be fine.)




None of your first three points are really a problem. No, you won't be able to do a simple regex based solution, but the rules for where whitespace matters in HTML are rigorously standardized. Obey them, and your minifier will work just fine.

>1) GZIP if used will make the impact of scrubbing out whitespace almost nonexistent

Maybe if you were just removing whitespace (although you still will see a difference). Removing comments and omitting optional closing tags will take you further. Minified JS compresses smaller than un-minified JS, so it's reasonable to think the same would be true of HTML.

>Most HTML served is dynamic, meaning that the HTML compression will need to be run on every HTML response - this could have some performance negatives. (If your just compressing static HTML once it should be fine.)

For templated HTML, the minification should be done on the template itself, not on the final output. You really do have to weigh the pros and cons of GZIPping dynamically generated HTML, so pre-minified HTML templates could be a pretty big win.


I don't know how the parent comment isn't rated higher, as I logged in just to say exactly that.

Gzip is basically exactly what the GP wants, and beyond that, whitespace in HTML is still, sadly, significant in places.




Guidelines | FAQ | Support | API | Security | Lists | Bookmarklet | Legal | Apply to YC | Contact

Search: