
Using type inference to make web templates robust against XSS - revorad
https://js-quasis-libraries-and-repl.googlecode.com/svn/trunk/safetemplate.html
======
ufo
I don't get why textual templates are so popular for generating HTML. Not only
do they tend to have weird rules regarding whitespace and indentation but they
also have no information about context when doing escaping (for example, just
escaping special characters will not protect you if you add user content to an
attribute and forget to add quotes since its easy to break out of the
attribute with whitespace)

    
    
       data-name=${foo}
       data-name=Hello onmouseover=evilJavascript
    

Using a document-oriented templating language avoids most of these problems
without needing to do HTML parsing to check things (and HTML parsing is
complicated and full of corner cases - its much worse than XML). (The article
says that this not expressive enough since you can't open a tag in a
subtemplate and close it in another one but I think you can see that as a
feature...)

~~~
sorbits
> _I don't get why textual templates are so popular […] you can't open a tag
> in a subtemplate and close it in another one but I think you can see that as
> a feature_

For many, it’s frustrating having to work with a model where you can’t split
up your template into multiple parts, without needing to make each part a
valid document on its own.

E.g. a blog would have templates for the page header/footer and article
header/footer.

It’s trivial to output the page header template, then iterate over the
articles and output the article header/footer around the article text, which
may be piped through various filters, and finally the page footer template.

I think we can all imagine how such system looks when the templates are “raw
text” (perhaps with some special tags for placeholders).

I am interested in seeing how it will look if we are to use a document-
oriented template language, as I certainly do like the idea of it, but I think
in practice it adds overhead and restrictions, which doesn’t make the extra
safety worth it.

~~~
nightpool
Why not just use a sub-template model then? Where you have a "page" template
that includes the "header", "content" and "footer" template, possibly even
decided at runtime?

~~~
sorbits
Such solution require much of the page-building logic (loops, conditions) to
be moved into the template language, and the state transfer that goes with it.

This means trivial tasks require editing multiple files, e.g. say we introduce
pagination: the master template now need to be told by the controller, if it
should include the “page n of m” template.

I assume ufo avoids this problem (and the limited capabilities of template
languages wrt. logic) by instead using host language libraries to generate the
HTML, but IMO that has the downside of limited interoperability, as the
templates are then just code written for some library.

------
CGamesPlay
The example in the solution sketch leads me to believe the author is
misdirected. URLs have different attack vectors than HTML and have different
escaping rules. In the solution sketch given, the HTML might be safe, but I
can easily put any parameters I want into the generated URL.

When designing systems like this, each security boundary should be handled
separately. Create a URL class that supports escaping fields safely. Create an
HTML template engine that supports escaping contents safely. URL.toString will
give a guaranteed-safe URL, which can be inserted into a guaranteed-safe
attribute later on.

~~~
ufo
My impression was that they were trying to go for a more "middle ground"
approach to things that made it easy to convert old existing templates
(written in less safe dialects like jQuery templates). Since the legacy code
allows you to have inline URLs that you put things in you can't force the
users to refactor things to generate all the URLs outside the templates.

------
cdoxsey
Go's HTML templates implement similar escaping based on context:
(<http://golang.org/pkg/html/template/>)

"This package understands HTML, CSS, JavaScript, and URIs. It adds sanitizing
functions to each simple action pipeline, so given the excerpt

<a href="/search?q={{.}}">{{.}}</a>

At parse time each {{.}} is overwritten to add escaping functions as
necessary. In this case it becomes

<a href="/search?q={{. | urlquery}}">{{. | html}}</a>"

~~~
ufo
Do you know what approach the go template package actually uses to do the
escaping? The original article is pretty comprehensive as to what alternatives
people tend to use and lists pros and cons for most of them. However, when I
searched to source code I found a link[1] back to the original article so now
I'm just more confused!

[1] <http://golang.org/src/pkg/html/template/doc.go#L169>

~~~
carbocation
Here is escape.go from html/template :

[https://code.google.com/p/go/source/browse/src/pkg/html/temp...](https://code.google.com/p/go/source/browse/src/pkg/html/template/escape.go?name=release-
branch.go1)

~~~
ufo
Wait a minute. If its just adding escaping functions to the templates, who is
actually running the templates?

