

Fun with fonts. Rendering obfuscated HTML. - woodall
http://christopherwoodall.com/ceasar/

======
imurray
[EDIT: I assume the main point was to make documents readable, but hard to
copy. My comment is about breaking this trivial copy protection scheme. I see
that other comments read the purpose of the font differently.]

Given a long document, one could easily crack this Caesar substitution cipher.
Of course one could also do OCR on the characters to learn the mapping.

Tangentially, these thoughts remind me of a cool Master's thesis [1] where
they did basic OCR, by clustering blobs of ink and solving the Caesar
substitution cipher, rather than trying to recognize the shapes of the
letters! That approach can be used to adapt a real OCR system to the current
document.

[1] <http://www.cs.toronto.edu/~scottl/research/msc_thesis.pdf>

------
AdleyEskridge
This is a really neat demo! That said, wouldn't this be extraordinarily
inaccessible to users employing screen-reading software? They'd "hear" a
jumble of characters.

~~~
woodall
This would be horrible for any one using screen readers. You could use
something like longdesc but then you've ruined your obfusciation. Really not
something one should apply in production.

The only saving grace for this might be in internal sites where you want to
display information to the user but do not want that user copy/pasting
sensitive information into emails/chat/ect.

------
jgrahamc
Could be made considerably stronger by using a homophonic substitution cipher.
Given that there are lots and lots of Unicode characters it would be pretty
easy to flatten the distribution to make letter frequency analysis hard.

~~~
pbhjpbhj
If you do things as a polyalphabetic substitution then, provided you limited
the length of the ciphertext by the available letter spaces in the font file,
couldnt you avoid frequency analysis completely? So for example each letter s
in the plaintext would be generated from a different ciphertext symbol.

------
Paul_S
Is this meant for copy protecting text on the website?

If I can see it I can copy it regardless of how fun you make the process.

You'd be better off just releasing it under a license of choice. Technical
means of protection are pointless.

------
pbhjpbhj
How about having a scrambled page and using a separate channel to distribute
the font. Yes it's still just a substitution cipher but various tricks could
be added to make it more interesting - you could add in steganography for
example (eg by a particular font characteristic) or use a sparse font file
generated for each para or have the font file as a one time pad (each
cyphertext letter is replaced by a word).

~~~
woodall
I've been playing with a few ways to abuse the browser. You /might/ be able to
hide the font in an image file[1] so that it's harder to see. I have a few
more tricks but can't show them at this moment.

[1] <http://jsfiddle.net/a2zK5/>

------
rblatz
The reverse of this is actually the interesting part. Give scrambled HTML to
the browser, use a font to convert it to human readable text.

Obviously this wouldn't stop anyone willing to put a bit of effort into
decoding it, but I bet lyric sites start using this soon.

Edit: Apparently I was confused like several others, but came to the right
conclusion anyways.

~~~
benholmen
Why would a lyric site want to obfuscate lyrics in the source code? It seems
like that'd be counterproductive since their traffic is largely driven by
search engines.

~~~
TazeTSchnitzel
Eh, they could render it differently for Googlebot.

~~~
nekitamo
Then they would get hit with a cloaking penalty when the googlebot figures out
its being cloaked.

------
MindTwister
This would be decoded in about 5 seconds, since the first thing I'd do would
be copying the text into my editor of choice...

~~~
woodall
Try to copy and paste the "ciphered" text into a text editor and see what you
get. You can take it a step farther and generate font files on the fly to make
it even harder to predict the sequence; you can even map the harder to see
unicode characters. By no means is it secure, but I thought it cute.

------
ecesena
Apart from the utility, it's really cool! Look forward to see Vigenère cipher
;)

------
themstheones
I'm missing the point. Who wouldn't view source / inspect element to see the
real text?

~~~
mistercow
I think you're thinking of it backwards. The source looks jumbled, but the
font that the browser renders the page with is readable because the font acts
as a substitution cipher.

Still pretty pointless though, since substitution ciphers are child's play to
break.

~~~
pbhjpbhj
> _Still pretty pointless though, since substitution ciphers are child's play
> to break._ //

It's an interesting result that probably hasn't been considered by most
people.

Also, child's play is big business.

~~~
mistercow
>It's an interesting result that probably hasn't been considered by most
people.

Is it? I mean it's cute to see it actually implemented, but this is something
I thought of the first time I read about substitution ciphers. I suppose it
is, if nothing else, an interesting way to introduce someone to some basic
cryptography concepts (assuming they already know about typefaces).

~~~
pbhjpbhj
The first time you saw substitution ciphers you thought "hey how about using
webfonts as a one time key"?

I guess because I learnt about such ciphers when our primary school only just
got it's first computer, running at 4MHz, the conflation of webfonts and
subst. ciphers never struck me before.

Before today I've seen this done with javascript and, TBH, I think that would
have been the first method that sprung to mind but I've not really ever
bothered to think about it.

Quick search only found one other example of this technique:
<http://eligrey.com/blog/post/tag/rot13> (rot13 fonts, lol) as well as this
<http://jsfiddle.net/QQ9WQ/> from the current author. It's hard to search as
there is a font called "cipher font", which of course is available as a
webfont! Of course there are lots of pages using js, like
<http://rumkin.com/tools/cipher/substitution.php> which does some funky
substitutions.

~~~
mistercow
Don't be glib now; of course I didn't think "webfonts". I thought "fonts". The
addition of "web" to the concept is trivial and uninteresting.

~~~
pbhjpbhj
Trivial I'll give you with reservations but I found it interesting as a
concept to develop. I think it's got more possibilities than just the simple
font, particularly the idea of creating the webfont on the fly.

------
woodall
Sorry, from the comments it's apperent that this was a horrible demo.

~~~
pbhjpbhj
Perhaps you could show the ciphertext page and then have a script operate when
the password (or button-click) is entered that alters the style of the
ciphertext parts so that they use the key font instead. Extra marks if you use
jQuery to fade through several font 'keys' before presenting the plaintext.
Also preload the font files.

IMO that'd be cool.

