A fun way, though maybe there are much more productive ways, is to learn Scheme and/or Lisp, and, with a language that has a library for it, convert the html to a big s-expression. Then you have it in a form that is the form of the language itself, where you can literally do anything with it.