What layer can you realistically have below the DOM? If you get rid of it, how can you realistically avoid reimplementing everything? I don't think this makes sense.
I think UIKit (and possibly Android, which I haven't worked with) has the right idea: nested, texture-backed views that can render their own content.
You might argue that canvas and WebGL nodes in the DOM already serve this role, but I disagree. In their current form, these nodes exist inside the document-based world of the web rather than containing it, an inversion of what I see as the proper hierarchy. This arrangement poses a number of problems for designing rich apps, including very poor performance related to content reflow. Some companies[1] are trying to fix this by manually doing all their rendering inside canvas and re-implementing HTML and CSS along the way. Unfortunately, this is a ton of work and results in a web experience that is non-standard in many ways, including for things like accessibility and text selection. The fact that this actually does work to significantly improve the user experience, however, points to the fact that something needs to drastically change for the web to remain healthy, useful, and relevant.
My understanding is that the DOM is already implemented as a series of texture-backed views in many browsers. That would still remain the same — nobody would have to reimplement this functionality. I just think it would be a great idea for us to be able to use those same texture-backed views for custom UI unrelated to HTML and CSS, and to separate the rendering concerns of text and document flow from the design of user interfaces.
In my experience, HTML-style layout is just horrible for app UI. It's really much better suited for text. On iOS, you can either use autolayout (constraint-based layout) or alternatively perform layout manually, in code, for that purpose.
I think it should be a subset of DOM / CSS designed for speed and flexibility. A normal browser could render the page correctly but an optimized browser for apps would be able to make some assumptions to render everything much faster. Something like asm.js but for DOM/CSS.