I have made a framework that generates a HTML "DOM" tree on the server, as a tree of python objects, and then spits it out as a string to be sent to the client. The way it does this is via a recursive depth-first traversal of the tree: for example a div would spit out the opening "div", spit out all it's children's html and then spit out the closing "/div".
This tree is broken down into conceptual components, as shown below:
graph http://lhy.mit.edu/media/Flow_Chart.png
This only shows the first two levels of hierarchy; the actual site has many more: for example each comment in the comment bar is a self contained component, each button on the menu bar is a self contained component. As you can see, the various components do not need to be on the same depth in the tree. What constitutes a "component" is decided by me.
What I want is the complete html string for each component (everything from the root node of that component downwards), as well as the partial HTML string for every component (The HTML of that component, minus the HTML of its children). The partial HTML of main section, for example, would be the html, head and two div tags only. The complete html of main section, on the other hand, would be every node on the page.
How would i do this? I could just find the complete HTML string of every component and sub-component, mark the boundaries of each sub-component with some string and do Regex-Removals in order to find the partial HTML string for every component, but that feels clunky and inefficient.
I could do an iterative-deepening DFS, halting at the boundary between a component and its sub-components until every node in that component has been explored. I would then have the partial HTML for every component but then i would need to do a similarly hacky Regex-Inserts to later build up the complete HTML for every component.
I could do both, but that would take two passes and would be expensive, though maybe not as expensive as the above Regex gymnastics.
I could do a priority-queue Dijkstra's, having each component be strictly higher priority than its children. It would traverse the tree in the correct order, finishing each component before moving on to its children, but i have no idea how i would get the final well-formed HTML string out of it.
The purpose of all this is so the server can intelligently and completely autonomously determine the minimal set of components on the client's page that need to change on a page-transition between two arbitrary pages.
If i create a new page on my site, I should need no more than Zero extra lines of code to have it ajax smoothly with any existing page.
But first i need to get my graph-traversing html-spewing algorithms in order. Any ideas?