29.11.2012 Views

2nd USENIX Conference on Web Application Development ...

2nd USENIX Conference on Web Application Development ...

2nd USENIX Conference on Web Application Development ...

SHOW MORE
SHOW LESS

Create successful ePaper yourself

Turn your PDF publications into a flip-book with our unique Google optimized e-Paper software.

HtmlParser<br />

CssParser<br />

IHtmlParser<br />

ICssParser<br />

IDownloadManager<br />

JS Engine<br />

DOM implementati<strong>on</strong><br />

UI<br />

Event loop<br />

Node, Element, . . .<br />

Default download<br />

manager<br />

Browser executable<br />

WinForms<br />

renderer<br />

ILayoutListener<br />

Layout<br />

Figure 1: C3’s modular architecture<br />

Assembly<br />

Class<br />

Interface<br />

Communicati<strong>on</strong><br />

Implementati<strong>on</strong><br />

Threads<br />

IRenderer<br />

browsers. Using a higher-level language better preserves<br />

abstracti<strong>on</strong>s and simplifies many implementati<strong>on</strong> details.<br />

Code C<strong>on</strong>tracts [4] are used throughout C3 to ensure<br />

implementati<strong>on</strong>-level invariants and safety properties—<br />

something that is not feasible in existing browsers.<br />

Below, we sketch C3’s module-level architecture, and<br />

elaborate <strong>on</strong> several core design choices and resulting<br />

customizati<strong>on</strong> opportunities. We also highlight features<br />

that enable the extensi<strong>on</strong> points examined in Secti<strong>on</strong> 3.<br />

2.1 Pieces of an HTML platform<br />

The primary task of any web platform is to parse, render,<br />

and display an HTML document. For interactivity,<br />

web applicati<strong>on</strong>s additi<strong>on</strong>ally require the managing of<br />

events such as user input, network c<strong>on</strong>necti<strong>on</strong>s, and script<br />

evaluati<strong>on</strong>. Many of these sub-tasks are independent; Figure<br />

1 shows C3’s module-level decompositi<strong>on</strong> of these<br />

tasks. The HTML parser c<strong>on</strong>verts a text stream into an<br />

object tree, while the CSS parser recognizes stylesheets.<br />

The JS engine dispatches and executes event handlers.<br />

The DOM implementati<strong>on</strong> implements the API of DOM<br />

nodes, and implements bindings to expose these methods<br />

to JS scripts. The download manager handles actual network<br />

communicati<strong>on</strong> and interacti<strong>on</strong>s with any <strong>on</strong>-disk<br />

cache. The layout engine computes the visual structure<br />

and appearance of a DOM tree given current CSS styles.<br />

The renderer displays a computed layout. The browser’s<br />

UI displays the output of the renderer <strong>on</strong> the screen, and<br />

routes user input to the DOM.<br />

2.2 Modularity<br />

Unlike many modern browsers, C3’s design embraces<br />

loose coupling between browser comp<strong>on</strong>ents. For example,<br />

it is trivial to replace the HTML parser, renderer<br />

3<br />

fr<strong>on</strong>tend, or JS engine without modifying the DOM implementati<strong>on</strong><br />

or layout algorithm. To make such drop-in<br />

replacements feasible, C3 shares no data structures between<br />

modules when possible (i.e., each module is heapdisjoint).<br />

This design decisi<strong>on</strong> also simplifies threading<br />

disciplines, and is further discussed in Secti<strong>on</strong> 2.7.<br />

Simple implementati<strong>on</strong>-agnostic interfaces describe the<br />

operati<strong>on</strong>s of the DOM implementati<strong>on</strong>, HTML parser,<br />

CSS parser, JS engine, layout engine, and fr<strong>on</strong>t-end renderer<br />

modules. Each module is implemented as a separate<br />

.Net assembly, which prevents modules from breaking abstracti<strong>on</strong>s<br />

and makes swapping implementati<strong>on</strong>s simple.<br />

Parsers could be replaced with parallel [8] or speculative 4<br />

versi<strong>on</strong>s; layout might be replaced with a parallel [11] or<br />

incrementalizing versi<strong>on</strong>, and so <strong>on</strong>. The default module<br />

implementati<strong>on</strong>s are intended as straightforward, unoptimized<br />

reference implementati<strong>on</strong>s. This permits easy permodule<br />

evaluati<strong>on</strong>s of alternate implementati<strong>on</strong> choices.<br />

2.3 DOM implementati<strong>on</strong><br />

The DOM API is a large set of interfaces, methods and<br />

properties for interacting with a document tree. We highlight<br />

two key design choices in our implementati<strong>on</strong>: what<br />

the object graph for the tree looks like, and the bindings<br />

of these interfaces to C # classes. Our choices aim to<br />

minimize overhead and “boilerplate” coding burdens for<br />

extensi<strong>on</strong> authors.<br />

Object trees: The DOM APIs are used throughout the<br />

browser: by the HTML parser (Secti<strong>on</strong> 2.4) to c<strong>on</strong>struct<br />

the document tree, by JS scripts to manipulate that tree’s<br />

structure and query its properties, and by the layout engine<br />

to traverse and render the tree efficiently. These clients<br />

use distinct but overlapping subsets of the APIs, which<br />

means they must be exposed both to JS and to C # , which<br />

in turn leads to the first design choice.<br />

One natural choice is to maintain a tree of “implementati<strong>on</strong>”<br />

objects in the C # heap separate from a set<br />

of “wrapper” objects in the JS heap 5 c<strong>on</strong>taining pointers<br />

to their C # counterparts: the JS objects are a “view”<br />

of the underlying C # “model”. The JS objects c<strong>on</strong>tain<br />

stubs for all the DOM APIs, while the C # objects c<strong>on</strong>tain<br />

implementati<strong>on</strong>s and additi<strong>on</strong>al helper routines. This design<br />

incurs the overheads of extra pointer dereferences<br />

(from the JS APIs to the C # helpers) and of keeping<br />

the wrappers synchr<strong>on</strong>ized with the implementati<strong>on</strong> tree.<br />

However, it permits specializing both representati<strong>on</strong>s for<br />

their respective usages, and the extra indirecti<strong>on</strong> enables<br />

4 http://hsiv<strong>on</strong>en.iki.fi/speculative-html5-parsing/<br />

5 Expert readers will recognize that “objects in the JS heap” are<br />

implemented by C # “backing” objects; we are distinguishing these from<br />

C # objects that do not “back” any JS object.<br />

<str<strong>on</strong>g>USENIX</str<strong>on</strong>g> Associati<strong>on</strong> <strong>Web</strong>Apps ’11: <str<strong>on</strong>g>2nd</str<strong>on</strong>g> <str<strong>on</strong>g>USENIX</str<strong>on</strong>g> <str<strong>on</strong>g>C<strong>on</strong>ference</str<strong>on</strong>g> <strong>on</strong> <strong>Web</strong> Applicati<strong>on</strong> <strong>Development</strong> 63

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!