August 9, 2011
This article was contributed by Nathan Willis
Mozilla has released a Firefox extension named Tilt that renders web pages as 3D stacks of box-like elements. The 3D structure, so the thinking goes, will offer web developers an important visualization tool when debugging pages. In addition to its practical value, however, Tilt is also a live demonstration of Firefox's WebGL stack, an emerging API for displaying 3D content within the browser.
Preview videos
of Tilt were made available in early June, but the first
publicly-installable version did not hit the web until the end of July.
The code is hosted on
developer Victor Porof's GitHub page; anyone interested in taking Tilt for
a whirl can download the .xpi file from the bin/
directory there and manually install it from Firefox's Add-ons Manager.
Firefox 4.0 or newer is required. The source is available there as well, of
course.
Once they have installed it, users can activate Tilt from Firefox's
Tools menu (or with the key combination Control-Shift-M). This activates
the Tilt visualization for the current tab only. The page is rendered as a
3D "mesh" in WebGL; elements (including all text and images) are in
full-color, which makes a head-on view look virtually identical to the
original page (albeit shrunken down by about 25% to make it easier to
manipulate). However the depth of the page elements' boxes are drawn in a
flat, opaque gray. You can see how many levels deep the stack is when
looking at a side or angled view, but you cannot tell which elements are
which.
The visualization is rendered within the normal Firefox content area.
Technically, it is drawn with WebGL on a canvas element that
is overlaid on the normal frame, so that the page persists without
interrupting any ongoing operations while it is hidden. As a result,
navigation via the location bar, bookmarks and forward/back buttons is
still possible. Any such navigation does switch the Tilt visualization
off, however — as does switching into a different tab and back
again.
Although the normal Gecko-rendered version of the page persists in the
background, the "Tilt view" of a given page is a read-only structure
generated like a snapshot. That is, you cannot interact with the page's
contents at all. Instead, your mouse and keyboard function as
spatial navigation controls within the 3D space. Mouse movement with the
left button held down twists and rotates the visualization in three
dimensions (in what Porof calls "virtual trackball" mode in
the GitHub documentation). Holding down the right button enables
left-right and up-down panning. The scroll wheel zooms in and out. The
arrow and W-A-S-D keys provide keyboard access to the same controls.
You can also double-click any page element in the visualization and bring up a sub-window containing the HTML code that corresponds to it. For elements on the "top" of the stack (which means the innermost-nested elements in the page), only the topmost contents are displayed. For elements lower in the stack, the pop-up window shows the HTML for the element you clicked on highlighted in blue, plus the child elements nested within it on an un-highlighted background. That can be helpful to trace through peculiar-looking stacks. The extension also renders a small "help" button (which displays the keyboard and mouse commands) and an "exit" button that returns you to normal browsing.
Tilting at elements
Obviously, finding the right 3D box to click on mid-way down the stack somewhere on a crowded page can involve a minute or two of zooming, panning, and manipulating the Tilt visualization, but that is precisely what the extension lets you do: separate out the layers of the document in a way that the normal 2D render does not offer. At the heart of Tilt's functionality is the tree-like structure of the document object model (DOM). Tilt takes the DOM elements in nested order, starting with body, and draws one layer for each. Every element within is rendered as its own layer stack on top of its parent: div, span, ul, img, etc.
The elements' dimensions and X,Y position are scraped from the already-rendered representation of the page (so that contents are not re-rendered to display or to update the the 3D visualization). Thus nested elements stack naturally on top of one another. Special treatment is given to off-screen elements (such as iframes or divs that were not displayed when Tilt was switched on); they float by themselves above the top edge of the page's main body stack.
In practice, because Tilt grabs the internal representation of the page
without being aware of the screen height, those pages that are more than
one screen-full tall appear to be extremely long in the Y direction and take some panning to inspect. Also, although the Z-ordering of element usually makes the relationship between them clear, there are some peculiar cases where elements seem to float above their parents with nothing in between, or are physically larger than the elements beneath them.
That is probably just the magic of HTML at work. After all, elements can be positioned absolutely rather than relatively — and that should logically interfere with the apparent "stacking" of boxes in Tilt. Still, the binary version of the extension offers a minimalist interface. Screenshots from Porof's blog entries on mozilla.com show that a richer UI is in the works, which ought to make inspecting the DOM easier. A July 27 entry, for example, shows a "thumbnail" navigator that offers a whole-page overview, as well as a DOM tree navigator and control over the thickness and spacing of elements' boxes.
I ran Tilt in Firefox 5.0 on a quad-core Phenom machine with NVIDIA
600GT graphics; 3D performance was adequate for twisting and rotating
the visualization stack — if not exactly snappy. In particular,
zooming in and out produced some noticeable lag, as did generating the
initial 3D mesh view for notoriously complex pages like those served up by
your favorite social networking sites. Tilt does not re-fetch or re-render
the page contents, so all of the lag is attributable to creating the 3D
mesh itself.
I am certain it is a tricky proposition (and Porof has discussed its
challenges in his Tilt blog
posts); to me the only takeaway from the speed issues is a lingering doubt
about the viability of WebGL on systems that do not support full hardware
acceleration. Inspecting a web page in 3D is not a speed-sensitive task,
but editing 3D content or playing live games would be. WebGL is a
derivative of OpenGL ES 2.0, so it is a well-established standard, and
Firefox has supported it since 4.0. However, currently only the Nvidia
binary OpenGL drivers support
WebGL hardware acceleration on Linux using Firefox 4 and 5, which leaves
out a significant number of users. Firefox 6 changes the way the browser
detects the video card driver and thus "whitelists" more OpenGL
drivers.
Inspection versus modification
At the moment Tilt is limited to displaying the DOM frozen at a single moment of time (and more specifically, before the extension was activated). That allows the user to visualize the depth and relationship between page elements, which can make for decent static analysis. But to make Tilt useful for developers, the team is working on exposing an HTML and CSS editor component and making Tilt cope with dynamic content.
That planned enhancement of the extension has two distinct parts: making the 3D mesh itself modifiable on-the-fly, and integrating an HTML editor. Making the mesh modifiable (as opposed to a static snapshot of the page) has other benefits as well; it would be able to show CSS transformations and animation, and potentially could be used to make the 3D visualization interactive. Seeing how the DOM responds to interactivity would be valuable to developers (plus, the ability to navigate between pages in 3D view would just plain look cool.).
An HTML editor inside the extension would also make Tilt more useful for debugging, as it would allow live updating of the DOM without the multi-step process currently required of reloading the page and then re-enabling the Tilt extension. Porof discusses this work in the July 27 blog post referenced above, saying that the current HTML display component (lifted from the Ace editor) will need to be replaced, and a less memory-intensive method for drawing the WebGL content developed.
There are other possibilities further out, such as visually distinguishing between elements with absolute and relative positioning, and the ability to "zoom in" to a specific DOM element and restrict display to that element and its children alone. Both of those features imply additional UI design — browsing the Tilt videos on YouTube, it is clear that the team iterated through several different looks before settling on the current one, and making large quantities of small HTML element blocks easy to scan visually is not simple.
Apparently users have also asked for the ability to export the 3D mesh
from the extension to a file, which would open the door to all kinds of new
cross-page analysis opportunities. Tilt has already begun to attract
attention from web developers who have begun to propose their own ideas — such
as "editing" page contents by moving and restacking the blocks in the
3D visualization itself.
That last suggestion would clearly demand significantly more work, so we should probably not expect to see it anytime soon. But Tilt's rapid progress is encouraging. At the moment, simple read-only inspection is all that the binary XPI provides, but that alone can be a useful debugging tool. It is similar to the structure-revealing functionality of the Web Developer extension with its outline tools, but the addition of a third dimension automatically brings some problems right to the forefront. Making that technique interactive for the user can only make it more valuable.
(
Log in to post comments)