A report from the Google Test Automation Conference

May 1, 2013

This article was contributed by David Röthlisberger

The focus of 2013's Google Test Automation Conference, held April 23 and 24 in New York City, was "Testing Media and Mobile". A major theme was WebDriver, which is an API for automating web browsers. Mozilla presented its work on WebDriver support in Gecko and extensions to WebDriver to allow automated testing of FirefoxOS beyond just the Gecko-powered content layer. Google talked about WebDriver support in Chrome/Chromium, including Chrome on Android. Others demonstrated FOSS software that re-purposes the WebDriver API for testing native mobile applications on Android and iOS. WebDriver seems to be gathering a lot of momentum in the world of automated user interface (UI) testing.

WebDriver

Selenium is a commonly used tool for web browser automation that is primarily used to automate UI-level tests for web pages and rich web applications. Selenium controls the browser by sending mouse and keyboard events, and gathers results by querying the state of particular document object model (DOM) elements. So, for example, a test script for an online contact-management application might click an "add contact" button (that loads a dialog via Ajax), then check that certain DOM elements have appeared, and that they contain a particular text. You target DOM elements using XPath locators. Selenium also comes with an IDE (implemented as a Firefox plugin) that "records" test scripts by watching the mouse and keyboard.

This type of testing is not a replacement for unit testing or isolated component testing; but as Ken Kania pointed out in his talk [video], you still want some guarantees that what the user sees in the UI is correct.

Selenium initially relied on JavaScript run in the browser to simulate user interactions; but since version 2.0 (released in 2011) Selenium incorporates WebDriver. WebDriver fires events natively at the OS level, thus overcoming many limitations in the JavaScript-based approach.

WebDriver has three parts: A protocol (using JSON over HTTP and now a W3C working draft); a "local" implementation, which is a library used by the automated test scripts (implementations exist for a variety of programming languages); and a "remote" implementation, called the "browser driver" that is implemented as a browser plugin (Firefox, Safari) or as a separate executable that communicates with the browser's own debugging and inspection interfaces (Internet Explorer, Chrome, Opera). The Chrome and Opera implementations are officially supported by the respective browser vendor. There are also WebDriver implementations for mobile browsers. These are typically implemented as an app that uses a WebView object — i.e. the WebDriver app contains a browser in-process. These implementations extend the WebDriver protocol with mobile-specific APIs (finger gestures, screen orientation, etc.).

A WebDriver test script in Python might look like this (taken from a slide from Kania's talk):

    driver = webdriver.Chrome('chromedriver.exe')
    driver.get('http://www.google.com/xhtml')
    search_box = driver.find_element_by_name('q')
    search_box.send_keys('ChromeDriver')
    search_box.submit()
    self.assertTrue('ChromeDriver' in driver.title)
    driver.quit()

This script starts up a WebDriver "browser driver" targeting the Chrome browser (apparently on Windows). It then has the browser make an HTTP GET request, waits for the response (which contains an HTML form), and fills in and submits the form. Finally, the script does the actual test, which is to assert that the submitted text is present in the title of the HTML page sent as a response to the form submission. Real-world test frameworks would decouple the details of which particular browser is being tested (in this case, Chrome) from the test script itself, but this example gives the idea.

ChromeDriver

Kania, who is from the Google Chromium team, talked about Google's work on ChromeDriver (WebDriver for Chrome). ChromeDriver has been available for the desktop browser on Linux, OS X, and Windows for over a year; it is available in the Chromium source repository.

Recently ChromeDriver was re-architected into "ChromeDriver2" to support more platforms. Currently, the only addition is Android, and still at alpha-level quality; if you want support for Chrome on Chrome OS or iOS, "you have to be patient... or contribute", Kania said.

ChromeDriver is a stand-alone executable that speaks with WebDriver on one side, and communicates with the browser on the other side. Since ChromeDriver2, it uses the Chrome DevTools protocol to control the browser (see Kania's ChromeDriver architecture slide). Kania noted that Chrome is a multi-process browser, and the main browser process already uses this DevTools protocol to communicate with the renderer processes.

Michael Klepikov, from Google's "make the web faster" team, talked [video] about Google's work to make the profiling information from Chrome Developer Tools programmatically available to WebDriver scripts. This slide shows the specific API to enable this profiling. Klepikov demonstrated a test script that launches Chrome, goes to news.google.com, searches for "GTAC 2013", and switches to web (as opposed to news) results, all via the UI; it (presumably) also asserts that certain results are displayed. He demonstrated this test both on the desktop Chrome browser (video), and on Chrome running on a real Android phone (video). When the test completes, it prints the URL of a web server on localhost (powered by WebPagetest) serving an interactive display of the timing measurements, which can also be viewed within the Chrome Developer Tools interface.

Marionette: FirefoxOS meets WebDriver

Malini Das and David Burns from Mozilla presented [video] the work done on automated testing for FirefoxOS. The FirefoxOS runtime engine is based on Gecko (Firefox-the-browser's rendering engine), so the phone's user interface, as well as the bundled apps and third-party apps, are written using HTML5, JavaScript, and CSS.

The existing Firefox WebDriver implementation (a browser plugin) works for the desktop browser, but not for FirefoxOS. So Mozilla has built WebDriver support directly into Gecko, under the name Marionette. Why WebDriver? The Firefox test suite contains thousands of Selenium WebDriver-based tests, and there are thousands of web developers already familiar with WebDriver.

Marionette allows full control of the phone (including telephony functions), and not just within an individual application. Marionette scripts can even control multiple devices at once (to test a call from one device to another, say). Mozilla has written client-side libraries (what the WebDriver spec calls the "local" side: the automated test script) for JavaScript and Python.

WebDriver isn't fully specified for gesture-based input, so Mozilla has added its own extensions. You can define gestures by chaining together actions (a "pinch" could be implemented as press-move-release with one finger, plus press-hold-release on another finger). Another Mozilla extension to WebDriver is the setContext function, to switch between the "content" and the privileged "chrome" contexts. Third party application developers only need to test at the "content" level, but testing FirefoxOS itself often requires access to privileged functions.

Marionette supports (non-UI) unit tests too, and can run the hundreds of thousands of tests that already exist for Firefox. Marionette can run these tests against Firefox on the desktop, the Boot to Gecko simulator, FirefoxOS device emulators, and real FirefoxOS devices. David demonstrated some geolocation unit tests running on an emulator, though there wasn't much to see other than the emulator booting up, plus some text output on a console. Unfortunately the demonstration of an actual UI test running on a real phone encountered technical difficulties. "Let me demo how quickly the phone resets", David said to much laughter.

Of course there are security concerns with allowing a phone to be controlled remotely. For now, Marionette is only available in engineering builds; Mozilla is working on a way to deploy it for third-party developers.

Appium: automation for native apps

Jonathan Lipps from Sauce Labs started his talk [video] with a complaint: while "mobile is taking over the world", test automation for mobile apps is extremely underdeveloped.

Lipps is the lead developer of Appium, a test automation framework for native (and hybrid) mobile apps; it is released under the Apache 2.0 license. Appium consists of a server that listens for WebDriver commands, and translates them into implementation-specific commands for the particular device under test (Android or iPhone, real device or emulator). Re-using the WebDriver protocol (originally intended for in-browser web applications) means that users can write tests in any existing WebDriver-based framework.

Appium extends WebDriver with mobile-specific features by hijacking WebDriver's executeScript command (you pass in a string starting with "mobile:" instead of actual JavaScript). Presumably a more official mechanism will be worked out eventually.

Behind the scenes, Appium uses the operating system's official automation frameworks: UI Automation and Instruments on iOS; uiautomator on Android. Appium also has an optional backend for the community project Selendroid which, unlike uiautomator, works on versions of Android prior to Jelly Bean.

Lipps demonstrated [video] Appium's brand-new support for FirefoxOS by running a test script against the FirefoxOS simulator. The test launched the Contacts app, asserted that Lipps's name wasn't present in the list of contacts, clicked the "add contact" button, added him as a contact, and finally asserted that the new contact was present in the list view. Lipps had added FirefoxOS support during the conference, since Burns and Das's talk the previous day, which really shows the advantages of open standards like WebDriver.

Android uiautomator

Guang Zhu (朱光) and Adam Momtaz from Google covered [video] in some detail the Android uiautomator framework. Much like Selenium WebDriver tests, a test for a mobile application needs a way to control the mobile device, and a way to receive feedback on the state of the device; Zhu provided a survey of the approaches taken by existing Android automation frameworks (slide 1, slide 2, slide 3). uiautomator uses the InputManager system service for control, and the Accessibility service for feedback. This means it will only work if the application's widgets meet accessibility guidelines — which will hopefully encourage more developers to make their apps accessible.

Set-top box testing with GStreamer and OpenCV

I (David Röthlisberger) demonstrated [video] how YouView uses GStreamer and OpenCV for video capture and image processing to run black-box automated tests against YouView's set-top box product. Using GStreamer's gst-launch command-line utility, I showed how easy it is to capture video and insert image-processing elements into the GStreamer media pipeline. (Incidentally, the ability to create a prototype so quickly using GStreamer's command-line tools was crucial to obtaining management buy-in for this project at YouView.)

Using these technologies, YouView built stb-tester, released publicly under the LGPL v2.1+. stb-tester scripts are written in Python, and have two primitive operations available: press to send an infrared signal to the system under test (pretending to be a human user pressing buttons on a remote control), and wait_for_match to search for a given image in the system-under-test's video output.

I also showed some examples of "model-based testing" built on top of stb-tester: First [video], a script that knows how to navigate within a menu arranged as two rows of thumbnail images (aka a double carousel), so instead of saying "press UP, then LEFT, etc.", the script says "navigate to entry X". Second [video], a state machine of YouView's "setup wizard", that allows test scripts to generate random walks through the possible user choices during the setup wizard.

In spite of the name, stb-tester can be used to test any consumer electronic device, not just a set-top box, as long as it outputs video and can be controlled via infrared. See stb-tester.com for documentation and introductory material.

Who's responsible for testing?

"Your developers — or worst case, test organization — produce tests."
—Michael Klepikov [video]

Another major theme of the conference, which was more on the people and process side, was that the responsibility for testing is shifting from a separate test team to the developers themselves. In any organization following modern development practices, developers will write unit tests for their code; but end-to-end integration testing, performance testing, and security testing have traditionally been the domains of separate teams. Even Google has separate positions for "Software Engineer" and "Software Engineer in Test", and there seems to be a difference in status between those two positions — though judging from conversations with various Googlers, that difference is decreasing.

Claudio Criscione, who works on web security at Google, gave a talk [video] on an internal tool his team developed to find cross-site scripting vulnerabilities; he said they put a lot of effort into making the tool easy to use, so that development teams could run the security tests against their own applications, taking the load off the security team. Similarly, James Waldrop pointed out in his talk [video] that his performance testing team at Twitter can't possibly scale up to test all the systems being produced by the various development teams, so instead his team provides tools to make it easy for the developers to write and run performance tests themselves. Specifically, he presented Iago, which is a load generator developed at Twitter and released under the Apache 2.0 license.

Simon Stewart, who created WebDriver, leads the Selenium project, and now works at Facebook, said in his talk [video] that Facebook has no such thing as a "Software Engineer in Test", only "Software Engineers". Furthermore, Facebook has no test or QA departments. The developers are responsible for testing.

About the conference

The conference is in its 7th year (though it skipped a year in 2012). Attendance is free of charge, but by invitation and limited to around 200 technical participants; applications are announced each year on the Google Testing Blog. Live streaming was available to the public, along with a Google Moderator system for the remote audience to submit questions to the speakers.

It was a very polished conference — you can tell Google has done this before. It was held in Google's New York City office; meals were catered by Google's famous cafeteria. A sign-language interpreter was on site, and stenographers provided live transcripts of the talks. Videos and slides are available here.

Index entries for this article
GuestArticles	Rothlisberger, David
Conference	Google Test Automation Conference/2013

A report from the Google Test Automation Conference

Posted May 4, 2013 0:19 UTC (Sat) by giraffedata (guest, #1954) [Link] (4 responses)

Facebook has no test or QA departments. The developers are responsible for testing.

If they actually mean the same individual who designs and writes code test it, I would word that differently: Facebook doesn't test its code.

I don't consider the running of code that the coder does testing. It's just part of an iterative trial-and-error development process. To be worthy of being called "testing" I expect the work to be done by someone whose judgment isn't clouded by his expectation that the code works or fear that it doesn't. Or the fact that time spent testing takes away from time that could be spent doing something more fun like designing and coding.

Testing at Facebook

Posted May 4, 2013 11:48 UTC (Sat) by drothlis (guest, #89727) [Link] (3 responses)

> I don't consider the running of code that the coder does testing.

If you mean a developer running a program as he or she is developing it,
then I agree with you completely. Simon's talk was about *automated*
testing.

> time spent testing takes away from time that could be spent doing
> something more fun like designing and coding.

I imagine that Facebook looks for developers that don't think testing is
beneath them. Automated testing has its own engineering challenges that
are just as fun to design & code, as production software has. :-)

Testing at Facebook

Posted May 4, 2013 12:07 UTC (Sat) by drothlis (guest, #89727) [Link]

BTW Simon talks about this at 26:42 [video] if you want some additional context.

He does mention a form of manual testing: Facebook employees run beta versions of their apps on their own phones (36:03 [video]). I think Facebook's product is particularly well suited to this, and it doesn't apply to all software projects -- you obviously can't dogfood your banking backend or aircraft control system at home.

Testing at Facebook

Posted May 4, 2013 18:26 UTC (Sat) by giraffedata (guest, #1954) [Link] (1 responses)

Simon's talk was about *automated* testing.

And in that context, I assumed the testing he was talking about was developing the test cases for the automated test machinery to run. And that's something I would want done by someone other than the author of the code to consider it a real test.

I imagine that Facebook looks for developers that don't think testing is beneath them.

They would also have to look for developers who enjoy testing as much as designing and coding.

Of course, they might let some talented, affordable designers and coders go to competitors in so doing.

Testing at Facebook

Posted May 7, 2013 17:15 UTC (Tue) by drothlis (guest, #89727) [Link]

> developing the test cases [... is] something I would want done by
> someone other than the author of the code to consider it a real test.

That's a very important point. Some possible solutions (or at least
mitigations):

+ Code review: Tests should be given just as much, or more, scrutiny
during code reviews as the code-under-test itself.

+ Pair programming: Pair up on a given feature, with one developer
writing the tests and the other developer writing the implementation;
swap roles for the next feature/iteration/sprint.

+ Integration tests: End-to-end integration tests will, by their nature,
cover more than a single developer's work, so the developer writing
the test is, by definition, not testing only his or her own code.
The same is true for isolated component tests, depending on the size
of the component.