The focus of 2013's
Google Test Automation Conference,
held April 23 and 24 in New York City, was "Testing Media and Mobile". A major
theme was WebDriver, which is an API for automating web browsers. Mozilla presented its work on WebDriver
support in Gecko and extensions to WebDriver to allow automated testing
of FirefoxOS beyond just the Gecko-powered content layer. Google talked about
WebDriver support in Chrome/Chromium, including Chrome on Android. Others
demonstrated FOSS software that re-purposes the WebDriver API for testing native
mobile applications on Android and iOS. WebDriver seems to be
gathering a lot of momentum in the world of automated user interface (UI)
Selenium is a commonly used tool for web
browser automation that is primarily used to automate UI-level tests for web
pages and rich web applications. Selenium controls the browser by
sending mouse and keyboard events, and gathers results by querying the
state of particular document object model (DOM) elements. So, for example, a
test script for an
online contact-management application might click an "add contact"
button (that loads a dialog via
check that certain DOM elements have appeared, and that they contain a
particular text. You target DOM elements using XPath locators. Selenium also comes with an
IDE (implemented as a Firefox plugin) that "records" test scripts by
watching the mouse and keyboard.
This type of testing is not a replacement for unit testing or isolated
component testing; but as Ken Kania pointed out in his
talk [video], you
still want some guarantees that what the user sees in the UI is
user interactions; but since version 2.0 (released in 2011) Selenium
incorporates WebDriver. WebDriver fires events natively at the OS level,
WebDriver has three parts: A protocol (using JSON
over HTTP and now a
W3C working draft); a
"local" implementation, which is a library used by the automated test scripts
(implementations exist for a variety of programming languages); and a
"remote" implementation, called the "browser driver" that is implemented as a
Safari) or as a
separate executable that communicates with the browser's own debugging and inspection interfaces
The Chrome and Opera implementations are officially supported by the
respective browser vendor.
There are also WebDriver implementations for mobile browsers. These
are typically implemented as an app that uses a
object — i.e. the WebDriver app contains a browser in-process. These
implementations extend the WebDriver protocol with mobile-specific APIs
(finger gestures, screen orientation, etc.).
A WebDriver test script in Python might look like this (taken from a
from Kania's talk):
driver = webdriver.Chrome('chromedriver.exe')
search_box = driver.find_element_by_name('q')
self.assertTrue('ChromeDriver' in driver.title)
This script starts up a WebDriver "browser driver" targeting the
Chrome browser (apparently on Windows). It then has the browser make an HTTP
GET request, waits for the response (which contains an HTML form), and
fills in and submits the form. Finally, the script does the actual
which is to assert that the submitted text is present in the title of
the HTML page sent as a response to the form submission. Real-world test
frameworks would decouple the details of which particular browser is
being tested (in this case, Chrome) from the test script itself, but this
gives the idea.
Kania, who is from the Google Chromium team, talked about
Google's work on ChromeDriver (WebDriver for Chrome). ChromeDriver has
been available for the desktop browser on Linux, OS X, and Windows for
over a year; it is
in the Chromium source repository.
Recently ChromeDriver was re-architected into "ChromeDriver2" to support
more platforms. Currently, the only addition is Android, and still at
alpha-level quality; if you want support for Chrome on Chrome OS or iOS,
"you have to be patient... or contribute", Kania said.
ChromeDriver is a stand-alone executable that speaks with WebDriver on one
side, and communicates with the browser on the other side. Since
ChromeDriver2, it uses the Chrome
protocol to control the browser
(see Kania's ChromeDriver architecture
Kania noted that Chrome is a multi-process browser, and the main
browser process already uses this DevTools protocol to communicate with
the renderer processes.
Michael Klepikov, from Google's "make the web faster" team,
talked [video] about
Google's work to make the profiling information from Chrome Developer
Tools programmatically available to WebDriver scripts.
shows the specific API to enable this profiling. Klepikov demonstrated a test
script that launches Chrome, goes to news.google.com, searches for "GTAC
2013", and switches to web (as opposed to news) results, all via the UI;
it (presumably) also asserts that certain results are displayed. He
demonstrated this test both on the desktop Chrome browser
on Chrome running on a real Android phone
the test completes, it prints the URL of a web server on localhost
(powered by WebPagetest)
interactive display of the timing measurements, which can also be viewed
within the Chrome Developer Tools interface.
Marionette: FirefoxOS meets WebDriver
Malini Das and David Burns from Mozilla
presented [video] the
work done on automated testing for FirefoxOS. The FirefoxOS runtime
engine is based on Gecko (Firefox-the-browser's rendering engine), so the
phone's user interface, as well as the bundled apps and third-party
The existing Firefox WebDriver implementation (a browser plugin) works
for the desktop browser, but not for FirefoxOS. So Mozilla has built
WebDriver support directly into Gecko, under the name
WebDriver? The Firefox test suite contains thousands of Selenium
WebDriver-based tests, and there are thousands of web developers
already familiar with WebDriver.
Marionette allows full control of the phone (including telephony
functions), and not just within an individual application. Marionette
scripts can even control multiple devices at once (to test a call from
one device to another, say). Mozilla has written client-side libraries
(what the WebDriver spec calls the "local" side: the automated test
WebDriver isn't fully specified for gesture-based input, so Mozilla has
added its own extensions. You can define gestures by chaining
together actions (a "pinch" could be implemented as press-move-release
with one finger, plus press-hold-release on another finger).
Another Mozilla extension to WebDriver is the setContext function, to
switch between the "content" and the privileged "chrome" contexts. Third
party application developers only need to test at the "content" level,
but testing FirefoxOS itself often requires access to privileged
Marionette supports (non-UI) unit tests too, and can run the hundreds of
thousands of tests that already exist for Firefox. Marionette can run
these tests against Firefox on the desktop, the
Boot to Gecko
simulator, FirefoxOS device emulators, and real FirefoxOS devices.
David demonstrated some geolocation unit tests running on an emulator, though
there wasn't much to see other than the emulator booting up, plus some
text output on a console. Unfortunately the demonstration of an actual UI test
running on a real phone encountered technical difficulties. "Let me demo
how quickly the phone resets", David said to much laughter.
Of course there are security concerns with allowing a phone to be
controlled remotely. For now, Marionette is only available in
engineering builds; Mozilla is working on a way to deploy it for third-party
Appium: automation for native apps
Jonathan Lipps from Sauce Labs started his
talk [video] with
while "mobile is taking over the world", test automation for mobile
apps is extremely underdeveloped.
Lipps is the lead developer of Appium,
a test automation framework for native (and hybrid) mobile apps; it is
released under the Apache 2.0 license. Appium consists of a server that
listens for WebDriver commands, and translates them into
implementation-specific commands for the particular device under test
(Android or iPhone, real device or emulator). Re-using the WebDriver
protocol (originally intended for in-browser web applications) means
that users can write tests in any existing WebDriver-based framework.
Appium extends WebDriver with mobile-specific features by hijacking
WebDriver's executeScript command (you pass in a string starting with
official mechanism will be worked out eventually.
Behind the scenes, Appium uses the operating system's official
on Android. Appium also has an optional backend for the community
project Selendroid which,
unlike uiautomator, works on versions of Android prior to Jelly Bean.
Appium's brand-new support for FirefoxOS by running a test script
against the FirefoxOS simulator. The test launched the
Contacts app, asserted that Lipps's name wasn't present in the list
of contacts, clicked the "add contact" button, added him as a
contact, and finally asserted that the new contact was present in the
list view. Lipps had added FirefoxOS support during the
conference, since Burns and Das's talk the previous day, which really
shows the advantages of open standards like WebDriver.
Guang Zhu (朱光) and Adam Momtaz from Google
covered [video] in
some detail the Android
framework. Much like Selenium WebDriver tests, a test for a mobile
application needs a way to control the mobile device, and a way to
receive feedback on the state of the device; Zhu provided a
survey of the approaches taken by existing Android automation
uiautomator uses the InputManager system service for control, and the
Accessibility service for feedback. This means it will only work if the
application's widgets meet accessibility guidelines — which will hopefully encourage more developers to make their apps accessible.
Set-top box testing with GStreamer and OpenCV
I (David Röthlisberger)
how YouView uses
OpenCV for video capture and image processing to
run black-box automated tests against YouView's set-top box product.
Using GStreamer's gst-launch command-line utility, I showed
how easy it is to capture video and insert image-processing elements
into the GStreamer media pipeline. (Incidentally, the ability to create
a prototype so quickly using GStreamer's command-line tools was crucial
to obtaining management buy-in for this project at YouView.)
Using these technologies, YouView built
stb-tester, released publicly under the LGPL v2.1+.
stb-tester scripts are written in Python, and have two primitive
operations available: press to send an infrared signal to the system
under test (pretending to be a human user pressing buttons on a remote
control), and wait_for_match to search for a given image in the
system-under-test's video output.
I also showed some examples of "model-based testing" built on top of
stb-tester: First [video],
a script that knows how to navigate within a menu
arranged as two rows of thumbnail images (aka a double carousel), so instead of saying "press UP, then LEFT, etc.", the
script says "navigate to entry X".
state machine of YouView's "setup wizard", that allows test scripts to
generate random walks through the possible user choices during the setup
In spite of the name, stb-tester can be used to test any consumer
electronic device, not just a set-top box, as long as it outputs video
and can be controlled via infrared. See
stb-tester.com for documentation and
Who's responsible for testing?
"Your developers — or worst case, test organization — produce tests."
Another major theme of the conference, which was more on the people and
process side, was that the responsibility for testing is shifting from a
separate test team to the developers themselves. In any organization
following modern development practices, developers will write unit tests
for their code; but end-to-end integration testing, performance testing,
and security testing have traditionally been the domains of separate
teams. Even Google has separate positions for "Software Engineer" and
"Software Engineer in Test", and there seems to be a difference in
status between those two positions — though judging from conversations
with various Googlers, that difference is decreasing.
Claudio Criscione, who works on web security at Google, gave a
talk [video] on an
internal tool his team developed to find cross-site scripting
vulnerabilities; he said
they put a lot of effort into making the tool easy to use, so that
development teams could run the security tests against their own
applications, taking the load off the security team.
Similarly, James Waldrop pointed out in his
talk [video] that his
performance testing team at Twitter can't possibly scale up to test all
the systems being produced by the various development teams, so instead
his team provides tools to make it easy for the developers to write
and run performance tests themselves. Specifically, he presented
Iago, which is a
developed at Twitter and released under the Apache 2.0 license.
Simon Stewart, who created WebDriver, leads the Selenium project,
and now works at Facebook,
said in his
talk [video] that
Facebook has no such thing as a "Software Engineer in Test", only
"Software Engineers". Furthermore, Facebook has no test or QA
departments. The developers are responsible for testing.
About the conference
is in its 7th year (though it skipped a year in 2012). Attendance is
free of charge, but by invitation and limited to around 200 technical
applications are announced each year on the
Google Testing Blog. Live streaming
was available to the public, along with a
Google Moderator system
for the remote audience to submit questions to the speakers.
It was a very polished conference — you can tell Google has done this
before. It was held in Google's New York City office; meals were catered
by Google's famous cafeteria. A sign-language interpreter was on site,
and stenographers provided live transcripts of the talks. Videos and
slides are available
to post comments)