On the final day of the Tizen Developer Conference in San Francisco, Samsung distributed developer devices to registered attendees. Although they are not fully-functional consumer devices by any means, the devices did provide the first opportunity for most developers to see the new platform's mobile phone incarnation outside of an emulator.
Physically, the device is shaped like a largish mobile phone, with a 4.5-inch diagonal (720 by 1280) touchscreen, three buttons, dual cameras, audio jack, and mini USB port. Inside, the platform is based on Samsung's Exynos 4210 system-on-chip, which is also found in Samsung's production phones. The processors are ARM: a dual-core Cortex A9 CPU, paired with a Mali 400 GPU. 1GB of RAM is available, plus 16GB of storage and a microSD card slot. Additional hardware includes WiFi, Bluetooth, GPS, and NFC, plus a SIM card slot.
As a development aid, the size of the device works to its advantage. Although it is large for a phone, the on-screen keyboard and other touch widgets are much easier to hit than those on a pocket-sized device. On the other hand, the microSD and SIM card slots can only be accessed by taking apart the outer shell, which means rapid-swapping of cards will be difficult. There may be developers who lament the lack of a hardware keyboard, too, although the device is intended to be accessed through the Tizen SDK over a USB connection.
Thus far it is unclear if the SIM slot is supported in software; the SIM cards I tried are not recognized even for non-modem operations such as contacts import, and no one appears to have gotten it up and running. Bluetooth is provided by a Broadcom 4330, which also supplies the WiFi and is FM radio-capable, but out of the box Bluetooth and FM are not enabled. Similarly, the NFC chip is a PN544 from NXP, which is the most widespread NFC chip supplier, but it is not yet used by the software stack. A standard complement of accelerometer, digital compass, and three-axis gyroscope sensor is also included.
The installed software is the Tizen 1.0 "Larkspur" release, which officially debuted on April 30. The code represents a merger of Intel's MeeGo work with the Samsung Linux Platform (SLP), which was derived from LiMo. Although use of the Enlightenment Foundation Libraries (EFL) in the graphics stack is the most widely-publicized piece of SLP, there are others — including SLP's 3G telephony stack. At the conference, the Tizen project made it clear that oFono (an Intel project pre-dating MeeGo) was due to be ported over as Tizen's telephony stack; this may explain the SIM compatibility issue, as the device is alleged to be the same hardware used in Samsung's Android-powered Galaxy S2, and the modem driver may not have been ported to SLP.
Other parts of the Tizen 1.0 stack are interesting, too. The display server is X, although here again at the conference it was explicitly said that a port to Wayland is in the works. The effort consists of completing the port of EFL to Wayland, and of optimizing the speed of the platform's web runtime. Input is handled by the SLP Input Service Framework and by the XGesturesExtension originally authored by Canonical. The latter has since evolved into the uTouch framework, which makes the platform an unusual blend of technologies of different ages.
The underpinnings are certainly worth exploring, but most developers will probably be more interested in exploring the HTML5-based application API. The device includes a small set of reference applications — browser, phone, messages, clock, contacts, calendar, image gallery and audio player — plus a comprehensive system settings utility.
The demo applications bear many similarities to their MeeGo counterparts; perhaps the phone application more than others, and the media applications a bit less so, but all will feel familiar. Stylistically, most use flat colors, with sharp edges and corners, subtle highlighting via box outlines, and what desktop Linux vendors have come to call symbolic, monochrome icons. Most provide top tabs for navigating between the application's pages and place function buttons at the bottom of the screen. Where vertical scrolling is required, the scroll bar widget is a thin, translucent overlay that appears only while scrolling, and fades out. It does appear that the outermost edge of the screen is sensitive to scroll events, which can be tricky on mobile devices. On the other hand, none of the demo applications seems to show off multi-touch gesture recognition.
Several of the applications show off a nice assortment of data entry widgets. For example, the alarm clock setting screen features a drop-down selector that scrolls through the hour or minute options horizontally. But it sports a separate radio-button selector for choosing the notification type, which unrolls a bit like an unfolding map. On the whole, all of the transitions are smooth, and navigation between pages and elements is simple.
That said, a few of the demo applications use a completely different widget style. The calculator uses rounded buttons with a slight 3D embossed effect, gradients, and a faux-leather background. The stopwatch and timer screens in the clock emulate an LCD screen and a different faux-leather look, respectively, and feature their own UI widget styles. Were this a commercial product, the differences in look and feel would be cause for concern; in a demo suite, though, they simply show competing options for GUI styling.
Samsung's J.D. Choi showed off several advanced applications in his conference keynote, utilizing the platform's OpenGL support, but there is no official word when they — or a system update enabling missing pieces like Bluetooth, NFC, and SIM support — might land.
Of course, demo applications cannot be the end of the story. Tizen seeded the devices to developers at the conference in order to jumpstart independent application development. But it will be quite some time before anyone should expect to see output. The Mer project has started to dig into the lower levels of the system, including the u-boot bootloader, and is working on booting an OS from the microSD card. Thomas Perl has succeeded in building and running Python and PyGame on the device, and more and more developers are popping in to ask practical questions on the mailing list and on IRC.
The preferred method for writing applications is the Tizen SDK, which for the moment is only available for Windows and 32-bit Ubuntu systems, but enterprising hackers have found ways around that limitation already. We will provide a more in-depth look at the SDK and assorted development tools in another story soon. In the meantime, it is good to finally see Tizen running on a physical hardware device. Yes, the experience is much the same as that provided by the SDK emulator and reviewed around the web, but a modern development device makes all the difference in the world. There may be no third-party applications to install just yet, but that was also the situation shortly after the release of the Nokia N950 last year, and today there are hundreds. Which just goes to show that if you want software, the simplest thing you can do is give the open source community tools and a platform on which to build.
[ The author would like to thank the Tizen project and the Linux Foundation for support to attend the conference. ]
Mandriva SA, the company behind the Mandriva distribution, has announced it will return control of the distribution back "to the community." But exactly how that will play out in practice remains unclear, since the company was unable to convince the Mageia distribution to participate in the new effort — but, conversely, is cooperating with Mageia for future products.
Mandriva made the community hand-over announcement on May 17, saying that it had decided to "transfer the responsibility of the Mandriva Linux distribution to an independent entity." The announcement outlines only a rough plan, with the formation of a governing body that will include representatives from Mandriva SA, but will not be under the company's direct control. The company will also continue to contribute its engineering resources to Mandriva development. The announcement then states that the details of the organization's governance model, processes, and other infrastructure will be fleshed out over the next few months, a process to be handled by a still-in-formation workgroup of community members.
There have been several forks of the Mandriva distribution, but by far the largest is Mageia. Yet the Mageia board announced on May 21 that it had decided not to join the new Mandriva workgroup — at least, not as a group. The announcement enumerates five reasons for the decision.
First, the Mageia board feels that the Mageia.org organization already meets the needs that Mandriva SA gave for forming a new entity, and felt that the company should have joined it instead. Second, Mageia has "invested a lot of time and energy" to define Mageia as it is. Third, there is a lack of information about the future direction of the proposed Mandrake community entity. Fourth, the Mageia project does not have enough resources to take on a new project in addition to its existing work. Finally, because Mageia is already free software, code sharing between the projects can already happen without establishing any formal arrangement.
Some of the listed reasons are perplexing. Clearly the first two indicate that the project thought Mandriva SA should simply adopt Mageia as its community distribution as-is. That would offer technical challenges, since the two projects have diverged in key areas since the original split (package managers, for example), as well as trademark issues. The last two are easily justifiable — patches do already flow back and forth between the distributions, and one rarely hears of a distribution project with too much time and developer resources on its hands.
But the middle reason is a puzzle. It sounds as if the Mageia board rejected an offer to take a seat (or seats) on the Mandriva community governance entity, when the stated offer was a place in the workgroup that will define the entity's standards and practices. If the Mageia board was concerned about the future direction of the new project or entity, surely participating in the workgroup would be the best way to influence that direction for the better. Comments on the Mandrake SA announcement indicate that at least one other Mandriva fork, ROSA, is joining the workgroup.
The other wrinkle in the Mandriva makeover story is Mandriva SA's May 20 product announcement. In the post, the company outlines new product plans and how they relate to the evolving situation with the base distribution. The desktop product will be based on the new community-managed version of Mandrake, as will its OEM and education offerings. Pulse2, the company's corporate system deployment and management product, will continue to be developed and contributions will be made back to the Mandriva community. The company's server offering, however, will be based on Mageia.
Mageia confirmed the arrangement in the blog post linked to above, saying that it arose out of talks with Mandriva SA. It is not clear whether the decision involves any sort of development effort on Mageia's (or individual Mageia developers') part, but the project did take pains to emphasize that it had no bearing on the future development of Mageia itself. Several commenters on the Mandriva announcement asked why the company would choose to build its server product on a different code base than its desktop product, particularly in light of the divergence between Mandriva and Mageia and Mageia's 18-month life cycle (which is brief for a server distribution).
So far, there has not been an elaboration on the business server arrangement from either party. But that could simply be lack of time; Mageia released the final version of Mageia 2 on May 22. We previewed the release in April. Mandriva, meanwhile, has started to flesh out the beginnings of its community development plan on its wiki, and ROSA has rolled out its latest release. Hopefully, with the new release out the door at Mageia, and the Mandriva workgroup taking shape, we will soon hear more details.
The new X.org multitouch features allow for multitouch support in applications. We now have a software stack, uTouch, built on top of this multitouch support that can provide for practically any gesture scenario imaginable.
A "gesture" is normally thought of as a two-dimensional movement made by the user on some sort of input device—a two-finger pinch, for example, or a three-finger downward drag. Teaching a computer to recognize these movements requires a lower-level description, though; in uTouch, this description consists of values like the number of touches, movement thresholds, and timeout values. An application may register a "gesture subscription" describing a specific gesture and be notified when that gesture is recognized by the uTouch subsystem. Those notifications take the form of a sequence of events describing the gesture motion over time.
Key to understanding how uTouch works is knowledge of all the typical gesture use cases. First, we have the concept of gesture primitives: drag, pinch (including both "pinch" and "spread"), rotate, and tap. These primitives make up the foundation of all intuitive gestures. They can be strung together as needed for more complex gestures, such as a double tap. Stroke gestures, such as drawing an ‘M’ to open the mail client, may be recognized as a specific long gesture sequence, or as a sequence of drag gestures. Note, however, that uTouch does not have stroke gesture detection facilities built-in.
Second, there are two fundamental object interaction types: single motion, single interpretation gestures and direct object manipulation. The former involves gestures like a two-touch swipe to go backward and forward through browser history, while the latter involves gestures like a three touch drag to move an application window around the desktop.
The single motion, single interpretation gestures require thresholds and/or timeouts. For example, the colloquially implied difference between a swipe and a drag is that a swipe must be a quick motion in a given direction, whereas a drag may be any motion that manifests in a displacement in space. To put it in uTouch gesture subscription terms, a swipe is a drag primitive gesture with a displacement threshold that must be crossed within a specific amount of time. For example, when implementing browser history gestures a two-touch swipe may be implemented with a threshold of 100 pixels over half of a second. In contrast, direct object manipulation usually implies a zero threshold. For example, as soon as three touches begin on a window, the window should be movable.
Most simple gesture interactions may be handled through gesture subscriptions consisting of the required gesture primitives and the object interaction types. However, there are times when an application needs to have further control over gesture recognition. For example, a bezel drag gesture occurs when the user begins a drag from the bezel of the screen and moves inward. This gesture must be distinguished from the user initiating a touch at the edge of the screen. The problem lies in the fact that both the bezel drag and the direct touch near the edge of the screen look indistinguishable at the beginning of the gesture. The distinguishing aspect is that the bezel drag is perpendicular to the bezel and has a non-zero initial velocity as seen by the touchscreen, whereas the direct touch near the edge of the screen will likely not have an initial velocity and/or may not be moving perpendicular to the bezel. To cater for a client that cares about one of these gestures but not the other, uTouch requires the client to accept or reject every gesture. When a gesture is rejected, the touches may be replayed to the X server, which allows for the mixing of gestures and raw multitouch in the same application.
Another facet of uTouch, as hinted above, is that, by default, it operates through "touch grab" semantics. When used on top of X.org, uTouch gestures are recognized from touches received through touch grabs. One benefit of this approach is the ability to mix gestures and raw multitouch in the same application. However, it also allows for priority handling of gestures. For example, system gestures may be handled by a client listening to touches through a grab on the root window. When gestures are not recognized or are rejected by the uTouch client, the touches are replayed to the next touch grab or selecting client. Thus, global gestures, application gestures, and raw multitouch events are all possible when using uTouch.
The last major feature of uTouch is the ability to recognize multiple simultaneous gestures in the same area. For example, imagine a game where the user pinches bugs on the screen to squash them. The screen is one large gesture input area, but the user may use both hands to pinch bugs. In order to facilitate this interaction mode, whenever new touches begin within the gesture area they are combinatorially matched with other touches that begin within a "glue" time period. In our game example there is a two-touch pinch gesture subscription. If four touches begin in the game area within the glue time period, six combinations of potential gestures will be matched. As touch events are delivered, the state of each matched gesture will be updated and then checked against the threshold and timeout for the gesture subscription. If a gesture meets the threshold and timeout criteria, it will be delivered to the client. The client can then attempt to match up the touches of the gesture against its context to determine whether to accept or reject each gesture. In the example below, there will be four pinch gestures sent to the client:
(Bug icons licensed under LGPL)
There will be potential pinch gestures for: AB, CD, AD, and BC (AC and BD, by virtue of moving in the same direction, are not considered to be potential pinches). The application must determine which gestures make sense. One method would be to hit test the initial centroid of each gesture against the bugs on the screen. All gestures that hit a bug are accepted. Note that uTouch automatically rejects overlapping gestures, so as soon as AB and CD are accepted, AD and BC will be implicitly rejected.
There is a twist to this complex logic, however. Gesture events are received serially. The client may need to know if more gestures are possible for a set of touches. For example, if both one-touch and two-touch drag gestures are subscribed, a two touch drag will cause two one-touch drag gestures and a two-touch drag gesture. If the uTouch client receives a one-touch drag first, it may not realize that a two-touch drag is coming for the touch as well. To handle this issue, a gesture property is provided to denote the finish of gesture construction for all of its touches. When a gesture has finished construction, the client knows that it has received all possible gestures containing the same touches. Thus, in the one- and two-touch drag example the one touch gesture will not emit the gesture construction property until at least the two-touch gesture begin event has been sent to the client.
The uTouch stack was designed to be flexible and provide for all possible gesture use cases. However, it is recognized that not all clients will care about multiple simultaneous gestures. There are plans to create a gesture subscription option that precludes the ability to have multiple simultaneous gestures. This will effectively push some policy into the recognizer, such as a preference for gestures with more touches. This will be particularly useful when subscribing to gestures on an indirect device, like a touchpad, where multiple simultaneous gestures are likely not wanted.
Lastly, uTouch is a complete gesture stack that surpasses the functionality of all available consumer platforms. uTouch works well with both touchscreens and touchpads, and supports both gestures and raw touch events in the same window or region of an application. In contrast, Windows only supports touchscreens and either gestures or raw touch events, but not both, in a given window. OS X supports touchpads but not touchscreens. Mobile platforms are limited to touchscreen support and single-application gestures at a time due to their modal task design. In contrast to each of these platforms, uTouch has been designed from the ground up to support all device types and all known use cases, including multiple applications and windows at the same time.
uTouch consists primarily of three components: uTouch-Frame, uTouch-Grail, and uTouch-Geis. Each of these will be described briefly below.
uTouch-Frame groups touches into units that are easier for uTouch-Grail to operate on. Gestures are recognized per-device and per-window, so touches are grouped into units representing pairs of devices and windows. This is also where all backends for each window system are implemented. uTouch-Frame events are platform independent.
Some window systems, like X11, also have the concept of touch sequence acceptance and rejection. This functionality is provided through uTouch-Frame as well.
Touch sequence acceptance and rejection is a core aspect of the uTouch stack when used for system-level gestures. Imagine a finger painting application listening for raw touch events (not gestures) is open on a desktop environment where three-touch swipes are used to switch between applications. When the user performs such a swipe, uTouch accepts the touch sequences on behalf of the window manager and switches applications. This prevents the painting application from handling (or even seeing) the touches. In contrast, when the user performs a three-touch tap, uTouch rejects the touch sequences because they do not match a known gesture. The painting application then receives the rejected touch sequences.
uTouch-Grail is the gesture recognizer of the uTouch project. It takes the per-device, per-window touch frames from uTouch-Frame and analyzes them for potential gestures.
Grail events are generated by frame events. Rather than duplicate the uTouch-Frame data, grail events contain gesture data and a reference to the frame event that generated it. This allows for uTouch clients to see the full touch data comprising a gesture.
Grail gesture events are comprised of a set of touches, a uniform set of gesture properties, and a list of recognized gesture primitives. Again, the supported primitives are: drag, pinch, rotate, and tap. The gesture properties are:
Drag, pinch, and rotate properties are encapsulated by the affine transformations. For more detail on how to use 2D affine transformations, please see this excellent Wikipedia article on transformation matrices.
During operation, a pool of recently-begun touches is maintained. In the current implementation this pool includes any touches that have begun within the past 60 milliseconds of "glue" time. When a new touch begins, it is combined in all possible combinations with touches in this pool in order to create potential gestures matching any active subscriptions.
A new gesture instance is created for each combination of touches. Each instance has an event queue, and new instances have one begin event describing the original state of the touches. The events are queued until any gesture primitive is recognized. When frame events are processed, any changes to touches in a gesture instance generate a new grail event. The new touch state is analyzed, and subscription thresholds and timeouts are analyzed to determine if any of the subscription gesture primitives have been recognized. For example, the default rotate threshold is 1/50th of a revolution, and the default rotate timeout is one half second. If the threshold is met before the timeout expires, the rotate gesture primitive is recognized.
When a gesture primitive has been recognized, the grail event queue is flushed to the client. The client must process the gesture events and make a decision on whether to accept or reject each gesture.
uTouch-Geis is the C API layer for the uTouch implementation. uTouch originally began as a private X.org server extension. It has since been updated, bringing it out of the X.org server and into the client side of the X11 system. This required a complete rewrite of uTouch-Frame and uTouch-Grail. However, we have managed to maintain API and ABI compatibility through uTouch-Geis, albeit with a few behavioral changes. uTouch-Geis has two API versions, version 1, a simpler interface, and version 2, an advanced interface. Although both are currently supported, the first version is deprecated in favor of the more flexible second version.
uTouch-Geis also makes gesture event control simpler by wrapping much of the X.org interaction behind an event loop abstraction. The uTouch stack requires careful management of touch grabs and timers. Any client may use uTouch-Frame and uTouch-Grail directly, but uTouch-Geis vastly simplifies incorporating gestures into an application. See the uTouch-Geis API documentation for more information.
We also have begun work on a gesture recognition system for the Chromium web browser. There are many potential gesture interactions that we hope to leverage in the browser. An initial implementation was proposed, but a rearchitecture of the gesture plumbing in Chromium required us to refactor it. We hope to merge an implementation into Chromium in the next few months.
Over the past two years the uTouch team has been working hard to bring multitouch gestures to the Linux desktop. We now have a complete stack that rivals, and in many ways surpasses, what is possible on other platforms. We look forward to further integration of uTouch gestures in desktop environments and applications, and we encourage everyone to take a look at what our stack has to offer.
Due to the US Memorial Day holiday on May 28, and a day off for the LWN crew, next week's edition will be published on Friday June 1. Whether you are celebrating on Monday or not, we hope you have a nice 28th of May, and we'll resume normal service on the 29th.
Page editor: Jonathan Corbet
Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds