User: Password:
|
|
Subscribe / Log in / New account

Development

What is Open Graph?

April 28, 2010

This article was contributed by Nathan Willis

Social networking site Facebook's latest API update, dubbed "Open Graph," is generating diverse debate — about its genuine "open"-ness, its impact on other web services, and (like other Facebook initiatives before it) it's privacy and security approach for users. Those issues can get conflated if not properly addressed, however, so it is vital to take a close look at what Open Graph is in order to make a solid critique.

Bird's eye view

Broadly speaking, Facebook's value is in its network effect — that is, there is only one Facebook, and a large number of people are using it, so if you are a content producer, you have access to the entire audience at a single entry point. Up until now, content producers have primarily reached out to this audience with "Facebook Pages" — which despite the vague name, are a very specific site feature. Any business, book, album, or product that wanted to connect with fans had to create a Facebook Page on the site, then collect "fans" — essentially re-creating a duplicate of its off-network identity that was confined to within Facebook itself. Page owners then marketed their product to fans in traditional ways: sending messages, posting updates, soliciting feedback, etc..

Open Graph is designed to duplicate that marketing process but free the content producer from having to re-create the in-network Facebook Page for each product. The producer can instead add semantic tags to any web page that identify it as a particular entity in Facebook's database, and add widgets to the page with which Facebook-using visitors can interact (including adding themselves as fans).

The producer can also add widgets to the web page that extract data from Facebook's aggregated database, such as "recommendations" based on common fan-bases, news items posted by fans, and other activity. To extract that data, Facebook introduced an API alongside the Open Graph tagging syntax itself. The Graph API can only be used to request data from Facebook's servers — and although it is regarded as an improvement on its predecessor APIs, it is also the source of many critics' privacy concerns. It exposes more information about user accounts than many feel is appropriate, and Facebook silently changed its privacy policy to opt everyone in to the API without asking.

The other major criticism of the system is that it stores all of the user-to-product relationship data inside the Facebook silo itself, which creates a single-point-of-failure, and stores it permanently, which affects user privacy. Finally, in spite of pre-pending the word Open to its name, Open Graph was not developed in the open, and at present there does not appear to be anyone outside of Facebook using it for anything. That, of course, does not mean that there is no room for others — including the open source software community — to make use of Open Graph in one form or another.

Tagging content

Open Graph Protocol (OGP) is Facebook's term for the semantic metadata content owners can add to their sites to enable Facebook's servers to reference them. In its current implementation, OGP is a Resource Description Framework in attributes (RDFa) ontology. RDF is the "semantic web" technology created by the W3C; RDFa is a technique for embedding it in a specific way, within special, structured HTML attributes. The ontology is the formal description of the objects, classes, and relationships being modeled. OGP defines RDFa properties in the "og:" namespace, which site creators place in <meta> tags inside the HTML document's <head> section.

A web page can be associated with only one "object," and the official schema defines a set of 38 object types, covering common activities, businesses, places, celebrities, and entertainment products. Several object properties are defined in the schema as well, from location information to contact information, plus general attributes such as "image" and "url."

For example, the movie "Primer" on the Internet Movie Database (IMDB) might include the following tags:

    <meta property="og:title" content="Primer"/>
    <meta property="og:type" content="movie"/>
    <meta property="og:url" content="http://www.imdb.com/title/tt0390384/"/>
    <meta property="og:site_name" content="IMDb"/>

Precisely what constitutes an "object" in OGP is not strictly defined — is the DVD release of "Primer" the same object as the theatrical release described on IMDB, for example? Are multiple editions of the same book separate objects? OGP does not currently take a position; Facebook's documentation is focused on the "user X likes product Y" relationship, which is not specific. Other OGP adopters may find the specification needs enhancement in this area.

In practice, Facebook's partner sites also make use of the Facebook Markup Language (FBML) namespace, which is not part of OGP itself, and would include properties such as "fb:admins" and "fb:app_id" that tie directly in to Facebook services.

As of today, OGP has only an informal specification, with examples but not a complete reference, hosted at opengraphprotocol.org. The site says that the specification has been released under the Open Web Foundation Agreement version 0.9, which specifies a royalty-free grant of copyright and patent non-assertion. The site also states that much of the syntax for OGP's properties mimics that found in the hCard microformat specification and the Dublin Core metadata ontology.

OGP's status as an "open" specification was challenged by Chris Messina, who directed his question to Facebook's Dave Recordon, in particular asking who (if anyone) outside of Facebook had participating in the specification's design. On the Open Web Foundation mailing list, Recordon eventually stated that OGP was created by Facebook engineers, with only "feedback" from others.

The API

Like OGP, the Graph API is simple in and of itself. It is RESTful and returns objects in JSON format. All Facebook objects — from people to pages, groups, applications, and events, are accessed the same way, with a https://graph.facebook.com/ID URL, where ID is the the official page ID already found in Facebook URLs. Appending /CONNECTION_TYPE to the URL allows the application to query relationships, such as /friends, /photos, /notes, /attending, or /likes — with the valid relationship types varying depending on the object. The API supports basic object introspection, and uses OAuth 2.0 for authentication.

The privacy concerns start with the fact that Facebook altered its privacy policy when Graph API launched, and set all of its users' permissions to allow sites using Graph API to retrieve their personal information. But there are still other privacy concerns, including the fact that Graph API can not only access user's information, it can be used to publish new content, writing to a user's wall, creating comments, and responding to event invitations, or the fact that the new API makes additional data views accessible that were never available in previous APIs, such as the full list of events that a user is attending.

The button

Bloggers have given considerable attention to the final piece of the puzzle, the "Like this" button that Open Graph-using sites are supposed to add to enable visitors to show their support for the page. As one might expect given OGP and Graph API, all the Like button does is send a Graph API call to Facebook's servers reporting the user ID of the visitor and the metadata about the page culled from the OGP properties in the page header.

To the content publisher, the result is exactly the same as the old Facebook Page fan-collecting process, but with less development overhead. IMDB can automatically collect "likes" from Facebook users by generating OGP headers for every page, rather than having to build separate in-network pages for each entity inside Facebook itself.

But there is more to it than that, at least in theory. OGP's "objects" are supposed to refer to real-world entities, such as Ben Affleck, not to pages, such as http://www.imdb.com/name/nm0000255/. Thus the Facebook server can parse property="og:name" content="Ben Affleck" on both IMDB and on Netflix, recognize that both sites refer to the same person, and react accordingly with advertising, recommendations based on other users' behavior, and so on.

It is not yet clear how external sites using Open Graph will have access to this aggregation of content, or if it will be limited to Facebook itself. Sites will, however, be able to use more complicated Graph API widgets to build on the behavior of their site visitors — the Recent Activity and Recommendations plugins are Facebook's existing examples. It is likely that it was this advanced usage that prompted Facebook to quietly do away with its previous policy of caching user behavior for a maximum of 24 hours — tracking long-term and large-scale behavior patterns requires more than that. That reversal was announced, with less fanfare, alongside Open Graph itself. Facebook is also offering more detailed user tracking to page owners as part of its Insights program, and has expanded the amount of user information it offers in this way.

Apart from the aggregation and user-tracking questions, the Like button's implementation has critics. ReadWriteWeb notes that a Like button can be written to link to a totally different page than the one on which it is placed, and advocates that users save and use a special JavaScript "bookmarklet" instead.

The rest of the Internet

Many advocates of open web technology decried Open Graph primarily on the grounds that if the "Like this" button becomes ubiquitous, Facebook will be the de-facto owner of all "like this"/"share this"-style conversations and behavior. That prospect should certainly worry businesses that rely on network effects for generating their own ad revenue or sales, such as Digg or Hotels.com. What else it threatens is less clear.

Almost immediately, a project named OpenLike appeared, setting out to replicate the Open Graph "Like this" button, but redirecting the API to another server. At present it appears to do little (if anything) more than the multi-service offerings of ShareThis.com or its competitors. Starting such a project is an understandable reaction, but regardless of who gets the API hit, if it is aggregated on the server, it is not necessarily more open or free than at Facebook. Making the data publicly accessible has its own privacy problems.

Alex Iskold at ReadWriteWeb believes that Open Graph's biggest effect will be that large sites will begin marking up their pages with semantic information, and nothing prevents other web or software developers from making good use of the RDFa tags used. This is probably true; the initial list of partner sites is small: Microsoft, Pandora, and Yelp, but the OGP documentation alludes to many more, including reference sites like IMDB.

Many open source applications harvest information from reference sites like IMDB, and at present have to do so via screen-scraping code that requires an update every time the site markup or layout changes. Consider all of the collections-management and media center applications that scrape cover images and screenshots from IMDB, for example. If the site does implement OGP, the applications' task becomes much simpler overnight.

That is not to say that OGP is perfect; far from it. It is not the only semantic web project by a long shot — it overlaps with Friend-of-a-Friend (FOAF), hCard, and Activity Streams in some respects, such as event and group descriptions, and competes with GoodRelations on product descriptions. In addition to that, WHATWG is working on an alternative semantic markup scheme called Microdata that encodes metadata in HTML attributes rather than in <meta> tags. This technique makes it possible to mark up multiple items in a single page, rather than limiting each page to a single object as in OGP.

Furthermore, if OGP is to become a useful ontology, Facebook will have to genuinely give up control over it, even when its customers do not like the direction the specification moves; whether Facebook is amenable to that is an open question. Certainly the opportunity is there for other big players (such as Google) to leverage a large pool of OGP semantics against its own hot properties (such as Gmail, Google Reader, and Google Maps) in ways that Facebook would not. Hopefully open source will not be late to the battle.

To others, the principle source of worry is that Open Graph morphs users' Facebook accounts into their general online identity on multiple sites. Clearly commercial services will resist this encroachment; Twitter, eBay, and PayPal have too much at stake in their in-house identity systems to cede that to another company — but a lot of smaller publishing outlets and individual blogs may be tempted to tie in to Facebook solely to increase hit counts. Here the biggest open source challenger is Mozilla, which has been working on lifting identity management out of the server and keeping it within the browser. On April 28, Mozilla announced the first release of its Account Manager add-on, which automates account creation for users and for sites.

There are certainly efforts to build decentralized "Facebook-like" social networking services in an open manner, such as the BuddyPress plugin for Wordpress. StatusNet, the company behind the popular Identi.ca microblogging site, has created OStatus as a general-purpose activity-and-status-update extension of its original microblogging protocol that can exchange Facebook-like general traffic. The big challenge these and other open projects face is gaining a significant foothold with the user community — starting from scratch.

The privacy-sensitive have probably stopped using Facebook altogether, and of course, everyone is free to keep an eye on their own personal information's security by monitoring the site's privacy policy. But those concerned about the continued openness of the web itself would be wise to turn their attention to the data that will suddenly become available to them, rather than the button attracting all of the attention. The power of a network like the one Open Graph is designed to create is in the links between the elements, to be sure; with the data OGP adoption could make available, the race is on to do something better with it.

Comments (4 posted)

Brief items

Quotes of the week

Anyone who thinks that Perl 6 is fundamentally based on traditional compiler construction techniques taught in universities frankly has no clue as to what a fundamental paradigm shift Perl 6 represents to language design and implementation. It's this fundamental change that ultimately gives Perl 6 its power, but it's also why Perl 6 development is not a trivial exercise that can be done by a few dedicated undergraduates.
-- Patrick Michaud

I am stepping down as the [GCC] spu backend maintainer since Sony removed GNU/Linux (OtherOS) support from their newer PS3 firmware. The main reason is I will no longer have access to a machine to support the target. But really this is also a step backwards for free software support from Sony.
-- Andrew Pinski

I love Emacs, and I have come to really like Bazaar, but I can not help but feel that by helping the Emacs development team move to Bazaar I have actually done both of these communities a disservice. Bazaar has received nothing but bad publicity from the switch, and the Emacs development group appears to have been hampered more than helped--despite the fact that Emacs was switching from crufty CVS.
-- Jason Earl

Comments (none posted)

Doxer 10.04.0 available

Doxer is a documentation system based around a wiki-style markup parser and a Drupal module for online service. It also has tools for extracting documentation from source code. Version 10.04 is the first release of Doxer as a standalone package.

Full Story (comments: none)

The Fennec browser for Android

There is now a "pre-alpha" version of the Fennec browser available for Android 2.0+ platforms. It comes with a number of caveats - when a Firefox-based browser has explicit warnings about memory use one should pay attention - but it should be an interesting thing for adventurous users to test.

Comments (22 posted)

LLVM 2.7 released

Version 2.7 of the LLVM compiler is out. There's a lot of new stuff, including improved C++ and beginning Objective-C support in clang, support for the MicroBlaze architecture, "major progress" in the MC internal assembler (described briefly in this article), static analysis improvements, better virtual machine support, and more. Details can be found in the release notes.

Full Story (comments: 108)

Notmuch release 0.3 now available

Version 0.3 of the "notmuch" email client is available. "The major theme of this release is a huge number of improvements to the emacs interface to notmuch. There's now a lovely new 'welcome screen' that provides a search bar, recent searches, and saved searches. It looks nice, it's extremely convenient to use, and we think it makes a great model for what a 'search-based email interface' should look like. (So we're hoping that someone will imitate it in an upcoming graphical interface to notmuch)." Note that this release had a bug or two, so there is a 0.3.1 update with fixes available.

Full Story (comments: none)

Qt Mobility 1.0 released

Version 1.0.0 of the Qt Mobility package has been released. Qt Mobility provides a set of APIs aimed at mobile applications; they cover areas like connectivity, sensors, contact management, location, and more. This code will eventually find its way into a MeeGo release. See this white paper [PDF] for more information.

Comments (none posted)

sighttpd 1.0.0 released

Sighttpd is an HTTP server focused on the distribution of realtime streaming data in an embedded context. It is thus useful for applications like camera controllers. The 1.0.0 release has just been announced: "This release includes a new configuration file syntax allowing a single sighttpd instance to provide multiple streams. Examples are provided for streaming Ogg Theora, Motion JPEG and MPEG-4 AVC (H.264) video."

Full Story (comments: none)

x264 adds Blu-ray support

The developers of the x264 video encoding library have announced that the library is now capable of creating video in the Blu-ray format. It is, they say, the first free Blu-ray encoder. "With x264's powerful compression, as demonstrated by the incredibly popular BD-Rebuilder Blu-ray backup software, it's quite possible to author Blu-ray disks on DVD9s (dual-layer DVDs) or even DVD5s (single-layer DVDs) with a reasonable level of quality. With a free software encoder and less need for an expensive Blu-ray burner, we are one step closer to putting HD optical media creation in the hands of the everyday user."

Comments (4 posted)

Open source from the White House

The White House - the seat of the US presidency - has announced that it is releasing some of its improvements to the Drupal content management system. "By releasing some of our code, we get the benefit of more people reviewing and improving it. In fact, the majority of the code for WhiteHouse.gov is already open source as part of the Drupal project. The code we're releasing today adds to Drupal's functionality in three key ways." It is nice to see that the president's office cares about such things.

Comments (12 posted)

Newsletters and articles

Development newsletters from the last week

Comments (none posted)

Page editor: Jonathan Corbet
Next page: Announcements>>


Copyright © 2010, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds