Perl 5.16 and beyond

March 21, 2012

This article was contributed by Dave Rolsky

Perl 5's yearly release rhythm is well established. Because major releases come out every single year, a major release no longer introduces a slew of new features. Instead, it consists of a smaller set of features and bug fixes. Upgrading to a new major version is easier than it's ever been. So why upgrade at all? What's changed in Perl 5 recently, and where is it going?

Syntax extensions outside the core

One of the most important changes has been a push toward making the Perl 5 core more extensible. This work started with the Perl 5.12.0 release, and has continued since then. Many of the internal APIs have been cleaned up and documented. It has also become much easier to write extensions that work like Perl builtins, and even to extend the syntax in entirely new ways. Extensions can modify the "optree" while Perl is compiling your code. The optree is the Perl interpreter's internal representation of a piece of code. It's what the interpreter executes when it runs your program. When extensions are compiled down to ops, they can be as fast as an implementation in the core would be.

The smart match feature that was added in 5.10.0 provides a good example. Smart match includes the "given" and "when" builtins, as well as the "~~" operator. In the code, smart match examples can look like:

    $value ~~ @array
    @array ~~ qr/foo/

The behavior of ~~ varies based on the type of both operands. The first example checks to see if $value is a member of @array. The second example checks to see if any member of @array matches the regular expression. The Perl 5 Porters eventually realized that this feature's behavior is much too complicated. Unfortunately, this realization happened after the feature shipped, so we can't simply remove or change it out from under existing users.

What we need is a way to change the behavior in future releases while still providing access to the old behavior. We could, of course, just do this in the Perl core's C code. If the "enable old behavior" flag is on, we use the old code path, otherwise we use the new path. While this is feasible, it's not desirable. It clutters the core, and the more features that go down this path, the messier the core gets.

Jesse Luehrs's smartmatch module lets us alter the behavior of the smart match feature in a lexical scope by loading a module. The smartmatch::engine::core module implements the original behavior as introduced in 5.10.0. If we include "use smartmatch 'core'" in our own code, the smart match operator and builtins use the old behavior. Meanwhile, another module can include "use smartmatch 'sane'" and get a different behavior.

Because this module takes advantage of hooks in the Perl interpreter for syntax extensions, it can actually insert itself into the compiled Perl optree. This means that the extension can be as fast as the core implementation without being part of the core C code. When the smartmatch behavior in the core is changed, this module can be shipped with that release. We can even arrange for code that includes "use v5.14" to use the old smart match behavior, while code that includes "use v5.18" gets the new behavior.

This extensibility also makes it easy to prototype new features as CPAN modules. Florian Ragwitz's List::Gather module on CPAN adds gather and take "builtins" which compile down to operations in the parsed program. This syntax provides a nice way to create an array from an iterating operator:

    use List::Gather;
    
    my @list = gather {
        while (<$fh>) {
            next    if /^\s*$/;
            next    if /^\s*#/;
            last    if /^(?:__END__|__DATA__)$/;
            take $_ if some_predicate($_);
        }
    
        take @defaults unless gathered;
    };

This syntax could easily be included in the core simply by bundling the List::Gather module in the core distribution. All we need to do is make use v5.18 load List::Gather behind the scenes. There's no need to alter the core at all.

What's new in Perl 5.16.0

Perl 5.16.0 doesn't have any huge features, but it does have a collection of small features and bug fixes that will make Perl 5 better. In 5.16.0, we now have a __SUB__ token. This token returns a reference to the current subroutine, letting us write cleaner recursive closures:

    use feature 'current_sub';

    my $factorial = sub {
        my $val = shift;
        if ( $val > 1 ) {
            return $val * __SUB__->( $val - 1 );
        }
        else {
            return $val;
        }
    };

    print $factorial->(5);

The 5.16.0 release also adds support for the Unicode 6.1 standard, in addition to several other Unicode improvements. We now have much better support for Unicode characters in symbol names (packages, methods, etc.). The new fc operator and \F escape implement the Unicode foldcase operation, which does proper case-folding for all languages.

As part of the previously mentioned push to make the interpreter internals more extensible, 5.16.0 documents a number of functions for manipulating "pads". A pad (or scratchpad) is the data structure that stores lexical variables for each subroutine. This API was already in use by some modules, like List::Gather, but documenting it means that module authors can rely on the API's stability.

There has also been a lot of work on the core documentation. Perl 5.16.0 ships with a new object-oriented programming tutorial, and the object-oriented reference documentation has been rewritten from scratch with expanded coverage.

In addition to these changes, Perl 5.16.0 has many other bug fixes, performance improvements, documentation improvements, and core module updates.

A detour through the Perl 5 ecosystem

Just talking about the core when talking about the state of Perl 5 misses much of what makes Perl Perl. Perl's greatest strength has always been the Comprehensive Perl Archive Network (CPAN) and the larger Perl community. This is a huge topic, so I'll only hit a few highlights.

If you haven't looked at Perl recently, you may not have heard of Moose. Moose borrows from Common Lisp Object System, Perl 6, and many other languages. Moose provides a clean declarative API for declaring classes and roles. This eliminates huge amounts of boilerplate that Perl's native object system requires, allowing developers to focus on what their class does, rather than how it does it. Moose also provides a self-hosted metamodel, which means that it can be extended by writing classes and roles using Moose itself. Moose has been adopted by many CPAN authors for their own modules, and there are dozens of Moose extensions available.

Perl 5 has a number of excellent web frameworks available. Catalyst, Dancer, and Mojolicious are all mature, well-supported, and widely used. All of these frameworks build on top of the PSGI spec and Plack tools. These are inspired by Python's WSGI and Ruby's Rack respectively. Any application or framework that implements the PSGI spec can easily be deployed using FastCGI, standalone servers like Starman, mod_perl, or even plain old CGI. This makes writing and deploying Perl web applications easier than it's ever been.

Perl also has a number of Object-relational mapping modules, but the most popular is DBIx::Class. It has been under development since 2005, and has seen wide adoption throughout the community, attracting a number of core contributors, as well as inspiring dozens of extensions.

The services built around the CPAN archive are also exciting. The MetaCPAN API provides a web API to a database of every distribution ever uploaded to the CPAN archive. This API is open and freely usable, so anyone can build tools on top of it. There is already a new CPAN search site, also called MetaCPAN, that uses this API.

The CPAN Testers service collects test reports on every distribution uploaded to CPAN. Clients test the distributions on many different platforms and Perl versions. The service received its twenty millionth test report on March 7, 2012, and is currently receiving nearly 1 million reports a month.

The community is also busy organizing events. There are YAPCs (Yet Another Perl Conferences) this summer in the US, Germany, Brazil, and Japan, as well as many smaller workshops scheduled or in the works.

Future Plans for Perl 5

What happens after Perl 5.16.0? What will be in 5.18.0 or 5.20.0? Perl 5 development is volunteer driven, and I cannot commit anyone else's time. That said, here are some ideas that have been floated for future releases.

The project I'm most excited about is work on a MOP for the Perl 5 core. A MOP is a Meta Object Protocol. This is what Moose provides, but Moose does this as an extension. The goal is to create an API that can be implemented by the Perl 5 core and extended by modules like Moose. Putting this in the core has the potential to make modules like Moose much faster. However, the MOP is not just for Moose. It will be flexible enough to support multiple object systems, and will be usable as a minimal Object Oriented system all on its own, without extension.

As a bonus, this work will also include a few new bits of syntax. In particular, classes and methods will finally be distinct from packages and subroutines, and there will also be core support for roles, named method parameters, and attribute declaration.

I already mentioned the work on making the core more extensible. That work is an ongoing effort that opens up the possibility of breaking backward compatibility in a saner way, as with the smartmatch module example. This in turn frees up core developers to fix old design mistakes and introduce new features that might otherwise break old code.

Unicode work is also progressing. The 5.18.0 release will (hopefully) include support for set operations on Unicode character classes in regular expressions, as well as Unicode-related performance improvements.

While it's hard to predict the specifics of the future, I'm excited to see the activity and effort going into Perl 5 core development these days. The new release schedule, along with a move to Git, seems to have attracted some new contributors. Looking at the activity in the core and the community, Perl 5 is on a healthy path toward the future.

Index entries for this article
GuestArticles	Rolsky, David

Moose

Posted Mar 22, 2012 2:27 UTC (Thu) by dskoll (subscriber, #1630) [Link] (5 responses)

Moose is nice, but... it's a bit of a memory pig. :(

Starting perl on Linux/x86 yields a process with VSZ around 3MB. Executing use Moose; expands that to about 9.5MB.

That might not seem like a lot, but throw in a few more CPAN modules and your memory footprint quickly gets huge.

Moose

Posted Mar 22, 2012 2:42 UTC (Thu) by autarch (subscriber, #22025) [Link]

This is one reason I'm excited about the possibility of core MOP. That will move some of what Moose does into the core, which will presumably be both faster and more efficient.

Moose

Posted Mar 23, 2012 13:45 UTC (Fri) by jnareb (subscriber, #46500) [Link] (3 responses)

There are varius lighter-weight (but more limited) versions of Moose (or rather Moose-compatibile object systems): Mouse, Moo, Mo, and there is build time equivalent Mite.

Moose

Posted Mar 23, 2012 23:05 UTC (Fri) by dskoll (subscriber, #1630) [Link] (2 responses)

Well, yeah, but let's say you want to use CPAN modules X, Y, Z and W. If X wants Moose, Y wants Mouse, Z wants Moo and W wants Mo, you end up with Mess. :(

Moose

Posted Mar 27, 2012 20:13 UTC (Tue) by jnareb (subscriber, #46500) [Link] (1 responses)

Well, yeah, but let's say you want to use CPAN modules X, Y, Z and W. If X wants Moose, Y wants Mouse, Z wants Moo and W wants Mo, you end up with Mess. :(

Well, there are Any:Moose, Any::Mo modules. For example if you use Any::Mo you are limited to what Mo can do, but you can use any of Mo, Moo, Mouse, Moose which can implement this subset of OOP.

Moose

Posted Mar 29, 2012 3:11 UTC (Thu) by harbud (guest, #83808) [Link]

No, the problem is I cannot control what X, Y, Z, and W uses. X might use Moose specifically and not Any::Moose, Y might use Mouse, and so on.

Perl 5.16 and beyond

Posted Mar 23, 2012 16:21 UTC (Fri) by b7j0c (guest, #27559) [Link]

putting MOP in the core would be a great move, please do!

Perl 5.16 and beyond

Posted Mar 25, 2012 11:21 UTC (Sun) by kleptog (subscriber, #1183) [Link] (2 responses)

I something wonder where Python would be now if it had such an extensible core. The idea that you might be able to, per module, define whether you want Python 2 or Python 3 string semantics. from __future__ does some of the same work, but doesn't go far enough.

Perl 5.16 and beyond

Posted Mar 27, 2012 10:37 UTC (Tue) by man_ls (guest, #15091) [Link] (1 responses)

It looks like it would be a sane solution for the Python 2->3 transition mess. Not only string semantics, but to provide complete compatibility for Python 2 modules in Python 3 programs. It would be a nice way forward.

Re: What if Python had such an extensible core

Posted Jun 3, 2012 8:47 UTC (Sun) by gps (subscriber, #45638) [Link]

I don't see how this would help bridge the Python 2 <-> 3 boundary within a single program.

The real issue with doing that is the fundamental string type change; you can't magically pass data across such an interface boundary without deciding which way the types convert. Python is so dynamic that in most situations there is no definition carried around of where something came from such that a boundary crossing could be determined and enforced.