Annotations, Attributes, Traits

At last year's LPW, Stevan Little demonstrated a draft implementation of a new class system for Perl 5 (talk video can be found here). Here is how the proposed syntax works:

class Point {
    has $x ( is => 'rw' ) = 0;
    has $y ( is => 'rw' ) = 0;

    method clear {
        ($x, $y) = (0, 0);
    }
}

One of the questions that came up during the discussion afterwards was what is the best syntax for declaring the rich sets of additional metadata associated with class members and methods. This post is my take on the options that we have.

What's the problem

I believe Perl 5 needs to have a way to apply metadata annotations to different entities (i.e. subroutines, variables, packages, types), which meets the following criteria:

Extensible

Most programming langauges provide a syntax (see below) that allows developers to apply arbitrary metadata to entities. I would like to see Perl go beyond that and make all the different syntaxes for declaring various types of metadata (keywords, prototypes, types) available for developers to use.

Consistent

I think this is the biggest problem with the current state of metadata syntax in Perl. The sugar syntax for Moose attributes is nice, but it cannot currently be used for other types of entities - e.g. subroutines or lexical variables. The method keyword provided by Devel::Declare-based modules behaves almost like the built-in sub keyword, but not quite. Looking at the three most popular implementations of the method keyword - Method::Signatures, MooseX::Method::Signatures, and Method::Signatures::Simple:

method NAME;               # a "forward" declaration - broken in all three
method NAME : ATTRS BLOCK  # with attributes - only works in Method::Signatures
$methodref = method BLOCK; # anonymous method - broken in MooseX::Method::Signatures

Also, if you use signatures with a method keyword, you cannot use the same signatures with the sub keyword, for no other reason but limitations of the Perl parser. Similarly, some modules allow you to do funky things such as multi method ..., but not multi sub .... Until Perl has a way to resolve these inconsistencies, the new syntax will look experimental to many people and adoption levels will remain marginal.

Concise

We want syntax that is kind to both the eyes and the fingers, and that we can be proud of when standing next to other programming languages.

How different languages do it

Different languages have at different stages in their evolution reached a point where they needed a way to declare arbitrary additional information about a given entity.

Perl 5 attributes

Perl 5 added attributes in version 5.6. Attributes can be attached to a vairable or a subroutine. Some examples of how attributes are used:

my %hash : Tie(Tied::Class) = ...; # Tie::Attributes 
sub calculate : Memoized { ... } # Attributes::Memoize
sub edit : Chained(/) PathPart(edit) Args(0) { ... } # Catalyst

Due to some limitations in their implemenation, attributes do not enjoy a stellar reputation in the Perl world:

  • Attributes are normally declared at compile-time and cannot be manipulated during runtime.
  • Attributes are difficult to introspect.
  • Declaring attributes is tricky and may involve polluting your symbol table.
  • Attributes provided by different modules do not play well together.
  • The attribute argument cannot access symbols from its context (i.e. my $lazy_var :Builder(\&_builder_sub) does not work).

Perl 6 traits

Perl 6 comes with a powerful annotation syntax out of the box. Traits can be applied to classes, functions and variables:

constant $pi is Approximated = 3;
my $key is Persistent(:file<.key>);
sub fib is cached {...}

Java annotations

Metadata annotations were introduced in J2SE 5.0.

@Test
public static void edit { ... }

C# attributes

C# attributes, based on Java's annotations, were introduced in .NET 1.1, :

[Test]
public static void edit { ... }

Attributes can be added to packages, types, methods, parameters, members and variables.

Python decorators

Python calls them decorators and borrows Java's syntax. Python decorators can be used to annotate a class, function or method:

@Test
def edit
    ...

Decorators were first introduced in Python 2.3. A fascinating overview of the different proposals for decorator syntax can found in this PEP.

Other languages

Unfortunately this is where my knowledge ends, feel free to add more examples in the comments ...

But what are annotations anyway?

Let us backtrack for a moment and consider what a metadata annotation is. Moose's powerful attributes (not to be confused with Perl's built-in attributes) are a very good starting example:

has 'name', is => 'ro', isa => 'Str', required => 1;

With other types of constructs, identifying annotations is less clear:

# annotate function 'bar' as exportable - classic implementation
use base 'Exporter';
our @EXPORT = qw(bar)';
sub bar { ... }

# attributes-based implementation
use Perl6::Export::Attr;
sub bar :Export(:DEFAULT) { ... }

# and a Moose-style implemetation might look like this:
function 'bar', export => 1, body => sub { ... };

# a few possible syntaxes to annotate something as read-only:
readonly my $var            # call a function on declaration
my $var :readonly           # use attributes
use readonly var => ...     # a pragma
readonly qw('$var')         # call a funcion on symbol name 
has 'var', is => 'readonly' # Moose-style properties

So, while there may be different ways to associate metadata with an object, ultimately it can always be expressed as a list of properties applied to that entity, e.g.:

# pseudocode
function ( name => 'bar', export => 1, body => sub { ... } );
variable ( name => 'var', type = 'scalar', read_only => 1, value = ... );

This is basically how Moose's meta protol stores information internally. You could use Moose sugar keywords to create objects, or raw Class::MOP meethods, or you could create them using Perl's built-in syntax, but it all ends up looking the same underneath. Hence my three rules of metadata:

  1. Everything is metadata
  2. There is more than one way to apply metadata
  3. But all metadata is the same on the inside

If you start thinking in metadata, you realize that most syntax is just sugar for metadata application, and you start appreciating that powerful and elegant metadata manipulation tools in a programming langauge are the most important requirement for writing powerful and elegant software. That is why I am very keen for Perl to be great with metadata. I think it is already 90 percent there, but there is still some work to be done. My experiments in creating an interface for MooseX::Params provide a good illustration of what is in my opinion one of the better approaches.

The evolution of MooseX::Params

Moose's attribute syntax is quite powerful and suffers from none of the limitations of Perl's built-in attributes. This is why the original implementation of MooseX::Params used that to define methods on steroids:

method 'foo', args => [ ... ], build_args => 1, returns => 'Str', body => sub { ... };

But the problem with this syntax is that it becomes very verbose very quickly. In a programming language that counts the number of characters in commonly used built-in keywords, the above statement contains a few arrows too many (and commans, and quotes ...). A number of modules exist on CPAN to address Moose's verbosity (MooseX::Has::Options, MooseX::Has::Sugar), but eventually MooseX::Params chose to start using function attributes:

sub foo : Args(bar, baz) BuildArgs Returns(Str) { ... }

Even then people complained that having to type column-Args for every simple subroutine is quite a nuisance. So the latest version on git attempts (semi-successfully) to hijack the prototype slot for the signature:

sub foo (bar, baz) : BuildArgs Returns(Str) { ... }

What we have done is, we have used the language's built-in syntax to fill a metadata slot. There is nothing particularly innovative about this - in an ordinary program, most metadata slots are filled in not using the extended metadata syntax (attributes, traits, decorators or whatever) but by built-in language syntax. This Java method declaration:

@Test
public static string foo ( string bar, int baz ) { ... }

is just a concise way to write this:

method ( 
    name       => 'foo', 
    scope      => 'public', 
    static     => 1,
    returns    => 'string',
    parameters => [ bar => ..., baz => ... ],
    test       => 1,
    body       => sub { ... },
);

All programming languages we desribed above provide special syntax slots for commonly used types of built-in metadata, and a catch-all syntax for any additional medatata. What makes Perl intersting is that at least some of the special built-in metadata slots are actually available for third party modules to use.

Hookable syntax

Let us look in more detail at those special syntax slots. The first example, which we already mentioned, is the prototype slot. Core Perl uses it for prototype declaration. MooseX::Params and the signatures pragma use it for signature declaration. Web::Simple uses it for route specification:

sub foo (*&)         # prototype
sub foo ($bar, $baz) # signatures pragma signature
sub foo (bar, baz)   # MooseX::Params signature
sub foo (/foo)       # Web::Simple route

A less obvious example of how built-in syntax can be used to modify metadata information is the assignment operator. By using tie or Variable::Magic, for example, we can theoretically hijack the assignment process and make it produce various side-effects, including changing the metadata information for the affected variable. Albeit by different means, the proposed new class syntax uses the assignment operator to set the default value of the class member, which is different from its actual value:

has $foo = 'bar'; # set 'foo' to 'bar' unless a different 
                  # value is passed to the constructor

Perl has another convenient metdata slot which due to its limitations has remained largely unused:

my Foo $bar = Foo->new;

The type slot allows you to specify that the given variable must contan an object that is an instance of the specified class. Unfortunately, this slot is tricky for third-party modules to make good use of. Currently the types pragma uses it to provide compile-time hints to the optimizer, MooseX::Lexical::Types provides limited support for Moose type validation, and Lexical::Types uses it to automatically bless lexicals into objects.

my int $foo; # optimization with the types pragma
my Int $foo; # validation with MooseX::Lexical::Types
my Int $foo; # automatically blessed with Lexical::Types

It is possible that future perls will make it even easier for modules to access the information from this slot and put is to better use (another possible application that comes to mind is datatype declarations for database columns in an ORM).

A syntax for Perl classes

And finally we are back to where we started from. I believe Perl's existing syntax is powerful enough to provide succint and extensible metadata declarations. Given you run-of-the-mill Moose attribute:

has 'foo', is => 'ro', isa => 'Str', lazy => 1, default => 'bar';

I would love to be able to write the above declaration as:

has Str $foo : ro lazy = 'bar';

And have methods that do:

method foo (Str $bar, Int $baz) : build_args returns(Str) { ... } 

It is perlish, nice and clean.

Stay tuned for the next post, where we will delve deeper into more complicated metadata, functions with mutiple bodies, and challenges to metadata declaration in general.

Posted in Perl 5
blog comments powered by Disqus