Annotations, Attributes, Traits - Part II

This post is a continuation of Annotations, Attributes, Traits and explores the different options for applying metadata that are available to Perl, and what is needed to make these options available for third-party modules to use.

Context

Let us start with a list of the contexts where we may have to set metadata:

  1. Variable declaration
  2. Subroutine declaration
  3. Package declaration
  4. Moose attribute declaration

The last item requires some explanation. The attribute syntax that Moose implemented has become pervasive in modern Perl modules, and is now used in many new libraries, whether related to Moose or not. Even though the new syntax proposed by Stevan implements attributes as variables, there are many other uses of attributes and it is safe to assume that they are here to stay for the time being (e.g. HTML::FormHandler, DBIx::Class::Candy, Parse::FixedRecord).

There is also Moose type declaration (subtype, enum and the like), which I have chosen to ignore in make the discussion more manageable. I do believe it will benefit from most of the proposals here.

Also, we will omit package declaration for the time being, as we will be talking about them later.

A note on attribute case

All examples with perl attributes in this post use lowercase syntax. The attributes documentation currently advise against this practice, and perl will issue a warning when it encounters an non-core attribute starting with a lowercase letter. The main reason for this is to avoid conflicts with new core attributes that may be introduced in the future. My personal opinion is that:

  1. Lowercase names with underscores are way more perlish and look much better in most code, and if attributes are to gain wider usage I believe this syntax should definitely be permitted.

  2. AFAIK, core perl has not added new attributes since 5.6 in which attributes were initially introduced.

  3. Even if new attributes need to be added, at this stage it probably could (and should) be done in a module that requires explicit loading or via the feature pragma, so conflicts should not be an issue.

Setting metadata

We will now attempt to list all the different ways to set metadata in all the contexts listed above. Some of them we already discussed in the previous post.

Keywords

This is a way of setting metadata by changing the keyword for declaring the respective entity:

# variables
my $var;
our $var;
state $var;
has $var;

# subroutines
sub doit { ... }
method doit { ... }
action doit { ... }

# attributes
has name => ...
has_field name => ...
has_class name => ...
column name => ...
test name => ...

Keyword modifiers

This method involves using a keyword before the main declaration keyword:

# variables
const my $var;

# subroutines
my sub doit { ... }
multi method doit { ... }

# attributes
class has 'name' ...

Type

The use of the type slot (between the declaration keyword and the entity name) is currently limited to variables:

my Str $var;

It could possibly be implemented for subroutines too to indicate the return value, like Java does (if the subroutine ignores context) - but AFAIK Perl 6 does not do it and there must be a good reason for that (e.g. because functions can return multiple values, which they can't in Java):

# method returning a string
method Str doit { ... }

In order to make types work with attributes, the attribute keyword must have a way to check if its first argument is a type name or the attribute name (which is probably not possible now):

column Varchar 'name' ...

Assignment

The assignment operator as part of the declaration is only available to variables:

my $var = 1;  # set value
has $var = 1; # set default value

It could theoretically be used for attributes with a similar meaning:

column name = 1; # default column value

There could, of course, be other, more interesting uses. Just today I had a look at Tie::Array::CSV, prompted by an entry on IronMan:

tie my @file, 'Tie::Array::CSV', 'filename';

With a clever use of the assignment operator this already nice syntax could be transformed into:

my @file : csv = 'filename';

Signature

The signature slot is only available to subroutines:

sub doit (@args) { ... }

Its various potential uses were discussed in the previous post.

Code

The code block slot is currently only available to subroutines:

sub doit { ... }

It could theoretically be used with variables and attributes to assign e.g. a builder:

# declare a local variable whose value is the result of an expensive 
# calculation, but defer the calculation until the first time the value
# is requested
my $var :lazy { ... }; # shortcut for: my $var :lazy = sub { ... }

Attributes

Perl attributes are available to both variables and subroutines, and could be made available to Moose attributes via Devel::Declare:

# variables
my $var : ro = 1;
has $var : ro lazy build;

# subroutines
sub doit ($foo, %bar) : build_args check_args returns(Str) { ... }

# attributes
column name : data_type(VARCHAR) required;

Perhaps the most controversial point in these posts will be whether attributes should be used over simple lists as the primary syntax for metadata declaration. The attributes sytnax tends to be much more concise with common shorter declarations, especially when combined with some of the other patterns described above:

has Str $name : ro required = 'John Smith'; 

vs.

has Str $name ( is => 'ro', required => 1 ) = 'John Smith';

When we add some complexity and choose not to use extra sytnax sugar, the scales tilt in favour of the list syntax, but I still prefer the attributes syntax as it adheres to the Perl moto of making easy things easy and hard things possible:

has $names : ro required
    isa( 'HashRef' )
    default( sub {  first => 'John', family => 'Smith' } )
    traits( qw(Hash) )
    handles( get_names => 'elements', add_name => 'set' );

vs:

has $names ( 
    is       => 'ro'
    required => 1,
    isa      => 'HashRef',
    default  => sub {  first => 'John', family => 'Smith' },
    traits   => [qw(Hash)],
    handles  => { get_names => 'elements', add_name => 'set' },
);

We can prettify the first declaration a little bit if we allow spaces between attribute names and the opening bracket:

has $names : ro required
    isa     ( 'HashRef' )
    default ( sub {  first => 'John', family => 'Smith' } )
    traits  ( qw(Hash) )
    handles ( get_names => 'elements', add_name => 'set' );

Another point to keep in mind is that in order to provide a consistent syntax for metadata declaration, we will also need to implement the list syntax for subroutines too, which may not be an easy task, and will conflict with singatures:

sub doit ($foo, %bar) ( build_args => 1, check_args => 1 ) { ... }

Metadata at a distance

The last way to set metadata is by code that is located in a different place from where the entity is declared. Many popular modules make use of this pattern:

# variables
readonly($var);

# subroutines
set_prototype('doit', '*$%');
memoize('doit');

In many cases, there exists a more concise syntax to achieve the same goal at the time of declaration:

const my $var = ...;
sub doit : prototype('*$%') { ... }
sub doit : memoize { ... }

Sometimes, however, there isn't. One specific case is subroutines with multiple bodies. Consider a Catalyst action that implements a form via Catalyst::Controller::HTML::FormFu:

sub edit :Local :Args(1) :Form { ... }
sub edit_FORM_ERROR { ... }
sub edit_SUBMITTED_AND_VALID { ... }

The underlying metadata syntax is:

subroutine( 
    name => edit, 
    body => \&edit, 
    on_form_error => \&edit_FORM_ERROR,
    on_form_submitted_and_valid => \&edit_SUBMITTED_AND_VALID,
    ...
);

The above could be more elegantly expressed as:

action edit : ... { ... }
form_error edit { ... }
form_submitted_and_valid edit { ... }

Or even:

action edit : ... { ... }
form edit : error  { ... }
form edit : submitted_and_valid { ... }

In a similar vein, it is open to debate whether method modifiers implement the 'keyword' pattern (i.e. they declare a completely new code entity) or the 'metadata at a distance' pattern (i.e. they modify an existing code entity).

TODO

This is all nice and pretty, but unfortunately some of the changes required to make the above examples work may involve meddling with the perl core. Here is what I believe can and cannot be implemented via external modules today:

Generic mechanism to allow application of metadata to entities

In the first place there needs to be a mechanism where code can hook into the entity creation event in order to do its magic. For example, let's say I want to have a testing module that can automatically create custom fixtures based on the sub signature, and make them available as lexical variables.

sub test_cart (MyApp::Fixture::ShoppingCart::TwoApples $cart) { 
    ... # use the $cart here
}

There is currently no way for my module to hook into the subroutine creation event (the event caputred by Sub::Mutate::when_sub_bodied) - or at least no way that is obvious to a non-XS-savvy developer, and request random code to be executed at that point. The signatures pragma does this, but in a way that does not seeem reusable by other modules.

Another problem is that it is currently difficult to introspect all the differnt types of metadata applied durint the declaration of a given entity. In a fancy variable declaration:

my Str $var : foo bar;

the foo attribute has no way to tell that the bar attribute has been specified too - and vice versa, and neither knows that there is a type restriction as well.

I think Moose has solved this problem quite elegantly, and has a solid interface for extensions to inspect and modifty attribute properties, set defaults, and add new properties. This approach could be extended to handle different kinds of entities:

# this is how modules like Attribute::Handlers and Attribute::Lexical
# currently work
sub MyAttribute ($package, $symbol, $referent, $attr, $data) { ... }

# instead we could have a function 'with meta', i.e. it gets passed 
# a method object that contains all the information above, plus 
# information about the other attributes, the prototype, etc.
sub MyAttribute ($metamethod) {
    if $metamethod->has_attribute('OtherAttribute') {
        do_this();
    } else {
        do_that();
    }
}

All of the proposals below depend on such a protocol being available.

Keywords

Perl already has a very good system for introducing new keywords, both for variable and subroutine declaration, with Devel::Declare. Changing the way built-in functions work is more difficult, but apparently not impossible (see the signatures pragma for example).

Keyword modifiers

Any subroutine can act as a keyword mofifier for variables. Ditto for Moose attributes (e.g. if has in non-void context returns the meta object rather than installing it), although I could not find any modules actually doing that.

Keyword modifiers for plain perl subs are not currently possible, although I have a suspicion that this should be doable with Devel::Declare.

Type

Lexical::Types makes the type slot usable to an extent. While some jumping around hoops will be required, it is not impossible to come up with a solution that allows using MooseX::Types-style type names on variable declaration. Anything more complicated will unfortunately need changes to the Perl parser.

Another caveat is that types are currently only available to scalar variables. The docs say:

Currently, only scalar variables can be declared with a specific class 
qualifier in a "my", "our" or "state" declaration. The semantics may be 
extended for other types of variables in future.

So core changes will also be required in order to do something like:

my Str @array_of_strings;

Assignment

Capturing assignment is relatively straighforward with tie, and I believe also possible with Variable::Magic. Assignment support for Moose-style attributes could be implemented with Devel::Declare.

Signature

The prototype slot could be set to anything with warnings::illegalproto disabled, and then introspected with the prototype function. There are two caveats, however: first, the prototype cannot span more than one line; and second, prototype() will return the prototype as a string with all whitespace stripped. Thus it is currently not possible to have a signature implementation that can use more than one line for longer declarations, or for which whitespace is significant.

Attributes

Attributes are available to both subroutines and variables, and can be made available to Moose-style attributes via Devel::Declare.

There are two limitations in the current attribute implementation that I find particularly frustrating:

First, the attribute data (whatever is in the brackets after the attribute name) cannot access the current context. This means that you cannot refer to
a variable or subroutine defined in the scope where the attribute is used:

my @traits = qw(Hash MergeHashRef);
my %hash : traits(\@traits) ... # Nope!

Ideally the code taht manages attribute application should be able to capture the environment at the point where the attribute is used via a mechanism similar to Parse::Perl::current_environment(), and then make it available to the attribute handler:

sub MyAttributeHandler ($data, $env, ... ) {
    @data = Parse::Perl::parse_perl($env, $data);
    ...
}

Second, the Perl parser currently does not allow spaces between the attribute name and the opening bracket after it. Doing this would allow for much nicer looking attribute specifications:

has $ssn : rw
  clearer   (clear_ssn)
  predicate (has_ssn);

Conclusion

Despite their reputation, attributes seem a reasonable way to implement a consistent mechanism for metadata application in Perl. Combined with other syntax features, they meet all three criteria - extensible, consistent and concise.

Posted in Perl 5
blog comments powered by Disqus