My DarkPAN Setup

chromatic wrote about setting up a private CPAN to store your code. I am a big fan of this appropach, and I think that managing code as distributions, rather than as a bunch of stuff in a version control repo (which is the predominant practice in most shops), is the one and only way to go. Here is how we have set things up at $work:

  1. A CPAN::Mini mirror is set up on one of the company's machines. An apache web server makes it accessible at cpan.company.com via password-protected http (we use mod_auth_krb to authenticate against our Kerberos-based SSO solution). Now all clients can setup their main mirror with o conf urllist, as cpan.company.com, and their username and password with o conf username and o conf password respectively.
  2. We make the local mirror injectable with CPAN::Mini::Inject, and additionaly configure CPAN::Mini::Inject::Server so that we can inject from remote machines. CPAN::Mini::Inject::Server is also served by a password-protected apache at e.g. cpaninject.company.com.
  3. Developers who need to push distributions into the mirror have CPAN::Mini::Inject::Remote and Dist::Zilla::Plugin::Inject installed. All distributions are managed by Dist::Zilla, so they only need to do dzil release in order to get their stuff into the mirror and make it available to everyone else. We use a company-wide Dist::Zilla plugin bundle which contains all server settings for Dist::Zilla::Plugin::Inject, so it is zero configuration for them - they just need to install it and are ready to upload.
  4. I am still working on choosing an appropriate solution to serve the docs for our Perl code (perldoc.company.com). There is a lot of choice out there, I have yet to see what will work best for us...

What do we gain from working this way?

  • Code is reusable across applications. We don't have one monolithic product, rather we have a number of smaller program which reuse a lot of functionality. Structuring code in distributions makes this a breeze.
  • Management of dependencies when deploying is extremely easy.
  • And last but not least, we get to work with the awsome Dist::Zilla :-)

This is, however, still not a widesperad approach, and we have encountered many issues while implementing it:

  • There are a lot of ways to solve this problem. Other than the tools outlined above, there are CPAN::Site, CPANPLUS::Shell::Default::Plugins::CustomSource, MyCPAN::App::DPAN, and now CPAN::Dark, and probably others I don't know about. And then you could go with par, ppm, deb, rpm, git submodules, and more. Each of these approaches has its pros and cons, and one has to go through a lot of trial and error in order to fiture out what works for them. It would be good if there was one solution that does most of what people want and is easy to setup and use (like what Moose does to OO and Dist::Zilla does to distribution building). And then there are even more solutions if you want to server POD locally or put some kind of web interface for your CPAN. What I would love to be able to just grab a distro with the code for current metacpan.org interface and have it seamlessly integrate with my mirror, so that I can search it, browse it, read docs, etc.
  • I could not find much information on what author names I could use when deploying to a private CPAN, or what namespaces are safe to use for local distributions. Whatever exists as standards is spread out in different places, hard to find, and often discouraging. For example, what if I want to use other author names instead of the reserved LOCAL?
  • Most code with an architecture that uses somthing like Module::Pluggable to load stuff (think DBIx::Class resultsources and Catalyst components) is very painful to deploy as an installable distribution. If a module is removed from a distribution (e.g. you have deleted a table and its associated result class), an upgrde via CPAN will not prevent the old module from still being picked up. This is a long-standing issue for CPAN and I am not sure how it can be resolved. Currently we deal with this by using load_classes with explicit resultsource names in our DBIx::Class schemas, and we never install Catalyst distributions (Catalyst is not installation-friendly to begin withm, which is kind of OK since Catalyst apps are never used as dependencies themselves).
  • Using a passthrough mirror (the approach that CPAN::Dark relies on) works with cpanminus, but I cannot see a way to configure it via the standard CPAN or CPANPLUS shell. Similarly, CPANPLUS's custom sources work only with the cpanp shell. It would be great if there is a standard for querying more than one location for a module and all clients supported it. We do not really need to keep a full CPAN mirror on our servers, we only need a place to keep our private code.
  • And last but not least, there is little support for using password-protected mirrors. The standard shell supports only http authentication, and stores the username and password in plain text. If private CPANs are to gain wider acceptance in corporate environmens, support for more authentication protocols and more secure ways to store credentials would probably be essential.

In an ideal world, distributions would be the default way to manage perl code even for the newbee programmer. It should be simple and expected to setup your own passtthrough mirror on a shared hosting for the code you write, and the standard way to deploy stuff should be through cpan and local::lib, even for the smallest of applications.

Posted in Better CPAN
blog comments powered by Disqus