A few separate conversations in the span of the past hour left me with this question: when does it make sense for a piece of software to be packaged? What makes someone say something like "oh, we should get that into Fedora" - when does the rationale behind it hold? Here are my half-formed thoughts, in the spirit of RERO, so that those who know more than I do about the topic can nucleate around the idea and rip it apart. (Go! Go! Go!)

At the most basic level, as I (a non-packager) understand it, packaging software makes it easier for someone else to deploy it. Repeatedly, identically, easily, in an automated manner.That's it. That's the big deal. So the question of do I want this software packaged? becomes would it be in my interest to make it easy for other people to deploy this software?

This means that answering the question "should it be packaged?" isn't actually about looking at the software, per se. Instead, you look how the situation of a group of people would change if the deployability of the software changed.

For end-user desktop applications, which is what most computer users spend their days in front of, the answer is almost always yes. In fact, I can't think of any exceptions. Those applications are made usable via being installed on an individual's computer. The easier it is to deploy, the easier it is to get someone using it. Users want their software to be easy to install. Developers of desktop applications want users to install and use their software. Of course Firefox and Pidgin and bash should be packaged.

The same goes for development tools and languages. They, too, are deployed via individual instances; the more instances there are, the easier it is for developers to use it, the more developers will be using it, the more powerful the upstream language community becomes. Python and Eclipse will always, always be packaged.

These are the easy cases. For desktop applications and individually deployed tools, both end-users and developers (and the businesses they work within) benefit from easier deployment, and therefore packaging makes sense.

It starts getting interesting when you look at multi-user hosted services. Some still fall into the same case as above, where end-users and developers and their ecosystems benefit from easier deployment. For instance, Moodle developers have a strong incentive to get their work packaged. For them, more working individual Moodle deployments are a good thing - they make the community more powerful. The more deployments of Moodle schools want to spin up, the more consulting will be purchased from Moodle companies. And for schools and their IT admins and teachers, the "end users" of this software, having it be easy to spin up a working instance is a huge bonus; they frequently want to try out the software, experiment with different modules in a sandbox before deploying it live for a class, and so forth. When you're trying to decentralize functionality, packaging makes sense.

However, sometimes a project's influence does not depend on the number of independent instances of its software. Take OpenHatch, a website for aggregating getting-started tasks and mentors for FOSS novices. The influence of OpenHatch therefore depends on how much information they can centralize, rather than how much functionality they can decentralize. OpenHatch is also an open source software project - they custom-developed the platform they run on - but the point of open-sourcing OpenHatch is to make it possible for volunteers to contribute to the central site. Packaging OpenHatch and making it easy to spin up other deployments would not make much sense. When you're trying to centralize information, packaging doesn't make sense...

...to you. But it may to other people who want to decentralize your functionality later on. Take Remora, the software developed by Mozilla in order to aggregate Firefox plugins. This was also an example of "power via centralization of information" - Remora developers wanted to get more people to addons.mozilla.org, and something like mozilla-addons.somewhere-else.org would have hurt their project! The turning point came when another group wanted to centralize a different set of information the same way; Sugar Labs needed a way to host its activities, and therefore broke out Remora for generalization. But it's important to note that it was the software's second user, not its first developer, with that impetus for generalization and ease of deployment.

I could see OpenHatch running into a similar situation someday - surely something else will come up where somebody wants to aggregate tasks and match mentors and doers! - but as long as nothing specific comes up, the OpenHatch project is likely to remain unpackaged, because it simply doesn't need to be right now.

Now here's the really interesting part: what if you want to centralize functionality? Most "software as a service" (SaaS) providers fall into this category. Look at Limesurvey, an open source survey platform. It's got Limeservice, a hosted version with a basic free plan that heavy users can pay to upgrade with more responses. I've used Limeservice in the past because it's too darn hard for me to set up Limesurvey on my own. Packaging Limesurvey would make it easier to deploy, which might decrease usage of the central service - so even if I as a user would love to be able to yum install limesurvey, if I put myself in Limeservice's shoes, I can see why they might not be jumping to get it packaged! (Limeservice's prices are also low enough that it's not that painful for me as a user - I'm not trying to badmouth Limeservice here, it's a pretty nifty service and a great piece of software, and I'm glad to use it for my research projects.)

Still, it's possible for someone to package Limesurvey just because they want to. With open source, you can always decentralize functionality, and that seems to be the trend. Decentralized functionality, centralized information. It's mostly a question of:

  • Who would benefit from having the functionality decentralized?
  • How much work would it take to do?
  • Are the folks who'd benefit willing to put up the resources to do that work? (This is harder for some than others, and dependent on the packager just as much as the package; for instance, as a non-packager, even an easy package is a tremendous amount of work for me to handle - but I also know other people who eat bundled libraries for breakfast and would see the same thing as a trivial task.)

I think these ideas are rough and certainly could use more development, but it's some churning food for thought I wanted to put out there.