On the continuing decline of the GPL

Our most popular CAOS blog post of the year, by some margin, was this one, from early June, looking at the trend towards persmissive licensing, and the decline in the usage of the GNU GPL family of licenses.

Prompted by this post by Bruce Byfield, I thought it might be interesting to bring that post up to date with a look at the latest figures.

NB: I am relying on the current set of figures published by Black Duck Software for this post, combined with our previous posts on the topic. I am aware that some people are distrustful of Black Duck’s figures given the lack of transparency on the methodology for collecting them. Since I previously went to a lot of effort to analyze data collected and published by FLOSSmole to find that it confirmed the trend suggested by Black Duck’s figures, I am confident that the trends are an accurate reflection of the situation.

The figures indicate that not only has the usage of the GNU GPL family of licenses (GPL2+3, LGPL2+3, AGPL) continued to decline since June, but that the decline has accelerated. The GPL family now accounts for about 57% of all open source software, compared to 61% in June.

As you can see from the chart below, if the current rate of decline continues, we project that the GPL family of licenses will account for only 50% of all open source software by September 2012.

That is still a significant proportion of course, but would be down from 70% in June 2008. Our projection also suggests that permissive licenses (specifically in this case, MIT/Apache/BSD/Ms-PL) will account for close to 30% of all open source software by September 2012, up from 15% in June 2009 (we don’t have a figure for June 2008 unfortunately).

Of course, there is no guarantee that the current rate of decline will continue – as the chart indicates the rate of decline slowed between June 2009 and June 2011, and it may well do so again. Or it could accelerate further.

Interestingly, however, while the more rapid rate of decline prior to June 2009 was clearly driven by the declining use of the GPLv2 in particular, Black Duck’s data suggests that the usage of the GPL family declined at a faster rate between June 2011 and December 2011 (6.7%) than the usage of the GPLv2 specifically (6.2%).

UPDATE – It is has been rightfully noted that this decline relates to the proportion of all open source software, while the number of projects using the GPL family has increased in real terms. Using Black Duck’s figures we can calculate that in fact the number of projects using the GPL family of licenses grew 15% between June 2009 and December 2011, from 105,822 to 121,928. However, in the same time period the total number of open source projects grew 31% in real terms, while the number of projects using permissive licenses grew 117%. – UPDATE

As indicated in June, we believe there are some wider trends that need to be discussed in relation to license usage, particularly with regards to vendor engagement with open source projects and a decline in the number of vendors engaging with strong copyleft licensed software.

The analysis indicated that the previous dominance of strong copyleft licenses was achieved and maintained to a significant degree due to vendor-led open source projects, and that the ongoing shift away from projects controlled by a single vendor toward community projects was in part driving a shift towards more permissive non-copyleft licenses.

We will update this analysis over the next few days with a look at the latest trends regarding the engagement of vendors with open source projects, and venture funding for open source-related vendors, providing some additional context for the trends related to licensing.

FLOSSmole data confirms declining GPL usage

Last week we published a post looking at some statistics suggesting a decline in the usage of the GNU GPL.

The post sparked some interesting debate, not least about the validity of Black Duck Software’s numbers, which we had used to compare usage of the various FLOSS licenses over recent years.

While we have no specific reason to doubt Black Duck’s figures, Bradley M Kuhn, in particular, suggested that Black Duck’s data should be “ignored by serious researchers” since the company doesn’t disclose enough detail about its data collection methods.

He added that “AFAICT, FLOSSmole is the only project attempting to generate this kind of data and analysis thereof in a scientifically verifiable way”.

You can probably guess where this is going…

Started in 2004, FLOSSmole* collects data on open source software projects. FLOSSmole’s data is freely available via Google Code.

In order to test Black Duck’s data we downloaded FLOSSmole data from four sources for which both current (May 2011) and historical (October 2008) data was available: Rubyforge, Freshmeat, ObjectWeb and the Free Software Foundation.

We then sorted each data set and generated subtotals for each license type, checking the data manually to make sure we had combined all the relevant data (data tagged GPL2, GPLv2 and GNU GPLv2 for example).

Given the wide variety of ways in which the various GNU Public Licenses have been tagged across the four data sources (a huge number of Freshmeat projects are tagged simply “General Public License” with no version number) it also made sense to group the licenses together into the GPL family (including LGPL and AGPL).

The results show that the GPL family of licenses accounted for 70.77% of all 53,914 projects in the sample in October 2008. In May 2011 that figure had declined to 59.31% of 54,800.

As a reminder, the figures from Black Duck showed the proportion of projects using the GPL family of licenses had declined from 70% in June 2008 to 61% today. So the FLOSSmole figures actually show a more rapid decline in GPL usage than Black Duck’s.

One important point to note is that a significant number of projects (5,775) in the 2011 Freshmeat data do not have license details. Removing these projects from the sample would result in the GPL family of licenses representing 66.3% of 49,025 projects in 2011.

Either way, the FLOSSmole results confirm a decline in GPL usage.

UPDATE: Just to be clear, the figures for ‘GPL family’ above include both LGPL and AGPL as well. FLOSSmole’s figures show both increased from 2008-2011, from 6.22% to 7.21% and 0.11% to 0.36% respectively.

2ND UPDATE: Of course, the % of total projects is only one way to measure adoption, and some people will argue it’s not a particularly good one. Certainly we’re not going to get carried away with the fact that the % of projects hosted by the Free Software Foundation using the GPL family has declined from 81.2% to 76.7%. Although it is kind of interesting.

*Howison, J., Conklin, M., & Crowston, K. (2006). FLOSSmole: A collaborative repository for FLOSS research data and analyses. International Journal of Information Technology and Web Engineering, 1(3), 17–26. (more)

451 CAOS Links 2011.06.10

Yet more Apache OpenOffice fall-out. Bacula Systems raises $5m. And more.

# As the proposal to incubate OpenOffice.org at Apache went live, controversy about the proposal continued. The Free Software Foundation unsurprisingly voiced its support in favour of the LGPL LibreOffice project,which Keith Curtis outlined his opposition to the plan.

# Bacula Systems raised $5m from KM Capital Partners and from the Swiss Canton of Vaud.

# Joe Brockmeier explained how Microsoft’s patent loss could be bad news for open source.

# Computacenter raised the prospect of legal action against open source support supplier Sirius for complaining to Parliament about its “Microsoft bias”.

# Jahia announced the commercial release of its Jahia 6.5 content management system.

# Couchbase announced the general availability of Membase Server 1.7.

# Talend announced Talend Cloud, its cloud-enabled integration platform.

# Stefano Maffulli considered the implications of the declining adoption of copyleft licenses.

# Ian Skerrett introduced some key finding from the 2011 Eclipse community survey.

And the best open source license is …

UPDATE: The final vote is in and a winner has been declared, with Matt Asay and his arguments for the GPL taking the prize. You can see the debate or follow links to the other judges’ votes and thoughts here.

This is my assessment as a judge of the recent open source license debate held by the FOSS Learning Centre. We’ll have to begin with some qualifications and definitions, starting with the fact that there is no ‘best’ open source software license. Still, a star-studded open source software panel provided a lively, informative debate on the merits of some top open source licenses. For that, I congratulate and thank the panelists, Mike Milinkovich from the Eclipse Foundation arguing for the Eclipse Public License, Matt Asay of Alfresco arguing in favor of the GPL and David Maxwell from Coverity arguing for BSD. All three put forth some of the most important attributes and shortcomings of the three open source licenses, as well as other, related open source licenses. However, using a complex, proprietary formula awarding points for goodness and minuses for badness, I was able to deem a winner: Mike Milinkovich and the EPL. Perhaps fitting that the license that can best be described as the middle of the spectrum should be the winner. Here’s why:

Matt Asay kicked off the discussion, which became more of a debate as it developed, with a consistent message about GPL’s dominance among open source software projects, which is 70% or more based on most accounts (and considering GPLv2 and GPLv3). He also referred to monetization and the fact that GPL serves as the basis for successful support and services models, such as Red Hat. However, Matt did not initially mention the strategic and defensive benefits of GPL, which is often chosen because it mitigates the threat of a fork that someone can make proprietary. I was also hoping for him to address how GPL can deliver benefits of open source without having to share as in the spirit of the license, based on whether and how the software is distributed. Nevertheless, Matt made his most compelling arguments around the fact that GPL is the primary open source model and the license that developers understand and trust most. He furthered his argument later by agreeing EPL may be better for lawyers, but GPL is better for developers. Matt reinforced these ideas with his reference to large companies using GPL software, such as Google or TiVO, that gets it to vast numbers of users.

Mike Milinkovich spoke second with some background on EPL, its origin as a ‘legal document’ and how it links open source software to commercial products. He also hit on the fact that EPL covers patent rights, which is certainly important to vendors and developers. He later referred to the meaninglessness of Matt’s 70% GPL figure, based on the idea that software on repository is something different than software in use (where other licenses do have greater representation). However, our research indicates that the most popular open source licenses among hosted code are consistent with the most popular open source licenses among code in use, with GPL, BSD and EPL all in the top. Mike also referred to commercialization and money, which is certainly important to commercial open source, but did not give equal mention to community until later. Still, Mike earned back a point when he referred to monetization of open source software among traditional vendors and organizations beyond VC-funded, open source startups, where we are seeing significant growth for open source software. While I would have liked to have heard an argument in favor of EPL based on compatibility, Mike also made a good case for EPL in government — another consistent theme of the discussion — where code would belong to the public with commercial opportunity on top.

David Maxwell signaled a more rebuttal-type response and gave it in his arguments for the BSD license, which he introduced as the oldest license given its roots to Unix and the ’80s. David scored a point for simplicity and straightforwardness when he read the actual license, something his peers would’ve had a hard time doing. David did somewhat jump the gun, though, on rebutting with his counterpoints about GPL’s strict copyleft requirements, which he called ‘enforcement-based.’ Still, David recovered with an argument for BSD based on its emulation, which he credited for other popoular licenses such as the Apache Public License and Artistic License.

The debate portion was followed by some good discussion of business models, open core and proliferation with questions from the live and Web audiences. So why does my vote for the winner go to Mike and the EPL? While it was certainly close on my card and all three made compelling arguments, Mike and his portrayal of the EPL were the most realistic and pragmatic to today’s open source software in the enterprise. Communities, copyleft and the sharing that allows developers and projects to sustain effective, productive open source efforts must be balanced with commercial interests, endeavors and aspiration. Neither open source communities nor open source commercialization would be nearly as significant without one another, and Mike’s arguments and statements seemed most closely attuned to that.

Thanks again to the panelists, participants and FOSS Learning Centre for putting on the event. Please get involved in the discussion and watch the debate, comment here or elsewhere.

U.S. court confirms open source license legitimacy

There was a major open source legal development this week and surprisingly, it did not involve the string of BusyBox lawsuits, which included settlement from mobile and telecom giant Verizon in March 2008. Instead, the latest open source victory involves a federal appeals court ruling that basically upholds the idea and enforcement of ‘copyleft.’

The ruling, which centered on the Artistic License, made it clear that regardless of whether software is open source or proprietary, its creators have a right to attach requirements and conditions that govern its use and distribution. So to those who have argued that the GPL or other open source licenses might be thrown out of court, there is now more concrete proof. Open source software and its licensing are not some strange legal realm. Instead, GPL and other open source licenses base much of their meaning on existing, accepted laws, particularly U.S. copyright law and with GPLv3, international copyrigt law.

During the BusyBox GPL enforcement cases over the last year, there have been calls for actual courtroom hearings rather than settlement. The thinking is this would go further to solidifying the legality and legitimacy of the GPL and open source licensing in general. However, I still believe that the settlements, particularly from the likes of Verizon, do as much to bolster open source licensing. Now it appears open source supporters can have it both ways given the string of BusyBox settlements and the recent ruling that reinforces one of the basic tenets of open source, copyleft, in U.S. legal books.