On the rise and fall of the GNU GPL

Back in 2011 we caused something of a stir, to say the least, when we covered the trend towards permissive licensing at the expense of reciprocal copyleft licenses.

Since some people were dubious of Black Duck’s statistics, to put it mildly, we also validated our initial findings, at Bradley M Kuhn’s suggestion, using a selection of data from FLOSSmole, which confirmed the rate of decline in the proportion of projects using the GPL family of licenses between October 2008 and May 2011.

Returning to Black Duck’s figures, we later projected that if the rate of decline continued the GPL family of licenses (including the LGPL and AGPL) would account for only 50% of all open source software by September 2012.

As 2012 draws to a close it seems like a good time to revisit that projection and check the latest statistics.

I will preface this with an admission that yes, we know these figures only provide a very limited perspective on the open source projects in question. A more rounded study would look at other aspects such as how many lines of code a project has, how often it is downloaded, its popularity in terms of number of users or developers, how often the project is being updated, how many of the developers are employed by a single vendor, and what proportion of the codebase is contributed by developers other than the core committers. Since that would involve checking all these for more than 300,000 projects I’m going to pass on that.

Additionally, while all that is true, it does not mean that there is no value in examining the proportion of projects using a certain license. I am more interested in what the data does tell us, than what it doesn’t.

Data sources:
We analysed two distinct data sources for our previous analysis: Black Duck’s license data and a selection of data collected by FLOSSmole. Specifically we chose data from Rubyforge, Freecode (fka Freshmeat), ObjectWeb and the Free Software Foundation because those were the only sets for which historical (October 2008) data was available in mid 2011. For this update we have to use FLOSSmole’s data from September 2012 since the November 2012 dataset for the Free Software Foundation is incomplete. It is not possible to get a picture of GPLv2 traction using this FLOSSmole data since the majority of projects on Freecode are labelled “GPL” with no version number. In addition, for this update we have also looked at FLOSSmole data from Google Code, comparing datasets for November 2011 and November 2012. to get a sense of the trends on a newer project hosting site.

Black Duck’s data
According to Black Duck’s data the proportion of projects using the GNU GPL family of licenses declined from 70% in June 2008 to 53.24% today. The first thing to note therefore is that the rate of decline seen a year ago did not continue, and that the GNU GPL family of licenses continues to account for more than 50% of all open source software. The rate of the decline of the GNU GPLv2 has actually accelerated over the past year, however, and its usage is now almost the same as the combination of permissive licenses (I went with MIT/Apache/BSD/Ms-PL, you can argue about that last one if you like, but I’ve got to stick with it for consistency) at around 32%.

FLOSSmole’s data
Also in the interests of consistency I should clarify that we made a slight error in our previous calculations relating to the data from FLOSSmole. When we looked at the FLOSSmole data in June 2011 we reported a decline from 70.77% in October 2008 to 59.31% in May 2011. In calculating the data for this update I identified an error and that the figure should have been 62.8% in 2011. So less of a decline, but a decline nonetheless. The figures show that despite the total number of projects increasing from 54,000 in 2011 to 57,069 in September 2012, the proportion of projects using the GNU GPL family of licenses has remained steady at 62.8%. However, the proportion of projects using permissive licenses has grown, from 10.9% in 2008 to 13.4% in 2011 and 13.7% in September 2012.

Google Code data
The data from Google Code involves a much larger data set: 237,810 projects in 2011 and 300,465 in 2012. It also presents something problem since one of the choices on Google Code is dual-licensing using the Artistic License/GPL. Including these projects in the GNU GPL family count we see that the proportion of projects hosted on Google Code using the GNU GPL family of licenses declines from 54.7% in November 2011 to 52.7% in November 2011. Interestingly though the proportion of projects using permissive licenses also fell, from 38% in 2011 to 37.1% today. As a side note, the use of “other open source licenses” grew from 2.0% in 2011 to 4.3% in 2012.

What does it all mean? You can read as much or as little into the statistics as you wish. Since I am fed up with being accused of being a shill for providing analysis of the numbers I won’t bother to do so on this occasion – you are perfectly free to figure it out for yourselves.

Here’s everything in a single chart:

On the continuing decline of the GPL

Our most popular CAOS blog post of the year, by some margin, was this one, from early June, looking at the trend towards persmissive licensing, and the decline in the usage of the GNU GPL family of licenses.

Prompted by this post by Bruce Byfield, I thought it might be interesting to bring that post up to date with a look at the latest figures.

NB: I am relying on the current set of figures published by Black Duck Software for this post, combined with our previous posts on the topic. I am aware that some people are distrustful of Black Duck’s figures given the lack of transparency on the methodology for collecting them. Since I previously went to a lot of effort to analyze data collected and published by FLOSSmole to find that it confirmed the trend suggested by Black Duck’s figures, I am confident that the trends are an accurate reflection of the situation.

The figures indicate that not only has the usage of the GNU GPL family of licenses (GPL2+3, LGPL2+3, AGPL) continued to decline since June, but that the decline has accelerated. The GPL family now accounts for about 57% of all open source software, compared to 61% in June.

As you can see from the chart below, if the current rate of decline continues, we project that the GPL family of licenses will account for only 50% of all open source software by September 2012.

That is still a significant proportion of course, but would be down from 70% in June 2008. Our projection also suggests that permissive licenses (specifically in this case, MIT/Apache/BSD/Ms-PL) will account for close to 30% of all open source software by September 2012, up from 15% in June 2009 (we don’t have a figure for June 2008 unfortunately).

Of course, there is no guarantee that the current rate of decline will continue – as the chart indicates the rate of decline slowed between June 2009 and June 2011, and it may well do so again. Or it could accelerate further.

Interestingly, however, while the more rapid rate of decline prior to June 2009 was clearly driven by the declining use of the GPLv2 in particular, Black Duck’s data suggests that the usage of the GPL family declined at a faster rate between June 2011 and December 2011 (6.7%) than the usage of the GPLv2 specifically (6.2%).

UPDATE – It is has been rightfully noted that this decline relates to the proportion of all open source software, while the number of projects using the GPL family has increased in real terms. Using Black Duck’s figures we can calculate that in fact the number of projects using the GPL family of licenses grew 15% between June 2009 and December 2011, from 105,822 to 121,928. However, in the same time period the total number of open source projects grew 31% in real terms, while the number of projects using permissive licenses grew 117%. – UPDATE

As indicated in June, we believe there are some wider trends that need to be discussed in relation to license usage, particularly with regards to vendor engagement with open source projects and a decline in the number of vendors engaging with strong copyleft licensed software.

The analysis indicated that the previous dominance of strong copyleft licenses was achieved and maintained to a significant degree due to vendor-led open source projects, and that the ongoing shift away from projects controlled by a single vendor toward community projects was in part driving a shift towards more permissive non-copyleft licenses.

We will update this analysis over the next few days with a look at the latest trends regarding the engagement of vendors with open source projects, and venture funding for open source-related vendors, providing some additional context for the trends related to licensing.

Facebook opens up, but misses opening

Facebook took the first steps in what the company calls an ongoing experiment with open source software and developers. With its Facebook Open Platform, the social networking company is opening its API infrastructure, FQL and FBML parsers with implementations, common methods and tags, samples and dummy data, showing developers its development platform, tools and examples.

To the dismay of some open source figures, Facebook chose the Common Public Attribution License (CPAL) for most of the software (some is licensed under the similar Mozilla Public License). Facebook says it considered its license options, but felt MPL and CPAL were a good middle-of-the-spectrum approach. They are OSI-approved as open source, but are not as copyleft and community strong as the GNU General Public License or related licenses. Facebook also cited CPAL, which requires attribution, as a better match for network deployment and how software works today. This makes sense, but it also makes me wonder why not the Affero GPL?

Here’s the thinking. Google set off a fairly heated debate in the open source community with its anti-AGPL stance and discouragement of the license on its Google Code project hosting. Some of the resistance to Google’s AGPL aversion even equated to projects moving away from Google. I find it interesting that the predominant view on AGPL, and particularly that of Google, is a fairly negative one. In fact, there may be opportunity in AGPL. In the case of Facebook, I think it may have had an even bigger opportunity with AGPL given where Google is on the matter.

Sure, AGPL, which brings GPL code and modification sharing requirements to the SaaS or networked software model, may have been a bit riskier. Facebook says it considered AGPL, but encountered concern and confusion among developers over viral effects of the license. It’s interesting how viral can be both good, as in GPL development and community that is truly open, transparent and inclusive, or bad, as in extending requirements to code not intended or desired to be shared.

MPL and CPAL are logical choices, and Facebook should be commended for opening and providing the code under OSI-approved, open source licenses. It also deserves credit for a fairly clear and straightforward presentation of its licensing and terms. However, it borders on handicapping its open source strategy with a contribution agreement. The bigger opportunity, again, may have been AGPL. While Google defends its use of open source under what many consider a loophole and continues receiving criticism for its AGPL opposition, Facebook could have stood apart on the matter by embracing the AGPL. The license is similar to CPAL in that it is better suited to network deployment, but it is also in the GPL family, lending credibility and community benefits. This is not the last open source move from Facebook, and the opportunity may still be there, but the chance to really get in Google’s face does not come around often.