451 CAOS Theory *
A blog for the enterprise open source community

Time for your cloud gut check

, April 25, 2011 @ 2:15 pm ET

It may be hard for Amazon, any of its users, critics or competitors to find a silver-lining in the recent cloud outage that took major sites offline for significant periods over the last week (ok, the critics and competitors are getting plenty), but I see a real upside for all: this has been our latest cloud computing gut check.

Just as we have seen in the case of open source software forks, dissents and competition, these challenges all represent a form of open source discipline that keeps code, communities and vendors ‘honest’ in the sense they must respond to developer and user demands and must also steer a successful path both organizationally and commercially. So while there is no doubt pain and loss from the Amazon outage, it is also a reminder that what does not kill your cloud computing deployment will only make it stronger.

It’s true, the outage illustrates that users and providers are still figuring out cloud computing, and that there is still much learning to be done. It was interesting to see some companies actually sending out press releases regarding how well they and their teams were able to keep their cloud-based environments going through the outage. Indeed, as highlighted recently by our own Tier 1 analysts Jason Verge and Doug Toombs, a number of heavy Amazon cloud users were able to largely sustain the blow of the outage and keep their clouds aloft, including Neftlix and Zynga. We can probably assume this kind of thing could happen with a private cloud, and if we don’t, we should. Still, the point is that the differentiation of technology and the team to effectively leverage it emerged as a critical differentiator during the Amazon cloud outage.

I believe the technology, tasks, procedures and preparedness that are represented in the winners versus the losers in this centers on ‘devops,’ a term we refer to often that involves the crossing of development, operations and other professionals in modern IT environments that both leverage and provide cloud computing services. Discussion of devops often centers on efficient use of cloud computing resources by both providers and users. Even when we consider ‘no-ops’ or more accurately ‘auto-ops,’ — whereby systems and operations are abstracted for developers and users — there is a definite need for knowledge, skill, experience and process when confronting cloud crashes, particularly on the operations side. Devops also represents a more holistic view of software in its environment(s), which is critical to crisis management and recovery for both Amazon and its users. Certainly Amazon and its partners are working hard to restore all of their cloud services to full functionality, but it is very interesting and encouraging to see customers and users adding in their know-how and talent to offset down servers and avoid downtime. It makes it clear why a large organization such as Facebook would benefit from opening its own datacenters and practices.

From Amazon’s and other providers’ perspectives – the cloud stubbed toe of this week also highlights how communication and reaction are perhaps as critical as the technical aspects of addressing what’s wrong and fixing it. Open source software also provides lessons here, indicating vendors and providers are best served by transparency and openness. What the message boards and Twitterverse are telling us now is that users will accept some degree of downtime and difficulty, but they want straight information on how long and how severely they will be down. Just as vendors face a challenge in fairly yet effectively pricing and charging for cloud computing, it may be difficult to provide guidance on recovery from an outage, but the same rules of PR crisis management apply: don’t over-promise and don’t under-deliver.

So just like a fork, leadership crisis or large, proprietary competitor is supposed to wreck an open source project or vendor, the latest cloud crash will finally stifle this cloud hype, bluster and momentum, right? Not quite. I would argue that just like a good fork, feud or megavendor foray into open source software is actually a strengthening, disciplinary measure, the latest cloud coughing will serve as a necessary gut check on cloud computing, thus helping us avoid a cloud bubble.

Permalink | Technorati Links | Bookmark on del.icio.us | digg it
Add Comment Categories: Software

Leave a Reply