Microsoft Sidekick Debacle & the Cloud: Lessons Learned

When Microsoft's storage service for Sidekick users broke down, cloud computing questions sprang up -- both fair and unfair.

Comments

article on CRN blamed the outage on the fact that Microsoft is working on another project and pulled engineers from Danger onto the other project. Frankly, this is, or should be, irrelevant from a user perspective. A cloud provider is running a service and has to be committed to operational excellence, despite any other distractions or competing priorities. Otherwise, it forces the customer to examine the internals of the cloud service. This, from the perspective of the customer, is impractical, since everyone has limited time to devote to these things--a problem which will only get worse, given the fact that we are moving to a world in which use of cloud services is rapidly multiplying.

Moreover, most cloud providers don't want a horde of customers insisting on auditing the service--the support required for customer audits is not scalable. Finally, a customer shouldn't have to examine the inner workings of the cloud service. One doesn't question how the local electric utility schedules its generator maintenance, why should it be necessary for a cloud service? Customers should not have to do detailed evaluations of a cloud service: it's the job of the service provider to ensure appropriate operational processes in place.

Whatever the reason for the data loss, it calls into question the tenet that cloud computing enables a better level of discipline and expertise to be devoted to a service offering. If a customer can't depend on a cloud provider to perform at a higher level than the customer could do on its own, why should it turn to the cloud?

Likely Outcomes of this Incident

Microsoft evaluates its practices throughout its cloud offerings: I guarantee that one outcome of this incident is that an edict came down from on high: "Make sure no other system is vulnerable to this problem!" There are undoubtedly a bunch of operations groups at Microsoft digging through backup practices to ensure redundant data is stored and that reliable backups are being performed. Also undoubted is the response of these groups: "how come we're being stuck with a ton of extra work because they screwed up?" Fellas, that's just the way organizations work.

Other cloud providers use this as a "teaching moment": While these cloud companies are wiping their hands across their foreheads in relief, thinking "there but for God's grace go I," senior management is regarding this incident as an inexpensive way to learn an important lesson, and are taking it as an opportunity to do a low-risk drill. Of course, if other Microsoft operations groups resent having to do work because of this incident, imagine how ops groups in other companies feel!

Microsoft's credibility suffers a short-term hit: Some people will generalize this situation to all of Microsoft's offerings, and be more cautious about using them. Let me be clear: I don't believe this situation represents Microsoft's typical operations practices. Hotmail is a far larger service, and I don't recall hearing anything like this happening with it. Nevertheless, Microsoft's overall cloud reputation will be tarnished for a while.

The best thing for Microsoft would be to treat this as crisis management event, and follow the established playbook: early apologies, full transparency, frequent updates. That still won't prevent people from re-evaluating their opinions, at least in the short-term, but it will help return those initial re-evaluations back to their long-term assessments more quickly.

Cloud computing in general suffers a short-term hit: Any time one market participant suffers a significant blow, the concern spreads to others. All cloud providers are going to be questioned about their competence regarding storage practices. It's inevitable and unavoidable. Rather than resisting it, they should take it as an opportunity to proclaim about how much they are concerned on this topic and describe at length the extensive, redundant, and highly structured processes they have in place to avoid issues like this one. This information won't stop people from querying the provider, but it shows responsiveness and provides the opportunity to pick up share.

Long-term, this is a minor bump in the road: Of course this is a significant incident, and of course a very difficult situation for those affected by it, but in the long-run, this will be looked back at as a minor incident. Cloud computing is gaining momentum, driven by an appreciation of its strengths and cost efficiencies, and a problem, even one as serious as this, will not long hinder its progress.

Bernard Golden is CEO of consulting firm HyperStratus, which specializes in virtualization, cloud computing and related issues. He is also the author of "Virtualization for Dummies," the best-selling book on virtualization to date.

Join the newsletter!

Error: Please check your email address.

Tags cloud computing storage sidekick

More about Google Microsoft Stratus T-Mobile T-Mobile