slide

Everything is broken

Ned Bellavance
7 min read

Cover

Everything is Broken…

There’s more than one occasion where I have uttered the phrase, “Why can’t this just work?”  Usually after battling it out with some piece of software that the marketing fluff described as “simple” and “easy-to-use” and turns out to be more like incredibly complex and completely undocumented.  I want my technology to just work, but I also want it to be cutting-edge, infinitely configurable, and fully documented.  Those who are familiar with the Project Management Triangle may realize that having all three is impossible.  To which I say, what about in n-th dimensions?

Seriously though, I have noticed that with the speed of innovation, especially in the cloud, most things that are released are at least partly broken.  And that’s not just for beta or preview features, generally available features and functionality are buggy and partly undocumented.  Major releases of software have always had some bugs, which is why “It ain’t done till SP1” was a mantra among the Microsoft cognoscenti.

An Agile PeacockYeah, I’ve got kids

Weren’t Agile software development and continuous integration supposed to rescue us from this tide of buggy flotsam?  In the old bad days of waterfall software development, release cycles were measured in years.  And the marketing team often had to advertise features that didn’t exist yet, or weren’t fully baked.  Then it was up to the product team to make sure the feature was in the final release, which resulted in poor regression and unit testing and last minute code commits finding their way into the gold release.  Teams would work in isolation for years, and then push out a half-baked product, which would ultimately fail to meet customer expectations, and we would all collectively hold our breath until the first service pack came out.

Agile and CICD were supposed to rescue us from this quagmire, and in some ways they did.  But instead of putting us on solid ground, we are now firmly established on a swamp, where we need to keep building up just so we don’t sink into the muck.  Cloud companies are in an arms race of features, and Agile software development allows them to pump those features out at an alarming rate.  Every week I am reading about the latest features now available in Azure or AWS or Office 365 or in some other cloud service.  These features are usually released in some kind of preview, which is good, and those previewing it help the development team find bugs.  But that also means if you want to stay on the cutting edge then you are basically an unpaid member of the QA team for company X.

Forever Your Never leave Beta PaulaBeta

It seems that all this started back when Google Labs was pumping out new ideas every few months.  Each new idea would be released in Beta onto an unsuspecting world.  The Beta label gave that service special provenance, if a user complained that a feature was broken or unusable, then Google could shrug and say that it was still in beta.  Gmail, the most widely used email platform in the world, was in beta for five years, of which two years were a beta open to the public.  Just because it was in beta didn’t stop people from using it for business critical applications, but the beta label gave Google some type of plausible deniability.  Gmail set the standard for leaving projects in beta for as long as you please, and we all got comfortable using a service that was in some ways inherently broken.

Welcome to the QA Team!

As someone who had a Gmail account from early on, I benefited from the seemingly limitless storage and also suffered from the occasionally buggy interface and half-baked features.  In essence I was an extension of the existing Gmail QA team, and I worked as a tester in exchange for early access to a beta product.  That model has proliferated across the industry, with beta programs for VMware, Microsoft, Citrix, and more.  You have the privilege of running buggy software and providing feedback and bug reports, all for no cost to the vendor.  There are some tangible benefits to this approach, especially in regards to how a product matures, which features end up in the final release, and the overall stability of the system upon general availability release.

A perfect example of this approach is Microsoft’s most recent release of Windows 10 and Server 2016.  In the past, Microsoft has developed their operating systems and applications in a closed room, with very little input and testing from the larger community.  It allowed Microsoft to tightly control the flow of information and features being developed, so they could have a big bang release announcement.  But it also lead to a mentality that Microsoft’s software wasn’t ready for production until the first service pack was released.  That approach began to shift a little with Windows Vista and Server 2008 (aka Longhorn), where there was an early release program you could sign up for to test the operating system and provide some feedback.  There was only one preview release, and when Vista finally came out… well I think we all remember how successful that was.  Windows 7 and Windows 8 had an increased number of previews and additional feedback from customers, as did Server 2012.  Those releases were moderately more successful, and while things didn’t really hit their stride till Windows 8.1 and Server 2012 R2, their predecessors were definitely more usable on Day 1.  With Windows 10 and Server 2016, Microsoft started the early release program long before the official release.  In fact, the first technical preview of Server 2016 dropped in May of 2015, a full 17 months before the product was released for general availability.

Open Source all the Things!

What’s the next step?  Open Source Software of course.  Although we are accustomed to being QA testers for organizations, the code is still locked up in their purview.  We can find the bugs, and report them, but we lack the ability to examine the root cause and suggest a fix.  In the world of open source solutions, you could just file a bug report on Github.  However, if you are feeling suitably motivated, you can fork the project and fix the bug, then submit a pull request to incorporate your fix back into the master branch.  Lots of applications and operating systems already have an open source version, and they have been enjoying the benefits for years.  Other organizations are coming around on the idea, even former bulwarks of conservatism (read: Microsoft) have started open sourcing select projects, such as PowerShell.  Will they ever go full OSS?  It seems unlikely.  Microsoft has too many enterprise customers that would distrust an open source operating system, and Microsoft itself is probably not ready to let everyone see the Windows kernel laid bare in all its tangled glory.  Nevertheless it sees the value in customer feedback and input, and the way it is embracing open source and technical previews helps demonstrate this.

Everything is Broken… and that’s OK!

In the end, everything is broken, and it will never all be fixed.  The rapid pace of innovation in the cloud and on premise has created a whack-a-mole situation where problems just keep popping up like vicious little groundhogs, and sometimes we all feel a little like Bill Murray in Caddyshack.  Some days I just want to blow up the whole dang thing.  But it’s OK.  The rapid change and innovation has also brought with it opportunity.  Those of us in IT consulting have a job because we understand technology, and can keep up with the rapid pace.  It’s our job to stress about all the things that are broken, so to the end-users it all seems to “just work”.  If technology is magic, then we are magicians, and I guess that’s OK too.

magician