The Vista problem: a candid internal view

One of the puzzles that really interests me is why delivering Vista — the new incarnation of Windows — has proved so traumatic for Microsoft. It’s interesting because it raises the question of whether these code-monsters have now grown so large and complex that they are beyond the capacity of any single organisation — even one as smart as Microsoft — to manage. Here’s an extensive excerpt from a fascinatingly candid Blog post by a Microsoft insider. Apologies for the length, but it has already been removed once after posting (though the author says he came under no company pressure)…

Vista. The term stirs the imagination to conceive of beautiful possibilities just around the corner. And “just around the corner” is what Windows Vista has been, and has remained, for the past two years. In this time, Vista has suffered a series of high-profile delays, including most recently the announcement that it would be delayed until 2007. The largest software project in mankind’s history now threatens to also be the longest.

[…]

Admittedly, this essay would be easier written for Slashdot, where taut lines divide the world crisply into black and white. “Vista is a bloated piece of crap,” my furry little penguin would opine, “written by the bumbling serfs of an evil capitalistic megalomaniac.” But that’d be dead wrong. The truth is far more nuanced than that. Deeper than that. More subtle than that.

I managed developer teams in Windows for five years, and have only begun to reflect on the experience now that I have recently switched teams. Through a series of conversations with other leaders that have similarly left The Collective, several root causes have emerged as lasting characterizations of what’s really wrong in The Empire.

[…]

The Usual Suspects

Ask any developer in Windows why Vista is plagued by delays, and they’ll say that the code is way too complicated, and that the pace of coding has been tremendously slowed down by overbearing process. These claims have already been covered in other popular literature. A quick recap for those of you just joining the broadcast:

Windows code is too complicated. It’s not the components themselves, it’s their interdependencies. An architectural diagram of Windows would suggest there are more than 50 dependency layers (never mind that there also exist circular dependencies). After working in Windows for five years, you understand only, say, two of them. Add to this the fact that building Windows on a dual-proc dev box takes nearly 24 hours, and you’ll be slow enough to drive Miss Daisy.

Windows process has gone thermonuclear. Imagine each little email you send asking someone else to fill out a spreadsheet, comment on a report, sign off on a decision — is a little neutron shooting about in space. Your innocent-seeming little neutron now causes your heretofore mostly-harmless neighbors to release neutrons of their own. Now imagine there are 9000 of you, all jammed into a tight little space called Redmond. It’s Windows Gone Thermonuclear, a phenomenon by which process engenders further process, eventually becoming a self-sustaining buzz of fervent destructive activity.

Let’s see if, quantitatively, there’s any truth to the perception that the code velocity (net lines shipped per developer-year) of Windows has slowed, or is slow relative to the industry. Vista is said to have over 50 million lines of code, whereas XP was said to have around 40 million. There are about two thousand software developers in Windows today. Assuming there are 5 years between when XP shipped and when Vista ships, those quick on the draw with calculators will discover that, on average, the typical Windows developer has produced one thousand new lines of shipped code per year during Vista. Only a thousand lines a year. (Yes, developers don’t just write new code, they also fix old code. Yes, some of those Windows developers were partly busy shipping 64-bit XP. Yes, many of them also worked on hotfixes. Work with me here.)

Lest those of you who wrote 5,000 lines of code last weekend pass a kidney stone at the thought of Windows developers writing only a thousand lines of code a year, realize that the average software developer in the US only produces around (brace yourself) 6200 lines a year. So Windows is in bad shape — but only by a constant, not by an order of magnitude. And if it makes you feel any better, realize that the average US developer has fallen in KLOC productivity since 1999, when they produced about 9000 lines a year. So Windows isn’t alone in this.

The oft-cited, oft-watercooler-discussed dual phenomenon of Windows code complexity and Windows process burden seem to have dramatically affected its overall code velocity. But code can be simplified and re-architected (and is indeed being done so by a collection of veteran architects in Windows, none of whom, incidentally, look anything like Colonel Sanders). Process can be streamlined where inefficient, eliminated where unnecessary.

But that’s not where it ends. There are deeper causes of Windows’ propensity to slippage.

Cultured to Slip

Deep in the bowels of Windows, there remains the whiff of a bygone culture of belittlement and aggression. Windows can be a scary place to tell the truth.

When a vice president in Windows asks you whether your team will ship on time, they might well have asked you whether they look fat in their new Armani suit. The answer to the question is deeply meaningful to them. It’s certainly true in some sense that they genuinely want to know. But in a very important other sense, in a sense that you’ll come to regret night after night if you get it wrong, there’s really only one answer you can give.

After months of hearing of how a certain influential team in Windows was going to cause the Vista release to slip, I, full of abstract self-righteous misgivings as a stockholder, had at last the chance to speak with two of the team’s key managers, asking them how they could be so, please-excuse-the-term, I-don’t-mean-its-value-laden-connotation, ignorant as to proper estimation of software schedules. Turns out they’re actually great project managers. They knew months in advance that the schedule would never work. So they told their VP. And he, possibly influenced by one too many instances where engineering re-routes power to the warp core, thus completing the heretofore impossible six-hour task in a mere three, summarily sent the managers back to “figure out how to make it work.” The managers re-estimated, nipped and tucked, liposuctioned, did everything short of a lobotomy — and still did not have a schedule that fit. The VP was not pleased. “You’re smart people. Find a way!” This went back and forth for weeks, whereupon the intrepid managers finally understood how to get past the dilemma. They simply stopped telling the truth. “Sure, everything fits. We cut and cut, and here we are. Vista by August or bust. You got it, boss.”

Every once in a while, Truth still pipes up in meetings. When this happens, more often than not, Truth is simply bent over an authoritative knee and soundly spanked into silence.

The Joy of Cooking

Bundled with a tendency towards truth-intolerance, Windows also sometimes struggles with poor organizational decision-making. Good news is that the senior leaders already know this and have been taking active steps to change the situation.

There are too many cooks in the kitchen. Too many vice presidents, in reporting structures too narrow. When I was in Windows, I reported to Alec, who reported to Peter, to Bill, Rick, Will, Jim, Steve, and Bill. Remember that there were two layers of people under me as well, making a total path depth of 11 people from Bill Gates down to any developer on my team.

This isn’t necessarily bad, except sometimes the cooks flash-mob one corner of the kitchen. I once sat in a schedule review meeting with at least six VPs and ten general managers. When that many people have a say, things get confusing. Not to mention, since so many bosses are in the room, there are often negotiations between project managers prior to such meetings to make sure that no one ends up looking bad. “Bob, I’m giving you a heads-up that I’m going to say that your team’s component, which we depend on, was late.” “That’s fine, Sandy, but please be clear that the unforeseen delays were caused by a third party, not my team.”

Micromanagement, though not pervasive, is nevertheless evident. Senior vice presidents sometimes review UI designs of individual features, a nod to Steve Jobs that would in better days have betokened a true honor but for its randomizing effects. Give me a cathedral, give me a bazaar — really, either would be great. Just not this middle world in which some decisions are made freely while others are made by edict, with no apparent logic separating each from the other but the seeming curiosity of someone in charge.

In general, Windows suffers from a proclivity for action control, not results control. Instead of clearly stating desired outcomes, there’s a penchant for telling people exactly what steps they must take. By doing so, we risk creating a generation of McDevs. (For more on action control vs. results control, read Kenneth Merchant’s seminal work on the subject — all $150 of it, apparently).

Uncontrolled? Or Uncontrollable?

We shouldn’t forget despite all this that Windows Vista remains the largest concerted software project in human history. The types of software management issues being dealt with by Windows leaders are hard problems, problems that no other company has solved successfully. The solutions to these challenges are certainly not trivial.

An interesting question, however, is whether or not Windows Vista ever had a chance to ship on time to begin with. Is Vista merely uncontrolled? Or is it fundamentally uncontrollable? There is a critical difference.

It’s rumored that VPs in Windows were offered big bonuses contingent on shipping Vista by the much-publicized August 2006 date. Chris Jones even declared in writing that he wouldn’t take a bonus if Vista slips past August. If this is true, if folks like Brian Valentine held division-wide meetings where August 2006 was declared as the drop-dead ship date, if general managers were consistently told of the fiscal importance of hitting August, if everyone down to individual developers was told to sign on the dotted line to commit to the date, and to speak up if they had any doubts of hitting it — mind you, every last one of those things happened — and yet, and yet, the August date was slipped, one has to wonder whether it was merely illusory, given the collective failure of such unified human will, that Vista was ever controllable in the first place.

Are Vista-scale software projects essentially uncontrollable by nature? Or has Microsoft been beset by one too many broken windows? Talk amongst yourselves…

The answer to that question — Are Vista-scale software projects essentially uncontrollable by nature? — may well be “yes — if they’re done within a single organisation”. That’s why Steve Weber may well be right about Open Source: that it’s a better way of making unimaginably complex products.