Why are the Microsoft Office file formats so complicated?

An amazing post by Joel Spolsky which is an excellent example of why blogging is such a useful augmentation of our collective intelligence.

Last week, Microsoft published the binary file formats for Office. These formats appear to be almost completely insane. The Excel 97-2003 file format is a 349 page PDF file.
If you started reading these documents with the hope of spending a weekend writing some spiffy code that imports Word documents into your blog system, or creates Excel-formatted spreadsheets with your personal finance data, the complexity and length of the spec probably cured you of that desire pretty darn quickly. A normal programmer would conclude that Office’s binary file formats:

* are deliberately obfuscated
* are the product of a demented Borg mind
* were created by insanely bad programmers
* and are impossible to read or create correctly.

He then goes on carefully and lucidly to explain why that ‘normal programmer’ would be wrong on all four counts. Wonderful stuff.