Cory’s inner geek

Cory Doctorow is one of the most interesting people I know. He’s just written a fascinating essay in Locus Online detailing three geeky spinoffs from his creative work. The first is a system for matching (i) institutions that would like a free copy of one of his books with (ii) donors who are willing to give one away. The second is his adaptation of Twitter hashtagging to extract more value from the text files in which he makes research notes when he’s working on a book. The third is an adaptation of the version-control systems commonplace in software development to track the evolution of his books through successive drafts. Here’s how he formulates the problem for which this is a solution:

I know a lot of archivists and one of their most common laments is the disappearance of the distinct draft manuscript in the digital age. Pre-digital, authors would create a series of drafts for their work, often bearing hand-written notations tracking the thinking behind each revision. By comparing these drafts, archivists and scholars could glean insights into the author’s mental state and creative process.

But in the digital era, many authors work from a single file, modifying it incrementally for each revision. There are no distinct, individual drafts, merely an eternally changing scroll that is forever in flux. When the book is finished, all the intermediate steps that the manuscript went through disappear.

It occurred to Cory that there was no rational reason why this had to be so. After all, computers are terrific at remembering insane amounts of trivial information. So he wrote to a programmer friend of his, Thomas Gideon.

Thomas loved the idea and ran with it, creating a script that made use of the free and open-source control system “Git” (the system used to maintain the Linux kernel), checking in my prose at 15-minute intervals, noting, with each check-in, the current time-zone on my system clock (where am I?), the weather there, as fetched from Google (what’s it like?) and the headlines from my last three Boing Boing posts (what am I thinking?). Future versions will support plug-ins to capture even richer metadata — say, the last three tweets I twittered, and the last three songs my music player played for me.

He called it “Flashbake”, a neologism from my first novel, Down and Out in the Magic Kingdom. I was honored.

It’s an incredibly rich — even narcissistic — amount of detail to capture about the writing process, but there’s no reason not to capture it. It doesn’t cost any more to capture all this stuff every 15 minutes than it would to capture a daily file-change snapshot at midnight without any additional detail. And since Git — and other source repositories — is designed to let you summarize many changes at a time (say, all the changes between version 1 and version 2 of a product), it’s easy to ignore the metadata if it’s getting in the way.

Wonderful stuff. I don’t think Cory has ever written a boring piece in his entire life.