Mr. Natural Says…

Fans of R. Crumb will remember Mr. Natural and his words to live by: Always use the right tool for the job. It's a good thing to keep in mind when you compare ODF and XMLRS

Those of us that are, to use an arcane bit of speech, “of a certain age,” will remember the initial heyday of underground comics, epitomized by the work of R. Crumb in come-and-go series such as Zap Comix and Despair. Crumb created a netherworld of often seedy, usually disturbed, and always disturbing characters, such as Snoid, Flakey Foont, Shuman the Human, Whiteman, Herb Housetop, and most famously, Mr. Natural.

This blog entry isn’t actually about R. Crumb (who now rather incredibly appears in The New Yorker on an occasional basis), or even about Mr. Natural, but I will use as my jumping off point a bit of argot that R. Crumb popularized and which you can still find rattling around the Internet. No, not “Keep on Trucking,” but this:

Mr. Natural sez, “Always use the right tool for the job!”

Indeed, these are words of wisdom to live by. But how do we know what “the right tool for the job” may be, in a given case? This question does (finally) lead me to the Topic of the Day, which is this: ODF or XML Reference Schema (XMLRS): which is the right tool for your job?

Personally, I’m rather frustrated over this question for two reasons. First, the specifications themselves are lengthy and complex (more on this later), making any sort of side-by-side comparison a challenging task. And second, I’m a lawyer, not a software engineer, so I’m definitely not the right tool for this job. So far, I’m not aware of anyone who has undertaken this task, and I really wish that someone with credibility would do so.

Until that happens, I’m left with deductive reasoning, but I think that this process may take us surprisingly far down the road, so let’s give it a shot, shall we? But first, we should introduce our combatants.

In this corner, we have the existing champion (it is, after all, an already adopted standard), the OpenDocument OASIS Format, or ODF, weighing in at a hefty, but trim, 706 pages. And in that corner, the challenger and candidate Ecma specification: the Microsoft XML Reference Schema, weighing in at over 1900 pages.

What, if anything, might we deduce from the fact that the challenger is more than 2 ½ times as long as the current champion? Let’s come back to that, because really what we should start with is the job that we need a tool for, rather than the tool itself, right?

So what is the job? Or, stated another way, what job was each tool created to do?

We have a lot more to go on here, because both sides have been very clear in what they were trying to achieve. In the case of ODF, the purpose was to create a standard to which multiple products, both proprietary and open source, could — and as significantly — would be built. And so it has happened.

In the case of the challenger, the purpose is explicitly to create a standard that will serve the needs of Microsoft, over and out. (Should I have said “the needs of Office users?” instead of “the needs of Microsoft?” No. Adopting ODF, and solving backward compatibility issues with conversion features outside the format would have been serving the needs of Office users, because it would have given them freedom of choice, given that there are already other office productivity products that support ODF. It will be years – if ever – before there are similar products supporting XMLRS.)

So in the one case, the purpose is to create a standard designed to be supported by many products, while in the other the goal is to create a standard that is designed to support only a single product.

In short, we not only have two tools, but those tools were created to do two different jobs. Or, to be a bit more objective, we have two tools that are targeted at more or less the same job, but which have been highly optimized, in each case, to do a rather different job. And many would be the decisions, large and small, that you would expect to be made to optimize the tool to best do one job as compared to the other.

How much does that matter? That’s a harder answer to nail down with certainty, because now we get into another dimension of what “the job” is. In the case of ODF, the job is to enable not only long-term access to data, but to enable the creation of the most interoperable, innovative, competitive products and marketplace, while in the case of the XMLRS, it’s to make it easy for Office users to transfer information and not, one would have to assume, to make it particularly easy for anyone else to build products suited to their own unique needs or goals. Or, perhaps, to build any products at all.

One has to ask, will there be any incentive for other applications to be developed to accept Office data in any event, where Microsoft will presumably always have the upper hand on where the standard goes next; will always be a step ahead in product development; and will have no incentive to make competition easy?

I’ll leave that one for you to answer, because now I’d like to finally take a look at the very different length of those two specifications again. True, Office is today more feature-rich than ODF, so some portion of that extra mass of pages is certainly to be expected. How much? Well, certainly not 1200 plus pages over and above the 706 page length of the entire ODF standard. So no matter how you slice it, there’s a lot of extra detail there.

So what would that extra detail be about? We already know what a large part of it is for, and that is to guarantee backwards compatibility — back through multiple generations of Office. Does that matter?

The answer is yes, and the reason is because any time you create a standard, you need to find a balance between technical excellence and accommodating legacy needs; between standardizing at a high enough level of detail to avoid becoming overly constraining; and also at a low enough level to prevent a proliferation of proprietary extensions that destroy the original goals of standardizing at all. If you strike that balance properly, let’s say that the length would be X (or, for purposes of this example, let’s say, oh, maybe 706 pages).

So here, we might expect, lies the reason for most or all of those extra pages after we’ve described embedded audio and video and such. One person who is more technically astute than I am and whose judgment I respect phrased it this way:

“Look, you can write a specification to make anything backwardly compatible. But at what cost? All programs grow warts, and if you want to accommodate all those warts, you have to write all kinds of strange code. You end up with a complex hodgepodge of ancient languages and dialects, none of which are relevant to the way that you’d want to do things today.

“The result? Anyone from outside Microsoft who wanted to use MSXML would need to learn all this arcane stuff that just wouldn’t be needed if they were using ODF instead. It wouldn’t be there because it was necessary to make good documents, or to make a good standard. It would just be there to handle all those old versions, and to work around all those warts.”

That’s what would have to happen when you make the standard to suit the application, rather than make the applications to suit the standard. I expect that this is indeed where a lot of those extra 1200 plus pages come from, and I’d hate to have to struggle through them, let alone have to work with them. Would you, especially if you could use the ODF specification and get a proprietary or open source converter to plug the gap between an application that supported ODF and an Office document?

At the end of the day, I guess it comes out like this. You don’t really have to speculate whether Microsoft has evil designs, and you don’t even have to do a technical comparison to see what else might be under the hood (although I still hope someone will do so). You can simply decide what the job is that you want to do, and then pick the best tool for that job.

If the job is to stay a Microsoft Office user, in a Microsoft world, with prices and features determined by Microsoft, then I would heartily endorse XMLRS as the right tool for you — because that’s just the job (and the world) it was designed to serve.

But if you’d like more choices, competitive prices, richer innovation, market-driven features, and variety — if that’s your job instead — then boy, do I have a tool for you!

[To browse all prior blog entries on this story, click here]

subscribe to the free Consortium Standards Bulletin (and remember to Buy Your Books at Biff’s)