With all the hype for EPUB 3 production, lost in the frenzy seems to be the work we did on the Z39.98 Authoring and Interchange Format.
But is it any less valuable for being overshadowed by its distribution counterpart?
I’ve already written a short defence of the format in the DAISY forums, but it feels like we’ve been unnaturally silent on the format since it’s delivery.
A big reason for that silence is undoubtedly because many of us who were involved in the NISO standard were also pulled into EPUB 3, and getting accessibility traction in a mainstream format like EPUB invariably takes precedence. As wonderful as I might think an XML authoring format for parallel production use might be, it’ll never be a more effective means of getting books to those who want them than will getting accessible production at the source, whatever internal production grammar is used.
But I want to address the question of why an organization might opt for a format like we defined in Z39.98 for internal use as a master format.
The first myth to dispel, if it actually exists, is that there was ever any competition between Z39.98 and EPUB. Both have histories that stretch back into the 1990s, DAISY’s is a linear one through a number of revisions, and EPUB’s evolves out of the Open Ebook Format (but is still essentially linear). Although EPUB up to and including version 2 can be seen as a minimally accessible text format (i.e., no bells and whistles), it wasn’t until version 3 that EPUB caught up to, and passed, the DAISY format in terms of what it can accommodate. (Noting here that EPUB 3 is a format and can be made as inaccessibly as any other format when done sloppily.)
But the genesis of EPUB’s evolution to fully embracing accessibility can be tied to developments that occurred in 2010, nearly two full years after we’d started the Z39.86 revision. No one had a crystal ball that could see this change coming, or things might have happened differently.
We were actually beginning to wrap up the new XML framework right when the EPUB revision took off, and that, coupled with the specification approval process, was why it would take another year and a half before we would be able to unveil Z39.98. (I’ll explain the two different numbers shortly; the changes are not just typos.)
But my point is that although both specifications came out in 2012, they weren’t under active development at the same time, and their mandates were never the same.
The Z39.98 specification, as it became numbered, was initiated to handle the problem many accessible content producers faced creating semantically rich content that could be used in multiple output streams. I’d bumped into problems making DTBook content work effectively for braille, e-text and synthetic speech production when I’d started at CNIB. The vocabulary was simply not as rich as I needed for specialized formats like plays and cookbooks, and developing and teaching generalized markup techniques to staff was not fun. DTBook was very HTML-ish, having evolved out of its XHTML roots in earlier versions of the standard.
The other problem I had with DTBook was that it still made use of DTDs in an age that was seeing XML production increasingly migrating to schema validation using RelaxNG and Schematron. Also, even though you could extend DTBook, the extension points did not provide great levels of granularity. It would have taken a lot of schematron rules on top of DTBook to influence production in a consistent manner.
But I’m not here to carp about DTBook, simply to note that it was time for a refresh, and that was our ambition. By starting a new revision of the DAISY standard, the goal was to split the text production component that had been bound up in DTBook from the output talking book distribution format. The new XML format was going to be “Part A” of the new Z39.86 specification and the text/audio distribution format “Part B”. (We’ll never know exactly what “Part B” would have looked like, as EPUB 3 ultimately superseded any work on it, and is now endorsed as the distribution format of the future.)
This new XML format was also going to be designed to facilitate the exchange of content between accessible production agencies. CNIB was not alone in having a different internal format for production, with DAISY being one output. Although it’s always possible to exchange output formats, the semantically rich masters are typically what make the most sense to exchange, whether directly or through an intermediary like the TIGAR project. But how do you do that when everyone has a different internal format?
The brilliant idea that Markus Gylling had was to define a common framework in which to build content models, one that would ensure a certain level of consistency between documents while giving each producer full flexibility to define new content models and refine existing ones to meet their particular needs. Z39.98, as a result, does not define a content model, but a framework in which to build content models, while maintaining a common structure and means of processing the content.
With a new XML specification that was no longer a direct linear evolution of the full talking book standard, a new number was needed. That’s why what started out as a Z39.86 revision culminated in a Z39.98 specification. Z39.86 remains the talking book format of old for those continuing to use it.
But that brief history lesson aside, it’s logical enough to ask at this point: since both Z39.98 and EPUB 3 can be used for XML production, which should you use?
Or, put another way, if EPUB 3 had been available at the same time that I implemented Z39.98 at CNIB, would I have still opted for Z39.98?
The answer is an unequivocal yes.
XHTML5, the text content grammar that underpins EPUB 3, may include some new semantic tags, but is still driven by formatting. Although the epub:type attribute is a useful addition that cleanly separates authoring from semantics, it doesn’t necessarily make production more effective, as I’ll get to.
XHTML5 can certainly be used for single source production (I’ve seen it done), but that’s true of any markup grammar, so not specifically an argument in its favour. The question you have to ask when choosing/designing any master storage format is not can I use this format for production, but how effective is the format to author, quality check, transform and re-use again later.
When it comes to accessible production, markup is done by hand; industry tools like InDesign are simply unrealistic for the task. We work in a chop and scan world, after all, where manual content inspection is the norm.
In this kind of production environment, the ability to create rich markup grammars is really what sets Z39.98 apart for me. I don’t want to explain to the people working at tagging content how to use attributes to layer semantics into tags. I don’t want to explain to them which tags can take the semantics and which ones can’t. I’ve done it before, and it’s not intuitive for the taggers. It’s infinitely easier to use a semantically-driven grammar where the tag names themselves indicate their correct usage.
I don’t buy into the theory that too many tags are hard, in case that’s not clear. There’s no real savings between semantics inflected through attributes and a tag name carrying the semantic, only greater ambiguity in the former. That’s not to suggest that HTML has it wrong in not creating a multitude of elements for special uses, but it’s a design decision I find works against manual production, especially when production is not specifically geared at web display. If your content authors are not markup experts, it’s simpler for them to do one step in looking up the element name to use (or view its allowed children) than to have to first pick a generic element appropriate for the situation and then find the semantic for it.
Another reason I prefer Z39.98 to XHTML is because braille production support for markup is still limited. Feeding XHTML into a program like Duxbury isn’t going to make you a happy, well-adjusted individual in the long run. It’s much simpler to apply formatting to element names in an import than it is to jump through the hoops involved in trying to match on attributes when each element’s attribute node is treated as a literal value. One misstep even in the order of attributes or the semantics within them and your matching falls apart (beware IDs!).
It’s also simple to add information necessary for braille production into Z39.98 content and still be able to validate the output, as the framework simplifies the prototyping of new grammars. With a base master format, all that is necessary is to copy the schema and extend/restrict it as necessary. The modular nature of the schemas makes this a breeze to do. I can take a Z39.98 file and contort it all kinds of ways to get a braille-ready import file.
But I’m starting to drag this post out, which wasn’t my intention, so I’ll try to wrap it up.
There are benefits and drawbacks to any markup language you use, but trying to figure out where your pain points are, and who’s going to suffer the most, has to be factored in. For the kind of accessible text production we were doing at CNIB, I’d take a structure-optimized markup grammar like we built using the Z39.98 framework to dealing with the ambiguities of a generalized markup laced with semantics any day of the week.
I’m not saying it’s the choice for everyone, but as I said earlier there’s rarely ever a one-size-fits-all solution when it comes to data production. Z39.98 fills a useful niche in a world of markup grammars, and was a decision to implement I don’t regret (and have heard no complaints from those still there).