EPUB Annotations

Annotations are kind of a weird thing to me, at least from a format perspective. Being able to create them is an integral part of the reading experience for many readers, no doubt, but technically they have nothing to do with the structure of an ebook itself. They’re more like a layer that lives on top of the format.

Seen in that light, it’s hard to argue that EPUB itself has to define an annotation framework. Leave it to the reading system developers to figure out annotations in EPUB or any other format they might support, right?

Of course, therein lies a big problem. Leave it the vendors and you get proprietary implementations that can’t travel with your content across devices and apps. You also can’t distribute annotations separately from the content. That’s effectively the world we live in now; another brick in the wall of the walled gardens.

So, it’s not surprising that the IDPF has been working on a framework for annotations in EPUB, based on the W3C Open Annotation work. It walks an interesting line between presentation and storage, which is what I’m going to look at today.

Lost from HTML 5.0

If you’ve ever taken a read of the EPUB 3 Content Documents spec, you’ve undoubtedly seen the warnings about HTML5 features being experimental, and to use them with caution. Caveat emptor and all that…

Did you ever skip on over to the HTML 5.0 spec and have a look at what features those were? Did you use them with caution? (Everyone follows specs to the letter of the law, right?)

If not, as the 5.0 revision winds down you might have missed the various features that have recently been pushed out. If you liked the details/summary elements, for example, you’re waiting for HTML 5.1 now for official status.

So where does that leave the world of EPUB 3?

IDPF forums still active

Hm, time really flew the last few months between edupub and a few other things.  Can you tell my January was a bit on the slow side?

Anyway, to get back in the swing of things, I just wanted to note that http://www.idpf.org/forums is still active for asking EPUB questions. When the EPUBZone microsite was added the forum link disappeared from the main menu — and the link from the mircosite goes directly to the EPUB3 forum, making it appear the other three might have been retired — but the all the forums are alive and well, if a little harder to get to right now.

I’ve reported the problem, as have others, so hopefully full access will be restored from the IDPF site soon. In the meantime, though, don’t hesitate to use them on the idea they’re no longer frequented.

TTS … Today, Tomorrow, Someday?

I have another long-standing interest in text-to-speech rendering from my time at CNIB, where the two main outputs we were generating were xml for braille full-text production and synthetically voiced DAISY 2.02 back matter components.

The reason we were TTS’ing back matter was that spending time reading indexes and bibliographies is an enormous waste of human resources — it’s a lag on getting books out to readers and would result in a precipitous drop in total output.

Very few people ever read the back matter, too, at least in general circulating libraries like we had. TTS meant that we didn’t have deny readers information that otherwise would have been omitted.

But to the point of this post, when I first saw the enhancements in EPUB 3 to improve text-to-speech playback, and a means of distributing high-quality text for rendering on the client side, I had stars in my eyes. Here was a way to bring high-quality voicing without huge audio downloads. But two plus years on, how close are we to realizing the potential?

Legal E– PUB?

I suppose it’s natural that I wonder every so often how well EPUB could be used to represent acts, regulations, case law and all the other (boring to everyone else) documents that are critical to the field. I did get my first “real” data job in legal publishing, after all.

And, yes, I realize how sad it must sound that I spend my free time thinking about translating legal documents to EPUB.

But whatever your sympathies, I spent a little time looking at the problems again by tinkering with some legislation I pulled off the web, so thought I’d recap what I was finding.

MathML Support in EPUB

Can you have an ebook format for education that doesn’t support math? (Note: it’s a rhetorical question!)

Math is pervasive, despite the common perception that it’s only for STEM. Even novels have been known to include equations from time-to-time.

And yet consumer-end MathML support in digital works is still lacking, despite two decades of work in W3C. If you want the whole story, Peter Krautzberger has written a fantastic article on the history of MathML and support in browsers that I’d highly recommend reading. I’m not going to attempt anything similar today, as there’s nothing of value a math neophyte like me could add.

Rather, what has me thinking MathML is the upcoming second EDUPUB conference, and the lack of consistent rendering, and voicing, of math in EPUB reading systems. In particular, how do we effect real change?

Semantically Structuring HTML

I don’t know what has me thinking about the use of the epub:type attribute to structure HTML markup today, except for the obvious sad fact that I like thinking about markup issues.

The attribute is increasingly being used to build a kind of semantic scaffolding to prop up the generic markup that is HTML — going beyond simple semantic inflection of structures into semantic markup models where there are required parent and child relationships.

It’s not a new idea, but can it succeed in EPUB?
Hyperlink to non-standard resource …

If you’ve seen this message from epubcheck, you know where this post is going.

It’s easy to get tripped up on EPUB’s web-like, but not quite web, quirks, as what is valid to do on in a web page isn’t always valid to do in an EPUB. Particularly when it comes to linking to resources, as you have to follow EPUB’s core media type requirements.

If epubcheck has spewed the “hyperlink to non-standard resource” message at you, it’s because you can only have internal links go to XHTML or SVG documents, at least without a fallback.

But don’t fret. There are some easy, and not-so-easy, workarounds to this problem, which is what I’m going to look at.

Who’s afraid of the world wide web?

There are certain aspects of EPUB 3 that are underspecified by default.

The navigation document I detailed in the last post is one example. While the rules for structuring the markup are delineated in the specification, the specification itself is not out to mandate the presentation.

Scripting is another example, but in a slightly different way. JavaScript is already well defined, so the flexibility doesn’t come from under-specification of the technology, but from flexibility to restrict what a script can do. Although this flexibility was given with best of intentions, content creators are now finding themselves at the mercy of the lowest common support denominator.
Tables of Contents Revisited

Alas… I don’t think I can say I learned while working on the best practices book that no matter how detailed you think you’ve reviewed your prose, some flubs will make it through.

Only because I learned that lesson a long time ago when I first got into publishing. I’ve seen notes from an author to an editor in a final print run, after all. (Thankfully, the editor wasn’t me!)

It doesn’t surprise me, then, that some errata has been noted, or that I’m the one reporting some of it. You can only hope that your mistakes are small, and the errata haven’t been all that critical to date.

That said, when you make an error, there’s no point running from it, and the two bits of errata that most bother me are both in the navigation chapter. I just gave a brief synopsis of one of the issues in the IDPF forums, so I figured I’d take a little more time to outline them both here.

And maybe take a look at a couple of additional clarifications we tried to make during the 3.0.1 revision, as grasping the in-spine and out-of-spine uses is proving to be generally confusing.

