The Art of Indexing

If you don’t follow IDPF specification development closely — I can forgive you if you don’t, though I can’t imagine why not — a new baby is on its way: indexes. The first working draft is now available on the IDPF site, and, all going well, it will become a recommendation probably around the same time as the 3.0.1 revision specifications.

I’m not going to write about the indexes spec today, though; well, at least I’ll be trying not to. More on my mind is the general misunderstanding of indexing in a digital world, one where engineers equate an art with a simple science. Indexing is not just keyword listing, after all, despite that being a pervasive belief in the ebook world. Re-read that last sentence a few times before going on, if you need. It’s important to get.
Continue Reading The Art of Indexing

Not Quite Ready for Prime Time: aria-describedat

One of the late additions to the EPUB 3.0.1 revision was the very newly minted aria-describedat attribute. As you might guess from the title of this post, while it’s technically available now, using it may not have the desired effect for the obvious reason that it’s going to take time before it finds support in reading systems.
Continue Reading Not Quite Ready for Prime Time: aria-describedat

Semantic Overload

So, does EPUB 3.0.1 adding both RDFa and Microdata attributes officially qualify it for semantic overload? Are there just too many ways to express semantics: epub:type, ARIA role, RDFa, microdata and even microformats can all be used.

The answer is probably yes and no, with any fault you might be inclined to find lying at the feet of the W3C where these things are ground out. One of HTML’s drawbacks has been the lack of a standardized way of expressing meaningful information about the structure and content of documents. It’s led to the current proliferation of mechanisms, each of which serves a useful function, but the sum of which invariably makes for confusion.
Continue Reading Semantic Overload

Semantic Intention

A recent mailing list thread had me thinking today about the old problem of where semantics exist. They aren’t hard-coded in your (X)HTML tags, for example, which probably sounds like a strange thing to say. But it’s true… to an extent. Markup languages define how you can structure documents with a lot of precision, but can only weakly enforce what the tags themselves mean.
Continue Reading Semantic Intention

Epubcheck 3.0 for .Net

I was motivated to figure out if I could add EPUB 3.0 validation to a program I’d written in C#, so did some tinkering over the weekend to see what I could discover. I thought I’d jot down a few notes about getting epubcheck to work as a library inside of a C# program, in case it’s of any use to anyone thinking their only option is to shell out to the command line.
Continue Reading Epubcheck 3.0 for .Net

Nav to NCX

I’ve been porting over the content I used to have on the cdata site today, so for a taste of programming (which it seems like I rarely get to do anymore) I made the XSLT style sheet I wrote ages ago to convert an EPUB 3 navigation document into an NCX into a live form.

Feel free to use it, but I guarantee nothing…

It’s also reachable under the Odds ‘n Ends section.

Babies are toughest of all…

So much for thinking I’d actually write anything what with the 3.0.1 revision, the accessibility metadata project and the arrival of the baby.

So much for thinking I’d get any sleep after the baby arrived, too… boy was I naive thinking prepping the house would be the hardest work! But it’s nice to have a real baby now and not have to sadly refer to the specs as my babies anymore. ;)

Must try and get back in the habit of thinking again, if only to fight off the delirium that comes from a lack of sleep.

Here are a few pics of little Amelia Sadie.

[nggallery id=1]


Navigating EPUB CFIs – Part 1


It’s not a topic I grew up dreaming of writing about one day, but it seems like a useful topic to cover, since questions about how to read them come up from time to time (see this recent thread on CFIs in the IDPF forums). I squeezed a quick explanation of them into the Best Practices book, but I had to keep it short, so maybe I can do better justice to them here.

I will say that the specification can be much more daunting to read than CFIs actually are, but that’s not an indictment of the specification. It’s just that CFIs are one of the more technically complex features that were included in EPUB 3.

But let’s start at the beginning: what is a “CFI”?Continue Reading Navigating EPUB CFIs – Part 1


Taking a stroll back into Z39.98 territory has me thinking about the Digital Image And Graphic Resources for Accessible Materials (DIAGRAM) content model, and, more specifically, whether it could be reformulated now that longdesc appears to have a new lease on life.

Which isn’t to suggest that there’s necessarily anything wrong with the content model we have, or that it’s no longer useful, but working on the A11Y Metadata Project to get accessibility metadata into, and seeing LRMI already ahead of the game, it seems natural enough that the content model be expressible as native (X)HTML5 for cases when that is necessary. Accommodating more than one way to render information is never a bad thing, after all.Continue Reading DIAGRAM Musings