Navigating EPUB CFIs – Part 2

Once more unto the breach, dear friends, once more… I promise only once more!

Okay, so I’m trying to steel myself up to write the second half of the CFI post I swore to myself I’d come back to after the first part. If I don’t get this done by year end, it may never see the light of day.

The assumption at this point is that you understand the basic referencing mechanism in play, as I’m only going to look this time at the mechanisms that exist having reached your destination. If you don’t have the foggiest what all the slashes and even and odd numbers mean, have another go at the first part.

So what are some the fun things you can do once you reach an element…

Character offsets

Perhaps the obvious one is the ability to reference into the text content of an element. When a CFI path ends with a colon followed by an integer, that integer refers to the character offset. Here’s an example from the specification:


The odd “/1” step indicates that we’ve reached the first section of character data in para05, and “:3” indicates that the text offset is 3 characters in. That’s (almost) all there is to character offsets.

Why is this cool? Well, think about the limitation of referencing by element IDs. Instead of only providing a link to an element, character offsets let you go that next level down and reference the text. It doesn’t matter how the author structured their content anymore. It also means that you can link indexes entries and other references to precise locations.

So why did I say “almost” above? Because the img element’s alt text is not rendered by default, but can be reached by a CFI. Although you’ll typically find character offsets attached to odd numbered steps, for this reason you will sometimes find them attached to even numbered steps. For example:


But that’s the only case where you’ll find a character offset attached to an even numbered step.

(Note that I didn’t, and am not going to, specify full CFI paths; they’ll be truncated to just the interesting part for the rest of this post.)

Text Location Assertions

There are a couple of additional interesting ways you can augment character offsets. The first of these is text location assertions, which allow you to specify the text you expect to appear before or after the location.

As the text in a publication can change over time, using text location assertions provides a means of determining whether the location originally pointed to has shifted. Provided a unique enough string of characters before and/or after the point, the reading system could self-correct the location.

Text location assertions are included after the character offset in brackets, like this:


In the above, the ‘b‘ stands in for the text before the referenced location and ‘a‘ for text after. If I were referencing into the middle of the line of Henry V I started off the post quoting, it might look like this:

/1:10[Once more,unto the breach]

If the text following the reference isn’t relevant (or there is none), simply omit it and the comma:

/1:10[Once more]

But if only the text following is relevant, you have to start the assertion with a comma:

/1:10[,unto the breach]

If you noticed there are no delimiters around the text, like quotes, you’re catching on to the one additional bit you need to be aware of. Since the text before or after the location can contain characters that have special meanings, you have to escape these characters using circumflexes (^).

Say I wanted to reference the space after the comma in the quote. In that case, I’d need to escape the first comma that is part of the text as follows:


You can review the escaping requirements in character escaping section of the specification. In a nutshell, you need to watch out for brackets, parentheses, semicolons, commas and circumflexes. (You also have to encode the percent sign as %25, but the long, boring details relating to IRIs is explained in the spec.)

Side Bias

I’m not going to spend a lot of time on side bias, as it only has relevance where the reference occurs at a dynamic page break (supposedly also a line break, but I can’t think of how this would matter). The gist of side bias is that it allows you to specify which side of the reference to show at that break point (i.e., the content on the page before or the content on the page after).

A side bias assertion also goes inside of brackets and takes the form “s=a|b“, where the pipe indicates either ‘a‘ (after) or ‘b‘ (before) is the value. In other words, if I wanted the text following my character offset to display when the reference falls at a page boundary, the expression would look like this:


The expression starts with a semicolon because text location assertions also go inside the same set of brackets, and are expected first. You can omit a text location assertion, but you can’t omit the semicolon separator. It’s just one of the parsing requirements of CFIs (for the technically inclined, ‘[s=a]‘ could be an actual, if unlikely, text location assertion, which is why the semicolon is required).

But back to the topic at hand, if the location referenced fell at the break between pages one and two, the reading system would be expected to show page two given the above CFI.

Similarly, if I wanted the text leading up to that reference to be visible I’d specify:


Now the reading system would be expected to render page one.

That’s really all there is to them.

Spatial Offsets

While pointing to text locations is fun and all, it’s hardly the coolest technique in the CFI toolbox. (Maybe I’m biased, though, as I’ve seen so much text data in my life indexing into the content has lost all its lustre.)

Moving along the scale of cooler tricks you can accomplish with CFIs, then, we land at spatial offsets. As the name suggests, these allow you to reference a spatial location (x and y coordinates) within an image or video. This time, we append an at sign (@) followed by the coordinates (x:y). For example, to specify the centre of an image:


And the answer to the question you might be asking yourself is: no, the image does not have to be 100 pixels wide by 100 pixels tall for the above CFI to reference the centre point. The x and y coordinates are specified as a percentage of the image width and height. If you really need pixel precise points, you’ll have to work out the appropriate fraction. I’m not sure how often that kind of precision will be necessary, though.

If you’re wondering how to handle referencing video images that are constantly changing, we’ll see how that’s done after first looking at how to specify an offset in the timeline.

(Remember, since spatial offsets reference elements, they’ll always be even numbered. Same for the following offset references.)

Temporal Offsets

Temporal offsets allow you to specify the time position within a resource like an audio or video clip.

The syntax uses a tilde (~) at the end of the path followed by a number that indicates the offset in seconds (you can specify fractions of a second, too). For example, we could reference a point 3 minutes and 25.23 seconds into a fictitious video clip like this:


Mixed Temporal and Spatial Offsets

With temporal offsets under our belt, we can now return to the problem of how to specify a position within an image when the image on the screen is constantly changing.

Naturally enough, the solution is to combine temporal and spatial offsets into one CFI. By specifying how far into a video timeline to go first, you can then reference a location in the current image using a spatial offset. For example, we can combine the previous two example to find the centre of the image 3 minutes and 25.23 seconds into like this:


The one point to note here is that the order has to be temporal offset first followed by spatial offset. If you were to reverse the above, it’s likely that the CFI would be resolved to whatever image is displayed at the 00:00:00 mark (probably the poster for HTML5 video) with the offset being junked.

Phew! Made it to the end. That’s more than you’ll ever want to know about CFIs, I’m sure.

Where do you go from here…

Probably not very far, unfortunately. Unless you’re a developer who wants to implement this functionality, that is.

Right now there’s very limited support for even basic CFI paths, so you’re not going to be able to write links using the above techniques anytime soon. The functionality exists more to facilitate bookmarking and annotating content, but as those are reading system functions (at least at this time), it means there’s not much you can do but look at the pretty things you could do with reverence and awe.

4 Replies to “Navigating EPUB CFIs – Part 2”

  1. Hi,

    It seems CFI wouldn’t catch up with real world implementations and it is a perfect example of re-inventing the wheel. If your data model is XML which is required by EPUB 3 for Content Documents, why not to use XPath?


  2. I’m not sure I see how xpath would solve the needs that CFIs do.

    You can only reference nodes with xpath, which is crude. CFIs are designed to allow greater granularity (pointing to text positions/ranges, into images and temporal media, etc.), as the functionality expected to be built with them needs precision (annotations, highlights, bookmarks, etc.).

    Also, xpath fails to address referencing into the container to the canonical instance (part one of this blog post), since a document can appear in the spine, or be referenced, more than once, but the annotation/highlighting/bookmark/whatever-is-being-anchored-by-the-CFI has to apply to only the instance referenced by the user.

    As I recall, there was a lot of discussion during the development of the spec about the downside of having to create a referencing mechanism uniquely for epub, but at the time there weren’t any other viable solutions. It’s possible CFIs could be superseded in the future in the interests of better alignment with the OWP, but it would have to be by a technology that could at least provide the bulk of the functionality.

    The same is true for the use with OA. It was recognized that the OA model needed a method for referencing into zip containers, but the epub implementation specification wasn’t the place to develop it, so we persisted what is defined to work for the format. That could change in the future, though OA itself doesn’t use xpath for its selectors, either.

    1. Hi Matt and thank you for the elaboration. No questions CFI spec it there to use and discuss. But the question is about the CFI real world adoption. Thank you for the excellent blog and quick response!


Leave a Reply to Pavel Cancel reply

Your email address will not be published. Required fields are marked *