Always wanted to put more than one rendering of your publication in a single EPUB file but couldn’t figure out how to? Never fear, here comes EPUB Multiple-Rendition Publications 1.0 to save the day!
Now that the spec is finally wending its way to recommendation, I figured it was a good time to get back to the blog and have a look at how it works.
Adding more than one rendition to the EPUB container has been possible for as long as I’ve been aware of the format but (almost) no one was taking advantage of the feature. Back in the EPUB 2.0 days, there wasn’t even a restriction on what other content you could toss in. Want to add a PDF with your EPUB, okeydoke. Docbook, too? Why not!
Those were heady days, but a lot of the openness didn’t make a whole lot of sense unless you consider OCF as a standalone packaging format. If you’re using it to send EPUB content to a consumer, sending PDFs and Docbook and other formats to their reading system falls out of scope fast.
I’m not here to argue whether which view of renditions format in the container does or doesn’t make sense, but during the development of EPUB 3.0 it was decided to start tuning down the free-for-fall nature and require that the EPUB container only contain one or more EPUB versions of the same publication. (Okay, I don’t see the harm in other formats if you really want to do it for some reason, but c’est la vie.)
What remains is the ability to optimize a single publication for different renderings: French and English versions, for example, or a fixed-layout version with a reflowable one.
All good, but the reason for the history lesson is that EPUB 3.0 still didn’t quite go far enough. It placed parameters on what you could put in the container but didn’t standardize how to differentiate or select what was in the container. That’s where the Advanced/Hybrid Layouts working group stepped in to fill the void, and voila, we now have the multiple-renditions specification.
If you take a read of that specification, you’ll see that there are three distinct parts to it:
- the first deals with metadata for the publication;
- the second deals with selecting between the different renditions in the container;
- the third deals with how to automatically jump someone from one rendition to the same point in another.
That kind of nice symmetry lends itself well to a blog breakdown, amazingly enough. But I’m going to take things a little out of order, as the selection attributes and mapping documents are the key new technologies to be aware of. I’ll leave metadata to the end as only wonks like me read that stuff. :)
I’m not going to spend time introducing the container.xml file. In a nutshell, it’s an XML file that consists of one or more
rootfile elements that tell a reading system where in the container to find the package files for the various renditions.
For example, say we included two renditions in a container, one a fixed-layout rendition and the other a reflowable. Prior to the multiple renditions specification, this is what your container file would look like:
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container"> <rootfiles> <rootfile full-path="EPUB/package-fxl.opf" media-type="application/oebps-package+xml"/> <rootfile full-path="EPUB/package-flw.opf" media-type="application/oebps-package+xml"/> </rootfiles> </container>
Other than a little suffix jiggery-pokery I added to differentiate the package file names — which in itself might not be apparent to all humans, let alone machines — there’s nothing to differentiate one
rootfile element from the other.
If a reading system (or any processing tool) wants to figure what makes one rendition unique from the other, it would have to read both the referenced package documents and do a comparison. But that doesn’t guarantee it will correctly figure out the difference, and it’s not terribly efficient, either.
And what happens if you want to express a more subtle difference, like that one rendition is optimized for colour screens and another black and white, or that one is for larger screens and another for phones?
To solve this problem of selecting from the available renditions, the specification introduces four attributes that you can attach to the
rendition:layout— indicates if the given rendition is fixed layout or reflowable using the values “pre-paginated” or “reflowable”, respectively.
rendition:media— takes a media query that expresses the nature of the rendition (e.g., orientation, max-width, color).
rendition:language— expresses the primary language of the rendition (e.g., a bilingual dictionary might have two languages expressed in the package metadata, but only one is the primary language of the rendition).
rendition:accessMode— an accessibility feature that describes the nature of the content (e.g., is it image-only with no text, text-based, aural). Takes one of the values “visual”, “textual”, “auditory” or “tactile”.
There’s also a fifth attribute called
rendition:label, but as you might guess from the name it’s for labelling renditions. I’ll return to it later.
Setting the attributes
Going back to the earlier example, we can now add the
rendition:layout attribute to each
rootfile element to distinguish them in a machine-readable way (note the
rendition namespace declaration on the container element; you have to set it to use the attributes):
<container ... xmlns:rendition="http://www.idpf.org/2013/rendition"> <rootfiles> <rootfile full-path="EPUB/package-fxl.opf" rendition:layout="pre-paginated" .../> <rootfile full-path="EPUB/package-flw.opf" rendition:layout="reflowable".../> </rootfiles> </container>
That’s really all there is to using rendition selection attributes.
The language attribute is equally straighforward, but, if you love examples, here’s how you could specify English and French renditions of a publication:
<rootfile full-path="EPUB/package-en.opf" rendition:language="en" .../> <rootfile full-path="EPUB/package-fr.opf" rendition:language="fr" .../>
The accessMode attribute provides an accessibility boost. Indicating that a rendition is fixed layout, for example, doesn’t indicate if it is image-based (text is just pixels in an image) or xhtml-based (text content can be read by assistive technologies):
<rootfile full-path="EPUB/package-fxl-v.opf" rendition:layout="pre-paginated" rendition:accessMode="visual" .../> <rootfile full-path="EPUB/package-fxl-t.opf" rendition:layout="pre-paginated" rendition:accessMode="textual" .../>
You could also use the attribute to distinguish a textual rendition, where braille would be generated on the fly, from a tactile rendition optimized for reading on a refreshable braille device:
<rootfile full-path="EPUB/package.opf" rendition:accessMode="textual" .../> <rootfile full-path="EPUB/package-brl.opf" rendition:accessMode="tactile" .../>
There are interested parties also working on how distinguish different tactile renditions, potentially using the language code, but at this time the work is still preliminary.
If you’re more of a minimalist in your markup, note that you don’t even have to specify any attributes on the default rendition (the first
<rootfiles> <rootfile full-path="EPUB/package-fxl.opf" .../> <rootfile full-path="EPUB/package-flw.opf" rendition:layout="reflowable".../> </rootfiles>
The processing rules for picking the appropriate rendition say that if you reach the default rendition without making a match, use it.
I’m not really a fan of minimalism, however, as it’s possible that a reading system might not make a perfect match but get an approximate match, in which case it might have to decide without any information about the default rendition whether a partial match is best. (But that’s more of a niche case, only really applicable if you have three or more renditions in the container with multiple atributes, and especially when making use of complex multi-part media queries.)
You’re not limited to using only one attribute, either. as I just let out in that last aside. Say you have to fixed layout rendition and a reflowable, with one of the fixed layout renditions optimized for portrait and the other for landscape display. Here’s how you could express those:
<rootfile full-path="EPUB/package-flw.opf" rendition:layout="reflowable" .../> <rootfile full-path="EPUB/package-fxl-l.opf" rendition:layout="pre-paginated" rendition:media="(orientation: portrait)" .../> <rootfile full-path="EPUB/package-fxl-p.opf" rendition:layout="pre-paginated" rendition:media="(orientation: landscape)" .../>
If you’re going to use media queries, however, one quirk to be aware of is that you can only use the media type “all” with them. For example, if you want to write a media query using “not” to target all non-colour viewports, you have to specify the media type:
<rootfile full-path="EPUB/package-flw.opf" rendition:media="not all and (color)".../>
The next question to look at is when and how these attributes are used by a reading system. Naturally, they provide the ability to select a rendition when you first open a publication, but they also allow the reading system to react to changes in state, and allow users to manually change whenever they desire. So who governs all this switching?
First thing to know is that there’s no requirement in the specification that a reading system try to pick an appropriate rendition for you, or that it automatically swap out one for another. How selection happens is going to be reading system-dependent, so it’s not a question I can answer. A reading system might only decide on a rendition automatically when you first load the publication. Or, it might change renditions each time a significant event such a change in orientation occurs, or it might never change renditions for those kinds of events, or it might give the user the option to turn on or off what kind of changes trigger swapping.
The only thing that a reading system is expected to do is provide you the ability to select from any of the available renditions. That’s where the
rendition:label attribute I mentioned earlier comes in. If you’re going to include more than one rendition, it’ll be wise to include a label, even if it’s as basic as stating in human terms that one rendition is fixed layout and the other reflowable:
<rootfile full-path="EPUB/package.opf" rendition:layout="pre-paginated" rendition:label="Reflowable" .../> <rootfile full-path="EPUB/package-fxl.opf" rendition:layout="pre-paginated" rendition:label="Fixed-Layout" .../>
If you’re wondering where the reading system gets the information it needs to make some of these determinations, that’s also left to reading system developers to decide. The only attribute that is truly user-independent is
media. With that one, you’re basically telling the reading system the appropriate rendition based on the capabilities of the device or some state it is in. Is it in landscape mode? Check. Does it have a max-width of X? Check. Any reading system can evaluate these to true or false, both when the publication opens and if you change the state somehow (e.g., switch from portrait into landscape).
The other three attributes are more dependent on user preference, which is always a nebulous thing. Whether I want to see a fixed layout version or reflowable could just be a reading preference, but could also change depending on the device I’m using or a reading disability I might have. The reading system might get my personal preference right for the general case, but fail when device characteristics also start to factor in.
Language preference is similarly nebulous and will depend on the nature of the work being read. If I’m trying to learn French, I may not want the English rendition, even though I normally read in English.
My point is only that there is only so much that a reading system can get right, and many of the selection criteria require knowing information about preferences that aren’t fixed in stone.
That’s why the specification only attempts to address how a reading system can pick renditions based on what were perceived to be common use cases for having multiple renditions in containers. It remains to be seen how the specification will evolve in the real world.
In terms of support now, which is always a question of interest, the only reading system I know that has some support is Readium, which you can test out in an experimental build. If you go to the settings menu in that demo you’ll find a tab for selecting renditions. It was done, as I understand, more as a proof of concept, but gives an idea of how selection could occur. When it will become part of the main build I don’t know.
Mapping documents are the companion to being able to select from different renditions, dealing instead with how do you not lose your place in the text. Say I turn my tablet from portrait to landscape mode. It’s great that there’s a rendition optimized for landscape mode, but what a nightmare if I get thrown back to page one and have to find my way back to where I was.
The mapping document was designed to allow the content producer to avoid such an unfortunate reading experiences for users by mapping points in the publication across all the renditions. If a new rendition is loaded, in other words, the reading system can take the current location, look it up in the mapping document (or the closest point listed) and then load the corresponding point in the new rendition that is being loaded.
I don’t want to spend a lot of time on mapping documents, however, only because it’s unlikely anyone will spend time trying to craft these by hand. In all likelihood, their creation will require automated tools.
But, for thoroughness, the mapping document gets listed in the container.xml file using the new
link element mechanism that was added during the EPUB 3.0.1 revision:
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container"> <rootfiles> <rootfile full-path="EPUB/package-fxl.opf" media-type="application/oebps-package+xml"/> <rootfile full-path="EPUB/package-flw.opf" media-type="application/oebps-package+xml"/> </rootfiles> <link href="mapping.html" rel="mapping" media-type="application/xhtml+xml"/> </container>
It doesn’t matter where you place the mapping document or what you call it. The only thing you need to know is not to list it in the manifest of any of the renditions in the container. Like the files in the META-INF directory, it is universal.
The mapping document itself is represented as a very stripped-down XHTML file that use a
nav element to contain the mapping list. Here’s a little piece of an example from the specification:
<html xmlns="htttp://www.w3.org/1999/xhtml"> <head> <meta charset="utf-8"/> </head> <body> <nav epub:type="resource-map"> <ul> <li><a href="../../en/en.opf#epubcfi(/6/2)"></li> <li><a href = "../../fr/fr.opf#epubcfi(/6/4)" /></li> <li><a href = "../../de/de.opf#epubcfi(/6/2)" /></li> <li><a href = "../../es/es.opf#epubcfi(/6/6)" /></li> <li><a href = "../../it/it.opf#epubcfi(/6/2)" /></li> </ul> ... </nav> </body> </html>
As you probably noticed, the
a tags all make use of EPUB canonical fragment identifiers. That’s the one big reason why I don’t see mapping documents being written by hand, as their beastly things to calculate. You don’t have to use EPUB CFIs, but they’re encouraged in the specification for technical reasons that have to do with the reading system being able to effectively sort and use the points.
But that’s probably enough on mapping documents for the faint of heart. For more detail, you can dig into the relevant section of the specification.
For a section that turned out pretty minimal in the end, let me attest that the problem of publication metadata can be quite a pain in the rear end. At one time, we had pretty big ambitions in terms of minimizing the repetition of metadata across renditions, since most renditions will share a lot of the same metadata, but the practical reality of EPUB production and consumption that we were bound to, and by, led to a much watered-down final product.
But before launching into a long explanation of the universe, for those getting tired of reading, all that is for multiple renditions is to include a release identifier.
If you’re at all familiar with metadata in the package file, you already know that the release identifier for any given rendition is the unique identifier plus the last modified time. This value was designed allowed you to compare two publications and determine if they contain the same content, and, if so, which one is newer and which older. Great stuff, but written at a time when everyone considered that an EPUB publication would contain only one rendition of the content.
It doesn’t make sense to look at any single rendition and assume that you can effectively compare to different publication to determine if they are the same, as different renditions can (and some would argue always should) have different identifiers. What then if a rendition is removed, for example, or what if the order of the rendition changes? If you look into the first rendition listed in the container.xml file, you may or may not see the same identifier even though the overall publication is the same.
Long story short, EPUB needed a release identifier that wasn’t dependent on looking into the available renditions. To do that, we simply needed to translate package metadata up to the metadata.xml file in the META-INF directory, which is exactly what we did. The multiple-renditions specification makes it valid to use package metadata elements in that file, and requires a release identifier.
So, in markup, your most basic metadata.xml file will look something like this:
<metadata xmlns="http://www.idpf.org/2013/metadata" unique-identifier="uid" version="3.0"> <dc:identifier id="uid">urn:uuid:f134e1db-a879-4660-954b-876bd3b78caa</dc:identifier> <meta property="dcterms:modified">2015-08-06T15:45:33Z</meta> </metadata>
You’ll notice that the
version attributes from the package element are specified on the root
metadata tag. Since there’s no package element in this file, it was the only place we could put them. Otherwise, it’s just what you’d expect to find in a package file.
So why not also add the author(s), title(s), language(s) here and get away from having to repeat the metadata across all the renditions? Well, as I started to mention above, it’s not a trivial task.
For one, the AHL group wasn’t in a position to radically change EPUB. Every package document has required metadata that must be specified in it (identifier, title and language). For compatibility with all known reading systems, current and past, the first rendition in the container also needs to include a complete set of metadata for use in bookshelves, etc.
So, given that we can’t take metadata out of the default rendition, and we have potential required overlap, we then had to look at how metadata could reasonably be inherited to form an accurate picture of the publication. The result was a lot of gear grinding that only ended with such a complicated, and fragile, set of rules that we decided it was better not to pursue inheritance.
If you’re really interested in the details, consider something as simple as the title. If you define it in the metadata.xml file, what does it mean when you find a title in the package document? Do you treat them as two different titles to be joined? Does one override the other?
The only way to solve the problem of whether to inherit or override was to start making rules like if two elements have the same ID, one cancels out the other. If you’re combining titles, on the other hand, which one goes first? If, when combining, the metadata in metadata.xml takes priority, what effect might that have elsewhere? Also, when the default rendition has to include a complete set of metadata, what is the likelihood that errors will slip through the cracks leading to metadata “duplication” where it wasn’t intended?
For that reason, and probably other reasons I’m forgetting, you can include any other metadata you want in the publication-level metadata but there’s no expectation that the reading system use it to generate a picture of any given rendition. Not the most elegant solution, but the most practical given the constraints we faced. Maybe a happier EPUB 4.0 day will unshackle us from the burden of backward compatibility, but until then…