EPUB Annotations

Annotations are kind of a weird thing to me, at least from a format perspective. Being able to create them is an integral part of the reading experience for many readers, no doubt, but technically they have nothing to do with the structure of an ebook itself. They’re more like a layer that lives on top of the format.

Seen in that light, it’s hard to argue that EPUB itself has to define an annotation framework. Leave it to the reading system developers to figure out annotations in EPUB or any other format they might support, right?

Of course, therein lies a big problem. Leave it the vendors and you get proprietary implementations that can’t travel with your content across devices and apps. You also can’t distribute annotations separately from the content. That’s effectively the world we live in now; another brick in the wall of the walled gardens.

So, it’s not surprising that the IDPF has been working on a framework for annotations in EPUB, based on the W3C Open Annotation work. It walks an interesting line between presentation and storage, which is what I’m going to look at today.

I have to confess from the outset that I’m not a big writer of annotations. I hated writing on my books in university, and even if e-annotations are non-destructive there’s just something aesthetically displeasing about a book littered with markers, no matter how much you might try to minimize them.

But despite my proclivities, there are a number of interesting things going on in the new specification to satisfy the technophile in me. (Okay, and I’ve been helping edit it, too, so I kind of have an inside track.)

In and Out

One of the interesting aspects of the specification, as I was alluding to, is that it’s not prescriptive about what happens under the hood in reading systems — the focus is more on the exchange of annotations.

You’re not going to find a whole whack of reading system requirements on how to ingest, store, merge, sort, render or do whatever else needs to be done to maintain these things and show them in ebooks. Instead, the focus of the specification is on a format for representing collections of annotations.

In other words, the specification defines a way to import/export and exchange annotations in a standard way, but what the reading system does between import and export is its business.

Now, that doesn’t mean the developer can go all batsh#t crazy, as you still have to be able to export a collection compliant with the specification, but it does leave a lot of room for artistic interpretation.

All that’s really required is that HTML5 fragments be displayable, since those represent each annotation. Everything else could be translated coming in and going out.

JSON-LD Collections

But to get into the annotations themselves, the one big new change to the EPUB universe that they introduce is the use of JSON-LD for structuring the collections.

Not another XML format, in other words, which I’m sure will please some and rankle some people. It’s also yet another format, which I’m sure will please some and rankle others. On the bright side for the rest of us, it’s a format you’re probably never going to see unless you build reading systems or are in the business of selling annotations.

I’m not in the camp that tries to believe that JSON is the perfect replacement for XML. It’s still yet another structured markup format, in most cases is hard to write effectively by hand, has just as annoying failure problems as XML and can be a massive PITA to debug.

But today’s not a day for rants. JSON-LD is lightweight enough to be useful (i.e., it’s not RDF), and the body of the annotations are embedded HTML5 fragments, so the JSON part is mostly trappings.

The minimal core collection only looks like this, after all:

{
  "@context": "http://www.idpf.org/epub/oa/1.0/context.json",
  "@id": "http://example.org/epub/annotations.json",
  "@type": "epub:AnnotationCollection",
  "annotations": [
      /* annotation definitions here */
   ]
}

The attributes will need some explaining, but it’s isn’t terribly complex stuff: a default JSON object with a few properties. The annotations array is where you define each individual annotation, which we’ll see later.

Even if you aren’t terribly familiar with JSON, the key/value pairings (“key“: “value“) aren’t that hard to grasp. The @ symbols at the start of the names might seem complex and spooky — why not on annotations? — but they just indicate that the property is a core keyword in JSON-LD versus a property defined specifically for the format (see the JSON-LD core keywords).

To speed things along a bit, you don’t have to worry about @context unless you’re interested in the inner workings of linked data. It’s just a reference to the default context for EPUB annotations, which defines fun things like default prefix mappings and property name shorthands.

The @id property probably speaks for itself: it’s the unique identifier, like for EPUB publications. The only thing to be aware of is that the value has to be universally unique to avoid collisions with other annotation collections.

And the @type property is another of those things you just have to specify. It identifies that the document contains a collection of EPUB annotations (Open Annotation isn’t specific to EPUB).

You can add a lot more metadata to than shown above, like a title for the collection, who authored/published it, when it was last modified and so on. The kind of standard fare you specify for EPUB publications in the package document. Appendix A of the specification has piles of useful examples; see the third for an interesting set of metadata properties.

Annotations

I have a feeling this blog post could get really bogged down in details if I try to get into too many specifics, so I’m going to aim at only a cursory explanation of the annotations in the collection.

Similar to the main collection, each annotation is a JSON object in the annotations array, and consists of metadata properties, a body and a target (what the annotation is attached to in the document).

Here’s a basic example from the specification:

{
   "@id": "urn:uuid:E7E3799F-3CD5-4F69-87C6-5478B22873D6",
   "@type": "oa:Annotation",
   "hasTarget": {
      "@type": "oa:SpecificResource",
      "hasSource": {
         "@id": "http://www.example.org/ebooks/A1B0D67E-2E81-4DF5/v2.epub", 
         "@type": "dctypes:Text"
      }
   },
   "hasBody": {
       "@type": "dctypes:Text",
       "format": "application/xhtml+xml",
       "chars": "<div xml:lang='en' xmlns='http://www.w3.org/1999/xhtml'>I love Alice in Wonderland</div>",
       "language": "en"
   },
   "motivatedBy": "oa:commenting"
}

Once again, there’s a required @id and @type. The identifiers for annotations also have to be universally unique, and the type of object is another default value that you just have to be sure to add. Things get a lot more interesting with the hasTarget and hasBody properties.

But as the order of the properties isn’t important, I’ll jump to the trailing, and simpler to explain, motivatedBy property before looking at the meatier hasBody and hasTarget properties.

In a nutshell, it identifies the nature of annotation. In most cases you’ll probably just plug in oa:commenting for general comments, but the Open Annotation spec defines a number of other types of motivations for annotations, such as questioning, replying and highlighting.

There’s even an “editing” value if you decide annotations are a great way to work on manuscripts.

Annotation Body

Getting to the heart of an annotation, let’s look at hasBody first:

   "hasBody": {
       "@type": "dctypes:Text",
       "format": "application/xhtml+xml",
       "chars": "<div xml:lang='en' xmlns='http://www.w3.org/1999/xhtml'>I love Alice in Wonderland</div>",
       "language": "en"
   }

While you might expect a simple string of HTML5 markup to be the value of the property, you can see in this example that there are actually several pieces of information you have to specify. The @type here is set to dctypes:Text by default. This value has fooled some people into believing the annotations themselves can only be text (e.g., no images, video, markup), but that’s not the case. The property is more of a hint to what is contained in the chars property — a string of text data.

EPUB limits the annotation body to an HTML5 fragment, but Open Annotation is more liberal, so the chars property could contain image data, for example, and the @type would then be dctypes:Image. You don’t need to worry about what other types of data an annotation can hold at this time, though, since no others are allowed.

The format property more directly identifies the nature of the content of the chars property, in the case of EPUB annotations it’s the XHTML media type. Again, no other type is allowed in version 1.0, but these properties are necessary for future expansion so you just have to get in the habit of setting them.

The chars attribute contains the HTML5 fragment. Note that it has to be an element fragment of HTML5 Flow content, so you can’t have a bare text string (i.e., it would have to be wrapped in a span or div).

The HTML fragment can reference other resources, like audio and video clips.

You only need to be aware that script elements are not allowed in the body. The Open Annotation working group still has to develop a security model for scripted annotations, and the EPUB working group has no ambition to try and jump the gun on that work. Scripted annotations are probably a bit of a niche case, too, but I’ve had to eat my words before.

The language property is another that speaks for itself, so let’s move on.

Annotation Targets

Targeting where the annotation applies is where complexity starts to creep in. There are two parts to the process: identifying the publication and then identifying the location within it.

Both aspects are expressed in a hasTarget object:

"hasTarget": {
  "@type": "oa:SpecificResource",
  "hasSource": {
    "@type": "dctypes:Text",
    "uniqueIdentifier": "urn:isbn:123456789x",
    "dc:identifier": "urn:uuid:A1B0D67E-2E81-4DF5-9E67-A64CBE366809",
    "dcterms:modified": "2011-01-01T12:00:00Z"
  },
  "hasSelector": {
    "@type": "oa:FragmentSelector",
    "value": "epubcfi(/6/4[chap01ref]!/4[body01]/10[para05]/3:10)"
  }
}

(I’m going to disregard @type, as it’s another example of a required value that doesn’t change.)

Sources

The hasSource object is where the real fun begins.

Ignoring its @type property, which is another required value, the next three properties should look familiar if you’re familiar with EPUB package metadata. We’re got a unique identifier and  dcterms:modified date, which together are the release identifier for an EPUB rendition, plus a second identifier for redundancy checking.

As you’d expect, these values can be matched against the EPUB metadata by a reading system to determine whether the annotation applies to it or not. In this case, you could match the annotation up to a specific release of the publication regardless of which vendor it comes out of.

Not terribly complicated stuff, but maybe you don’t want to be that specific. In that case, maybe you drop the modified date to allow the annotation to match any release with the ISBN. Or if you want to match any version regardless of its identifiers, you could specify only its title:

  "hasSource": {
    "@type": "dctypes:Text",
    "dc:title": "The Odyssey",
  },

If you don’t want to leave the specificity to chance, you can also a specificityLevel property:

"hasTarget": {
  "@type": "oa:SpecificResource",
  "specificityLevel": "publication",
  "hasSource": {
    
  },
  
}

In the above case, even if I include a unique identifier and last modified date in hasSource, the specificity level is a signal to the reading system that it can ignore the last modified date and apply the annotation to all publications with the specified unique identifier.

There are two properties for identifying the source that I’m going to skim by: @id and originURL. Mostly because I’m still not fully sure how they would work in the real world, or how useful they are now, at least.

The @id specifies the URI identifier for the book, but how many have those? (More a linked data world need.) The originURL I’m less sure I understand at this point. What is a product information page URL really going to do, for example. I’m hoping we can clarify these a little further in an upcoming release. I don’t take my lack of paying attention to those debates as indicative of the properties not having value, only that the prose could be fleshed out.

Locations

Identifying the location is a little more routine, as it just involves an EPUB CFI:

  "hasSelector": {
    "@type": "oa:FragmentSelector",
    "value": "epubcfi(/6/4[chap01ref]!/4[body01]/10[para05]/3:10)"
  }

(See my earlier posts on CFIs if you don’t understand them: part one covers the path basics and part two some of the more advanced ways you can specify locations.)

One of the limitations of CFIs that annotations exposes is the lack of specificity when you have multiple renditions of the content. EPUB CFIs assume only one rendition of the content, but what if I bundle both fixed layout and reflowable renditions of the content? By default, the CFI applies to whichever one is listed in the first rootfile element in the container.xml file.

Open Annotation has a mechanism to work around this, fortunately, although it’s only useful for annotations. If you are in a situation where you have to target a rendition other than the default, you can specify the path to the OPF file in the hasState property:

  "hasTarget": { …  },
  "hasState": {
    "@type": "epub:RenditionState",
    "opfPath": "/opfs/content.opf"
  }

There are limitations to CFIs beyond the complexity of authoring, like the effects of a scripted DOM, so they’re not the most useful solution in all cases. The Open Annotation specification includes its own content selectors, but how to integrate these with EPUB remains an issue to be solved in a future release or version.

Annotation Audience

That last thing I’ll look at as far as annotations go is who the annotation is intended for.

I mentioned teacher’s guides at the outset, but other than putting “teacher guide” in the title of the collection how do you go about adding this information for machine processing.

Enter the audience property:

"audience" : [
   {
      "@type" : "schema:EducationalAudience",
      "schema:educationalRole" : "teacher"
   },  
   {
      "@type" : "schema:Audience",
      "schema:audienceType" : "scientist"
   }
],

(The property name is just a shorthand for ” dc:audience“, if you’re into the initial context stuff I touched on back in the collections section.)

Here the schema.org EducationalAudience type is used with the educationalRole property to identify that the annotation is intended for a teacher. (See Appendix D.1.1 of the EDUPUB profile for a list of recommended values to use with this property.)

The second declaration uses the more general Audience type with the audienceType property to additionally indicate that it is for scientists.

I can’t say that audiences are widely identified in reading systems, but one designed specifically for schools might identify classes of users. In that case, the annotations could be controlled to the type of reader. EDUPUB will probably push growth in this area of identification.

You can get there from here

The last question I’ll look at (I promise!) is how do these annotations move around, since I’m claiming one of the primary functions of the format is interchange.

Fortunately for your reading attention span, the answer is pretty straight forward.

If you want to distribute annotations with an EPUB, you only have to include the annotation document and any local resources it references in the EPUB container. The annotation media type ” application/ld+json;profile=http://www.idpf.org/epub/oa/1.0/“. (This media will be made valid in a future update to epubcheck, but is invalid right now.)

If you want to distribute, or move annotations around, outside of an EPUB, you just zip them up. There are some requirements, like including the annotation document in the root of the container with the name ” annotation.json“, but again no rocket science involved.


I probably should just admit to myself that I’m writing tutorials and not short posts, as this turned into kind of a three hour tour.

But hopefully this will make the spec a little easier to understand and read for more information, as there’s still lots more you can do than is explained above (annotating annotations, for one).

And this is another case of a specification in progress. If you don’t like something, want a missing piece added, etc., there’s still plenty of time to comment on the draft.

May all your annotations be briefer than this post!

One Reply to “EPUB Annotations”

  1. Hi Matt,

    I believe if IDPF decided to follow W3C OA route, the EPUB Annotations spec should strive to adopt any of its features which are already there in W3C OA. I believe EPUB Annotations selectors should be more closely aligned with W3C work on OA selector or use XPath model to have any chance of good implementation.

    Pavel.

Leave a Reply

Your email address will not be published. Required fields are marked *