A question I get asked a lot is why numbered headings matter in EPUB. HTML5 defined a fancy new algorithm that can correct your headings after all, right? (Albeit in weird and wonderful ways if you don’t follow every nuance!)
The answer to the HTML5 question I’ll get to, but I can never seem to stress enough that getting headings right in your documents is a key component of their overall accessibility. There are two primary ways that readers will move through your ebook at the markup level (or DOM/accessibility tree level, to be specific).
One method is the table of contents, but while it provides the structure of the document, constantly having to open it to move a section or two ahead or back is not the most user-friendly approach to navigation.
Navigation by table of contents has also suffered in EPUBs for two reasons: 1) the complete table of contents was often not provided in EPUB 2 NCX documents; and 2) when opening the table of contents, you’re typically presented with the first entry each time (i.e., the table of contents doesn’t get set to the current location in the ebook, leaving the reader to manually move through all the entries to find where they are).
EPUB 3’s new navigation document should help with the first problem, as it provides the means to include the full structure without having to have it all rendered visually. The second problem remains to be solved, but is on the reading system side to do.
The other key means of navigation is by heading.
Where the table of contents requires readers to move in and out of the content to get around, which is useful when skimming for a location, headings allow them to quickly move forward and back without leaving the content. Headings are not numbered h1, h2 .. h6 just so you can pick how big you want them to look, after all! Each represents a level of your document hierarchy, and when you apply them properly it allows a reader using an accessible device to move through the levels in a consistent manner. Accessible devices typically have hotkey combinations that allow the reader to move across equivalent heading levels or up and down through them, so long as they have been consistently applied.
But numbered headings are far from perfect:
- there are only six of them, which might seem overly-generous to the novel writers of the world, but having worked with legislation, six is often hardly enough. Textbooks also push this artificial limit.
- once you hit six, you can visually keep styling the headings differently for visual readers, but anyone using an AT loses any further ability to distinguish the document hierarchy.
- they’ve made reuse of content a pain, as it typically requires modifying heading numbers on the fly to get the content to retain a consistent hierarchy
Generic headings that are tied to the document structure would seem to be the perfect answer, no?
Unfortunately, a combination of a complex implementation and a compressed timeline has made HTML5 generic headings, and the outlining algorithm, little more than a pain point. The idea that each structurally-significant section would have a single heading was a good one, don’t get me wrong, but it’s the other contortions that were required that makes it worth steering clear of. It’s not really even HTML5’s fault, except for trying to make sense of the nonsensical. Of quick note, the details most people miss include:
- headings floating around in the content imply new sections; structure isn’t only defined by the structural elements section, article, nav and aside. If you think adding a subtitle following the title in the next lower heading tag is harmlesss, you’ve actually just create a new subsection of your document for all the content in that section.
- the hgroup element was supposed to solve the problem of multiple headings causing new sections by creating a container for the group, but it’s been under assail pretty much since it was devised. At this time, it carries no semantics and has been moved to a separate specification, so don’t assume you’re doing the right thing by using hgroup in your document.
- numbered headings get auto-corrected in ways that are not always obvious. Figuring out when headings will be superordinate or subordinate can be tricky.
It’s not just the complexity of the implementation that causes problems, though, otherwise it would just be a matter of education. The other problem is that by not implementing the generic headings in a new way, the algorithm conflicts with how ATs handle HTML content. With only the newest Jaws having some basic support for the algorithm, that leaves the vast majority of AT users in a situation where the new heading model will be a serious impediment, even if you follow it very strictly:
- if every heading is an h1, the table of contents becomes the only effective path through the book as there is no way to move forward or back except one h1 at a time
- if headings are used in sectioning roots like blockquote and figure, readers have to also traverse those points as well as they try to move through the content
But I don’t want to bemoan this. The outline algorithm and everything associated with it is now labelled at risk for removal from HTML5, so if that’s not enough reason to stop thinking about it in the near term I don’t know what is.
Keep using numbered headings in your books to reflect the nesting level of content, keep them consistently numbered, keep to one heading per section element, get in the habit of always including a section element with them so you’re never surprised by what you’ve created, avoid headings in content elements like figures and blockquotes. Do these few things and you’ll not only remain accessible but your content should work fine whenever a working algorithm is implemented.
And if you’re publishing through an XML workflow, don’t shy away from using generic headings internally, or following the practices defined by the outlining algorithm. But for distribution, do a little dance with your headings to follow the preceding guidelines to make them more accessible…