I suppose it’s natural that I wonder every so often how well EPUB could be used to represent acts, regulations, case law and all the other (boring to everyone else) documents that are critical to the field. I did get my first “real” data job in legal publishing, after all.
And, yes, I realize how sad it must sound that I spend my free time thinking about translating legal documents to EPUB.
But whatever your sympathies, I spent a little time looking at the problems again by tinkering with some legislation I pulled off the web, so thought I’d recap what I was finding.
Legislation, at least in Canada, all lives on the web at this point, right, so why would anyone care for an EPUB version?
If you need to make this case, portability would be one good reason. Do you want to depend on an internet connection to get at information?
More interesting is the ability to package up sets of related acts and/or subsections of acts that are pertinent to a specific field. Who says an EPUB has to contain only a single act? All the legislation you could ever need on a little tablet. Hm…
Another notch up the coolness ladder is the ability to bookmark and add your own annotations to these portable digital versions.
Or, if you’re a publisher, what if you gave the acts away but made your business model selling professional annotations. (If it won’t fly for legislation, what about case law?)
If you’re thinking the format is only for novels, a little more imagination will open a world of opportunities.
I’m going to make this a short post, as I haven’t had a lot of time to synthesize my thoughts, or really done enough research to make more than broad-sweeping generalizations. Plus converting the document has sucked much of the life out of me.
In a nutshell, I grabbed a copy of the Income Tax Act off the government site and did a quick conversion to see what I’d unearth. Here then is a small list of things that came to light:
- Navigation within the body is still awkward. The table of contents is useful, but the source files were using unordered lists to structure the entire document. The oddity of a piece of legislation as an unordered list aside, there is a correlation to ordered lists. At least once you reach numbered sections. The pain in translating to HTML lists has always been the decimal numbering used to inject new subsections.
- Chunking of massive documents like the ITA is interesting. Not every section decomposes to a file size that would meet the typical 300KB EPUB size restrictions. But then none were extreme in size, either.
- Speaking of chunking, it makes for some weird page breaking when the document is dynamically repaginated. When reading a piece of legislation, you don’t typically expect to find a half page or more of whitespace followed by the next section. A scrolled interface would work better.
- Semantics are lacking (naturally enough, I suppose). EPUB does have parts and divisions, but no paragraphs and subparagraphs. But then one could argue that these are semantics for semantics sake.
- One nice addition of HTML5 + EPUB 3 semantics is the ability to indicate that margin notes as outside the logical reading order. Although they typically read like headings, they’re in the margin in print for a reason. Enter:
- It’s also nice to be able to use a similar combination to mark revision notes as outside the reading order.
- I spotted some basic math formulas inserted as plain text. If there were better support for MathML, as I complained in the last post, it would be nice to make real equations of these.
- It was surprising no hyperlinking was added to the various section cross-references, but that’s an indictment of the production than of the ability to do such.
- If HTML5’s outline algorithm were more than just a myth, the use of sections with generic headings would be a way around the collision of deep nesting of sections in legislation and there being only six headings in HTML. I didn’t see any deep nesting in this document, but it just popped to mind.
If you want to download a copy of the EPUB, give it a look and/or rifle through the source, feel free to. I only reviewed the first five chunks. After that, all you’ll find is some global cleanup.
If it’s not clear from the file, the document is unfit for any use but looking at it as an EPUB.
I’ll probably return to legislation and case law in EPUB in the future, but if anyone has been playing around with the format please feel free to share your thoughts/pain points in the comments. I’m always interested to hear.