Note: This document is an informative reproduction of OCF 2.0.1 in HTML.
In the case of any discrepancies, the original DOC file is the authoritative source.

Open Container Format (OCF) 2.0.1 v1.0.1

RECOMMENDED SPECIFICATION September 4, 2010

This version
http://www.idpf.org/doc_library/epub/OCF_2.0.1_draft.doc
Latest version
http://www.idpf.org/doc_library/epub/OCF_2.0_latest.doc
Previous version
http://www.idpf.org/ocf/ocf1.0/download/ocf10.htm
Diffs to previous version
http://www.idpf.org/doc_library/epub/OCF_2.0.1_diffs_to_2.0.doc

Copyright 2010 by International Digital Publishing Forum.

All rights reserved. This work is protected under Title 17 of the United States Code. Reproduction and dissemination of this work with changes is prohibited except with the written permission of the International Digital Publishing Forum.

1 Overview

This specification, the Open Container Format (OCF), is one third of a triumvirate of modular specifications that make up the EPUB publication format. EPUB enables the creation and transport of reflowable digital books and other types of content as single-file digital publications that are interoperable between disparate EPUB-compliant reading devices and applications. EPUB encompasses a content markup standard (Open Publication Structure OPS), a packaging standard (Open Packaging Format OPF), and this specification, a container standard.

1.1 Purpose and Scope

This specification defines the Open Container Format (OCF). OCF is a general-purpose container technology. This specification describes the general-purpose container technology in the context of encapsulating EPUB publications and OPTIONAL alternate renditions thereof. It is however anticipated that the general-purpose container technology described herein may ultimately be used in other bundling applications.

As a general container format, OCF collects a related set of files into a single-file container. OCF can be used to collect files in various document formats and for classes of applications. The single-file container enables easy transport of, management of, and random access to, the collection.

OCF defines the rules for how to represent an abstract collection of files (the abstract container) into physical representation within a ZIP archive (the physical container). The rules for ZIP containers build upon and are backward compatible with the ZIP technologies used by Open Document Format (ODF) 1.0.

OCF is the REQUIRED single-file container technology for EPUB publications. OCF MAY play a role in the following workflows:

1.2 Definitions

ASCII

American Standard Code for Information Interchange a 7-bit character encoding based on the English alphabet (ANSI X3.4-1986). When used in this document, ASCII refers to the printable graphic characters in the range 33 (decimal) through 126 (decimal) and the nonprintable space character 32 (decimal).

Content Provider

A publisher, author, individual, or other information source that provides a publication to distribution or sales channels or directly to one or more EPUB Reading Systems using OCF as described in this specification.

EPUB

The publication format as defined by the OCF 2.0.1, OPF 2.0.1 and OPF 2.0.1 specifications.

EPUB Publication

A collection of OPS Documents, an OPF Package file, and other files, typically in a variety of media types, including structured text and graphics, packaged in an OCF container that constitute a cohesive unit for publication, as defined by the EPUB standards.

EPUB Reading System (or Reading System)

A combination of hardware and/or software that accepts EPUB Publications and makes them available to consumers of the content. Great variety is possible in the architecture of Reading Systems. A Reading System MAY be implemented entirely on one device, or it MAY be split among several computers. In particular, a reading device that is a component of a Reading System need not directly accept OCF-Packaged EPUB Publications, but all Reading Systems MUST do so. Reading Systems MAY include additional processing functions, such as compression, indexing, encryption, rights management, and distribution.

IRI

Internationalized Resource Identifier (http://www.ietf.org/rfc/rfc3987.txt).

OCF

The Open Container Format defined by this specification.

OCF Container

A container file that is compliant with the format defined in this specification.

ODF

Open Document Format (http://www.oasis-open.org/committees/download.php/12572/OpenDocument-v1.0-os.pdf).

OPF

Open Packaging Format (http://www.idpf.org/doc_library/epub/OPF_2.0.1_draft.htm).

OPF Package

An XML document that describes the OPS contents of an EPUB Publication providing metadata, manifest, reading-order and navigation information for the publication.

OPS

Open Publication Structure (http://www.idpf.org/doc_library/epub/OPS_2.0.1_draft.htm).

OPS Document

An XML document that conforms to the OPS 2.0.1 specification generally containing the textual content of an EPUB Publication.

MIME

Multipurpose Internet Mail Extensions (http://www.isi.edu/in-notes/rfc2045.txt). MIME media types provide a standard methodology for specifying the content type of objects.

RFC

Literally Request For Comments, but more generally a document published by the Internet Engineering Task Force (IETF). See http://www.ietf.org/rfc.html.

Reading System

See EPUB Reading System.

Relax NG

A schema language for XML (http://www.relaxng.org/).

Rootfile

The top-level file of a rendition of a publication; either the root from which all other components can be found or the lone file encapsulating the rendition. The EPUB rootfile is the OPF Package file. A PDF file containing the PDF rendition could also be a rootfile.

XML

Extensible Markup Language (http://www.w3.org/TR/2006/REC-xml-20060816/).

ZIP

A defacto industry standard bundling and compression format (http://www.pkware.com/business_and_developers/developer/appnote).

1.3 Relationship to Other Specifications

This specification combines subsets and applications of other specifications. Together, these facilitate the construction, organization, presentation, and unambiguous interchange of electronic documents:

  1. The XML 1.0 Extensible Markup Language specification (Fourth Edition) (http://www.w3.org/TR/2006/REC-xml-20060816/); and
  2. The OPF 2.0.1 Open Packaging Format specification (http://www.idpf.org//doc_library/epub/OPF_2.0.1_draft.htm); and
  3. The OPS 2.0.1 Open Publication Structure specification (http://www.idpf.org/doc_library/epub/OPS_2.0.1_draft.htm); and
  4. The XML 1.0 namespace specification (Second Edition) (http://www.w3.org/TR/2006/REC-xml-names-20060816/); and
  5. The Unicode Standard, Version 4.0. Reading, Mass.: Addison-Wesley, 2003, as updated from time to time by the publication of new versions. (See http://www.unicode.org/unicode/standard/versions for the latest version and additional information on versions of the standard and of the Unicode Character Database).; and
  6. Particular MIME media types (http://www.ietf.org/rfc/rfc4288.txt and http://www.iana.org/assignments/media-types/index.html); and
  7. Open Document Format for Office Applications (Open Document) v1.0 (http://www.oasis-open.org/committees/download.php/12572/OpenDocument-v1.0-os.pdf); and
  8. ZIP format (http://www.pkware.com/business_and_developers/developer/appnote); and
  9. XML-Signature Syntax and Processing (http://www.w3.org/TR/2002/REC-xmldsig-core-20020212); and
  10. XML Encryption Syntax and Processing (http://www.w3.org/TR/2002/REC-xmlenc-core-20021210).
  11. Web Content Accessibility Guidelines 1.0 (http://www.w3.org/TR/WCAG10/).

EPUB Reading Systems MAY support XML 1.1, but this feature is deprecated in version 2.0.1 (in favor of XML 1.0). Support for XML 1.1 will be removed in the next version of the specification.

1.4 Conformance

The keywords "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "RECOMMENDED", "MAY", and "OPTIONAL" in this document MUST be interpreted as described in (http://www.ietf.org/rfc/rfc2119.txt).

This section defines conformance requirements for OCF.

1.4.1 Conforming Containers

The term Conforming OCF Abstract Container indicates an OCF Abstract Container (See Section 2.2) that conforms to all of the relevant conformance criteria defined in this specification. The term Conforming OCF ZIP Container indicates a ZIP archive that conforms to the relevant ZIP container conformance criteria (See Section 4) and whose contents is a Conforming OCF Abstract Container.

In addition to other conformance criteria defined in this specification, a Conforming OCF Abstract Container MUST meet the following conditions:

  • All XML files MUST be well-formed (as defined in XML 1.0) and thus include a correct XML declaration (e.g. <?xml version=1.0?>)
  • All XML files MUST be compatible with the XML 1.0 specification (http://www.w3.org/TR/2006/REC-xml-20060816/) and the Namespaces in XML specification (http://www.w3.org/TR/2006/REC-xml-names-20060816/)
  • All XML files MUST be encoded in UTF-8 or UTF-16
  • All XML files MUST conform to the relevant XML specification for any MIME type specified for the file

1.4.2 Conforming Reading Systems

The term Conforming EPUB Reading System indicates a Reading System that supports all of the mandatory features defined by this specification and the OPF and OPS specifications.

An EPUB Reading System that does not support all of the features defined in this specification and the OPF and OPS specifications MUST NOT claim to be a Conforming EPUB Reading System and SHOULD provide readily available documentation of the subset of features it supports.

An EPUB Reading System SHOULD provide readily available documentation of the accessibility features it supports. This documentation SHOULD conform to the relevant version of the W3C's Web Content Accessibility Guidelines. (See Section 1.5)

1.5 Accessibility

E-books MAY provide an accessible reading experience for users with disabilities provided authors and publishers conform to accepted industry standards for the creation of accessible electronic materials. EPUB publications packaged or delivered using OCF SHOULD conform to the accessibility standards set forth by the relevant IDPF Working Groups to ensure that the broadest possible set of users will have access to books delivered in this format. This includes adherence to the W3C's Web Content Accessibility Guidelines 1.0 (http://www.w3.org/TR/1999/WAI-WEBCONTENT-19990505/) or, if it is released while the Working Group is active, the Web Content Accessibility Guidelines 2.0 (the current draft is available at http://www.w3.org/TR/WCAG20/). EPUB publications packaged or delivered using OCF MUST NOT interfere with any features intended to deliver accessible content, regardless of how that content is rendered.

In addition, recommendations from the W3C HTML 4.0 Guidelines for Mobile Access (http://www.w3.org/TR/NOTE-html40-mobile/) and the W3C Web Accessibility Initiative's proposed User Agent Guidelines (http://www.w3.org/TR/WD-WAI-USERAGENT/) SHOULD be reviewed and applied by OCF implementers to ensure that Reading Systems will be in conformance with accessibility requirements.

1.6 Future Directions

It is the intent of the contributors to this specification that subsequent versions of this specification continue in the directions established by the 2.0.1 release. Specifically:

Future versions of the OCF specification MAY include:

2 OCF Overview

2.1 OCF: A General Container Technology

OCF is purposely designed as a general container technology that can be used by other file formats, not just EPUB. In particular, OCF is purposely designed to be upwardly compatible with the container technology used in ODF 1.0 such that a future version of ODF might use OCF.

2.2 Abstract Container vs. Physical Container

An Abstract Container defines a file system model for the contents of the container. The file system model MUST have a single common root directory for all of the contents of the container. The special files REQUIRED by OCF MUST be included within the META-INF directory that is a direct child of the root directory. All (non-remote) electronic assets for embedded publications MUST be located within the directory tree headed by the containers root directory.

A Physical Container holds the physical manifestation of an abstract container. This specification defines how an abstract container MUST be mapped to the following two physical container technologies:

Publications MUST render the same no matter whether using a File System Container or a ZIP Container. In both cases, the EPUB Reading System ultimately opens the rootfile for the Publication, from which it can determine how to render the Publication.

2.3 Examples

(This section is informative.)

This section includes an example of a single rendition and a multiple rendition container. See Section 3.5.1 for normative descriptions.

2.3.1 Example of a simple Publication, Abstract Container, and ZIP Container

To illustrate the concepts from the previous section, lets assume we have an EPUB Publication of Dickens Great Expectations which consists of an OPF 2.0.1 package file (Great Expectations.opf) and a large number of OPS 2.0.1 files, one for the cover page (e.g., cover.html) and one for each chapter (e.g., chapter01.html). The contents of the publication might be as follows:

OPF/OPS Publication:
Great Expectations.opf
cover.html
chapters/
   chapter01.html
   chapter02.html
   … other OPS files for the remaining chapters …
					

The contents of the Abstract Container includes all of the assets from the Publication, plus a small number of files defined by OCF within the META-INF directory. Note that container.xml is REQUIRED in all circumstances. See Section 3 for descriptions of the files within the META-INF directory.

Abstract Container:
META-INF/
   container.xml
   [manifest.xml]
   [metadata.xml]
   [signatures.xml]
   [encryption.xml]
   [rights.xml]
OEBPS/
   Great Expectations.opf
   cover.html
   chapters/
      chapter01.html
      chapter02.html
      … other OPS files for the remaining chapters …

When the above abstract container is mapped to a File System Container, the directory structure within the file system exactly matches the OCFs Abstract Container directory structure shown above:

File System Container:
…some directory within the file system…/
   META-INF/
      container.xml
      [manifest.xml]
      [metadata.xml]
      [signatures.xml]
      [encryption.xml]
      [rights.xml]
   OEBPS/
      Great Expectations.opf
      cover.html
      chapters/
         chapter01.html
         chapter02.html
         … other OPS files for the remaining chapters …

When the above Abstract Container is stored within a ZIP container, the contents of the ZIP archive will match the directory structure shown above, but MUST also contain a mimetype file as the first file in the ZIP archive to aid in the easy identification of the media type of the container. [See section 3.4]

ZIP Container:
mimetype
META-INF/
   container.xml
   [manifest.xml]
   [metadata.xml]
   [signatures.xml]
   [encryption.xml]
   [rights.xml]
OEBPS/
   Great Expectations.opf
   cover.html
   chapters/
      chapter01.html
      chapter02.html
      … other OPS files for the remaining chapters …

The corresponding META-INF/container.xml file might appear as follows:

<?xml version="1.0"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
   <rootfiles>
      <rootfile full-path="OEBPS/Great Expectations.opf"
      media-type="application/oebps-package+xml" />
   </rootfiles>
</container>

N.B. The use of the specific namespace string urn:oasis:names:tc:opendocument:xmlns:container should be considered provisional until approved by an OASIS technical committee.

2.3.2 Single-publication containers, but with alternate renditions

In some circumstances, an OCF container might hold multiple renditions of the same publication. An example is a container that has OPS/OPF documents as the primary rendition for viewing but includes an alternate PDF for printing. To avoid name conflicts, it is RECOMMENDED that each rendition be placed within its own subdirectory and that multiple <rootfile> elements be defined within container.xml. Here is an example:

Abstract Container:
META-INF/
   container.xml Note: includes multiple <rootfile> elements
   [manifest.xml]
   [metadata.xml]
   [signatures.xml]
   [encryption.xml]
   [rights.xml]
OEBPS/
   Great Expectations.opf
   cover.html
   chapters/
      chapter01.html
      chapter02.html 
      … other OPS files for the remaining chapters …
PDF/
   Great Expectations.pdf

The corresponding META-INF/container.xml file might appear as follows:

<?xml version="1.0"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
   <rootfiles>
      <rootfile full-path="OEBPS/Great Expectations.opf"
      media-type="application/oebps-package+xml" />
      <rootfile full-path="PDF/Great Expectations.pdf"
      media-type="application/pdf" />
   </rootfiles>
</container>

3 OCF Container Contents

3.1 File and directory structure

The virtual file system for the OCF Abstract Container MUST have a single common root directory for all of the contents of the container.

The following file names in the root directory are reserved:

The mimetype file is discussed in Section 4. The META-INF/ directory contains the reserved files used by OCF. These reserved files are described in the following sections. All other files used by the publication rendition(s) within the Abstract Container MAY be in any location descendant from the root directory except for mimetype at the root level or within the META-INF directory.

It is RECOMMENDED that the contents of individual publications be stored within dedicated sub-directories to minimize potential file name collisions in the event that multiple renditions are used or that multiple publications per container are supported in future versions of this Specification.

3.2 Relative IRIs for referencing other components

Files within the Abstract Container reference each other via Relative IRI References (http://www.ietf.org/rfc/rfc3987.txt and http://www.ietf.org/rfc/rfc3986.txt), no matter what is used for the physical container (e.g., File System Container or ZIP Container). For example, if a file named chapter1.html references an image file named image1.jpg that is located in the same directory, then chapter1.html might contain the following as part of its content:

<img src="image1.jpg" alt="" />

For Relative IRI References, the Base IRI (see RFC3986) is determined by the relevant language specifications for the given file formats. For example, the CSS specification defines how relative IRI references work in the context of CSS style sheets and property declarations. Note that some language specifications reference RFCs that preceded RFC3987, in which case the earlier RFC applies for content in that particular language.

Unlike most language specifications, the Base IRIs for all files within the META-INF/ directory use the root folder for the Abstract Container as the default Base IRI. For example, if META-INF/container.xml has the following content:

<?xml version="1.0"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
   <rootfiles>
      <rootfile full-path="OEBPS/Great Expectations.opf"
      media-type="application/oebps-package+xml" />
   </rootfiles>
</container>

the path OEBPS/Great Expectations.opf is relative to the root directory for the Abstract Container and not relative to the META-INF/ directory.

3.3 File Names

The term File Name represents the name of any type of file, either a directory or an ordinary file within a directory within an Abstract Container. For a given directory within the Abstract Container, the Path Name is a string holding all directory names in the full path concatenated together with a / character separating the directory names. For a given file within the Abstract Container, the Path Name is the string holding all directory names concatenated together with a / character separating the directory names, followed by a / character and then the name of the file. The File Name restrictions described below are designed to allow directory names and file names to be used without modification on most commonly used operating systems. This specification does not specify how an EPUB Reading System that is unable to represent OCF conforming File Names would compensate for this incompatibility.

The following statements apply to Conforming OCF Content:

Note that some commercial ZIP tools do not support the full Unicode range and may only support the ASCII range for File Names. Content creators who want to use ZIP tools that have these restrictions MAY find it is best to restrict their File Names to the ASCII range. If the names of files can not be preserved during the unzipping process, it will be necessary to compensate for any name translation which took place when the files are referenced by URI from within the content.

3.4 Container media type identification

It is frequently necessary for applications to determine the media type of a file. This is usually accomplished by looking at the file extension of the file. This gives applications a quick way to determine the type of the file without looking inside the file. OCF Container files SHOULD use an extension .epub to identify to processing applications that they are OCF Containers.

In order to translate a file extension into a media type, typically a processing agent will register the relationship between file extension and media type with the operating system. Applications that are interested in OCF Container files SHOULD register the media type of application/epub+zip as corresponding to the file extension of .epub.

Unfortunately, the identification of files through the use of file extensions is notoriously unreliable. As a result, it is desirable to have a more robust way of identifying files independent of their file names or extensions. One mechanism that has evolved for doing this is to require the placement of specific information at specific file offsets. A processing agent can then check a fixed location to determine if the file is an OCF Container.

The method that has evolved for doing this in ZIP archives is the inclusion of an uncompressed, unencrypted file called mimetype as the first file in the ZIP archive. The contents of this file are the media type of the file. OCF Containers MUST place the ASCII string application/epub+zip in the mimetype file as the first file in the ZIP archive. See Section 4 for more detail on this mechanism.

3.5 META-INF

All valid OCF Containers MUST include a directory called META-INF at the root level of the container file system. This directory contains the files specified below that describe the contents, metadata, signatures, encryption, rights and other information about the contained publication.

The semantics of the following files that MAY be present at the META-INF/ level are specified. All other files found at the META-INF/ level MUST be ignored by conformant OCF Reading Systems.

3.5.1 Container META-INF/container.xml (Required)

(This is normative.)

All valid OCF Containers MUST include a file called container.xml within the META-INF directory at the root level of the container file system. The container.xml file MUST identify the MIME type of, and path to, the rootfile for the OPF/OPS version of the publication and any OPTIONAL alternate renditions included within the container.

The container.xml file MUST NOT be encrypted.

The container.xml file contains XML that uses the urn:oasis:names:tc:opendocument:xmlns:container namespace for all of its elements and attributes. The version="1.0" attribute MUST be included for all containers that conform to this version of the specification.

A RELAX NG OCF schema describing the <container> element that MUST be the root element of container.xml can be found in the Appendix A.

The <rootfiles> element MUST contain at least one <rootfile> element that has a media-type of application/oebps-package+xml. Only one <rootfile> element with a media-type of application/oebps-package+xml SHOULD be included. The file referenced by the first <rootfile> element that has a media-type of application/oebps-package+xml will be considered the EPUB rootfile. The EPUB rootfile (the OPF package file) MUST NOT be encrypted.

Each <rootfile> element specifies the rootfile of a single rendition of the contained publication. A rootfile often includes an enumeration of the other files needed by the rendition. In the case of EPUB, the root will be the OPF Package file for the OPS rendition of the publication, whose <manifest> element enumerates the other files used by the OPS rendition. In other cases, the rootfile MAY be the only file needed by the rendition.

(This example is informative.)

The following example shows a sample container.xml for an EPUB Publication with the root file OEBPS/My Crazy Life.opf (the OPF package file):

<?xml version="1.0"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
   <rootfiles>
      <rootfile full-path="OEBPS/My Crazy Life.opf"
      media-type="application/oebps-package+xml" />
   </rootfiles>
</container>

(This example is informative.)

The following example adds an alternate PDF version of the Publication:

<?xml version="1.0"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
   <rootfiles>
      <rootfile full-path="OEBPS/My Crazy Life.opf"
      media-type="application/oebps-package+xml" />
      <rootfile full-path="PDF/My Crazy Life.pdf"
      media-type="application/pdf" />
   </rootfiles>
</container>

(This is normative.)

The <manifest> element contained within the OPF root package file specifies the one and only manifest used for OPS processing; all items referenced in this manifest MUST be included in the ZIP archive. Ancillary manifest information contained in the ZIP archive or in the OPTIONAL manifest.xml file MUST NOT be used for OPS processing purposes. Any extra files in the ZIP archive (i.e., files within the ZIP archive that are not listed within the package files <manifest> element, such as META-INF files or alternate derived renditions of the publication) MUST NOT be used in the processing of the OPS publication.

The values of the full-path attributes MUST contain a path component (as defined by RFC3986) which MUST only take the form of a path-rootless (as defined by RFC3986). The path components are relative to the root of the container in which they are used.

Conforming OCF User Agents MUST ignore unrecognized elements (and their contents) and unrecognized attributes within a container.xml file, including unrecognized elements and unrecognized attributes from other namespaces.

Conforming container.xml files MUST be valid according to the RELAX NG OCF schema with the <container> element as the root element after removing all elements (and child nodes of these elements) and attributes from other namespaces.

(This example is informative.)

For example:

<?xml version="1.0"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container" foo:xmlns="..." foozle:xmlns="..." />
   <foo:bar />
      <rootfiles foozle:identifier="bar">
      ... 
   </rootfiles>
</container>

is conformant, but:

<?xml version="1.0"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
   <foo />
   <rootfiles>
      ... 
   </rootfiles>
</container>

is non-conformant due to the non-namespace-qualified use of the <foo> element.

<?xml version="1.0"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
   <rootfiles identifier="bar">
      ... 
   </rootfiles>
</container>

is also non-conformant due to the non-namespace-qualified use of the identifier attribute on the <rootfiles> element.

3.5.2 Manifest META-INF/manifest.xml (Optional)

An OPTIONAL file with the reserved name manifest.xml within the META-INF directory at the root level of the container may appear in a valid OCF container. If present, the files content MUST be as defined in the ODF 1.0 manifest schema (http://www.oasis-open.org/committees/download.php/12570/OpenDocument-manifest-schema-v1.0-os.rng).

The manifest.xml file, if present, MUST NOT be encrypted.

3.5.3 Metadata META-INF/metadata.xml (Optional)

A file with the reserved name metadata.xml within the META-INF directory at the root level of the container file system may appear in a valid OCF container. This file, if present, MUST be used for container-level metadata. In version 2.0.1 of OCF, no such container-level metadata is specified. It is in this file that future innovation and extension SHOULD occur.

If the META-INF/metadata.xml file exists, its contents MUST be valid XML with namespace-qualified elements to avoid collision with future versions of OCF that MAY specify a particular grammar and namespace for elements and attributes within this file.

The metadata.xml file, if present, MUST NOT be encrypted.

3.5.4 Digital Signatures META-INF/signatures.xml (Optional)

An OPTIONAL signatures.xml file within the META-INF directory at the root level of the container file system holds digital signatures of the container and its contents. This file is an XML document whose root element is <signatures>. The <signatures> element contains child elements of type <Signature> as defined by XML-Signature Syntax and Processing (http://www.w3.org/TR/2002/REC-xmldsig-core-20020212). Signatures can be applied to the publication and any alternate renditions as a whole or to parts of the publication and renditions. XML Signature can specify the signing of any kind of data, not just XML.

The signatures.xml file MUST NOT be encrypted.

When the signatures.xml file is not present, the OCF container provides no information indicating any part of the container is digitally signed at the container level. It is however possible that digital signing exists within any optional alternate contained renditions.

A RELAX NG OCF schema describing the <signature> element that MUST be the root element of signatures.xml can be found in the Appendix A.

When an OCF agent creates a signature of data in a container, it SHOULD add the new signature as the last child <Signature> element of the <signatures> element in the signatures.xml file.

Each <Signature> in the signatures.xml file identifies by IRI the data to which the signature applies, using the XML Signature <Manifest> element and its <Reference> sub-elements. Individual contained files MAY be signed separately or together. Separately signing each file creates a digest value for the resource that can be validated independently. This approach MAY make a Signature element larger. If files are signed together, the set of signed files can be listed in a single XML Signature <Manifest> element and referenced by one or more <Signature> elements.

Any or all files in the container can be signed in their entirety with the exception of the signatures.xml file since that file will contain the computed signature information. Whether and how the signatures.xml file SHOULD be signed depends on the objective of the signer.

  • If the signer wants to allow signatures to be added or removed from the container without invalidating the signers signature, the signatures.xml file SHOULD NOT be signed.
  • If the signer wants any addition or removal of a signature to invalidate the signers signature, the Enveloped Signature transform (defined in Section 6.6.4 of XML Signature) can be used to sign the entire preexisting signature file excluding the <Signature> being created. This transform would sign all previous signatures, and it would become invalid if a subsequent signature was added to the package.
  • If the signer wants the removal of an existing signature to invalidate the signers signature but also wants to allow the addition of signatures, an XPath transform can be used to sign just the existing signatures. (This is only a suggestion. The particular XPath transform is not a part of OCF specification.)

XML-Signature does not associate any semantics with a signature, however an agent MAY include semantic information, for example, by adding information to the Signature element that describes the signature. XML Signature describes how additional information can be added to a signature (for example, by using the SignatureProperties element).

(This example is informative.)

The following XML expression shows the content of an example signatures.xml file, and is based on the examples found in Section 2 of XML-Signature Syntax and Processing. It contains one signature, and the signature applies to two resources, OEBFPS/book.html and OEBFPS/images/cover.jpeg, in the container.

<signatures>
   <Signature Id="MyFirstSignature" xmlns="http://www.w3.org/2000/09/xmldsig#">
      <SignedInfo>
         <CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
         <SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#dsa-sha1"/>
         <Reference URI="#Manifest1">
            <DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
            <DigestValue>j6lwx3rvEPO0vKtMup4NbeVu8nk=</DigestValue>
         </Reference>
      </SignedInfo>
      <SignatureValue>MC0CFFrVLtRlk=...</SignatureValue>
      <KeyInfo>
         <KeyValue>
            <DSAKeyValue>
               <P>...</P><Q>...</Q><G>...</G><Y>...</Y>
            </DSAKeyValue>
         </KeyValue>
      </KeyInfo>
      <Object>
         <Manifest Id="Manifest1">
            <Reference URI="OEBFPS/book.xml">
               <Transforms>
                  <Transform Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
               </Transforms>
            </Reference>
            <Reference URI="OEBFPS/images/cover.jpeg"/>
         </Manifest>
      </Object>
   </Signature>
</signatures>

3.5.5 Encryption META-INF/encryption.xml (Optional)

An OPTIONAL encryption.xml file within the META-INF directory at the root level of the container file system holds all encryption information on the contents of the container. This file is an XML document whose root element is <encryption>. The <encryption> element contains child elements of type <EncryptedKey> and <EncryptedData> as defined by XML Encryption Syntax and Processing (http://www.w3.org/TR/2002/REC-xmlenc-core-20021210). Each EncryptedKey element describes how one or more container files are encrypted. Consequently, if any resource within the container is encrypted, encryption.xml MUST be present to indicate that the resource is encrypted and provide information on how it is encrypted.

An <EncryptedKey> element describes each encryption key used in the container, while an <EncryptedData> element describes each encrypted file. Each <EncryptedData> element refers to an <EncryptedKey> element, as described in XML Encryption.

A RELAX NG OCF schema describing the <encryption> element that MUST be the root element of encryption.xml can be found in the Appendix A.

When the encryption.xml file is not present, the OCF container provides no information indicating any part of the container is encrypted.

OCF encrypts individual files independently, trading off some security for improved performance, allowing the container contents to be incrementally decrypted. Encryption in this way still exposes the directory structure and file naming of the whole package.

OCF uses XML Encryption to provide a framework for encryption, allowing a variety of algorithms to be used. XML Encryption specifies a process for encrypting arbitrary data and representing the result in XML. Even though an OCF container MAY contain non-XML data, XML Encryption can be used to encrypt all data in an OCF container. OCF encryption supports only encryption of whole files. The encryption.xml file, if present, MUST NOT be encrypted.

Encrypted data replaces unencrypted data in an OCF container. For example, if an image named photo.jpeg is encrypted, the contents of the photo.jpeg resource SHOULD be replaced by its encrypted contents. When stored in a ZIP container, streams of data MUST be compressed before they are encrypted; Flate compression MUST be used. Within the ZIP directory, encrypted files SHOULD be stored rather than Flate-compressed.

It MAY be desired to obfuscate the storage of embedded fonts referenced by an EPUB Publication to tie them to the parent publication and make them more difficult to extract for unrestricted use. In these cases, encryption.xml SHOULD be used to provide requisite font decoding information according to the Font Mangling informational document found at http://www.idpf.org/doc_library/informationaldocs/FontManglingSpec_2.0.1_draft.htm.

The following files MUST never be encrypted (regardless of whether default or specific encryption is requested):

  • mimetype
  • META-INF/container.xml
  • META-INF/manifest.xml
  • META-INF/metadata.xml
  • META-INF/signatures.xml
  • META-INF/encryption.xml
  • META-INF/rights.xml
  • EPUB rootfile (the OPF Package file)

Signed resources MAY subsequently be encrypted by using the Decryption Transform for XML Signature. This feature enables an application such as an OCF agent to distinguish data that was encrypted before signing from data that was encrypted after signing. Only data that was encrypted after signing MUST be decrypted before computing the digest used to validate the signature.

(This example is informative.)

In the following example, adapted from Section 2.2.1 of XML Encryption Syntax and Processing, the resource image.jpeg is encrypted using a symmetric key algorithm (AES) and the symmetric key is further encrypted using an asymmetric key algorithm (RSA) with a key of John Smith.

<encryption xmlns="urn:oasis:names:tc:opendocument:xmlns:container"
            xmlns:enc="http://www.w3.org/2001/04/xmlenc#"
            xmlns:ds="http://www.w3.org/2000/09/xmldsig#">
   <enc:EncryptedKey Id="EK">
      <enc:EncryptionMethod Algorithm="http://www.w3.org/2001/04/xmlenc#rsa-1_5"/>
      <ds:KeyInfo>
         <ds:KeyName>John Smith</ds:KeyName>
      </ds:KeyInfo>
      <enc:CipherData>
         <enc:CipherValue>xyzabc</enc:CipherValue>
      </enc:CipherData>
   </enc:EncryptedKey>
   <enc:EncryptedData Id="ED1">
      <enc:EncryptionMethod Algorithm="http://www.w3.org/2001/04/xmlenc#kw-aes128"/>
      <ds:KeyInfo>
         <ds:RetrievalMethod URI="#EK" Type="http://www.w3.org/2001/04/xmlenc#EncryptedKey"/>
      </ds:KeyInfo>
      <enc:CipherData>
         <enc:CipherReference URI="image.jpeg"/>
      </enc:CipherData>
   </enc:EncryptedData>
</encryption>

3.5.6 Rights Management META-INF/rights.xml (Optional)

An OPTIONAL file with the name rights.xml within the META-INF directory at the root level of the container file system is a reserved name in a valid OCF container. This location is reserved for digital rights management (DRM) information for trusted exchange of Publications among rights holders, intermediaries, and users. In version 2.0.1 of OCF, there is not a REQUIRED format for DRM information, but a future version of this specification MAY specify a particular format for DRM information.

If the META-INF/rights.xml file exists, it MUST be a well-formed XML document which uses and conforms to XML Namespaces it uses, and its contents SHOULD be valid XML with namespace-qualified elements to avoid collision with future versions of OCF that MAY specify a particular format this file.

The rights.xml file MUST NOT be encrypted.

When the rights.xml file is not present, the OCF container provides no information indicating any part of the container is rights governed.

4 ZIP Container

OCFs ZIP Container supports the ZIP format as specified by the application note at http://www.pkware.com/business_and_developers/developer/appnote/, but with the following constraints and clarifications:

Here are some details about particular fields in the ZIP archive:

The first file in the ZIP Container MUST be a file by the ASCII name of mimetype which holds the MIME type for the ZIP Container (i.e., application/epub+zip as an ASCII string; no padding, white-space or case change). The file MUST be neither compressed nor encrypted and there MUST NOT be an extra field in its ZIP header. If this is done, then the ZIP Container offers convenient magic number support as described in RFC 2048 and the following will hold true:

APPENDIX A: RELAX NG OCF Schema

<?xml version="1.0" encoding="UTF-8"?>
<choice xmlns="http://relaxng.org/ns/structure/1.0"
        ns="urn:oasis:names:tc:opendocument:xmlns:container">
   <element name="container">
      <attribute name="version">
         <value>1.0</value>
      </attribute>
      <element name="rootfiles">
         <oneOrMore>
            <element name="rootfile">
               <attribute name="full-path">
                  <text/>
               </attribute>
               <attribute name="media-type">
                  <text/>
               </attribute>
            </element>
         </oneOrMore>
      </element>
   </element>
   <element name="signatures">
      <oneOrMore>
         <element name="Signature" ns="http://www.w3.org/2001/04/xmldsig#">
            <externalRef href="http://www.w3.org/Signature/2002/07/xmldsig-core-schema.rng"/>
         </element>
      </oneOrMore>
   </element>
   <element name="encryption">
      <oneOrMore>
         <choice>
            <element name="EncryptedData" ns="http://www.w3.org/2001/04/xmlenc#">
               <externalRef href="http://www.w3.org/Encryption/2002/07/xenc-schema.rng"/>
            </element>
            <element name="EncryptedKey" ns="http://www.w3.org/2001/04/xmlenc#">
               <externalRef href="http://www.w3.org/Encryption/2002/07/xenc-schema.rng"/>
            </element>
         </choice>
      </oneOrMore>
   </element>
</choice>

APPENDIX B: Example

The following example demonstrates the use of this OCF format to contain a signed and encrypted EPUB publication with an alternate PDF rendition within a ZIP Container.

Ordered list of files in the ZIP Container:
mimetype
META-INF/container.xml
META-INF/signatures.xml
META-INF/encryption.xml
OEBPS/As You Like It.opf
OEBPS/book.html
OEBPS/images/cover.png
PDF/As You Like It.pdf
The mimetype file:
application/epub+zip
The META-INF/container.xml file:
<?xml version="1.0"?>
<container version="1.0" xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
   <rootfiles>
      <rootfile full-path="OEBPS/As You Like It.opf"
      media-type="application/oebps-package+xml" />
      <rootfile full-path="OEBPS/As You Like It.pdf"
      media-type="application/pdf" />
   </rootfiles>
</container>
The META-INF/signatures.xml file:
<?xml version="1.0"?>
<signatures xmlns="urn:oasis:names:tc:opendocument:xmlns:container">
   <Signature Id="AsYouLikeItSignature" xmlns="href="http://www.w3.org/2000/09/xmldsig#">
   
      <!-- SignedInfo is the information that is actually signed. In this case -->
      <!-- the SHA1 algorithm is used to sign the canonical form of the XML -->
      <!-- documents enumerated in the Object element below -->
      <SignedInfo>
         <CanonicalizationMethod Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
         <SignatureMethod Algorithm="http://www.w3.org/2000/09/xmldsig#dsa-sha1"/>
         <Reference URI="#AsYouLikeIt">
            <DigestMethod Algorithm="http://www.w3.org/2000/09/xmldsig#sha1"/>
            <DigestValue>j6lwx3rvEPO0vKtMup4NbeVu8nk=</DigestValue>
         </Reference>
      </SignedInfo>
      
      <!-- The signed value of the digest above using the DSA algorithm -->
      <SignatureValue>MC0CFFrVLtRlk=...</SignatureValue>
      
      <!-- The key to use to validate the signature -->
      <KeyInfo>
         <KeyValue>
            <DSAKeyValue>
               <P>...</P><Q>...</Q><G>...</G><Y>...</Y>
            </DSAKeyValue>
         </KeyValue>
      </KeyInfo>
      
      
      <!-- The list documents to sign. Note that the canonical form of XML -->
      <!-- documents is signed while the binary form of the other documents -->
      <!-- is used -->
      <Object>
         <Manifest Id="AsYouLikeIt">
            <Reference URI="OEBPS/As You Like It.opf">
               <Transforms>
                  <Transform Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
               </Transforms>
            </Reference>
            <Reference URI="OEBPS/book.html">
               <Transforms>
                  <Transform Algorithm="http://www.w3.org/TR/2001/REC-xml-c14n-20010315"/>
               </Transforms>
            </Reference>
            <Reference URI="OEBPS/images/cover.png" />
            <Reference URI="PDF/As You Like It.pdf" />
         </Manifest>
      </Object>
   </Signature>
</signatures>
The META-INF/encryption.xml file:
<?xml version="1.0"?>
<encryption xmlns="urn:oasis:names:tc:opendocument:xmlns:container" 
            xmlns:enc="http://www.w3.org/2001/04/xmlenc#"
            xmlns:ds="http://www.w3.org/2000/09/xmldsig#">
   
   <-- The RSA encrypted AES-128 symmetric key used to encrypt the data -->
   <enc:EncryptedKey Id="EK>
      <enc:EncryptionMethod Algorithm="http://www.w3.org/2001/04/xmlenc#rsa-1_5"/>
      <ds:KeyInfo>
         <ds:KeyName>John Smith</ds:KeyName>
      </ds:KeyInfo>
      <enc:CipherData>
         <enc:CipherValue>xyzabc...</enc:CipherValue>
      </enc:CipherData>
   </enc:EncryptedKey>
   
   
   <!-- Each EncryptedData block identifies a single document that has been -->
   <!-- encrypted using the AES-128 algorithm. The data remains stored in its -->
   <!-- encrypted form in the original file within the container. -->
   <enc:EncryptedData Id="ED1">
      <enc:EncryptionMethod Algorithm="http://www.w3.org/2001/04/xmlenc#kw-aes128"/>
      <ds:KeyInfo>
         <ds:RetrievalMethod URI="#EK" Type="http://www.w3.org/2001/04/xmlenc#EncryptedKey"/>
      </ds:KeyInfo>
      <enc:CipherData>
         <enc:CipherReference URI="OEBPS/book.html"/>
      </enc:CipherData>
   </enc:EncryptedData>
   
   <enc:EncryptedData Id="ED2">
      <enc:EncryptionMethod Algorithm="http://www.w3.org/2001/04/xmlenc#kw-aes128"/>
      <ds:KeyInfo>
         <ds:RetrievalMethod URI="#EK"  Type="http://www.w3.org/2001/04/xmlenc#EncryptedKey"/>
      </ds:KeyInfo>
      <enc:CipherData>
         <enc:CipherReference URI="OEBPS/images/cover.png"/>
      </enc:CipherData>
   </enc:EncryptedData>
   
   <enc:EncryptedData Id="ED3">
      <enc:EncryptionMethod Algorithm="http://www.w3.org/2001/04/xmlenc#kw-aes128"/>
      <enc:KeyInfo>
         <enc:RetrievalMethod URI="#EK" Type="http://www.w3.org/2001/04/xmlenc#EncryptedKey"/>
      </enc:KeyInfo>
      <enc:CipherData>
         <enc:CipherReference URI="PDF/As You Like It.pdf"/>
      </enc:CipherData>
   </enc:EncryptedData>
</encryption>
The OEBPS/As You Like It.opf file:
<?xml version="1.0"?>
<package version="2.0" xmlns="http://www.idpf.org/2007/opf" unique-identifier="Package-ID">
   <metadata xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:opf="http://www.idpf.org/2007/opf">
      <dc:identifier id="Package-ID">ebook:guid-6B2DF0030656ED9D8</dc:identifier>
      <dc:title>As You Like It</dc:title>
      <dc:creator opf:role="aut">William Shakespeare</dc:creator>
      <dc:identifier>0-7410-1455-6</dc:identifier>
      <dc:subject></dc:subject>
      <dc:type></dc:type>
      <dc:date opf:event="publication">3/24/2000</dc:date>
      <dc:date opf:event="copyright">1/1/9999</dc:date>
      <dc:identifier opf:scheme="ISBN">0-7410-1455-6</dc:identifier>
      <dc:publisher>Project Gutenberg</dc:publisher>
      <dc:language>en</dc:language>
   </metadata>
   <manifest>
      <item id="4915" href="book.html" media-type="application/xhtml+xml"/>
      <item id="7184" href="images/cover.png" media-type="image/png" />
      <item id="ncx" href="toc.ncx" media-type="application/x-dtbncx+xml" />
   </manifest>
   <spine toc="ncx">
      <itemref idref="4915"/>
   </spine>
</package>
The OEBPS/book.html file:

This file would be binary and be encrypted. Its decrypted contents might look something like:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
                      "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
   <head>
      ...
   </head>
   <body>
      ...
      <img src="images/cover.png" alt="Cover image: a picture of the Bard of Avon" />
      ...
   </body>
</html>
The OEBPS/toc.ncx file:

This file contains the navigation control file (NCX) for the publication as defined by OPF.

The OEBPS/images/cover.png file:

This file contains the encrypted binary bytes of the cover.png file.

The OEBPS/As You Like It.pdf file:

This file contains the encrypted binary bytes of the PDF file.

APPENDIX C: CONTRIBUTORS

This specification has been developed through a cooperative effort, bringing together publishers, vendors, software developers, and experts in the relevant standards.

Version 2.0.1 of this specification was prepared by the International Digital Publishing Forums EPUB Maintenance Working Group. Active members of the working group at the time of publication of revision 2.0.1 were:

Version 1.0 of this specification was prepared by the International Digital Publishing Forums Unified OEBPS Container Format Working Group. Active members of the working group at the time of publication of revision 1.0 were: