The Apache FOP Project

The Apache™ FOP Project

Apache™ FOP: PDF/A (ISO 19005)

Overview

PDF/A is a standard which turns PDF into an "electronic document file format for long-term preservation". PDF/A-1 is the first part of the standard and is documented in ISO 19005-1:2005(E). Work on PDF/A-2 is in progress at AIIM.

Design documentation on PDF/A can be found on FOP's Wiki on the PDFAConformanceNotes page.

Implementation Status

PDF/A-1b is implemented to the degree that FOP supports the creation of the elements described in ISO 19005-1.

Tests have been performed against jHove and Adobe Acrobat 7.0.7 (Preflight function). FOP does not validate completely against Apago's PDF Appraiser. Reasons unknown due to lack of a full license to get a detailed error protocol.

PDF/A-1a is based on PDF-A-1b and adds accessibility features (such as Tagged PDF). This format is available within the limitation described on the Accessibility page.

PDF/A-2 supports new features added with PDF 1.5, 1.6 and 1.7

PDF/A-3 allows embedding of arbitrary file formats

PDF/A-1b, PDF/A-2b and PDF/A-3b does not require accessibility to be enabled

PDF/A-1a, PDF/A-2a and PDF/A-3a require accessibility to be enabled

PDF/A-2u and PDF/A-3u require unicode to be used and accessibility to be enabled

Modes

Usage (fop.xconf)

Add section to pdf renderer with pdfa mode and pdf version.

<fop version="1.0">
  <accessibility>true</accessibility>
  <renderers>
    <renderer mime="application/pdf">
      <pdf-a-mode>PDF/A-1a</pdf-a-mode>
      <version>1.4</version>
    </renderer>
  </renderers>
</fop>

Usage (command line)

To activate PDF/A-1b from the command-line, specify "-pdfprofile PDF/A-1b" as a parameter. If there is a violation of one of the validation rules for PDF/A, an error message is presented and the processing stops.

PDF/A-1a is enabled by specifying "-pdfprofile PDF/A-1a".

Usage (embedded)

When FOP is embedded in another Java application you can set a special option on the renderer options in the user agent to activate the PDF/A-1b profile. Here's an example:

userAgent.getRendererOptions().put("pdf-a-mode", "PDF/A-1b");
Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, userAgent);
[..]

If one of the validation rules of PDF/A is violated, an PDFConformanceException (descendant of RuntimeException) is thrown.

For PDF/A-1a, just use the string "PDF/A-1a" instead of "PDF/A-1b".

PDF/A in Action

There are a number of things that must be looked after if you activate a PDF/A profile. If you receive a PDFConformanceException, have a look at the following list (not necessarily comprehensive):

There are additional requirements if you want to enabled PDF/A-1a (Tagged PDF). This is particularly the specification of the natural language and alternative descriptions for images. Please refer to the Accessibility page for details.

PDF profile compatibility

The PDF profiles "PDF/X-3:2003" and "PDF/A-1b" (or "PDF/A-1a") are compatible and can both be activated at the same time.

Interoperability

There has been some confusion about the namespace for the PDF/A indicator in the XMP metadata. At least three variants have been seen in the wild:

| http://www.aiim.org/pdfa/ns/id.html | obsolete, from an early draft of ISO-19005-1, used by Adobe Acrobat 7.x | | http://www.aiim.org/pdfa/ns/id | obsolete, found in the original ISO 19005-1:2005 document | | http://www.aiim.org/pdfa/ns/id/ | correct, found in the technical corrigendum 1 of ISO 19005-1:2005 |

If you get an error validating a PDF/A file in Adobe Acrobat 7.x it doesn't mean that FOP did something wrong. It's Acrobat that is at fault. This is fixed in Adobe Acrobat 8.x which uses the correct namespace as described in the technical corrigendum 1.

Metadata example

See this page for more info

[..]
</fo:layout-master-set>
<fo:declarations>
  <x:xmpmeta xmlns:x="adobe:ns:meta/">
    <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#">
      <rdf:Description rdf:about="" xmlns:dc="http://purl.org/dc/elements/1.1/">
        <dc:title><rdf:Alt><rdf:li xml:lang="x-default">title</rdf:li></rdf:Alt></dc:title>
        <dc:creator><rdf:Seq><rdf:li>Document author</rdf:li></rdf:Seq></dc:creator>
        <dc:description><rdf:Alt><rdf:li xml:lang="x-default">Document subject</rdf:li></rdf:Alt></dc:description>
      </rdf:Description>
    </rdf:RDF>
  </x:xmpmeta>
</fo:declarations>