The Apache FOP Project

The Apache™ FOP Project

Apache™ FOP: PDF/A (ISO 19005)

Overview

PDF/A is a standard which turns PDF into an "electronic document file format for long-term preservation". PDF/A-1 is the first part of the standard and is documented in ISO 19005-1:2005(E). Work on PDF/A-2 is in progress at AIIM.

Design documentation on PDF/A can be found on FOP's Wiki on the PDFAConformanceNotes page.

Implementation Status

PDF/A-1b is implemented to the degree that FOP supports the creation of the elements described in ISO 19005-1.

Tests have been performed against jHove and Adobe Acrobat 7.0.7 (Preflight function). FOP does not validate completely against Apago's PDF Appraiser. Reasons unknown due to lack of a full license to get a detailed error protocol.

PDF/A-1a is not implemented, yet. This is mostly because of the requirement for tagged PDF which is not available in FOP, yet.

Usage (command line)

To activate PDF/A-1b from the command-line, specify "-pdfprofile PDF/A-1b" as a parameter. If there is a violation of one of the validation rules for PDF/A, an error message is presented and the processing stops.

Usage (embedded)

When FOP is embedded in another Java application you can set a special option on the renderer options in the user agent to activate the PDF/A-1b profile. Here's an example:

userAgent.getRendererOptions().put("pdf-a-mode", "PDF/A-1b"); Fop fop = fopFactory.newFop(MimeConstants.MIME_PDF, userAgent); [..] If one of the validation rules of PDF/A is violated, an PDFConformanceException (descendant of RuntimeException) is thrown.

PDF/A in Action

There are a number of things that must be looked after if you activate a PDF/A profile. If you receive a PDFConformanceException, have a look at the following list (not necessarily comprehensive):

PDF profile compatibility

The PDF profiles "PDF/X-3:2003" and "PDF/A-1b" are compatible and can both be activated at the same time.

Interoperability

There has been some confusion about the namespace for the PDF/A indicator in the XMP metadata. At least three variants have been seen in the wild:

| http://www.aiim.org/pdfa/ns/id.html | obsolete, from an early draft of ISO-19005-1, used by Adobe Acrobat 7.x | | http://www.aiim.org/pdfa/ns/id | obsolete, found in the original ISO 19005-1:2005 document | | http://www.aiim.org/pdfa/ns/id/ | correct, found in the technical corrigendum 1 of ISO 19005-1:2005 |

If you get an error validating a PDF/A file in Adobe Acrobat 7.x it doesn't mean that FOP did something wrong. It's Acrobat that is at fault. This is fixed in Adobe Acrobat 8.x which uses the correct namespace as described in the technical corrigendum 1.