The Apache FOP Project

The Apache™ FOP Project

Apache™ FOP: Accessibility

Overview

This page describes the accessibility features of Apache™ FOP. Section 508 defines accessibility in the context of electronic documents for the USA but other countries have similar requirements.

Accessibility features are available only for the PDF output format and there are some implementation limitations. Also, certain actions must be undertaken by the content creator to ensure that FOP can create a truly accessible document.

Enabling accessibility

There are 3 ways to enable accessibility:

  1. Command line: The command line option -a turns on accessibility:

    fop -a -fo mydocument.fo -pdf mydocument.pdf
    
  2. Embedding:

    userAgent.setAccessibility(true);
    
  3. Optional setting in fop.xconf file:

    <fop version="1.0">
      <accessibility>true</accessibility>
      ...
    </fop>
    

When accessibility is enabled, additional information relating to the logical structure of the document is added to the PDF. That information allows the PDF viewer (or a text-to-speech application) to retrieve the natural reading order of the document.

The processing of the logical structure is memory-hungry. You may need to adjust the Java heap size in order to process larger files.

Changes to your XSL-FO input files

Apache FOP cannot automatically generate accessible PDFs. Some of the work can only be performed by the content provider. Following are some changes that may be necessary to your XSL-FO content in order to generate really accessible documents:

Customized Tagging

The PDF Reference defines a set of standard Structure Types to tag content. For example, ‘P’ is used for identifying paragraphs, ‘H1’ to ‘H6’ for headers, ‘L’ for lists, ‘Div’ for block-level groups of elements, etc. This standard set is aimed at improving interoperability between applications producing or consuming PDF.

FOP provides a default mapping of Formatting Objects to elements from that standard set. For example, fo:page-sequence is mapped to ‘Part’, fo:block is mapped to ‘P’, fo:list-block to ‘L’, etc.

You may want to customize that mapping to improve the accuracy of the tagging or deal with particular FO constructs. For example, you may want to make use of the ‘H1’ to ‘H6’ tags to make the hierarchical structure of the document appear in the PDF. This is achieved by using the role XSL-FO property:

...
<fo:block role="H1" font-weight="bold">I. A Level 1 Heading</fo:block>
<fo:block>This is the first paragraph of the first section...</fo:block>
...

If a non-standard structure type is specified, FOP will issue a warning and fall back to the default tag associated to the Formatting Object.

Treating Content as Artifact

If your document has content that is not meant to appear in the structure tree, you can wrap it in an fo:wrapper element whose role property has been set to ‘artifact’. For example:

<fo:block>blah... blah...
  <fo:wrapper role="artifact">Funny graphical thing without logical meaning</fo:wrapper>
  blah... blah... </fo:block>

This special value for the role property can also be applied to fo:static-content elements; indeed such elements are often used to contain page header information, that is there purely as a reading aid for sighted people and is meaningless when read out loud by a screen reader.

<fo:static-content flow-name="xsl-region-before" role="artifact">
  <fo:block>Page <fo:page-number/></fo:block>
</fo:static-content>

Scope of Header Table Cells

In XSL-FO, tables are inherently defined row by row. The fo:table-header element can be used to define ‘header rows’, in which each cell is a header of the corresponding column (like TH cells in HTML). The cell is said to have a column scope.

There is no way, however, to define ‘row headers’: cells that have a row scope. Of course it is possible to style a cell to make it look like a header (for example, by using a bolder font), but that won’t be reflected in the structure of the document.

When creating accessible documents, it is desirable to have that information, as it can be used by a screen reader to help the user build a mental representation of the table.

For that purpose, FOP defines the fox:header extension property. If an fo:table-column element has this property set to true, then the corresponding cells will receive the TH structure type, and the Scope structure attribute will have one of the following value:

fox:header | ---------|----------- Value: | true | false | Initial: | false | Inherited: | no | Applies to: | fo:table-column |

If for some reason a cell inside a header column is not meant to be a header cell, the role property can be used to override the default behavior and set the structure type to TD.

For example, the following table:

<fo:table width="100%" table-layout="fixed">
  <fo:table-column xmlns:fox="http://xmlgraphics.apache.org/fop/extensions" fox:header="true" column-width="proportional-column-width(1)"/>
  <fo:table-column column-width="proportional-column-width(1)"/>
  <fo:table-column column-width="proportional-column-width(1)"/>
  <fo:table-header font-weight="bold">
    <fo:table-row>
      <fo:table-cell border="1pt solid black" padding-left="1pt">
        <fo:block>Header Scope = Both</fo:block>
      </fo:table-cell>
      <fo:table-cell border="1pt solid black" padding-left="1pt">
        <fo:block>Header Scope = Column</fo:block>
      </fo:table-cell>
      <fo:table-cell border="1pt solid black" padding-left="1pt">
        <fo:block>Header Scope = Column</fo:block>
      </fo:table-cell>
    </fo:table-row>
  </fo:table-header>
  <fo:table-body>
    <fo:table-row>
      <fo:table-cell border="1pt solid black" padding-left="1pt" font-weight="bold">
        <fo:block>Header Scope = Row</fo:block>
      </fo:table-cell>
      <fo:table-cell border="1pt solid black" padding-left="1pt">
        <fo:block>Cell 1.1</fo:block>
      </fo:table-cell>
      <fo:table-cell border="1pt solid black" padding-left="1pt">
        <fo:block>Cell 1.2</fo:block>
      </fo:table-cell>
    </fo:table-row>
    <fo:table-row>
      <fo:table-cell border="1pt solid black" padding-left="1pt" font-weight="bold">
        <fo:block>Header Scope = Row</fo:block>
      </fo:table-cell>
      <fo:table-cell border="1pt solid black" padding-left="1pt">
        <fo:block>Cell 2.1</fo:block>
      </fo:table-cell>
      <fo:table-cell border="1pt solid black" padding-left="1pt">
        <fo:block>Cell 2.2</fo:block>
      </fo:table-cell>
    </fo:table-row>
    <fo:table-row>
      <fo:table-cell border="1pt solid black" padding-left="1pt" role="TD">
        <fo:block>Non-header</fo:block>
      </fo:table-cell>
      <fo:table-cell border="1pt solid black" padding-left="1pt">
        <fo:block>Cell 3.1</fo:block>
      </fo:table-cell>
      <fo:table-cell border="1pt solid black" padding-left="1pt">
        <fo:block>Cell 3.2</fo:block>
      </fo:table-cell>
    </fo:table-row>
  </fo:table-body>
</fo:table>

will be rendered into this:

Header Scope = Both Header Scope = Column Header Scope = Column
Header Scope = Row Cell 1.1 Cell 1.2
Header Scope = Row Cell 2.1 Cell 2.2
Non-header Cell 3.1 Cell 3.2

Testing

Accessible PDFs can be tested, for example, using Adobe Acrobat Professional. Its Accessibility Check feature creates a report indicating any deficiencies with a PDF document. Alternatively, you can just let a screen reader read the document aloud.

Limitations

Accessibility support in Apache FOP is relatively new, so there are certain limitations. Please help us identify and close any gaps.

Many resources providing guidance about creating accessible documents can be found on the web. Here are a few links, along with additional resources around the topic: