How-to guides page navigation
transcript-content-how-to-testing-pdfs

​​​​​​​

Testing PDFs

[Narrator:] In previous chapters we saw how to create accessible PDFs using different software tools. So we have an existing PDF. Maybe we created it ourselves or maybe we received it from somebody else.

How can we be sure that it is really accessible? There are two basic approaches to test the level of accessibility of PDF files. Following the manual approach, we can use a screen reader and read the document like a person with visual disabilities would. For this, we need to learn what a screen reader is and how it works. For more information, watch the chapter dedicated to screen readers.

When following the automated approach, we can use software tools that analyse the document by testing different accessibility criteria for us. Both approaches have their advantages and disadvantages.

The manual check

When using a screen reader, we have to test the document for several criteria. Does the navigation work? Can we reach the different chapters via the bookmarks? Can we navigate from heading to heading to reach the chapters and subchapters? Can we use the table of contents if we created one? Can we read all the text or is some hidden as part of a graphic? Can we recognise lists and table elements? Do all non-text elements have alternative descriptions so that people with visual disabilities can get information on their content? Does the reading sequence make sense or do the elements seem to appear in a random order? Do all elements have sufficient size and colour contrast? Are the letters of the fonts easy to recognise? Are the text sizes large enough? Does the document have metadata? Do we have a title and a description so that a person can get an idea of what kind of information this document has to offer? Was the language set, so that a screen reader can start reading using the correct voice? What about the author?

Reading a PDF

Let's try it. We are using the document we created before to test it manually. We have opened the document using Acrobat Reader running on Windows. Let's do some basic tests. Can we select and copy and paste text or was the text stored in the form of a graphic? We select text and copy it into Notepad. Looks good.

Do the images have alternative text? If we move the mouse over an image, the alternative text is shown in a small pop-up. This is the text that is spoken by a screen reader.

Can we see bookmarks in the left-hand pane of the window? You might have to open the left-hand pane and select the bookmark icon to see the bookmarks. We see the structure of the document presented in a tree hierarchy. The different levels can be expanded and collapsed. By clicking on a bookmark, we can navigate directly to the relevant chapter or subchapter.

When we open the File menu and select the Properties menu item, the Document Properties pop-up will open. Here we can find the meta information of the document: the title, the author, a subject and the keywords. In the lower part of the window, we can check if the document was stored using tagging. This does not indicate that the tags are complete or are used in the correct way, but at least we know that the document offers tagging information. Clicking the Security tab, we can check if the document is password protected. Please note that accessible documents should not use password protection as some screen readers have problems with it. If we activate the Advanced tab, we can see if a language is defined and if it matches the content of the document. If no language is defined, then the screen reader has to guess one and the user might have to switch the document language manually.

We have now tested a range of different accessibility criteria, but we cannot test them all without additional tools or a screen reader. So let's open the NVDA screen reader to start reading the document.

[Screen reader:] Welcome to NVDA dialog Welcome to NVDA! Use Cap..., Auto..., Show..., OK button.

[Narrator:] The NVDA screen reader welcomes us and informs us about the front window of the current application. Let's start reading.

[Screen reader:] alice demo.pdf Adobe Reader DC. Preview. Alice's Adventure in Wonderland collapsed. Heading level 1, page 1. Alice's Adventure in Wonderland. Alice's Adventure in Wonderland, commonly shortened to Alice in Wonderland, is an 1865 novel written by English mathematician Charles Lutwidge Dodgson under the pseudonym Lewis Carroll. It tells of a girl named Alice falling through a rabbit hole into a fantasy world. More information at link Website of Alice. Graphic Alice, the Rabbit and the Mad Hatter at the Tea party. Heading level 2. Publication timeline. The following list is a timeline of major publication events related to Alice's Adventure in Wonderland.

[Narrator:] We stop here for a moment. Have you noticed how the headings, links and image descriptions are spoken? The screen reader uses the structure tags of the document to offer us not only the content, but its role in the document. Let's check how the list is presented.

[Screen reader:] List with four items. 1865. First UK edition, the second printing. 1865. First US edition, the first printing of above. One…

[Narrator:] The screen reader informs us not only about the presence of the list, but also about the number of elements inside it. This will even work if there are nested lists inside a list, so a person with visual disabilities can get a clear perception about the position of the different information elements.

[Screen reader:] List of characters. Heading level 2. Heading level 2. List of characters. Table with 6 rows and 3 columns. Row 1, column 1, Position. Column 2, Name. Column 3, Description. Row 2, Position. Column 1-1. Name, column 2, Alice. Description, column 3, A mid-Victorian era child, Alice, unintentionally goes on an underground adventure after accidentally falling down a rabbit hole into Wonderland. Row 3, Position. Column 1-2, Name, The Caterpillar. Description, column 3, The Caterpillar is a hookah-smoking caterpillar… [Narrator:] The NVDA screen reader introduces the table by describing its layout. How many columns does the table have? How many rows does it have? When entering the table body, each data item will be preceded by its position and the name of the heading. So we can get a clear understanding of where we are and what this information describes. This information can be extremely useful if not reading the document sequentially. Imagine the user did a search for a specific word. If this word were part of the table data, the user would immediately be informed that the search hit was found in a table at a specific position. But how can we navigate inside the document? Let's navigate from heading to heading by pressing the H key on our keyboard.

[Screen reader:] Synopsis, heading level 2. Chapter One. Down the Rabbit Hole, heading level 3. Chapter Two. The Pool of Tears, heading level 3. [Narrator:] We can do the same backwards by pressing Shift + H.

[Screen reader:] Chapter One. Down the Rabbit Hole, heading level 3. Synopsis, heading level 2. List of characters, heading level 2. Page 1, Publication timeline, heading level 2. Alice's Adventure in Wonderland, heading level 1.

[Narrator:] Following this strategy, we can navigate from heading to heading, from link to link, from graphic to graphic, from list to list, from table to table, and so forth. Without tagging information, a user would need to read the document sequentially without any possibility to navigate inside the document. This gets more annoying with bigger documents. By reading the document with a screen reader we notice the sequence in which the different elements are spoken. Any anomalies, where the optical presentation differs from the screen reader's presentation, need to be fixed so that the logical structure is preserved for a screen reader user. We can also navigate using the bookmarks. [Screen reader:] File tools button. Home button. Toolbar. Start. Preview. Alice's Adventure in Wonderland collapsed. Expanded. Publication timeline. List of characters. Synopsis expanded. Chapter One. Down the Rabbit Hole.

[Narrator:] Do not forget to check if the different document elements have sufficient colour contrast. To learn how to check the colour contrast, please refer to the chapter ″Fonts, sizes and colours″.

The automated check

The manual check is time-consuming. First we have to know how to operate a screen reader, which can be a demanding task. We have to check our document against a list of criteria. Depending on the size of the document and the number of elements, this method can be prone to errors. Do all images have an alternative description or did we miss one? Wouldn't it be useful to have a program that helps us to find all the accessibility issues with one mouse click? The Adobe Acrobat program offers us this. Let's see how this works.

We start the Adobe Acrobat program and open the document we created before. Compared to the reader program, Acrobat shows us much more information about the document. For example, we can see the tagging information directly in the left-hand window pane. If your Acrobat program does not show the tagging icon, you can activate it with the right-mouse button on the Tags menu entry. We can now expand the different tag levels to see all the different elements. If you are working with web pages on a code level, most of the elements will look familiar to you.

We have headings. Here, a heading level 1, and here, a heading level 2. We have images, and lists with list items. We have tables, with table rows. And, inside the table rows, we have table headers and table data. The naming and organisation of the tags looks similar to those of web pages.

While we are clicking the different tags, note how Acrobat highlights the position of the tagged elements in the document by drawing a frame around them. The sequence of the tags in the tree presents the reading sequence. A screen reader uses the Tags tree to navigate the document. Elements that have no corresponding tags will be ignored by a screen reader.

It is possible to follow a list of accessibility criteria to determine the accessibility level of this document, as we did in Adobe Reader, but we would like to save some time. In the right-hand window, Acrobat offers a set of tools to modify and improve the document. For our needs, we will use the Accessibility tool. When we select the Accessibility tool, we see a set of functions. We would like to start a Full Check. The dialogue box now offers a set of options that allow us to limit the number of criteria. For simplicity reasons, we want to check for all criteria and click Start Checking. The result will be displayed in the left-hand window pane. We can expand the different criteria groups and see the results. A green checkmark indicates that the test passed successfully. A cross on a red background indicates a failed criteria. And a question mark on blue background reminds us that not everything can be checked automatically and needs manual intervention.

Here we see that the correct reading order and the colour contrast cannot be checked by the program. We need to do this manually. An exclamation mark on yellow background reports a problem with an optional success criteria. In our example, the program was not able to find an alternative description for the table. Strangely enough, the report claims that we have no document title. If we check this in the Document Properties window, we can see that there is a title. We can resolve this problem by clicking on the error with the right-mouse button and selecting Fix. Done! This is a bug in Acrobat, which may have been fixed in a current version of the program. The accessibility checker gave us a good evaluation for our example document.

But how does the accessibility checker behave when the document is not accessible? We will try this using the same document created via a printer driver. This document has no tagging and, therefore, no document structure. After we open the document, we can immediately see that there is no tagging when we open the Tags panel on the left-hand side of the window.

There is a lot of information missing here. Let's start the accessibility checker using the Accessibility tool Full Check. We can see that most of the tests have failed. The program was not able to find tagging information, there is no primary language, no alternative texts, no table or list structures, and so forth. This document is obviously not accessible. These problems need to be fixed.

Repairing PDF

Before we show how to repair and improve the accessibility of an existing PDF, let's think for a moment about this. If the PDF is modified in Acrobat, then the corrections will be stored in the PDF. At a later stage, content may be added to the document. Text could be added and images may be replaced. We would then need to open the document in our original file format and update it, resulting in the export of a new PDF. At this point, we would run into a problem. All the work we had invested to improve the accessibility of the previous PDF would then be lost. We would have to do the same work again in Acrobat using the new PDF. Therefore, fixing an accessibility problem in the authoring program is always the preferred method as then those changes will be incorporated into future document versions.

PAC 3

Adobe Acrobat is a commercial software, which not everybody can or is willing to buy. There are other options, and one of these is the PDF Accessibility Checker, or PAC for short. PAC is a freeware program running on Windows operating systems. It supports both experts and end users in conducting PDF accessibility evaluations. You can find the link to download PAC under the Documents tab on the platform. For our demo, we will use the current version 3 of the program.

We start the program and open our demo file. The program will immediately start checking the document and then gives a brief summary of the result. PAC uses a stricter set of criteria to test for accessibility than Word or Acrobat. It uses the criteria defined by the PDF Universal Access standard for benchmarking, whereas Word and Acrobat follow the Web Content Accessibility Guidelines standardised by the World Wide Web Consortium. You can get a detailed analysis of the criteria by pressing the Results in Detail button. Here we can see all the criteria that failed, with explanations. We can expand the criteria to find out where in the document our problem is and possibly how to fix it. Not all of these problems can be fixed by you. They need to be fixed by the program creating the document. Here is an example of table headers with associated subcells.

Another very useful function is the screen reader preview. If you would like to use a screen reader, but the pure keyboard oriented interface makes it too complicated, then this function is for you. The screen reader preview shows the text, including the document structure, how it would be read by a screen reader. It includes alternative texts as well as lists or table structures.

The Logical Structure button shows the tags tree. The elements of the tree can be expanded and collapsed. We can see the properties of each tag. Here we can see the alternative text of the image.

In case we get lost and do not know where in the document a tag is positioned, we just need to select the Page View tab. Clicking a tag in the tree will show us the position of the element in the document.

We do not have time to explain every function of this program in detail, but you can see how useful this free tool can be to evaluate existing PDFs.

Other tools for checking accessibility

There are more tools for checking accessibility. As we cannot introduce them all, here is a short list, which might be of interest to you. pdfaPilot is callas's solution for the conversion of PDF or native documents into PDF/A files for long-term archiving. The PDF/UA validation is built into version 6 of callas pdfaPilot Desktop. A free trial version of pdfaPilot can be downloaded. The PDF/UA validation remains available for free even after the trial period has ended.

The CommonLook PDF Validator is a free tool for testing document accessibility. It works as a plug-in within Adobe Acrobat. You can download this tool after registration. If you want to repair a PDF, you will need the commercial CommonLook PDF software. There are also online tools that can help you to evaluate your PDFs for accessibility. Please note that with all of these online tools you need to upload your PDF to a website, which might raise some privacy issues. If your document contains confidential information, then an online solution might not be the best choice for you.

The Tingtun Checker is an online tool that helps you to check web pages and PDF documents for accessibility. This project was co-funded by the European Commission as part of the European Internet Inclusion Initiative.

Another online tool is PAVE, which will not only analyse your PDF document, but will also try to correct it. By following the recommended steps, you can create a document structure and add alternative image descriptions.

The World Wide Web Consortium maintains a list of accessibility evaluation tools. You can access this list via the link under the Documents tab on the platform.

Where to continue?

You have been introduced to a set of programs that can be used to analyse existing PDFs for accessibility.

Depending on your personal interests you could continue with the following chapter: Repairing a PDF in Adobe Acrobat.

[Automated voice:] Accessibility. For more information visit: op.europa.eu/en/web/accessibility.

Close tab