How-to guides page navigation
transcript-content-how-to-from-word-to-pdf

​​​​​​​

From Word to PDF

[Narrator:] Microsoft Word is currently the most widely used word processor on the market. It is licensed as a stand-alone product or as a component of the Microsoft Office suite. The program has a long history dating back to its first release in 1983. Documents created with Word use the file suffixes ".doc" for legacy Word documents or ".docx" for the newer XML based Word format. Word offers some great features to create fully accessible documents. In this chapter you will learn how to create accessible Word documents and how to create accessible PDF documents from Word.

Text

This is an example document. We can see that this document contains information like headings, which present the structure of our document. The text uses links, which can guide the user to a website. We can see an image, which needs additional information. Furthermore, we can see a list and a table.

Let's make a start. First we create structure in our document using headings. This here is the first heading and therefore a heading of level 1. We mark it as a "Heading 1" using Word’s built-in styles. Please note that you should use Word’s built-in styles to mark headings. This guarantees that a heading is recognised as a heading. If you do not like the visual appearance, then feel free to modify the styles to format your text with different fonts, sizes or colours.

This here looks like a heading of level 2. And another heading 2. We assume that these here are headings of level 3. Each heading is marked with its appropriate heading level. This enables Word to detect a content hierarchy.

Please note that the heading levels should be properly nested. So a heading level 2 is always preceded by a heading level 1, and a heading level 3 is always preceded by a heading level 2, and so on. This will enable the user to unfold and fold the document structure in the form of bookmarks in the resulting PDF. If the headings are not properly nested, the PDF reader has to guess a hierarchy, which often results in a bad user experience. So our headings look nice.

We continue with a link in the text. This looks like a weblink. Let's mark it as a hyperlink in Word, so that the resulting PDF will have a real clickable link. We select the link text and open the context-sensitive menu with the right mouse button to activate the menu Hyperlink. We have now the possibility to enter the text to be displayed and the web address to be followed. Here we enter the text... and here the web address. Finally we confirm our choice by clicking the OK button. Now this looks like a real link.

Here at the bottom of the page we see a list. This list looks somehow broken. The indentation of the second line is not correct. In fact, this list was created by adding a leading minus character to each line. Word has no idea that this should be a list. So we should repair this and create a real list. We select the lines and press the list button in the toolbar. Note how the second line of each list item is now properly formatted.

At the end of the page, we have an issue with the page break. The table with a single row at the end of the page does not look nice. So we would like the table start to be moved to the next page. Often users format text using empty lines like this. There are two reasons why you should not do this. Firstly, some screen readers read empty lines as such. Imagine a screen reader repeatedly saying "empty line, empty line, empty line"... This would be very annoying. Secondly, imagine a long document with many pages where every page break was created by empty lines. Now somebody decides to enhance the document with additional paragraphs at the beginning of the document. All the page breaks would then need to be re-adapted to repair the layout of the document. So we undo this and create real page breaks. We position the cursor in the paragraph that should be the first of our next page. We open the context-sensitive menu with the right-mouse button and select Paragraph… In the dialogue box, we select the tab Line and Page Breaks and activate the checkbox Page break before. Voilà, this looks much better. Now we can add any content before and the line will always be the first line of the following page.

Images

Next we take care of the image. A screen reader can detect images, but not their content unless we describe them with an alternative text. To add an alternative text move the mouse cursor over the image and open the context-sensitive menu with the right-mouse button to select the function Format Picture. Inside the task pane, select the option Layout and properties. We can now open the Alt Text form to enter an image title and a description. For the title we type "The tea party". A screen reader uses the verbose alternative description, so we type "Alice, the Rabbit and the Mad Hatter at the tea party". Now we can close the task pane.

An image has a position in the text. It can be in-line with the text or it can flow, so the text can wrap around it. Screen readers prefer a fixed in-line position, so that they know when to speak the text of the image. To bring an image in-line, click the image and open the Layout Options dialogue. Click the icon for In Line with text. The image has now moved to the left. We can repair this by clicking the Center button in the toolbar. The image layout is now identical to the one before, but the image has a position inside the text.

Tables

Let’s continue with the table. For a proper table, we select one of the pre-defined table styles. The first row in our table is a table header, which describes the type of content in all our table data cells. To mark it as a header, we select in the menu bar Table Tools, Layout, and Repeat header rows. Note how the table header is now repeated on page breaks.

There is an alternative possibility to use the same function. Position your cursor over the table, open the context-sensitive menu with the right-mouse button and select Table properties. In the tab Row you will find a checkbox labelled Repeat as header row at the top of each page. Whichever way you prefer. Both options will fulfil the same task.

One remark related to tables. Tables should be used for table-based data only. You should never use tables for layout purposes. For example, some people misuse tables to create a multi-column text layout. This should never be done. Instead, use the Word functions designed to create a multi-column layout.

Try to avoid building tables with a complicated structure. The more complicated the structure of your table, the more complicated it will be for a screen reader user to navigate the table.

More

Before saving the document, we should enhance it with meta information. This meta information can help a user to understand what your document is about. Select File to open the Info panel. Here you can add the document title, some describing tags and comments to your document. Even though Word offers a rich toolset to create accessible documents, it cannot do everything. For instance, you have to check the colour contrast of the document content manually.

To learn how to check the colour contrast, please refer to the chapter ″Fonts, sizes and colours″.

The integrated accessibility checker

In the previous sections we have seen a set of procedures showing how to create an accessible Word file. Of course we can create a list of all those procedures and check them one by one. The chance of forgetting something increases with the number of pages and different content elements in our document. If we have many images, then the probability is high that we will forget to add an alternative text to one of them. Wouldn't it be nice to have an accessibility checker that supports us in creating an accessible document? In fact, this function exists in Microsoft Word.

To start the Accessibility Checker, select File, Inspect document, Check Accessibility. The Accessibility Checker task pane will open and inform you about the current accessibility status of your document. In our case, we get an error message that Word was not able to run the Accessibility Checker because of file type incompatibility. This is confusing as we are using the native Word file format for our document. When we look closer, we see that our file uses the “.doc” file suffix. Please remember that the “.doc” file suffix was used for the legacy Word file format and it cannot store any accessibility information. Therefore, you should never store your documents using the legacy “.doc” file format. To fix this, we save our document using the newer “.docx” file format. We select File, Save As. In the file selector dialogue, we choose the Word document (*.docx) option and click Save. A warning appears that informs us about the incompatibility of the new data format with legacy versions of the program, which is no problem for us as we are only using up-to-date versions of the program.

As you can see, the Accessibility Checker can now analyse the document. The inspection results are divided into errors and warnings. Let's look at the error messages. The error indicates that an alternative text for a table has not been set. Clicking on the object name guides us directly to the defective content element. At the bottom of the Accessibility Checker task pane, Word offers us additional information on why this fix is important and how we should fix it.

In this case, we need an alternative text for the table. A table can have an alternative text. The alternative text should explain the purpose of the table and what kind of data to expect. A screen reader user can then decide if they want to read the content of the table or if they want to skip the table and continue with the content that follows. To add alternative text to a table, position the cursor over the table and open the context-sensitive menu with the right-mouse button click. Select Table properties, Alt Text. For our purpose, we decide to enter the text “My favourite characters” as a title and “Table of characters and their description”. Finally, press the OK button to accept the data. Once we have fixed this, the error message disappears.

The warning indicates that an object is not in-line. Again, we can click the object name and Word supports us in fixing the problem. But why was this marked as a warning and not as an error? A screen reader cannot determine the position of a floating object on the page. So it is not sure when the information of this object will be spoken. Will it be at the beginning of the page... or will it be at the end? It is up to the screen reader to decide when to speak this information. Therefore, the preferred method is to define the position of an element by putting it in-line with the text to have a defined position.

Imagine a page with one very tall but narrow graphic. Next to this image flows the text describing the graphic. If the graphic was in-line with the text, the image would be alone on the page, while the describing text would be pushed on the next page. Even though this approach is preferable for accessibility, it would break the layout of the document. In this case, a decision needs to be taken. Is the timing of when the alternative text for this image is spoken so important that we have to accept a broken layout?

Saving the document

As we have seen before, there are multiple ways to create a PDF. So it is important to know how to save a PDF correctly from Word. We select File, Save as, and select PDF as the file type in the file selector box. As we want to make sure that the PDF uses tagging, we open the Options… dialogue box. We activate the checkbox for Create bookmarks using with the option Headings.

Please refer to the Word manual in case you prefer to create bookmarks independent from headings.

Additionally we activate the checkboxes for Document properties and Document structure tags for accessibility. This enables Word to save all the document structure information we have created before in the form of tags inside the PDF. Word will now convert your Word document to an accessible PDF.

Creating an accessible document

The demo was shown using Word 2016 on Windows. Please note that Word’s accessibility features have existed since Word 2010, so they are available for older Word versions.

At the time of the creation of this course, Word for the Mac had the same accessibility features as its Windows counterpart with one exception. Word for Mac cannot save a tagged PDF.

As a workaround, Microsoft offers to send your document to an online server to convert it to an accessible PDF. This raises some privacy issues.

Beyond that, there are technical issues limiting the use of this functionality. If you are using specific fonts available on your machine only, the server will not be able to embed them into your PDF.

Where to continue?

In this chapter you have learned how to create an accessible Word document and how to save an accessible PDF document from it. Starting with a properly structured Word document as a basis, it would also be possible to publish it in other formats, although it might be more complex requiring other tools and more technical knowledge.

Depending on your personal interests you could continue with one of the following chapters.

  • Adobe InDesign
  • Other authoring tools
  • Testing PDFs

[Automated voice:] Accessibility. For more information visit: op.europa.eu/en/web/accessibility.

Close tab