PDF file structure - four parts

After PDF created in 1993 by Adobe System, this file format has been developed for 18 years and becomes one of the most important format in people's daily life and work. However, few people know the basic information to PDF, never mention about what a PDF file consists of. Now, this article shows the PDF file structure to help you know more about PDF structure.

Generally speaking, a PDF document is composed of four main parts. They are one-line header, body, cross-reference table and trailer.

PDF four parts: Header, Body, Trailer and Cross-reference table

PDF Header : The first line of the PDF specifies the version of a PDF file format. These headers are the topmost portion of a document. It reveals the basic information of a PDF file, for example, "%PDF-1.4", it means that this PDF format is the fourth version. By the way, to read a PDF, you need a later version of PDF reader, i.e. you have to download Adobe Acrobat 5.0 to view %PDF-1.4.

PDF Body: The body of a PDF file consists of objects that compose the contents of the document. These objects include image data, fonts, annotations, text streams and so on. Users can also integrate invisible objects or elements. These objects embed the interactive features in a document like animation or graphics. A user can also implement logical structure in the document. You can also make the content of a PDF document more secure by implementing security features. One can protect the content of a document from unauthorized printing, viewing, editing or modifying. The body of a PDF also supports two types of numbers called integers and real numbers.

The Cross-Reference Table (or called xref table): The cross-reference table consists of links to all the objects or elements in a file. You can deploy this feature to navigate to other pages or content in a document. When users update their PDF files, they will automatically get updated in the cross-reference table. One can also trace the updated changes in the cross-reference table.

The Trailer: The trailer contains links to cross-reference table and always ends up with "%%EOF" to identify the end of a PDF file. The "%%EOF" is necessary for a PDF file, if this line missed, the PDF-file is not complete and may not be processed correctly. This is not same as PostScript files. If the last few lines of a PostScript file missed, you will still print most of the pages. For a PDF file, you lose everything. The trailer enables a user to navigate to the next page by clicking on the link provided.

PDF file structure

After you find 7 wonderful PDF features, you will know why PDF is so popular and so many people would love to create PDF from other file formats.