Altova MapForce 2024 Enterprise Edition

Navigation: Structural Components > PDF > MapForce PDF Extractor

Overview

This topic explains how to run the PDF Extractor and gives a general overview of the PDF Extractor and its interface.

How to run PDF Extractor

To launch the PDF Extractor, you can choose one of the following options:

•You can run the PDF Extractor directly from MapForce. This option is useful when you want to create a new PDF extraction template directly in MapForce.

•You can also start the PDF Extractor as a standalone program, by running the Altova MapForce PDF Extractor executable from the Start menu or the MapForce installation directory.

For more information about how to create a PDF extraction template, see Create a New Template.

GUI overview

The screenshot below illustrates the interface of the PDF Extractor. The interface is organized into five distinct parts:

•the top part that contains different menu and toolbar commands,

•the Schema Pane (top left part), which enables you to define the structure of your PDF document and extraction rules,

•the PDF View Pane (top right part), in which you can see your PDF file and use visual prompts to define extraction rules,

•the Properties Pane (bottom left part), which enables you to define various properties and calculate expressions,

•the Output Pane (bottom right part), which shows what the structure and data of your PDF document will look like, based on the properties and layout you have defined.

Notice that the structure in the Output pane is represented as an XML tree.

The PDF Extractor allows you to work on multiple templates at a time. Each template has its own separate window. In the screenshot above, there is only one template called PDFExtractor1. All PDF extraction templates that you create in the PDF Extractor are saved with a .pxt extension.

Important

Note that you can extract data only from electronically created PDF documents. Scanned documents are not supported.