Tag Archive for: PDF

Text Search for Precise PDF Data Extraction


PDF documents are used at many stages of modern business workflows, often serving as the format of choice for invoices, reports, legal contracts, and other critical documents. While PDFs are ideal for preserving content integrity and a particular visual layout, their structure makes automated data extraction challenging. For organizations engaged in data integration and ETL, unlocking information contained in PDFs is a necessity—and this is where the MapForce PDF Extractor comes in.

The MapForce PDF Extractor includes multiple tools for visually defining extraction rules to map PDF data to other formats. One that is particularly useful for zeroing in on specific content is text search. Here’s how it works – including a video demo. 

cartoon image of computer monitor with PDF charts peeling off the screen
Read more…
Tags: , ,

Extract Data for PDF Mapping


MapForce, Altova’s award-winning data mapping tool, includes support for PDF input in data integration and ETL workflows. The MapForce PDF Extractor makes it easy to define rules for extracting PDF data in a structured format to make it available for mapping to other popular formats like Excel, XML, JSON, databases, and more.

Let’s take a look at how it works.

Read more…
Tags: , , , ,

AI Integration & PDF Data Mapping in Version 2024


Version 2024 of Altova Software introduces brand new AI Assistants in multiple products as well as long-awaited support for PDF data integration in MapForce. Other features include Markdown editing support, split output preview for business report creation, support for new XBRL standards, and much more.

Let’s take a look at the highlights.

Decorative image with imagery symbolizing AI to announce new product features
Read more…
Tags: , , , ,

Using TrueType Fonts for StyleVision PDF Generation


StyleVision is Altova’s visual stylesheet designer for publishing XML and database data in PDF and other formats. Limitations in the design of the Apache FOP processor cause TrueType fonts to be unavailable for PDF generation. This tip for creating TrueType font metrics files, along with the downloadable scripts, provides StyleVision users with a workaround for this issue.
A metrics file is created by calling the Java application TTFReader. TTFReader is included with Apache FOP, so if you have Apache FOP installed, TTFReader is already installed, too.
If you are proficient in calling a Java application, you can call TTFReader yourself to create the metrics file for each TrueType font. Alternatively, we have provided a set of command files below that ease this task. With these command files, you can easily create metrics files for all the TrueType fonts installed on your computer.
Please visit https://www.altova.com/technote12.html to view the complete tip with screenshots and a link to download the command files.
If you’re not already a StyleVision user, you may download a free, 30-day trial here: https://www.altova.com/download.html

Tags: , , , ,