ETL Archives

How to Create Batch Data Mapping Projects

May 15, 2025/in Data Integration, ETL /by Erin Cavanaugh

A common requirement in data processing is batch data mapping, especially in the context of data transformation and integration. It involves converting data in batches rather than processing individual data points one at a time. Batch data mapping is often required in data integration or ETL scenarios where input from multiple sources needs to be aligned or transformed together. Two common scenarios are batch to batch and batch to one.

In our batch data processing video series, we walk you through implementing these projects step-by-step using visual tools in MapForce

Decorative image: yellow piping in a factory setting

Exporting Products from Shopify as CSV

March 28, 2025/in Data Integration, ETL /by Erin Cavanaugh

Shopify is a massively popular ecommerce platform in wide use by retail businesses large and small. Though Shopify provides easy-to-use tools for setting up and running an online storefront, managing the vast amount of data behind the scenes, such as product catalogs, customer information, order records, and inventory, can quickly become complex.

Businesses often need to integrate Shopify data with backend databases, ERP systems, CRMs, data warehouses, or other platforms to streamline operations, perform deeper analytics, or support automation workflows.

This is where a data mapping tool with Shopify support becomes essential, allowing businesses to transform, map, and move data between Shopify and other systems efficiently and accurately.

Let’s look at an example of a common scenario – extracting product data from Shopify to a CSV file – using visual tools in MapForce.

Image representing extracting data from a Shopify store for data integration

Tags: MapForce, Shopify

What is ETL?

February 11, 2025/in Data Integration, ETL /by Erin Cavanaugh

ETL—extract, transform, load—is the backbone of modern data integration. While most technical professionals understand the basics, the real challenge lies in designing efficient, scalable ETL processes that handle complex data transformations while maintaining performance and accuracy.

In our latest video series, we break down how ETL works, common challenges with defining ETL workflows, and how graphical tools like Altova MapForce can help. We’ll walk through demos of real-world scenarios such as transforming and loading CSV reports to a SQL database, as well as implementing scalable automation.

Whether you’re optimizing an existing process or researching new ETL tools, this series covers all the bases.

Colorful diagram representing ETL processing

Tags: MapForce

ETL Tutorial: Video

February 4, 2025/in Data Integration, Database, ETL /by Erin Cavanaugh

ETL processes span a wide spectrum of complexity, from straightforward tasks like a one-to-one mapping of an API payload to a database, to highly intricate scenarios requiring extensive data filtering, transformation, and manipulation.

Altova MapForce can tackle this full range of ETL tasks.

Diagram representing Extract Transform Load

This video tutorial explores a common ETL scenario:

Extract CSV data received in multiple reports
Transform and filter the data
Load the transformed data to a target SQL database

This particular transformation is somewhat complex because the CSV data is in a wide format, with separate columns corresponding to each of several years. Part of our transformation will be to melt or pivot the data to the long format more in line with how data is stored in a relational database.

This way, each year becomes a value in a single column, and its corresponding data is moved to a new column, resulting in more rows but fewer columns. This long format is also more readily consumed by common analytical and BI tools down the line.

In addition, we will filter unwanted data and round up long decimals before writing the data to the database.

Though making this type of transformation seems like a complicated problem, it’s easy using MapForce ETL tools that include drag-and-drop data mapping, dynamic node names, and built-in functions.

The example in this video a CSV to database ETL scenario, but MapForce supports a wide variety of additional data formats including XML, JSON, PDF, Excel, EDI, and XBRL. All popular SQL and NoSQL databases are also supported as the source or target of any data mapping.

MapForce is available for a free, 30-day trial. Now account or credit card are needed – so you can just get to work trying this ETL functionality for yourself.

Up next: Check out the previous video in our series, ETL Basics.

Tags: data integration, data mapping, MapForce

ETL Basics: CSV to Database in MapForce

January 28, 2025/in Data Integration, ETL /by Erin Cavanaugh

ETL processes are increasingly required in modern enterprises as organizations receive data in diverse formats that must be transformed and loaded into target databases or business systems. ETL projects range from simple to highly complex, depending on the specific requirements.

A common example of a straightforward ETL process involves extracting CSV data from incoming files, mapping the data structure, applying basic transformations to align with the target schema, deduping records, and then finally loading the processed data into a SQL database.

Whether an ETL project is basic with just a one-to-one mapping or more complex with sophisticated data processing requirements, developers need tools that can handle the scope of complexity without a huge learning curve – or price tag. That’s where MapForce comes in.

As part of our series on defining ETL pipelines, this video walks you through this process of extracting data in CSV documents, transforming it using data processing functions, and then configuring how it will be written to the target system.

Though this example focuses on CSV, it’s easy to define data mapping projects in MapForce for any combination of data formats. Benefits of MapForce as an ETL tool include:

Graphical, drag-and-drop data mapping
Extensible library of data processing functions
Support for all major SQL and NoSQL databases
Support for CSV, XML, JSON, PDF, XBRL, and other data sources
Instant output with affordable ETL automation

Watch the video now:

To continue learning about defining more complex ETL pipelines, watch the next video in the series.

Text Search for Precise PDF Data Extraction

December 3, 2024/in Data Integration, Database, ETL /by Erin Cavanaugh

PDF documents are used at many stages of modern business workflows, often serving as the format of choice for invoices, reports, legal contracts, and other critical documents. While PDFs are ideal for preserving content integrity and a particular visual layout, their structure makes automated data extraction challenging. For organizations engaged in data integration and ETL, unlocking information contained in PDFs is a necessity—and this is where the MapForce PDF Extractor comes in.

The MapForce PDF Extractor includes multiple tools for visually defining extraction rules to map PDF data to other formats. One that is particularly useful for zeroing in on specific content is text search. Here’s how it works – including a video demo.

cartoon image of computer monitor with PDF charts peeling off the screen

Tags: MapForce, PDF, PDF Extractor

Extract Data for PDF Mapping

November 6, 2023/in Data Integration, ETL /by Erin Cavanaugh

MapForce, Altova’s award-winning data mapping tool, includes support for PDF input in data integration and ETL workflows. The MapForce PDF Extractor makes it easy to define rules for extracting PDF data in a structured format to make it available for mapping to other popular formats like Excel, XML, JSON, databases, and more.

Let’s take a look at how it works.

AI-based support request sentiment analysis using MapForce and GPT-4

July 17, 2023/in Data Integration, Database, ETL /by Alexander Falk

Automated sentiment analysis of text, such as user reviews, has historically been a challenge. Because of the myriad intricacies of natural language, systems faced difficulties in analyzing context and nuances. This required an inordinate amount of manual work to overcome.

One of the many useful capabilities of modern AI systems that are based on large language models (LLMs) such as OpenAI’s GPT-4 is that they are very good at sentiment analysis of natural text inputs. We can use that capability to build a very efficient database solution in MapForce that, for example, goes through all the new incoming records in a support database and automatically determines whether a particular support request or other customer feedback is positive, negative, constitutes a bug report, or should be considered as a feature request.

Decorative lead photo depicting machine learning for sentiment analysis