Post Process
The Post Process section contains additional post-processing options for the result of the selected method:
•The Minimum Extent option specifies a threshold distance below which the split results are considered small fragments.
•The Small Fragments parameter specifies how to proceed with small fragments. The following values are available:
oDiscard: Small fragments will not be included in the sequence of the splitter (default option).
oMerge with previous: A small fragment will be merged with the first preceding non-small fragment.
oMerge with following: A small fragment will be merged with the first succeeding non-small fragment.
oSplit at center: The region between two non-small fragments will be split evenly; the initial and final small fragments will be merged with the first and last non-small fragments, respectively.
Example
You can use various ways of excluding unwanted fragments from processing. For example, if each page of your PDF document has the same number of snippets you want to eliminate, you can use the Skip Initial and Skip Final properties (see Example 1 below). However, if the number of unwanted snippets varies from page to page, you can use the Minimum Extent property.
To understand what value to supply for the Minimum Extent property, you need to measure the height of the fragment that you want to exclude from processing. Follow the steps below:
1.Select a rectangle that covers the height of the unwanted fragment (screenshot below).
2.Check the measurements in the status bar (screenshot below). The value 26.84pt represents the height of the fragment.
3.Based on the measurements shown in the status bar, we can safely set the Minimum Extent property to 30pt. We have set the Small Fragments property to Discard. All the fragments smaller than 30pt will be excluded from processing. To avoid unpredictable results, you need to make sure that the height of the fragments you plan to include in the split results is greater than the value of the Minimum Extent property. In our example, the height of the rows we want to split is greater than the height of the header row. Therefore, the value we have set in the Minimum Extent property will affect only the snippets we want to discard.