Find Text
The Find Text method is a mechanism that enables you to locate a specific piece of text and identify a split position relative to this text.
Search for text in PDF View pane
The PDF Extractor has a powerful text-searching functionality. You can do a search in the GUI as well as at runtime. For details, see Search Functionality.
Text Selection mode
The Text Selection mode is closely correlated with the text-searching functionality in the PDF Extractor. This mode enables you to select a piece of text in the PDF View pane and assign the text with its porperties (e.g., font face and size) to the Match structure of a Split object, Location/Boundary Assignment, or Group/Filter object. You can also turn the selected text into a region and create a Text Capture, a Split object, or a Merge Source object out of it. For details, see Selection Modes.
Properties
The properties of the find-text method are described in the table below.
Property | Description |
---|---|
Match
| The Match structure is a group of properties that specify matching criteria and enable you to filter search results.
This property specifies a search term to look for in the PDF file. This field must contain a value.
Allow Any Whitespace If selected, this option will allow any whitespace, including new lines and whitespace between words. If the check box is not selected, only single spaces will be taken into account.
Anchoring This property specifies the behavior of the left and right boundaries of a search term:
•None: Matches can start and end anywhere. •Whole Words Only: Matches must start and end at a word boundary. •Starting/Ending at Word Boundary: There must be a word boundary at the start/end of each match.
Note that the Anchoring property applies only to the left and right boundaries of a search term and not to any word breaks inside the text.
Case Sensitivity This option specifies if a text search should be case-sensitive or not.
Format Filtering The Format Filtering section contains a number of font-related settings that enable you to filter search results.
Match font This option enables you to search for text of a particular font (e.g., Arial-ItalicMT). When you select this option, you will be prompted to select a font face from the drop-down list.
Font weight This option enables you to search for normal and bold text as well as text without any weight constraints.
Font style This option allows you to search for normal and italic text as well as text without any constraints.
Match font size To search for text of a particular size, enable the Match Font Size option, type cell height (e.g., 12pt) and, optionally, the tolerance of the cell height (e.g., 2pt). The Cell Height property refers to the overall vertical space for each line of text that includes character height and line spacing.
Match text rotation To search for text that is displayed at an angle, enable the Match Text Rotation option and specify the angle (in degrees counterclockwise) of the text to match. For example, typing the value 90 would match text going upwards at 90 degrees. You can also specify the tolerance of the angle (e.g., 5 degrees).
|
Coordinate Selection | This option indicates the location of a split position.
Starting edge This option places a split position at the topmost point of a search term.
Ending edge This option places a split position at the bottommost point of a search term.
Center This option places a split position in the middle of a search term.
Location/boundary finders The Coordinate Selection option is also available for location and boundary finders. A horizontal location/boundary finder has horizontal orientation. Therefore, the Starting Edge property will place a split position at the leftmost point of a search term, and the Ending Edge property will place a split position at the rightmost point of a search term. A vertical location/boundary finder has vertical orientation. Therefore, the Starting Edge and Ending Edge properties will have the same meaning as for the Split object.
|
Displace | This option enables you to move the identified split position by a certain distance (e.g., 10pt).
|