Expression Syntax
This topic covers the following syntax-related aspects: tokens, grammar, supported operations, built-in names used in computations, and expression functions. For details, see the subsections below.
Tokens
The expression you enter for a specific property is processed into a sequence of tokens that must conform to the structure described in this subsection. Whitespaces between tokens are insignificant and ignored.
Identifiers
Built-in identifiers are used to reference built-in objects. Built-in identifiers start with an uppercase or lowercase letter or an underscore (e.g., Top refers to the top of a region). User identifiers are enclosed in square brackets [ ] and can contain any Unicode characters (e.g., [Sep1] may refer to the leftmost side of the table column called Description).
Single-character tokens
Single-character tokens are an essential part of programming languages and are usually combined to create complex expressions. The single-character tokens include the following: (, ), {, }, ,, :, ..
Number literals
Number literals represent numeric values. Number literals do not have a sign (this is handled as a unitary operator) and must have at least one digit before and after the decimal point.
String literals
String literals represent sequences of characters enclosed in double quotation marks. String literals can contain any Unicode characters and backslash-escaped special characters (e.g., the \n character represents a line break).
Color literals
Color literals represent color values. Color literals take on the same form as in HTML and CSS and use three or six hexadecimal digits (e.g., #FFF is a hexadecimal color code that represents white).
Operators
Operators are symbols that are used to perform various operations such as comparing values, performing mathematical calculations, concatenating strings, etc. The following operators are supported: +, -, *, /, &, |, =, !, <, >, ^, %.
Grammar
This subsection describes the following aspects of grammar: binary and unary expressions.
Binary Expressions
A binary expression is an expression that contains two operands and an operator that specifies an action (e.g., multiplication) to be performed on the operands.
Multiplicative (*, /), additive (+, -), and logical (&&, ||, !) operators are left-associative operators, which means that operators of the same operator precedence are processed from left to right (e.g., in the expression 5+2*4, the multiplication is calculated before the addition, because multiplication has a higher precedence than addition). Left-associative operators can be chained.
Equality (==, !=, =), relational (<, >, ≤, ≥), and general (operators not included in other classes) operators are not associative and, therefore, chaining such operators without parentheses results in a parse error.
Unary Expressions
A unary expression is an expression that contains only one operand and a unary operator that acts upon this single operand. The supported unary expressions are described below.
•A built-in reference expression is an identifier, identifier followed by a sequence of member selectors, or identifier followed by a tuple, which is a function call.
•A user reference expression is the same as a built-in reference except that no function calls are possible.
•A unary prefix operator expression is an operator that acts on a single operand and is placed before the operand. Some common unary prefix operators are +, -, !, ++, --.
•A string literal expression: see description in String Literals in the Tokens subsection above.
•A number expression is a number literal with an optional suffix, which can be any built-in identifier. This expression type can be useful, for example, for distance literals.
•A color literal expression: see description in Color Literals in the Tokens subsection above.
•Tuple expressions can be of the following types: (i) an empty tuple that contains no elements and is represented as (); (ii) a singleton tuple that contains a single element (e.g., (50,)); (iii) all other tuples, which contain at least two values (e.g., 5pt, “Hello”) is a tuple with a distance as its first value and a string as its second value).
•A structure expression is similar to a tuple expression, with the difference being that tuple members have indexes to identify them, but structure members have names instead of indexes. As opposed to a tuple, a single member structure is still a structure, as it has a name. However, an empty structure is equivalent to an empty tuple. Example of a structure: { X: 30pt, Y: -20pt } is a structure with members X and Y, both of which are distances.
Supported operations
This subsection describes supported operations.
Numbers
Numbers can be added, subtracted, negated, multiplied, divided, and compared. The PDF Extractor supports the odd and even functions to check whether a number is an odd or even integer. The result will be undefined if a number is not an integer.
Booleans
Boolean operations accept equality comparisons (==, !=), logical conjunctions (AND), and disjunctions (OR).
Locations
Location specifies the position of an edge on the page. Location operations enable you to place an edge on the page. Typically, you do this by using, for example, an edge or adding a distance to an edge (e.g., Left + 50pt). Locations specified in the same cardinal direction can be subtracted to give a result location (e.g., (Top + 500 pt) - (Top + 200pt) = Top+300pt).
Distances
Distance specifies how far apart two locations on the page are from each other. The distance functions are measured in pt, in, cm, mm and pc, with the same meaning as in CSS. You can add and subtract distances as well as multiply a distance by a number. It is also possible to divide distances to get the ratio.
Operations with rectangles
The PDF Extractor supports the following functions that enable you to manipulate rectangles:
The inflate function changes the size of a rectangle. If you specify negative numbers, the function shrinks the size of a rectangle by the given distance. If you specify positive numbers, the function expands the rectangle. This function could be useful if, for example, a page has a frame that you want to exclude from processing. You can modify the size of the rectangle manually, or you can use the inflate function (example below).
Syntaxinflate(rectangle, horizontal distance, vertical distance) -> rectangle
ExampleFor example, the expression inflate(PageRect, -1cm, -1cm) set in the Region property sets the rectangle 1 cm away from both horizontal edges of the page and 1 cm away from the vertical edges of the page (see screenshot below). For information about PageRect, see Built-in Names below.
|
The offset function moves the rectangle by the specified horizontal and vertical distance.
Syntaxoffset(rectangle, horizontal distance, vertical distance) -> rectangle
|
The clip function takes two rectangles as its two arguments and returns a rectangle that is the overlap area of the two input rectangles.
Syntaxclip(rectangle, rectangle) -> rectangle
|
The contains function checks if the specified locations are inside the bounds of the rectangle.
Syntaxcontains(rectangle, horizontal location, vertical location) -> boolean
|
Built-in Names
The PDF Extractor supports the following built-in names that can be used in computations:
•Left: The left edge of the current page. Left is an alias of PageRect.Left.
•Top: The top edge of the current page. Top is an alias of PageRect.Top.
•Right: The right edge of the current page. Right is an alias of PageRect.Right.
•Bottom: The bottom edge of the current page. Bottom is an alias of PageRect.Bottom.
•PageRect: A rectangle defining the boundaries of the current page. The PageRect built-in name is useful in combination with rectangle-processing functions such as inflate (see Operations with Rectangles above).
•PageNumber: The page number of the current page.