In chapter 1 we briefly touched upon location path expressions, noting their similarity to filesystem paths. In this chapter we delve deeper. Several interactive examples are provided to facilitate understanding of this fundamental type of XPath expression.
As with filesystem paths, location paths can also be absolute or relative.
An absolute location path is always evaluated from the document root.
A relative location path is always evaluated from a context node.
An absolute location path begins with a '/' to signify that it is starting from the document root. The document root is the top level of the XML document's node hierarchy and contains all other nodes in the XML document, including the root element (which is the top level of the XML document's element hierarchy).
An absolute location path can also begin with '//', however '/' and '//' mean different things. The following section includes two examples which illustrate the difference.
Absolute Location Path Examples
(Please click on radio buttons to see result of XPath expression evaluation)
Download example file: company_1.xml
/
/comment()
/company
/company/office
/company/office[2]
/company/office/@location
/company/office[@location = 'Boston']
/*
//*
This XPath expression selects the document root. The document root is the top of the XML document node hierarchy, it contains all other nodes in the XML document.
This XPath expression selects all comment nodes which are children of the document root. In this case there is only one comment '<!-- Edited with XML Spy-->'.
This XPath expression selects the 'company' element node which is a child of the document root. In this XML document 'company' is the root element. The root element is the top level of the element hierarchy in the XML document, i.e. it contains all other elements and attributes in the XML document. The document root is not to be confused with the root element.
This XPath expression selects all 'office' elements which are children of the 'company' element, which in turn is a child element of the document root.
This XPath expression selects the second 'office' element which is a child of the 'company' element, which in turn is a child element of the document root. The square brackets indicate a predicate which is used to filter sequences (in this case the sequence of 'office' nodes).'[2]' is short form for '[position() = 2]'.
This XPath expression selects the 'location' attribute of all 'office' elements which are children of the 'company' element, which in turn is a child element of the document root. '@' is an abbreviated form of the 'attribute::'axis specifier.
This XPath expression selects the 'office' element(s) which have an attribute named 'location' with a value equal to 'Boston'.
This XPath expression selects all element children of the document root. '*' is an element wildcard.
This XPath expression selects all element descendants of the document root.'//' is the abbreviated form of the 'descendant-or-self' axis specifier.
A relative location path is always evaluated from a context node.
A context node can be thought of as the node that the XPath processor is 'currently processing'.
The context node can change within an XPath query.
office/employee/first_name[ . = 'John']
Relative Location Path Examples
(Please click on radio buttons to see result of XPath expression evaluation)
In the following examples the 'employee' element (marked in blue) is the context node.
Download example file: company_1.xml
.
..
../..
first_name
*
age/text()
../following-sibling::office/@location
ancestor::*
ancestor-or-self::*
This XPath expression selects the context node. A dot i.e. '.' is an abbreviated form of the 'self::' axis specifier.
This XPath expression selects the parent element of the context node. '..' is an abbreviated form of the 'parent::' axis specifier.
This XPath expression selects the parent element of the parent element of the context node. '..' is an abbreviated form of the 'parent::' axis specifier.
This XPath expression selects all 'first_name' elements which are children of the context node.
This XPath expression selects all child element nodes of the context node.
This XPath expression selects the text node of the 'age' child element of the context node.
This XPath expression first navigates to the parent element of the context node (which is the first 'office' element child of the 'company' element). The next step navigates to the sibling 'office' element which follows (in document order) the first 'office' element. The third step access the 'location' attribute of the following sibling 'office' element.
This XPath expression uses the 'ancestor::' axis specifier and the '*' wildcard to select all ancestor elements (parent, grandparent,etc.) of the context node.
This XPath expression uses the 'ancestor-or-self::' axis specifier and the '*' wildcard to select the context node and all ancestor elements (parent, grandparent,etc.) of the context node.
A location path contains one or more steps.
A step consists of:
axis::node_test[predicate]
child::office[@location='Vienna']
The axis is the first part of a location step, it determines which direction to navigate with respect to a particular node. There are 13 different axes which belong to one of two groups: forward axis or reverse axis.
The forward axis returns nodes in document order, the reverse axis returns nodes in reverse document order.
A double colon :: is used to separate the axis specifier from the node test.
Forward Axis
Reverse Axis
child::office
Every axis has a principle node type. For the 'attribute' and 'namespace' axes the principle node type is 'attribute' and 'namespace' respectively, for all other axes the principle node type is 'element'.
The default axis (i.e. if no other axis has been specified) is the child axis
Several of the examples that we have encountered have not explicitly specified an axis e.g. 'child::' . The reason for this is because an abbreviated syntax exists for some axis and axis node test combinations e.g. an XPath expression to select all child elements named 'office' from the context node can be written as 'child::office' or simply 'office' in the abbreviated syntax.
A node test appears after the axis specifier in a location path step.
There are three types of node test:
A name can be any XML name
office
@location
The kinds of node that can be tested for are:
//attribute()
//element()
//*
The types of node that can be tested can be any built-in XML Schema datatype as well as any user defined simple or complex types.
//element(*, xs:date)
Predicates are used to filter nodes in a location path step. They appear within square brackets i.e. '[' and ']' after the node test.
//first_name[. = 'Andy']
//employee[age > 18 and age < 30]
//employee[age >= 30][4]/first_name
Because of their frequent use, an abbreviated syntax exists for the following axis specifier and axis specifier node test combinations:
child:: | If no axis is specified the 'child::' axis is assumed. 'child::' is the default axis. | |
@ | attribute:: | '@' is the abbreviated form of the 'attribute::' axis. |
. | self::node() | '.' is the abbreviated form of the 'self::' axis and the 'node' node test. |
// | descendant-or-self::node() | '//' is the abbreviated form of the 'descendant-or-self::' axis and the 'node' node test. |
.. | parent::node() | '..' is the abbreviated form of the 'parent::' axis and the 'node' node test. |
[4] | [position() = 4] | A predicate which contains only an integer value i.e. '[4]' is abbreviated form for '[position() = 4]'. |