XPath 3.1 became a Candidate Recommendation on 18 December, 2014.
XPath 3.1 conforms to the following specifications:
The most significant addition to XPath 3.1 is support for arrays and maps.
Prior to XPath 3.1 the only supported complex data structures were sequences and element structures. Arrays were introduced to XPath 3.1 because sequences are always flat. It is impossible to construct hierarchical sequences (e.g. a sequence of sequences). Arrays do not have this limitation i.e. it is possible to construct multidimensional arrays (an array of arrays etc.) Maps store data in key/value pairs and provide an extremely quick method of searching according to the key value.
In addition to arrays and maps, XPath 3.1 also introduces support for new operators and built-in functions.
Arrays are a collection of values associated with positions.
Each entry in an array is referred to as a 'member'.
The first member of an array has an index of 1, this simply means that it is located at position 1.
Arrays can also be nested i.e. an array can contain members which are themselves arrays.
There are two ways to construct arrays - using a square bracket constructor or using a curly bracket constructor.
Square bracket array constructor
[ 5, 10, 6, 2 ]
Curly bracket array constructor
array{5, 10, 6, 2}
array{5, 10, 6, 2}(2)
10
XPath 3.1 contains a number of built-in utility functions for arrays:
The 'array:size' function returns the number of members in the array.
array:size(['Austria', 'Belgium', 'Canada'])
3
array:size(['Austria', 'Belgium', 'Canada', ['Denmark', 'Estonia']])
4
The first argument to the 'array:get' function supplies the array, the second argument is the index of the member to be retrieved from the supplied array.
array:get(['Austria', 'Belgium', 'Canada', ['Denmark', 'Estonia']], 2)
'Belgium'
The 'array:append' function returns an array containing all members in the array supplied as the first argument plus an additional member consisting of the item supplied as the second argument.
array:append([10, 5, 8, 7], 4)
[10, 5, 8, 7, 4]
The 'array:subarray' function returns all members from the array supplied as the first argument, starting from a position supplied as the second argument up to the specified length supplied as an optional third argument.
array:subarray(['Estonia', 'Russia', 'Germany', 'France', 'Sweden'], 2)
['Russia', 'Germany', 'France', 'Sweden']
array:subarray(['Estonia', 'Russia', 'Germany', 'France', 'Sweden'], 3, 2)
['Germany', 'France']
The 'array:remove' function returns an array containing all the members of the array supplied as the first argument except for the member supplied as the second argument.
array:remove(['Estonia', 'Russia', 'Germany', 'France', 'Sweden'], 3)
['Estonia', 'Russia', 'France', 'Sweden']
The 'array:insert-before' function returns an array containing all the members of the array supplied as the first argument with one additional member at the position supplied as the second argument.
array:insert-before(['Estonia', 'Russia', 'Germany', 'France', 'Sweden'], 3, 'Italy')
['Estonia', 'Russia', 'Italy', 'Germany', 'France', 'Sweden']
The 'array:head' function returns the first member of the supplied array.
array:head(['Estonia', 'Russia', 'Germany', 'France', 'Sweden'])
'Estonia'
The 'array:tail' function returns an array containing all members of the supplied array except for the first member.
array:tail(['Estonia', 'Russia', 'Germany', 'France', 'Sweden'])
['Russia', 'Germany', 'France', 'Sweden']
The 'array:reverse' function returns an array containing all members of the supplied array in reverse order.
array:reverse(['Estonia', 'Russia', 'Germany', 'France', 'Sweden'])
['Sweden','France','Germany','Russia','Estonia']
The 'array:join' function concatenates the contents of multiple arrays into one array.
array:join((['Australia', 'New Zealand'] , ['Brazil', 'Argentina']))
['Australia', 'New Zealand', 'Brazil', 'Argentina']
The 'array:for-each' function applies the function supplied as the second argument to each member of the array supplied as the first argument.
array:for-each(['Australia', 'New Zealand', 'Brazil', 'Argentina'] , lower-case(?))
['australia', 'new zealand', 'brazil', 'argentina']
The 'array:filter' returns an array containing those members of the array supplied as the first argument for which the function supplied as the second argument returns 'true'.
array:filter(['Australia', 'New Zealand', 'Brazil', 'Argentina'] , contains(?, 'zil'))
['Brazil']
The 'array:fold-left' function takes three arguments. The first argument is an array, the second argument is the base value and the third function is a function. 'array:fold-left' evaluates the supplied function cumulatively on successive values of the supplied array.
array:fold-left([1 , 2 , 3], 0, function($a, $b){$a - $b})
-6
The 'array:fold-right' function takes three arguments. The first argument is an array, the second argument is the base value and the third function is a function. 'array:fold-right' evaluates the supplied function cumulatively on successive values of the supplied array.
array:fold-right([1 , 2 , 3], 0, function($a, $b){$a - $b})
2
The 'array:for-each-pair' function returns an array which is obtained by evaluating the function supplied as the third argument for each pair of members at the same position in the arrays supplied as the first and second arguments.
array:for-each-pair([ 1, 2, 3, 4] , [ 5, 10, 20, 40] , function($a, $b) {$a * $b})
[5, 20, 60, 160]
The 'array:sort' function returns an array containing all the members of the supplied array sorted according to a sort key function. 'array:sort' has two signatures: The first signature contains only one argument which supplies an array. The second signature contains a second argument which is used to supply a function according to which the array members should be sorted. The one signature function is equivalent to calling the two argument function with the 'fn:data#1' function i.e. it sorts the values according to their typed value.
array:sort([8 , 4 , 1 , 2])
[1 , 2 , 4 , 8]
array:reverse(array:sort([8 , 4 , 1 , 2]))
[8 , 4 , 2 , 1]
array:sort(['d' , 'b' , 'a' , 'c'])
['a' , 'b' , 'c', 'd']
array:sort(['d' , 'b' , 'a' , 2 , 'c'])
error cannot compare xs:integer with xs:string
array:sort(['d' , 'b' , 'a' , 2 , 'c'], string(?))
[2, 'a', 'b', 'c', 'd']
The 'array:flatten' function builds a new sequence from all items found in the input sequence and arrays.
array:flatten((1, 2, [3, 4]))
(1, 2, 3, 4)
Array Examples
[/company/office/employee/last_name]
[('Smith', 'Jones', 'Brown', 'Davis', 'Mason')]
array{/company/office/employee/last_name}
['Smith', 'Jones', 'Brown', 'Davis', 'Mason']
[/company/office/employee/last_name](1)
('Smith', 'Jones', 'Brown', 'Davis', 'Mason')
array{/company/office/employee/last_name}(1)
'Smith'
[/company/office/employee/last_name](2)
Error: Index out of Range
array{/company/office/employee/last_name}(2)
'Jones'
array:for-each(array{/company/office/employee}, upper-case(?))
['JOHN SMITH 25', 'JOHN JONES 30', 'MARY BROWN 30', 'PETER DAVIS 34', 'MARK MASON 44']
Maps contain a collection of key / value pairs known as entries.
A map is constructed using the 'map' keyword followed by an opening curly bracket '{' and ending with a closing curly bracket '}'. The key / value pairs consist of the key followed by a colon ':' followed by the associated key value. Each 'entry' is separated by a comma ',' .
Maps can also be nested i.e. a map can contain another map.
map {'USA' : 'United States', 'CHN' : 'China', 'GER' : 'Germany'}
XPath 3.1 contains a number of built-in utility functions for maps:
The 'map:size' function returns the number of entries in the supplied map.
map:size( map { 'USA' : 'United States', 'CHN' : 'China', 'GER' : 'Germany' })
3
The first argument to the 'map:get' function is the supplied map, the second argument is the supplied key for which the associated value is to be retrieved.
map:get( map { 'USA' : 'United States', 'CHN' : 'China', 'GER' : 'Germany' }, 'CHN' )
'China'
The 'map:put' function returns a new map containing all members of the map supplied as the first argument replacing the associated value of the key supplied in the second argument with the value supplied in the third argument (if that key exists in the supplied map), or adding an additional key / value pair if the key specified in the second argument does not already exist in the supplied map.
map:put(map{'a' : 'anton', 'b' : 'bob', 'c' : 'carl'} , 'a', 'andy')
map{'a' : 'andy', 'b' : 'bob', 'c' : 'carl'}
map:put(map{'a' : 'anton', 'b' : 'bob', 'c' : 'carl'} , 'd', 'dave')
map{'a' : 'anton', ' b' : 'bob', ' c' : 'carl', 'd' : 'dave'}
The 'map:merge' returns a new map with the entries from a number of existing maps.
map:merge((map{'a' : 'anton', 'b' : 'bob', 'c' : 'carl'} , map{'d' : 'dave', 'e' : 'earl' }, map{ 'a' : 'anna', 'd' : 'diane' }))
map{'a' : 'anna', 'b' : 'bob', 'c' : 'carl', 'd' :'diane', e:'earl'}
The 'map:keys' function returns an sequence of all keys in the supplied map.
map:keys(map{'a' : 'anton', 'b' : 'bob', 'c' : 'carl'})
('a', 'b', 'c')
The 'map:contains' function tests whether the map supplied as the first argument contains an entry for the key which is specified as the second argument to the function.
map:contains(map{'a' : 'anton', 'b' : 'bob', 'c' : 'carl'} , 'a')
true
map:contains(map{'a' : 'anton', 'b' : 'bob', 'c' : 'carl'} , 'd')
false
The 'map:entry' function returns a map with one entry (key / value pair).
map:entry( 1 : 'John Smith' )
map{ 1 : 'John Smith'}
The 'map:remove()' function returns a map containing all entries of the supplied map except for the one specified by the supplied member as the second argument
map:remove(map{1 : 'Estonia', 2 : 'Russia', 3 : 'Germany', 4 : 'France', 5 : 'Sweden'} , 4)
map{1 : 'Estonia', 2 : 'Russia', 3 : 'Germany', 5 : 'Sweden'}
The 'map:for-each' function returns a sequence of items by applying the function supplied as the second argument to each entry of the map supplied as the first argument.
map:for-each(map{1 : 'Estonia', 2 : 'Russia', 3 : 'Germany', 4 : 'France', 5 : 'Sweden'}, function($a, $b){upper-case($b) } )
('ESTONIA' , 'RUSSIA' , 'GERMANY' , 'FRANCE' , 'SWEDEN')
Map Examples
map{ 'Boston' : /company/office[@location='Boston']/employee/last_name/string() , 'Vienna' : /company/office[@location='Vienna']/employee/last_name/string() }
{Boston:('Smith', 'Jones'), Vienna:('Brown', 'Davis', 'Mason')}
map:size(map{ 'Boston' : /company/office[@location='Boston']/employee/last_name/string() , 'Vienna' : /company/office[@location='Vienna']/employee/last_name/string() })
2
map:get(map{ 'Boston' : data(/company/office[@location='Boston']/employee/last_name) , 'Vienna' : data(/company/office[@location='Vienna']/employee/last_name) } , 'Vienna')
('Brown', 'Davis', 'Mason')
map{ 'Boston' : data(/company/office[@location='Boston']/employee/last_name) , 'Vienna' : data(/company/office[@location='Vienna']/employee/last_name) }('Vienna')
('Brown', 'Davis', 'Mason')
map:merge(for $i in //employee return map:entry($i/@id, $i/last_name))
map {1B:Smith, 1V:Brown, 2B:Jones, 2V:Davis, 3V:Mason}
JSON stands for JavaScript Object Notation. It is a data interchange format which is gaining in popularity. XPath 3.1 includes two functions to process JSON data.
XPath 3.1 contains two built-in functions to process json data.
The 'parse-json' function parses a string supplied in JSON format and typically returns a map or array. The 'parse-json' function has two signatures, a one argument signature in which only the string to be parsed is supplied, and a two argument function where the first argument is the supplied string and the second argument is a map of options which can be used to parse the supplied string.
parse-json('{"employees" : {"employee" : {"id": 1, "name": "Chris", "age": 41}}}')
map {"employees" : map {"employee" : map {"age" : 41, "id" : 1 , "name" : 'Chris'}}}
The 'json-doc' function reads an external resource containing JSON, and returns the result of parsing the resource as JSON using the 'parse-json' function. The 'json-doc' function has two signatures, the one signature version supplies the url from which the json should be read and the two argument version supplies a map of options to be used when parsing the JSON data.
json-doc('http://blahblah/employees.json')
map {"employees" : map {"employee" : map {"age" : 41, "id" : 1 , "name" : 'Chris'}}}
In addition to the new map, array and JSON features, XPath 3.1 also contains new built-in functions and operators.
The following list contains the new operators in XPath 3.1:
The lookup operator i.e. '?' is used to retrieve array members located at a specified position in an array, or map values associated with a specific key in a map. The lookup operator is followed by a specifier.
The specifier can be the integer offset of an array, the name of a key (in a map), a wildcard value, or any other parenthesized expression.
map{1: 'a', 2: 'b', 3: 'c'}?2
'b'
map{1: 'a' , 2: 'b' , 3: 'c'}?*
('a' , 'b' , 'c')
let $x := parse-json('{"employees" : {"employee" : {"id": 1, "name": "Chris", "age": 41}}}') return $x?employees?employee?age
41
array{'x' , 'y' , 'z'}?1
'x'
array{'x' , 'y' , 'z'}?*
('x' , 'y' , 'z')
The arrow operator i.e. '=>' applies a function to a value. The value is used as the first argument to the function.The arrow is followed by the function to be called.
'hello goodbye hallo gutentag' => tokenize() => sort() => string-join(' ')
'goodbye gutentag hallo hello'
string-join(sort(tokenize('hello goodbye hallo gutentag')), ' ' )
'goodbye gutentag hallo hello'
The following list contains some of the interesting new additions to the built-in function library of XPath 3.1
This function is a boolean function which determines whether any of the strings supplied in the first argument when tokenized at whitespace boundaries contain the token supplied as the second argument .The 'contains-token' function has also has a three argument variation where the third argument is the collation.
contains-token(('hello', 'hello world'), 'world' )
true
contains-token(('hello', 'helloworld'), 'world' )
false
Returns an 'xs:dateTime' value from the date supplied in 'ietf' format. Many HTML pages on the web use the ietf date format so this function is useful when retrieving data from HTML sources.
parse-ietf-date('Thu, 12 Mar 2015 09:40:00 GMT')
xs:dateTime('2015-03-12T09:40:00Z')
The 'random-number-generator' function generates a random number. The function has two signatures: a zero argument signature, and a one argument signature. The one argument signature supplies a seed for calculating the random number.
The 'random-number-generator' function returns a map containing three entries: 'number', 'next' and 'permute'.
The associated value of 'number' is the random number i.e. an 'xs:double' value greater than or equal to zero i.e. 0.0e0 and less than one i.e. 1.0e0.
The associated value of 'next' is a zero-arity function that can be called to return another random number generator.
The associated value of 'permute' is a function with arity1 (one argument ), which takes an arbitrary sequence as its argument, and returns a random permutation of that sequence
random-number-generator()('number')
xs:double(0.11688232421875)
The 'sort' function sorts the supplied sequence of items.
The one argument version of the 'sort' function simply supplies a sequence of items to be sorted
The two argument version of the 'sort' function supplies a sequence of items to be sorted and a function to be applied to each item in the sequence. The sequence is sorted based on the results of applying the function to the members in the sequence.
fn:sort((-2, 3, 1))
(-2, 1, 3)
fn:sort((-2, 3, 1), fn:abs#1)
(1, -2, 3)