Given a string value and a collation, generates an internal value called a collation key, with the property that the matching and ordering of collation keys reflects the matching and ordering of strings under the specified collation.
fn:collation-key
( $key
as xs:string
xs:base64Binary
fn:collation-key
( $key
as xs:string
,$collation
as xs:string
xs:base64Binary
Calling the one-argument version of this function is equivalent to calling the two-argument version supplying the default collation as the second argument.
The function returns an
implementation-dependent
value with the property that,
for any two strings $K1
and $K2
:
collation-key($K1, $C) eq collation-key($K2, $C)
if and only if
compare($K1, $K2, $C) eq 0
collation-key($K1, $C) lt collation-key($K2, $C)
if and only if
compare($K1, $K2, $C) lt 0
The collation used by this function is determined according to the rules in . Collation keys are defined as xs:base64Binary
values
to ensure unambiguous and context-free comparison semantics.
An implementation is free to generate a collation key in any convenient way provided that it always generates the same collation key for two strings that are equal under the collation, and different collation keys for strings that are not equal. This holds only within a single execution scope; an implementation is under no obligation to generate the same collation keys during a subsequent unrelated query or transformation.
It is possible to define collations that do not have the ability to generate collation keys. Supplying such a collation will cause the function to fail. The ability to generate collation keys is an implementation-defined property of the collation.
let $C := 'http://www.w3.org/2013/collation/UCA?strength=primary'
The expression map:merge((map{collation-key("A", $C):1}, map{collation-key("a",
$C):2}), map{"duplicates":"use-last"})(collation-key("A", $C))
returns 2
.
The expression let $M := map{collation-key("A", $C):1, collation-key("B", $C):2}
return $M(collation-key("a", $C))
returns 1
.
As the above examples illustrate, it is important that when the
collation-key
function is used to add entries to a map, then it must
also be used when retrieving entries from the map. This process can be made less
error-prone by encapsulating the map within a function: function($k)
{$M(collation-key($k, $collation)}
.
An error is raised if the specified collation does not support the generation of collation keys.
The function is provided primarily for use with maps. If a map is required where
codepoint equality is inappropriate for comparing keys, then a common technique is to
normalize the key so that equality matching becomes feasible. There are many ways
keys can be normalized, for example by use of functions such as
fn:upper-case
, fn:lower-case
,
fn:normalize-space
, or fn:normalize-unicode
, but this
function provides a way of normalizing them according to the rules of a specified
collation. For example, if the collation ignores accents, then the function will
generate the same collation key for two input strings that differ only in their use of
accents.
The result of the function is defined to be an xs:base64Binary
value. Binary values
are chosen because they have unambiguous and context-free comparison semantics, because the value space
is unbounded, and because the ordering rules are such that between any two values in the ordered value space, an
arbitrary number of further values can be interpolated. The choice between xs:base64Binary
and xs:hexBinary
is arbitrary; the only operation that behaves differently between the two binary
data types is conversion to/from a string, and this operation is not one that is normally required for
effective use of collation keys.
For collations based on the Unicode Collation Algorithm, an algorithm for computing collation keys is provided in . Implementations are not required to use this algorithm.
This specification does not mandate that collation keys should retain ordering. This is partly because the primary use case is for maps, where only equality comparisons are required, and partly to allow the use of binary data types (which are currently unordered types) for the result. The specification may be revised in a future release to specify that ordering is preserved.
The fact that collation keys are ordered can be exploited in XQuery, whose order by
clause does not allow the collation to be selected dynamically. This restriction can be circumvented
by rewriting the clause order by $e/@key collation "URI"
as order by fn:collation-key($e/@key, $collation)
,
where $collation
allows the collation to be chosen dynamically.
Note that xs:base64Binary
becomes an ordered type
in XPath 3.1, making binary collation keys possible.