This function takes as input an XML document represented as a string, and returns the document node at the root of an XDM tree representing the parsed document.
fn:parse-xml
( $arg
as xs:string?
document-node(element(*))?
If $arg
is the empty sequence, the function returns the empty sequence.
The precise process used to construct the XDM instance is implementation-defined. In particular, it is implementation-defined whether DTD and/or schema validation is invoked, and it is implementation-defined whether an XML 1.0 or XML 1.1 parser is used.
The static base URI property from the static context of the fn:parse-xml
function call is used both as the base URI used by the XML parser to resolve relative
entity references within the document, and as the base URI of the document node that is
returned.
The document URI of the returned node is absent.
The function is not deterministic: that is, if the function is called twice with the same arguments, it is implementation-dependent whether the same node is returned on both occasions.
The expression fn:parse-xml("<alpha>abcd</alpha>")
returns a newly
created document node, having an alpha
element as its only child; the
alpha
element in turn is the parent of a text node whose string value
is "abcd"
.
A dynamic error is raised if the content of
$arg
is not a well-formed and namespace-well-formed XML document.
A dynamic error is raised if DTD-based validation is
carried out and the content of $arg
is not valid against its DTD.
Since the XML document is presented to the parser as a string, rather than as a sequence of octets, the encoding specified within the XML declaration has no meaning. If the XML parser accepts input only in the form of a sequence of octets, then the processor must ensure that the string is encoded as octets in a way that is consistent with rules used by the XML parser to detect the encoding.
The primary use case for this function is to handle input documents that contain nested
XML documents embedded within CDATA sections. Since the content of the CDATA section are
exposed as text, the receiving query or stylesheet may pass this text to the
fn:parse-xml
function to create a tree representation of the nested
document.
Similarly, nested XML within comments is sometimes encountered, and lexical XML is sometimes returned by extension functions, for example, functions that access web services or read from databases.
A use case arises in XSLT where there is a need to preprocess an input document before
parsing. For example, an application might wish to edit the document to remove its
DOCTYPE declaration. This can be done by reading the raw text using the
fn:unparsed-text
function, editing the resulting string, and then
passing it to the fn:parse-xml
function.