Executing Python Scripts through callback from RaptorXML¶
Creating Python Callback Scripts¶
RaptorXML invokes python scripts after a job is finished. These scripts are normal Python script files which define one or more of the RaptorXML Python API Entry-point functions and are passed to RaptorXML (see Passing a Python Callback Script to RaptorXML). The overall structure of a Python callback script used to access the Python interface is as follows. Notice how the entry-point Python function is defined.
# 1 imports
import os
from altova import xml, xsd, xbrl
# 2 entry point
def on_xsi_finished(job,instance):
filename = os.path.join(job.output_dir,'script_out.txt')
job.append_output_filename(filename)
f = open(filename,'w')
# 3 do something with the instance object, write output to f
f.close()
# 4 other entry points, helper classes or functions
CodeBlock-1
...
CodeBlock-N
Description of the Python script structure shown above:
Imports Python’s built-in os module, and then some of the RaptorXML specific modules from the
altova
package.The entry-point Python function
on_xsi_finished
(Supported Python API Callbacks).Your application logic goes here, do something with the instance object, write output to f.
Additional blocks of code, each containing function definitions or other code.
Note
Please keep in mind that the altova.xbrl.*
modules are only available in RaptorXML+XBRL .
The line def on_xsi_finished(job,instance) declares the entry-point Python function.
This is the function that is invoked after RaptorXML+XBRL Server has executed the command
valxml-withxsd
(xsi).The job and instance arguments are provided by RaptorXML+XBRL Server.
The filename variable is constructed by joining
job.output_dir
and the name of the file.In case of HTTP it will use the temporary job output directory on the server.
In case of command line invocation the working directory is used.
The
job.append_output_filename
function appends a filename to the job output.
Passing a Python Callback Script to RaptorXML¶
RaptorXML command line¶
Python scripts for callback invocation are passed to RaptorXML+XBRL by giving the script’s URL as the value of the --script
option.
The --script
option to invoke Python callback scripts is supported for the following commands:
valxml-withxsd
(xsi)valdtd
(dtd)valxsd
(xsd)valxbrltaxonomy
(dts)valxbrl
(xbrl)valinlinexbrl
(ixbrl)
These commands can be used on the command line interface or via the HTTP interface. Here are examples of usage with the different commands:
raptorxmlxbrl xsi --script=xml.py --script-api-version=2.8.6 --streaming=false test.xml
raptorxmlxbrl xsd --script=xsd.py --script-api-version=2.8.6 test.xsd
raptorxmlxbrl dts --script=dts.py --script-api-version=2.8.6 test.xsd
raptorxmlxbrl xbrl --script=xbrl.py --script-api-version=2.8.6 test.xbrl
raptorxmlxbrl ixbrl --script=inlinexbrl.py --script-api-version=2.8.6 test.htm
Note
When using the --script
option with the valxml-withxsd
command, make sure to specify --streaming=false
.
Otherwise a the script will not be executed and a warning is issued.
Note
The --script-api-version=2.8.6
option is optional and defaults to the latest RaptorXML Python API version.
When it is important that you use an exact version of the api (e.g. after upgrades when RaptorXML+XBRL might update the default) it is suggested to specify this version explicitly.
RaptorXML+XBRL Server¶
A Python callback script is passed with the script
option in the JSON job description of the following commands:
valxml-withxsd
(xsi)valdtd
(dtd)valxsd
(xsd)valxbrltaxonomy
(dts)valxbrl
(xbrl)valinlinexbrl
(ixbrl)
{
...
"script": "myscript.py"
...
}
Secure Python Script execution on RaptorXML+XBRL Server¶
When a Python callback script is specified in a command via HTTP to RaptorXML+XBRL Server, the script will only work if it is located in the
trusted directory (Server Setup).
The trusted directory is specified in the server.script-root-dir
setting of the server configuration file etc/server_config.xml
,
and a trusted directory must be specified if you wish to use Python callback scripts.
The script is executed from the trusted directory (or any sub-directory). Specifying a Python script from any other directory will result in an error.
Make sure that all Python scripts to be used are saved in this directory.
All output generated by the server for HTTP job requests is written to the job output directory
(which is a sub-directory of the output-root-directory
). This security restriction does not apply to Python scripts executed as callbacks on the command line,
which can write to any location.
RaptorXML Python API Entry-point functions¶
The commands that allow access to the Python interface are validation commands
and the Python script will be executed regardless of the validation outcome.
After validation has completed successfully, RaptorXML+XBRL Server will call a specific function, according to which command was executed.
The called function (see table below), therefore, must be defined in the Python script.
It must be defined with two parameters: the first is the job object, the second parameter varies according to which command was executed (see table).
The second parameter will be None
if the validation failed.
Command |
Function called by RaptorXML+XBRL Server |
---|---|
|
on_xsi_finished( job, xml-instance ) |
|
on_dtd_finished( job, dtd ) (since v2.1) |
|
on_xsd_finished( job, schema ) |
|
on_dts_finished( job, dts ) |
|
on_xbrl_finished( job, xbrl-instance ) |
|
on_ixbrl_finished( job, document-set, target-documents ) |
Passing arguments to the Python Callback Scripts¶
After the command has been successfully submitted, RaptorXML calls the entry-point Python function related to the executed command with the two arguments.
One can supply one or more arguments to the entry-point function using the --script-param
option:
raptorxmlxbrl xsd --script=xsd.py --script-param="key1:value1" --script-param="key2:value2" test.xsd
In the entry-point function the arguments can be accessed through job.script_params
dictionary.
v = job.script_params['key1'] # v will receive value1 as string
For RatorXML Server the additional parameters are passed through the script-param
array in the JSON job description like this:
{
...
"script-param": [{"key1": value1}, {"key2":value2}, ...]
...
}
Executing Python Scripts through raptorxml-python¶
RaptorXML Server comes with a custom python interpreter.
It acts as a drop-in replacement for a standard python3 interpreter and includes complete support for all RaptorXML Python API modules.
The custom Python interpreter has the same name as the command line tool with -python appended (e.g. raptorxml-python
).
To execute a python script with raptorxml-python
simply pass it’s name as argument:
raptorxml-python myscript.py
Importing RaptorXML Python API modules from raptorxml-python¶
During Python API callback script invocation the API version is specified with the --script-api-version
option.
For raptorxml-python
all parameters are processed directly by the python interpreter and no API version specific initialization occurs.
The API version can be selected by importing the specific version of the RaptorXML Python API modules directly.
import altova_api.v2.xml as xml
import altova_api.v2.xbrl as xbrl
...
Note
For most user scripts raptorxml-python
behaves exactly like raptorxml script
. These import statements work also in RaptorXML Python API callback scripts.
During an interactive session, it is sometimes convenient to import all RaptorXML Python API modules at once with a single import statement:
from altova_api.v2 import *
Caution
Please note that import *
is generally not recommended for production code as it can cause unwanted side-effects, e.g. by hiding built-in and previously imported symbols.
Extending the custom RaptorXML Python interpreter with pip¶
RaptorXML Server can be extended with 3rd-party packages using python pip:
raptorxml-python -m pip install pyodbc
Note
Depending on your platform and installation location you might need administrator privileges to install python extension packages into RaptorXML Server.
The installed modules are avaliable to any python script executed with RaptorXML independent from the invocation method.
Caution
Altova GmbH does not provide support for user installed 3rd-party modules.
Danger
RaptorXML Server already comes with some extension modules pre-installed. These modules are required for RaptorXML Server and must not be changed, upgraded or uninstalled.
This command lists all python packages that are installed into RaptorXML Server 2024:
raptorxml-python -m pip list
annotated-types (0.5.0)
asn1crypto (1.5.1)
autocommand (2.2.2)
cffi (1.15.1)
cheroot (10.0.0)
CherryPy (18.8.0)
cryptography (41.0.3)
Cython (0.29.36)
Genshi (0.7.7)
inflect (7.0.0)
jaraco.collections (4.3.0)
jaraco.context (4.3.0)
jaraco.functools (3.9.0)
jaraco.text (3.11.1)
jws (0.1.3)
more-itertools (10.1.0)
pip (23.2.1)
portend (3.2.0)
pycparser (2.21)
pydantic (2.3.0)
pydantic_core (2.6.3)
PyJWT (2.8.0)
pytz (2023.3.post1)
setuptools (65.5.0)
six (1.16.0)
tempora (5.5.0)
typing_extensions (4.8.0)
ws4py (0.5.1)
zc.lockfile (3.0.post1)
Python extension packages with native code¶
The raptorxml-python -m pip install
command is capable to build native code python extension packages.
All required libraries and header files are included in the RaptorXML Server distribution.
Some 3rd-party extension packages might have additional build dependencies which you have to provide yourself. For example on Ubuntu Linux you need to install the unixodbc-dev platform package before you can install the pyodbc module into RaptorXML Server:
sudo apt-get install unixodbc-dev
sudo raptorxml-python -m pip install pyodbc
To build native extension packages the same compiler that was used to build RaptorXML Server should be used.
RaptorXML Server Release |
Windows |
MacOSX |
Linux |
---|---|---|---|
v2024 |
VS2022 |
clang >= 13.0 |
gcc >= 10.0 |
Python extension packages which install new scripts or executables: white-space in install path¶
Some 3rd-party extension modules install scripts or executables (e.g. jupyter).
These fail to execute if the RaptorXML Server install path contains white-spaces (e.g. on windows c:\Program Files\Altova\...
).
These 3rd-party extension modules have to be installed using the path without white-spaces.
On Windows platforms this can be achieved by using the short path form for all parts that contain white-spaces:
C:\PROGRA~1\Altova\RaptorXMLXBRLServer2024\bin\RaptorXMLXBRL-python.exe -m pip install jupyter
Tip
On Windows the short name without white-space for a folder can be obtained using dir /X
.
Example Scripts¶
Examples scripts are hosted on GitHub .