Compare CSV Data
DiffDog Server can compare data in comma-separated, semicolon-separated, and tab- separated files (CSV, TSV files) against each other as well as against data in DBs. Files that meet the following criteria are eligible for comparison:
•Each line (row) must contain the same number of fields throughout the file. The file extension is not important.
•The field separators must be one of the following: comma, semicolon, or tab.
Since a CSV file is in fact tabular data, you can compare it with data from a database table, view, or custom recordset.
You can compare CSV files in the same two ways as you compare DB data:
•Set up the comparison job in the client or server config file by using [datasource] and [datadiff].
•On Windows, create the comparison job in DiffDog Enterprise and save it as a .dbdif files, then run the job on the server.
You can use either method to also compare CSV data against DB data. Both methods are described in more detail below.
Comparison in the config file
A CSV comparison can be set up in either the client config file or server config file as follows.
1.Add a data source that represents the "left" side of comparison to the configuration file. For CSV files, a typical data source would look like this.
[datasource:left_data]
type=csv
path=C:\left.csv
separator=comma
key-columns-by-ordinal=1
has-header-row=false
2.Add the second data source (the "right" side of the comparison).
[datasource:right_data]
type=csv
path=C:\right.csv
separator=comma
key-columns-by-ordinal=1
has-header-row=false
3.Add a data comparison job where you indicate what is the left and right data source, the columns to be compared, and other parameters. The map option lets you specify which columns take part in the comparison (the first and second columns, in this example).
[datadiff:reports]
left=left_data
right=right_data
map=1=>1,2=>2
output-mode=xml
In this example, the output of the CSV comparison is reported as XML, as shown by the output-mode=xml option. If the right side of the comparison is a database, you can also use output-mode=sql, and in this way you can obtain SQL scripts to merge data from the left to the right side. By default, the report is displayed at the command line, but you can optionally redirect it to a file, by adding the client-output-path option: client-output-path=/home/report.xml
4.Call the executable and run the data-diff command. In the code listing below, the configuration file is supplied using the --c option and the job name is supplied as the argument:
diffdogcmdlclient data-diff --c=client_config.ini reports
You can also set or override the output format for the comparison report at the command line, by adding the --output-mode option, for example:
diffdogcmdlclient data-diff --c=client_config.ini reports --output-mode=xml
Running comparisons created in DiffDog Enterprise (Windows only)
This option uses a comparison file (in .dbdif format) created with DiffDog Enterprise. To run the .dbdif file, DiffDog Server must run on Windows. This is because the .dbdif files created with DiffDog store paths using Windows syntax.
To run the comparison, do the following:
1.Create the .dbdif file in DiffDog Enterprise. See the DiffDog user manual for instructions about how to do this.
2.On the server side, use either the server or client executable to call the run command:
diffdogserver run <dbdiff-file>
diffdogcmdlclient run <dbdiff-file>
Note the following points:
•For the comparison to run successfully, all the CSV file paths that were added on the desktop machine must be valid on the server machine.
•If database connections are involved, the server machine must be configured and capable of handling the connection. For example, if the .dbdif file includes a connection that requires on ODBC driver from the database vendor, then that driver must be installed on the server machine. See Supported Databases.