The current working directory from which you invoke the Data Validator script runValidation.sh (runValidation.bat for Windows) must be the bin subdirectory containing the script (that is, XDB_HOME/bin).
For example, if the xDB Replication Server is installed into its default directory location, then issue the following command before invoking the Data Validator:
cd /opt/PostgreSQL/EnterpriseDB-xDBReplicationServer/bin
Similarly for Windows hosts, issue the following:
cd C:\Program Files\edb\EnterpriseDB-xDBReplicationServer\bin
The general command format for invoking the Data Validator is the following:
schema_name is the name of the schema in the source database containing the tables to be validated. The choices for option are listed later in this section within the Options subsection.
For Windows hosts, the command format is the following:
The following option displays the Data Validator version:
On Linux the version is displayed as follows:
On Windows the version is displayed as follows:
The following option displays the help information.
./runValidation.sh { –h | --help }
This is shown by the following:
The general syntax for all options except for --version and --help is shown by the following:
For clarity, the preceding syntax diagram shows only the single-character form of the option. The Options subsection lists both the single-character and multi-character forms of the options.
Specification of any database connection option (-sdbms through -tpw listed in the preceding syntax diagram) overrides the corresponding parameter in the datavalidator.properties file. Installation and Configuration for information on the datavalidator.properties file.
Options
-ss, --source-schema schema
The schema of the source database containing the tables to be compared against the target database.
-ts, --target-schema schema
The schema of the target database containing the tables to be compared against the source database. If omitted, the schema of the target database is the same schema as specified for the source database with the -ss option.
-it, --include-tables table_1 [,table_2 ] ...
The tables within the source schema that are to be included for comparison. If omitted, all tables within the source schema are compared against tables in the target schema with the exception of those tables excluded from comparison using the -et option. Note: There must be no white space between the comma and table names.
-et, --exclude-tables table_1 [,table_2 ] ...
The tables within the source schema that are to be excluded from comparison. If omitted, only those tables specified with the -it option are included for comparison. If both the -it and -et options are omitted, all source schema tables are included for comparison. Note: There must be no white space between the comma and table names.
-srs, --skip-rowsonlyin-source { true | false }
When true is specified, the logging of differences for rows that exist only in the source database table are skipped. The default is false.
-srt, --skip-rowsonlyin-target { true | false }
When true is specified, the logging of differences for rows that exist only in the target database table are skipped. The default is false.
-srb, --skip-rowsin-both { true | false }
When true is specified, the logging of differences for rows that exist both in the source and target database tables with the same primary key, but with different non-primary key values are skipped. The default is false.
-ld, --logging-dir log_directory_path
Directory path to where the Data Validator log and diff files are to be created and stored. If log_directory_path does not exist, Data Validator attempts to create it. If a full directory path is not specified log_directory_path is created or assumed to be located relative to the XDB_HOME/bin subdirectory where the runValidation.sh script is invoked. (That is, the logs directory is XDB_HOME/bin/log_directory_path.) Be sure the operating system account used to invoke the runValidation.sh script has the privileges to create the directory if it does not already exist, or to create files in the specified directory if it does already exist. If omitted, the default is the XDB_HOME/bin/logs directory.
-ds, --display-summary { true | false }
Specify true to display only the Data Validator summary. This omits the source and target database connection information as well as the detailed breakdown of the results by source database table. Specify false to display all of the Data Validator results. The type and amount of information that is displayed at the command line console when the Data Validator is invoked is the same information that is also stored in the log file for that run. If omitted, the default is false (that is, all of the Data Validator results is displayed).
-sdbms, --source-dbms database_type
The type of the source database server. Supported types are oracle, enterprisedb, sqlserver, sybase, and mysql.
-sh, --source-host host
The IP address or server name of the host on which the source database server is running.
-sp, --source-port port
The port number on which the source database server is listening for connections.
-sdb, --source-database dbname
The database name of the source database.
-su, --source-user user
The database user name for connecting to the source database.
-spw, --source-password password
The password of the source database user in unencrypted form.
-tdbms, --target-dbms database_type
The type of the target database server. Supported types are enterprisedb and oracle.
-th, --target-host host
The IP address or server name of the host on which the target database server is running.
-tp, --target-port port
The port number on which the target database server is listening for connections.
-tdb, --target-database dbname
The database name of the target database.
-tu, --target-user user
The database user name for connecting to the target database.
-tpw, --target-password password
The password of the target database user in unencrypted form.
-bs, --batch-size row_count
The -bs option specifies the number of rows to group in a batch to be used for comparison across the source and target database tables. For example, if a table contains 1000 rows, then a -bs setting of 100 requires 10 batch iterations to complete the comparison across the source and target databases. The Data Validator reads 100 rows, both from the source and target tables, and adds them in source and target buffers. The validation thread then reads the 100 rows from the source and target buffers and performs the comparison. It will then move to read and prepare the next 100 rows for comparison and so on. Note that the actual database round trips required to bring in 100 rows from the database depends on the -fs option for the fetch size. For example, an -fs setting of 100 needs just one round trip whereas an -fs setting of 10 requires 10 database round trips.
-fs, --fetch-size row_count
Performing data validation for tables that are quite large in size may cause the Data Validator to terminate with an out of heap space error when using the default fetch size of 5000 rows. Use the -fs option to specify a smaller fetch size to help avoid the out of heap space issue. The result set iteration will bring in as many rows as represented by the row_count value in a single database round trip.
Examples
The following examples use an Oracle source database and an Advanced Server target database to compare the tables in schema EDB on Oracle against the tables in schema public in Advanced Server.
The following lists the tables in schema EDB along with the content of tables DEPT and EMP in the Oracle source database:
The following lists the tables in schema public along with the content of tables dept and emp in the Advanced Server edb database:
Note the following differences:
The Oracle EDB schema contains one additional table named ORATAB that does not exist in the Advanced Server public schema.
The Oracle DEPT table contains one extra row with DEPTNO 50 that does not exist in the Advanced Server dept table.
The rows in the EMP table with EMPNO values 9001 and 9002 have column values that differ between the Oracle and Advanced Server tables
In this example, the JOBHIST table contains identical rows for both the Oracle and Advanced Server tables.
The content of the datavalidator.properties file is set as follows:
The following example compares all tables in the Oracle EDB schema against the Advanced Server public schema.
The Data Validator log files are created in directory /home/user/datavalidator_logs as specified with the -ld option. The operating system account used to invoke the runValidation.sh script has write access to the /home/user directory so the Data Validator can create the datavalidator_logs subdirectory.
The Data Validator output indicates the following:
There is one error in the DEPT table (the missing row).
There are two errors in the EMP table (the two rows with mismatching column values)
The JOBHIST table contains no errors.
The ORATAB table does not exist on the target database.
The following shows the files created in the Data Validator logs directory:
The log file contains the same content as displayed when the Data Validator is invoked. The diff file compares the differences where errors were detected.
The following is the diff file as displayed in a text editor:
Figure 9-1: Data Validator diff file
The following example includes only tables dept and emp with the -it option when comparing the Oracle EDB schema against the Advanced Server public schema.
The following example excludes tables ORATAB and jobhist with the -et option when comparing the Oracle EDB schema against the Advanced Server public schema. The -ds true option results in the display of only the Data Validator summary.
For this run, the corresponding log file contains only the Data Validator summary, omitting the source and target database connection information along with the error breakdown by table.