Optimizing Snapshot Replication v6.2
This section discusses configuration options for improving snapshot replication performance.
Note
The options described in this section apply to the publication server only and are set in the publication server configuration file unless otherwise specified.
copyViaDBLinkOra
When the copyViaDBLinkOra
option is set to true, the Oracle database link API, dblink_ora
, is used instead of JDBC COPY to populate Advanced Server subscription tables from an Oracle publication during snapshot replication.
Oracle database link provides an additional performance improvement over JDBC COPY.
Note
The Oracle database link API feature is not available with PostgreSQL, therefore the copyViaDBLinkOra option is not applicable to PostgreSQL subscription tables.
Note
Prior to using dblink_ora
with xDB Replication Server, there are a number of required configuration steps that must be performed in Advanced Server. For Advanced Server versions 9.3 or earlier, see the readme text file, README-dblink_ora_setup.txt
located in the POSTGRES_INSTALL_HOME/doc/contrib
directory for directions. For Advanced Server versions 9.4 or later, see Chapter dblink_ora in the Database Compatibility for Oracle Developer’s Guide for directions.
copyViaDBLinkOra={true | false}
The default value is false.
useFastCopy
Set the useFastCopy
option to true
to skip Write-Ahead Log (WAL) logging during COPY operations in order to optimize data transfer speed.
The archive_mode
configuration parameter in the postgresql.conf
file of the target Postgres database server must be off (thereby disabling archiving of WAL data) in order to use the useFastCopy
option.
useFastCopy={true | false}
The default value is false.
cpBatchSize
Use the cpBatchSize
option to set the batch size (in Megabytes) that is used in the JDBC COPY operation during a snapshot. Increase the value of this option for large publication tables.
This option is influential when Postgres is the subscription database since the JDBC COPY
operation is used to load Postgres subscription tables.
This option has no effect when Oracle or SQL Server is the subscription database. To tune loading of Oracle or SQL Server tables alter the batchSize
option.
cpBatchSize=n
The default value for n
is 8
.
batchSize
The batchSize
option controls the number of INSERT
statements in a JDBC batch.
This option is particularly significant when Oracle or SQL Server is the subscription database since tables of these database types are loaded using JDBC batches of INSERT statements.
For a Postgres subscription database, tables are loaded using JDBC COPY
, however, if the COPY operation fails for some reason, then table loading is retried using JDBC batches of INSERT statements as in the case of Oracle and SQL Server.
batchSize=n
The default value for n is 100.
skipAnalyze
Set the skipAnalyze
option to true if you want to skip execution of the ANALYZE command after loading Postgres subscription tables. The ANALYZE
command gathers statistical information on the table contents. These statistics are used by the query planner.
skipAnalyze={true | false}
The default value is false.
snapshotParallelLoadCount
Note
To apply this option to a single-master replication system, it must be set for the subscription server within the subscription server configuration file. To apply this option to a multi-master replication system, it must be set for the publication server within the publication server configuration file.
The snapshotParallelLoadCount
option controls the number of threads used to perform snapshot data replication in parallel mode. The default behavior is to use a single thread. However, if the target system architecture contains multi-CPUs/cores
you can specify a value greater than 1, normally equal to the CPU/core
count, to fully utilize the system resources.
snapshotParallelLoadCount=n
The default value is 1.
lobBatchSize
If a table contains a column with a data type typically used for large objects such as BYTEA, BLOB, or CLOB, there is a greater possibility that a heap space error may occur because of a potentially large amount of data (hundreds of megabytes) brought into memory. In order to minimize the possibility of this error, a snapshot replication loads tables containing a large object data type, one row at a time using a single INSERT statement per batch.
If however, the large object data type column is known to contain relatively small amounts of data, you can increase the speed of a snapshot replication by increasing the value of the lobBatchSize
option to allow a greater number of rows (specified by n) in each batch.
lobBatchSize=n
The default value is 1.