Optimizing snapshot replication v7
You can set various configuration options to improve snapshot replication performance.
These options apply only to the publication server. You set them in the publication server configuration file unless otherwise specified.
copyViaDBLinkOra
When the copyViaDBLinkOra
option is set to true, the Oracle database link API, dblink_ora
, is used instead of JDBC COPY
to populate EDB Postgres Advanced Server subscription tables from an Oracle publication during snapshot replication.
Oracle database link provides an additional performance improvement over JDBC COPY
.
Note
The Oracle database link API feature isn't available with PostgreSQL, therefore the copyViaDBLinkOra option doesn't apply to PostgreSQL subscription tables.
Note
Prior to using dblink_ora
with Replication Server, you must perform some required configuration steps in EDB Postgres Advanced Server. For EDB Postgres Advanced Server versions 9.3 or earlier, see README-dblink_ora_setup.txt
located in the POSTGRES_INSTALL_HOME/doc/contrib
directory for instructions. For EDB Postgres Advanced Server versions 9.4 or later, see dblink_ora in the Database Compatibility for Oracle Developer’s Guide.
copyViaDBLinkOra={true | false}
The default value is false.
useFastCopy
Set the useFastCopy
option to true
to skip write-ahead log (WAL) logging during COPY
operations to optimize data transfer speed.
The archive_mode
configuration parameter in the postgresql.conf
file of the target Postgres database server must be off (thereby disabling archiving of WAL data) to use the useFastCopy
option.
useFastCopy={true | false}
The default value is false.
cpBatchSize
Use the cpBatchSize
option to set the batch size (in MB) to use in the JDBC COPY
operation during a snapshot. Increase the value of this option for large publication tables.
This option is influential when Postgres is the subscription database since the JDBC COPY
operation is used to load Postgres subscription tables.
This option has no effect when Oracle or SQL Server is the subscription database. To tune loading of Oracle or SQL Server tables, change the batchSize
option.
cpBatchSize=n
The default value for n is 8.
batchSize
The batchSize
option controls the number of INSERT
statements in a JDBC batch.
This option is particularly important when Oracle or SQL Server is the subscription database since tables of these database types are loaded using JDBC batches of INSERT
statements.
For a Postgres subscription database, tables are loaded using JDBC COPY
. However, if the COPY
operation fails, then table loading is retried using JDBC batches of INSERT
statements as in the case of Oracle and SQL Server.
batchSize=n
The default value for n is 100.
skipAnalyze
Set the skipAnalyze
option to true if you want to skip executing the ANALYZE
command after loading Postgres subscription tables. The ANALYZE
command gathers statistical information on the table contents. These statistics are used by the query planner.
skipAnalyze={true | false}
The default value is false.
snapshotParallelLoadCount
Note
To apply this option to a single-master replication system, you must set it for the subscription server in the subscription server configuration file. To apply this option to a multi-master replication system, set it for the publication server in the publication server configuration file.
The snapshotParallelLoadCount
option controls the number of threads used to perform snapshot data replication in parallel mode. The default behavior is to use a single thread. However, if the target system architecture contains multi-CPUs/cores
, you can specify a value greater than 1, normally equal to the CPU/core
count, to fully use the system resources.
snapshotParallelLoadCount=n
The default value is 1.
lobBatchSize
If a table contains a column with a data type typically used for large objects such as BYTEA, BLOB, or CLOB, a heap space error is more likely to occur because of a potentially large amount of data (hundreds of MB) brought into memory. To minimize the chance of this error, a snapshot replication loads tables containing a large object data type, one row at a time using a single INSERT
statement per batch.
If, however, the large object data type column is known to contain relatively small amounts of data, you can increase the speed of a snapshot replication by increasing the value of the lobBatchSize
option to allow a greater number of rows (specified by n) in each batch.
lobBatchSize=n
The default value is 1.