Troubleshooting Areas v6.2
The following topics provide information on specific problem areas you may encounter.
Java Runtime Errors
If errors are encountered regarding the Java Runtime Environment such as the Java program cannot be found or Java heap space errors, check the parameters set in the xDB Startup Configuration file xdbReplicationServer-xx.config
. See xDB Replication Configuration File for information on the xDB Startup Configuration file.
The following is an example of the content of the xDB Startup Configuration file:
If you make any changes to the parameters in the xDB Startup Configuration file, be sure to restart the publication server and subscription server after making the modifications.
Starting the Publication Server or Subscription Server
Note
The subscription server only applies to single-master replication systems.
If you cannot start the publication server or the subscription server perform the following steps:
Step 1: Check the pubserver.log
and subserver.log
files for errors.
Step 2: Check the log file of the database server running the controller database for errors.
Step 3: Verify that the user name and password in the xDB Replication Configuration file on the hosts running the publication server and subscription server match a database user name and password in the database server running the controller database that the publication server and subscription server are attempting to access.
Step 4: If the controller database is a Postgres database, verify that the pg_hba.conf
file of its Postgres database server has entries that allow access to the controller database from the IP addresses of the hosts running the publication server and subscription server by the user name in the xDB Replication Configuration file.
Deleting the Control Schema and Control Schema Objects
The control schema completely describes the replication system. The control schema and its control schema objects must be complete and correct in order for replication to occur properly. In addition, the configuration and maintenance operations performed through the xDB Replication Console or the xDB Replication Server CLI cannot be accomplished properly unless the control schema is complete and correct.
There may be occasions where the control schema becomes corrupted. Either one or more control schema tables containing metadata are inadvertently deleted, or the data within the control schema tables becomes corrupted. Typically, corruption occurs in the form of the first case – one or more control schema tables were deleted, or the entire control schema and its contents were deleted manually using an SQL utility rather than through the operation of the xDB Replication Console or xDB Replication Server CLI.
In these situations, there may be no other choice than to remove all of the remaining control schema objects using the database management system’s deletion functions, which effectively deletes all replication systems managed by the control schema.
The same control schema deletion procedure must be performed in all publication databases that share the same control schema information as the current controller database given in the xDB Replication Configuration file.
From the viewpoint of the xDB Replication Console replication tree, a publication server that connects to the controller database has subordinate to it, the publication databases sharing the same control schema information.
In the following example, the SMR publication database edb as well as the three MMR primary node databases mdnnode
, MMRnode_a
, and MMRnode_b
are all managed by the same publication server, which connects to the controller database designated in the xDB Replication Configuration file. Thus, all publication databases edb
, mdnnode
, MMRnode_a
, and MMRnode_b
contain what should be the same control schema information.
The control schema must be removed from all four publication databases if it is determined that the control schema is corrupted in any of the four publication databases.
Finally, the subscription databases of SMR systems contain a control schema object, which must be deleted as well.
In the preceding example, subscription database subdb
contains a control schema object that may have to be deleted if control schema deletion is performed on the publication database.
The instructions in this section describe how to completely remove all control schema objects created by the xDB Replication Server product leaving just your original publication tables and any replicated subscription tables or publication tables of multi-master system nodes. Hence, the definition and framework for all existing single-master and multi-master replication systems are deleted. In effect, this simulates the situation when you have installed the xDB Replication Server product for the first time.
After you have performed this deletion process, single-master replication systems must then be recreated following the directions in sections Creating a Publication onward. A multi-master replication system must be recreated following the directions in sections Creating a Publication onward.
Warning: Do not attempt this if any replication systems are running in production. All replication systems will become inoperable. This section describes what to look for in order to tell if the control schema is not complete, and if so, what must be deleted to completely remove the replication system. This section does not discuss the internal contents of the control schema objects. If all of the control schema objects are present, then review the checklist in Section Common Problem Checklist before proceeding with deletion of the control schema as it is fairly unlikely that the content of a control schema table becomes corrupted.
If you decide that you must delete all of the control schema objects, follow the steps as discussed in the following:
Step 1: Stop the publication server.
Step 2: Stop the subscription server.
Step 3: Look for the control schema objects contained within a publication database. In the example used in this section, pubuser is the publication database user name. The publication consists of two tables – dept and emp.
For Oracle only: See Oracle Control Schema Objects for a list of Oracle control schema objects.
For SQL Server only: See SQL Control Schema Objects for a list of SQL Server control schema objects.
For Postgres only: See PostgreSQL Control Schema Objects for a list of Postgres control schema objects.
Step 4: If the schema that is supposed to contain the control schema objects (the publication database user name for Oracle, or the control schema you created or selected when configuring a SQL Server publication database along with _edb_replicator_pub
, _edb_replicator_sub
, and _edb_scheduler
, or _edb_replicator_pub, _edb_replicator_sub
, and _edb_scheduler
for Postgres) is missing, or there are missing database objects under the control schema, then you may need to complete the process of removing all remaining control schema objects.
If you decide to undergo this procedure, you must remove the control schema objects from all publication databases. You must also remove all subscription metadata objects from the subscription databases. Proceed with Step 7 and repeat Step 7 for all publication databases. Then proceed with Step 8 and repeat Step 8 for all subscription databases.
If the control schema objects look intact, repeat Step 3 for all other publication databases. If the control schema objects of all publication databases appear intact, then proceed with Step 5.
Step 5: For single-master replication systems, the subscription database contains a single control schema object in the form of a table named rrep_txset_health
. See Subscription Metadata Object for a listing of this control schema object for each type of subscription database.
For each subscription database, verify the presence of this subscription metadata object.
Step 6: If at this point, all control schemas and control schema objects appear intact in all publication databases and all subscription databases, then chances are that the problem lies elsewhere. Do not go proceed with any further steps in this section. Instead, recheck the checklist in Section Common Problem Checklist.
If it was determined that incomplete control schema objects exist, and you decide to go ahead with the deletion process, proceed with Step 7.
Step 7: Repeat this step for every publication database to delete its control schema and control schema objects.
For Oracle only: If the publication user name still exists, then log onto SQL*Plus or any other Oracle database administration utility and drop all control schema objects owned by the publication user. Alternatively, you can drop the publication database user along with its database objects using the cascade option, but the publication database user must be recreated and privileges reassigned if you intend to rebuild your replication systems. See Section Preparing the Publication Database for directions on creating the publication database user. The following example illustrates use of the cascade option:
For SQL Server only: If any of the control schema objects listed in Step 3 still exist, then log onto the SQL Server command line program, sqlcmd, or SQL Server Management Studio and drop these objects. The following example assumes some of the control schema objects were created under schema pubuser
. The other control schema objects are created under _edb_replicator_pub, _edb_replicator_sub,
and _edb_scheduler
. The publication tables are dept
and emp
located in schema edb
.
The following example shows how to delete the jobs in the msdb
database:
The control schema objects under the _edb_replicator_pub
schema are dropped as shown by the following:
For SQL Server 2008 only: Drop the following control schema objects when the publication database is SQL Server 2008:
For SQL Server 2012, 2014 only: Drop the following control schema objects when the publication database is SQL Server 2012 or 2014:
Drop the _edb_replicator_pub
control schema:
1> USE edb; 2> GO
Changed database context to
edb
.1> DROP SCHEMA _edb_replicator_pub; 2> GO
The control schema objects under the_edb_replicator_sub
schema as well as the schema itself are dropped as shown by the following.
Note
(For SQL Server 2012, 2014): When the publication database is SQL Server 2012 or 2014, the first table in the following list, rrep_common_seq
, does not exist. Therefore do not issue the first DROP TABLE
_edb_replicator_sub.rrep_common_seq
command.
The control schema objects under the _edb_scheduler
schema as well as the schema itself are dropped as shown by the following:
The control schema objects under the pubuser
schema are dropped as shown by the following:
For Postgres only: If any of the schemas _edb_replicator_pub, _edb_replicator_sub,
or_edb_scheduler
still exist in the publication database, drop the schema and all of its database objects. The following example shows a connection established in psql to the publication database edb. The DROP SCHEMA CASCADE
statement is then used to drop the schemas.
For synchronization replication with the trigger-based method, in the schema containing the publication tables, drop the triggers and trigger functions associated with the publication tables:
Step 8: Repeat this step for every subscription database to delete its control schema and control schema object.
For single-master replication systems, the subscription database contains a single control schema object in the form of a table named rrep_txset_health. Delete this table in all subscription databases. For SQL Server and Postgres subscription databases, delete the parent schema _edb_replicator_sub as well.
For Oracle subscription databases, the parent schema is not generated by xDB Replication Server, so it your decision as to whether to keep or delete the parent schema.
For Oracle only: The RREP_TXSET_HEALTH
table is created in the subscription database user’s schema. Drop this table.
For SQL Server only: The rrep_txset_health
table is created in the schema named _edb_replicator_sub. Drop this table and schema.
For Postgres only: The rrep_txset_health
table is created in the schema named _edb_replicator_sub
. Drop this table and schema.
Step 9: In the xDB Replication Configuration file, delete the lines containing the following parameters: user, password, host, port, database
, and type
.
Keep the lines with the following parameters: admin_user, admin_password
, and license_key
(if it exists).
See xDB Replication Configuration File for information on the xDB Replication Configuration file. See Post Installation Host Environment for the file system location of the xDB Replication Configuration file.
The absence of these parameters prevents the publication server and subscription server from attempting to connect to this database upon publication and subscription server startup.
The xDB Replication Configuration file should appear as follows without the controller database connection and authentication information:
Step 10: Start the publication server.
Step 11: Start the subscription server.
Step 12: In the replication tree you should see the following:
Figure 10-5: Replication tree after removal of all control schema objects
All the nodes under the SMR and MMR type nodes beneath the Publication Server node, and under the Subscription Server node no longer appear.
Step 13: You will need to recreate the replication system as described in sections Creating a Publication onward for a single-master replication system. See sections Creating a Publication onward for a multi-master replication system.
Dropping Replication Slots for Log-Based Synchronization Replication
As described in Section Logical Replication Slots logical replication slots are used for the log-based method of synchronization replication. While a log-based replication system is in use, these replication slots remain connected to the Postgres databases. When the replication system is removed, these replication slots are also deleted.
There may be circumstances where it is desired to drop a Postgres database used in a replication system, but the replication system could not be removed according to the normal procedure of using the xDB Replication Console or the xDB Replication Server CLI.
In such cases, it is assumed that the replication system has somehow become corrupted, and it is simply desired to delete the replication system components including some of the databases used in the replication system.
When the log-based method is used, certain additional procedures may be required to remove the replication slots before dropping the databases. Postgres does not permit a database to be dropped if a replication slot is connected to it. The following describes how the replication slots can be removed in order to drop a database.
Warning: Do not attempt this if any replication systems are running in production. All replication systems will become inoperable.
Replication slots can be displayed by the following query on the database server containing the databases to be dropped:
The active column indicates whether or not the replication slot is active. To deactivate an active replication slot, first stop the publication server. If the active column of the replication slot now displays f for false then you can remove the replication slot.
If the replication slot is still active, then you can deactivate it by terminating the process shown in the active_pid
column with the following command:
The following now shows that replication slot xdb_79910_5
for database MMRnode
has been deactivated:
Drop the replication slot with the following command by specifying the slot name:
Now, the dropped replication slot does not appear when the pg_replication_slots
directory is queried:
The database can now be successfully dropped:
In addition, replication origins can be displayed with the following command:
The following command can be used to remove a replication origin:
The following shows this replication origin has been removed:
For additional information on logical decoding functions see Section 9.26.6 Replication Functions` under Section 9.26 System Administration Functions in the PostgreSQL Core Documentation located at:
https://www.postgresql.org/docs/current/static/functions-admin.html
After performing this process, it is unlikely that removal of the entire replication system can be done with the xDB Replication Console or the xDB Replication Server CLI. Complete removal of the remaining replication system components must be done manually. Part of this process is removing the control schema and control schema objects from the publication databases. See Section Dropping Replication Slots for Log-Based Synchronization Replication for information on this procedure.