Replication sets v5

Suggest edits

A replication set is a group of tables that a PGD node can subscribe to. You can use replication sets to create more complex replication topologies than regular symmetric multi-master where each node is an exact copy of the other nodes.

Every PGD group creates a replication set with the same name as the group. This replication set is the default replication set, which is used for all user tables and DDL replication. All nodes are subscribed to it. In other words, by default all user tables are replicated between all nodes.

Using replication sets

You can create replication sets using create_replication_set(), specifying whether to include insert, update, delete, or truncate actions. One option lets you add existing tables to the set, and a second option defines whether to add tables when they are created.

You can also manually define the tables to add or remove from a replication set.

Tables included in the replication set are maintained when the node joins the cluster and afterwards.

Once the node is joined, you can still remove tables from the replication set, but you must add new tables using a resync operation.

By default, a newly defined replication set doesn't replicate DDL or PGD administration function calls. Use replication_set_add_ddl_filter to define the commands to replicate.

PGD creates replication set definitions on all nodes. Each node can then be defined to publish or subscribe to each replication set using alter_node_replication_sets.

You can use functions to alter these definitions later or to drop the replication set.

Note

Don't use the default replication set for selective replication. Don't drop or modify the default replication set on any of the PGD nodes in the cluster as it is also used by default for DDL replication and administration function calls.

Behavior of partitioned tables

PGD supports partitioned tables transparently, meaning that you can add a partitioned table to a replication set. Changes that involve any of the partitions are replicated downstream.

Note

When partitions are replicated through a partitioned table, the statements executed directly on a partition are replicated as they were executed on the parent table. The exception is the TRUNCATE command, which always replicates with the list of affected tables or partitions.

You can add individual partitions to the replication set, in which case they are replicated like regular tables (to the table of the same name as the partition on the downstream). This has some performance advantages if the partitioning definition is the same on both provider and subscriber, as the partitioning logic doesn't have to be executed.

Note

If a root partitioned table is part of any replication set, memberships of individual partitions are ignored. only the membership of that root table is taken into account.

Behavior with foreign keys

A foreign key constraint ensures that each row in the referencing table matches a row in the referenced table. Therefore, if the referencing table is a member of a replication set, the referenced table must also be a member of the same replication set.

The current version of PGD doesn't automatically check for or enforce this condition. When adding a table to a replication set, the database administrator must make sure that all the tables referenced by foreign keys are also added.

You can use the following query to list all the foreign keys and replication sets that don't satisfy this requirement. The referencing table is a member of the replication set, while the referenced table isn't:

SELECT t1.relname,
       t1.nspname,
       fk.conname,
       t1.set_name
  FROM bdr.tables AS t1
  JOIN pg_catalog.pg_constraint AS fk
    ON fk.conrelid = t1.relid
   AND fk.contype = 'f'
 WHERE NOT EXISTS (
  SELECT *
    FROM bdr.tables AS t2
   WHERE t2.relid = fk.confrelid
     AND t2.set_name = t1.set_name
);

The output of this query looks like the following:

 relname | nspname |  conname  | set_name
---------+---------+-----------+----------
 t2      | public  | t2_x_fkey | s2
(1 row)

This means that table t2 is a member of replication set s2, but the table referenced by the foreign key t2_x_fkey isn't.

The TRUNCATE CASCADE command takes into account the replication set membership before replicating the command. For example:

TRUNCATE table1 CASCADE;

This becomes a TRUNCATE without cascade on all the tables that are part of the replication set only:

TRUNCATE table1, referencing_table1, referencing_table2 ...

Replication set management

Management of replication sets.

With the exception of bdr.alter_node_replication_sets, the following functions are considered to be DDL. DDL replication and global locking apply to them, if that's currently active. See DDL replication.

bdr.create_replication_set

This function creates a replication set.

Replication of this command is affected by DDL replication configuration including DDL filtering settings.

Synopsis

bdr.create_replication_set(set_name name,
                           replicate_insert boolean DEFAULT true,
                           replicate_update boolean DEFAULT true,
                           replicate_delete boolean DEFAULT true,
                           replicate_truncate boolean DEFAULT true,
                           autoadd_tables boolean DEFAULT false,
                           autoadd_existing boolean DEFAULT true)

Parameters

set_name — Name of the new replication set. Must be unique across the PGD group.
replicate_insert — Indicates whether to replicate inserts into tables in this replication set.
replicate_update — Indicates whether to replicate updates of tables in this replication set.
replicate_delete — Indicates whether to replicate deletes from tables in this replication set.
replicate_truncate — Indicates whether to replicate truncates of tables in this replication set.
autoadd_tables — Indicates whether to replicate newly created (future) tables to this replication set
autoadd_existing — Indicates whether to add all existing user tables to this replication set. This parameter has an effect only if autoadd_tables is set to true.

Notes

By default, new replication sets don't replicate DDL or PGD administration function calls. See ddl filters for how to set up DDL replication for replication sets. A preexisting DDL filter is set up for the default group replication set that replicates all DDL and admin function calls. It's created when the group is created but can be dropped in case you don't want the PGD group default replication set to replicate DDL or the PGD administration function calls.

This function uses the same replication mechanism as DDL statements. This means that the replication is affected by the ddl filters configuration.

The function takes a DDL global lock.

This function is transactional. You can roll back the effects with the ROLLBACK of the transaction. The changes are visible to the current transaction.

bdr.alter_replication_set

This function modifies the options of an existing replication set.

Replication of this command is affected by DDL replication configuration, including DDL filtering settings.

Synopsis

bdr.alter_replication_set(set_name name,
                          replicate_insert boolean DEFAULT NULL,
                          replicate_update boolean DEFAULT NULL,
                          replicate_delete boolean DEFAULT NULL,
                          replicate_truncate boolean DEFAULT NULL,
                          autoadd_tables boolean DEFAULT NULL)

Parameters

set_name — Name of an existing replication set.
replicate_insert — Indicates whether to replicate inserts into tables in this replication set.
replicate_update — Indicates whether to replicate updates of tables in this replication set.
replicate_delete — Indicates whether to replicate deletes from tables in this replication set.
replicate_truncate — Indicates whether to replicate truncates of tables in this replication set.
autoadd_tables — Indicates whether to add newly created (future) tables to this replication set.

Any of the options that are set to NULL (the default) remain the same as before.

Notes

This function uses the same replication mechanism as DDL statements. This means the replication is affected by the ddl filters configuration.

The function takes a DDL global lock.

This function is transactional. You can roll back the effects with the ROLLBACK of the transaction. The changes are visible to the current transaction.

bdr.drop_replication_set

This function removes an existing replication set.

Replication of this command is affected by DDL replication configuration, including DDL filtering settings.

Synopsis

bdr.drop_replication_set(set_name name)

Parameters

set_name — Name of an existing replication set.

Notes

This function uses the same replication mechanism as DDL statements. This means the replication is affected by the ddl filters configuration.

The function takes a DDL global lock.

This function is transactional. You can roll back the effects with the ROLLBACK of the transaction. The changes are visible to the current transaction.

Warning

Don't drop a replication set that's being used by at least another node, because doing so stops replication on that node. If that happens, unsubscribe the affected node from that replication set. For the same reason, don't drop a replication set with a join operation in progress when the node being joined is a member of that replication set. Replication set membership is checked only at the beginning of the join. This happens because the information on replication set usage is local to each node, so that you can configure it on a node before it joins the group.

You can manage replication set subscription for a node using alter_node_replication_sets.

bdr.alter_node_replication_sets

This function changes the replication sets a node publishes and is subscribed to.

Synopsis

bdr.alter_node_replication_sets(node_name name,
                                set_names text[])

Parameters

node_name — The node to modify. Currently has to be local node.
set_names — Array of replication sets to replicate to the specified node. An empty array results in the use of the group default replication set.

Notes

This function is executed only on the local node and isn't replicated in any manner.

The replication sets listed aren't checked for existence, since this function is designed to execute before the node joins. Be careful to specify replication set names correctly to avoid errors.

This allows for calling the function not only on the node that's part of the PGD group but also on a node that hasn't joined any group yet. This approach limits the data synchronized during the join. However, the schema is always fully synchronized without regard to the replication sets setting. All tables are copied across, not just the ones specified in the replication set. You can drop unwanted tables by referring to the bdr.tables catalog table. These might be removed automatically in later versions of PGD. This is currently true even if the ddl filters configuration otherwise prevent replication of DDL.

The replication sets that the node subscribes to after this call are published by the other nodes for actually replicating the changes from those nodes to the node where this function is executed.

Replication set membership

You can add tables to or remove them from one or more replication sets. This affects replication only of changes (DML) in those tables. Schema changes (DDL) are handled by DDL replication set filters (see DDL replication filtering).

The replication uses the table membership in replication sets with the node replication sets configuration to determine the actions to replicate to which node. The decision is done using the union of all the memberships and replication set options. Suppose that a table is a member of replication set A that replicates only INSERT actions and replication set B that replicates only UPDATE actions. Both INSERT and UPDATE act8ions are replicated if the target node is also subscribed to both replication set A and B.

bdr.replication_set_add_table

This function adds a table to a replication set.

This adds a table to a replication set and starts replicating changes from this moment (or rather transaction commit). Any existing data the table might have on a node isn't synchronized.

Replication of this command is affected by DDL replication configuration, including DDL filtering settings.

Synopsis

bdr.replication_set_add_table(relation regclass,
                              set_name name DEFAULT NULL,
                              columns text[] DEFAULT NULL,
                              row_filter text DEFAULT NULL)

Parameters

relation — Name or Oid of a table.
set_name — Name of the replication set. If NULL (the default), then the PGD group default replication set is used.
columns — Reserved for future use (currently does nothing and must be NULL).
row_filter — SQL expression to be used for filtering the replicated rows. If this expression isn't defined (that is, set to NULL, the default) then all rows are sent.

The row_filter specifies an expression producing a Boolean result, with NULLs. Expressions evaluating to True or Unknown replicate the row. A False value doesn't replicate the row. Expressions can't contain subqueries or refer to variables other than columns of the current row being replicated. You can't reference system columns.

row_filter executes on the origin node, not on the target node. This puts an additional CPU overhead on replication for this specific table but completely avoids sending data for filtered rows. Hence network bandwidth is reduced and overhead on the target node is applied.

row_filter never removes TRUNCATE commands for a specific table. You can filter away TRUNCATE commands at the replication set level.

You can replicate just some columns of a table. See Replicating between nodes with differences.

Notes

This function uses the same replication mechanism as DDL statements. This means that the replication is affected by the ddl filters configuration.

The function takes a DML global lock on the relation that's being added to the replication set if the row_filter isn't NULL. Otherwise it takes just a DDL global lock.

This function is transactional. You can roll back the effects with the ROLLBACK of the transaction. The changes are visible to the current transaction.

bdr.replication_set_remove_table

This function removes a table from the replication set.

Replication of this command is affected by DDL replication configuration, including DDL filtering settings.

Synopsis

bdr.replication_set_remove_table(relation regclass,
                                 set_name name DEFAULT NULL)

Parameters

relation — Name or Oid of a table.
set_name — Name of the replication set. If NULL (the default), then the PGD group default replication set is used.

Notes

This function uses the same replication mechanism as DDL statements. This means the replication is affected by the ddl filters configuration.

The function takes a DDL global lock.

This function is transactional. You can roll back the effects with the ROLLBACK of the transaction. The changes are visible to the current transaction.

Listing replication sets

You can list existing replication sets with the following query:

SELECT set_name
FROM bdr.replication_sets;

You can use this query to list all the tables in a given replication set:

SELECT nspname, relname
FROM bdr.tables
WHERE set_name = 'myrepset';

In Behavior with foreign keys, we show a query that lists all the foreign keys whose referenced table isn't included in the same replication set as the referencing table.

Use the following SQL to show those replication sets that the current node publishes and subscribes from:

 SELECT node_id,
        node_name,
        pub_repsets,
        sub_repsets
   FROM bdr.local_node_summary;

This code produces output like this:

  node_id   | node_name |    pub_repsets     |   sub_repsets
------------+-----------+----------------------------------------
 1834550102 | s01db01   | {bdrglobal,bdrs01} | {bdrglobal,bdrs01}
(1 row)

To execute the same query against all nodes in the cluster, you can use the following query. This approach gets the replication sets associated with all nodes at the same time.

WITH node_repsets AS (
  SELECT jsonb_array_elements(
    bdr.run_on_all_nodes($$
        SELECT
          node_id,
          node_name,
          pub_repsets,
          sub_repsets
		FROM bdr.local_node_summary;
    $$)::jsonb
  ) AS j
)
SELECT j->'response'->'command_tuples'->0->>'node_id' AS node_id,
       j->'response'->'command_tuples'->0->>'node_name' AS node_name,
       j->'response'->'command_tuples'->0->>'pub_repsets' AS pub_repsets,
       j->'response'->'command_tuples'->0->>'sub_repsets' AS sub_repsets
FROM node_repsets;

This shows, for example:

  node_id   | node_name |    pub_repsets     |    sub_repsets
------------+-----------+----------------------------------------
 933864801  | s02db01   | {bdrglobal,bdrs02} | {bdrglobal,bdrs02}
 1834550102 | s01db01   | {bdrglobal,bdrs01} | {bdrglobal,bdrs01}
 3898940082 | s01db02   | {bdrglobal,bdrs01} | {bdrglobal,bdrs01}
 1102086297 | s02db02   | {bdrglobal,bdrs02} | {bdrglobal,bdrs02}
(4 rows)

DDL replication filtering

By default, the replication of all supported DDL happens by way of the default PGD group replication set. This is achieved with a DDL filter with the same name as the PGD group. This filter is added to the default PGD group replication set when the PGD group is created.

You can adjust this by changing the DDL replication filters for all existing replication sets. These filters are independent of table membership in the replication sets. Just like data changes, each DDL statement is replicated only once, even if it's matched by multiple filters on multiple replication sets.

You can list existing DDL filters with the following query, which shows for each filter the regular expression applied to the command tag and to the role name:

SELECT * FROM bdr.ddl_replication;

You can use the following functions to manipulate DDL filters. They are considered to be DDL and are therefore subject to DDL replication and global locking.

bdr.replication_set_add_ddl_filter

This function adds a DDL filter to a replication set.

Any DDL that matches the given filter is replicated to any node that's subscribed to that set. This also affects replication of PGD admin functions.

This doesn't prevent execution of DDL on any node. It only alters whether DDL is replicated to other nodes. Suppose two nodes have a replication filter between them that excludes all index commands. Index commands can still be executed freely by directly connecting to each node and executing the desired DDL on that node.

The DDL filter can specify a command_tag and role_name to allow replication of only some DDL statements. The command_tag is the same as those used by EVENT TRIGGERs for regular PostgreSQL commands. A typical example might be to create a filter that prevents additional index commands on a logical standby from being replicated to all other nodes.

You can filter the PGD admin functions used by using a tagname matching the qualified function name. For example, bdr.replication_set_add_table is the command tag for the function of the same name. In this case, this tag allows all PGD functions to be filtered using bdr.*.

The role_name is used for matching against the current role that is executing the command. Both command_tag and role_name are evaluated as regular expressions, which are case sensitive.

Synopsis

bdr.replication_set_add_ddl_filter(set_name name,
                                   ddl_filter_name text,
                                   command_tag text,
                                   role_name text DEFAULT NULL,
                                   base_relation_name text DEFAULT NULL,
                                   query_match text DEFAULT NULL,
                                   exclusive boolean DEFAULT FALSE)

Parameters

set_name — name of the replication set; if NULL then the PGD group default replication set is used
ddl_filter_name — name of the DDL filter; this must be unique across the whole PGD group
command_tag — regular expression for matching command tags; NULL means match everything
role_name — regular expression for matching role name; NULL means match all roles
base_relation_name — reserved for future use, must be NULL
query_match — regular expression for matching the query; NULL means match all queries
exclusive — if true, other matched filters are not taken into consideration (that is, only the exclusive filter is applied), when multiple exclusive filters match, we throw an error. This is useful for routing specific commands to specific replication set, while keeping the default replication through the main replication set.

Notes

This function uses the same replication mechanism as DDL statements. This means that the replication is affected by the ddl filters configuration. This also means that replication of changes to ddl filter configuration is affected by the existing ddl filter configuration.

The function takes a DDL global lock.

This function is transactional. You can roll back the effects with the ROLLBACK of the transaction. The changes are visible to the current transaction.

To view the defined replication filters, use the view bdr.ddl_replication.

Examples

To include only PGD admin functions, define a filter like this:

SELECT bdr.replication_set_add_ddl_filter('mygroup', 'mygroup_admin', $$bdr\..*$$);

To exclude everything apart from index DDL:

SELECT bdr.replication_set_add_ddl_filter('mygroup', 'index_filter',
				'^(?!(CREATE INDEX|DROP INDEX|ALTER INDEX)).*');

To include all operations on tables and indexes but exclude all others, add two filters: one for tables, one for indexes. This shows that multiple filters provide the union of all allowed DDL commands:

SELECT bdr.replication_set_add_ddl_filter('bdrgroup','index_filter', '^((?!INDEX).)*$');
SELECT bdr.replication_set_add_ddl_filter('bdrgroup','table_filter', '^((?!TABLE).)*$');

bdr.replication_set_remove_ddl_filter

This function removes the DDL filter from a replication set.

Replication of this command is affected by DDL replication configuration, including DDL filtering settings themselves.

Synopsis

bdr.replication_set_remove_ddl_filter(set_name name,
                                      ddl_filter_name text)

Parameters

set_name — Name of the replication set. If NULL then the PGD group default replication set is used.
ddl_filter_name — Name of the DDL filter to remove.

Notes

This function uses the same replication mechanism as DDL statements. This means that the replication is affected by the ddl filters configuration. This also means that replication of changes to the DDL filter configuration is affected by the existing DDL filter configuration.

The function takes a DDL global lock.

This function is transactional. You can roll back the effects with the ROLLBACK of the transaction. The changes are visible to the current transaction.