Replication sets v3.7
Replication sets provide a mechanism to control which tables in the database will be replicated and which actions on those tables will be replicated.
Each replicated set can specify individually if INSERTs
, UPDATEs
,
DELETEs
and TRUNCATEs
on the set are replicated. Every table can be in
multiple replication sets and every subscriber can subscribe to multiple
replication sets as well. The resulting set of tables and actions replicated
is the union of the sets the table is in. The tables are not replicated until
they are added into a replication set.
There are three preexisting replication sets, named "default",
"default_insert_only" and "ddl_sql". The "default" replication set is defined
to replicate all changes to tables in it. The "default_insert_only" replication
set only replicates INSERTs and is meant for tables that don't have primary key
(see Restrictions section for details). The "ddl_sql" replication
set is defined to replicate schema changes specified by the
pglogical.replicate_ddl_command
.
Note: Table are not added automatically to the "default" replication set,
the name "default" just means it exists by default. This behavior can be
changed using pglogical.alter_replication_set
.
Behavior of partitioned tables
From PostgreSQL 11 onwards, pglogical supports partitioned tables transparently. This means that a partitioned table can be added to a replication set and changes to any of the partitions will be replicated downstream.
The partitioning definition on the subscription side can be set up differently to the one on the provider. This means that one can also replicate a partitioned table to a single table, or a single table to a partitioned table, or a partitioned tabled to a differently'partitioned table (repartitioning).
It's also possible to add individual partitions to the replication set, in which case they will be replicated like regular tables (to the table of the same name as the partition on the downstream). This has some performance advantages in case the partitioning definition is same on both provider and subscriber, as the partitioning logic does not have to be executed.
Note: If the root-partitioned table is part of any replication set, memberships of individual partitions are ignored and only the membership of said root table will be taken into account.
Older versions of PostgreSQL
In PostgreSQL 10 and older, pglogical only allows the replication of partitions directly to other partitions. Which means the partitioned table itself cannot be added to a replication set and can't be target of replication on the subscriber either (one can't replicate a normal table to a partitioned table).
Replication set manipulation interfaces
The following functions are provided for managing the replication sets:
pglogical.create_replication_set
This function creates a new replication set.
Synopsis
Parameters
set_name
- name of the set, must be uniquereplicate_insert
- specifies ifINSERT
is replicated; default truereplicate_update
- specifies ifUPDATE
is replicated; default truereplicate_delete
- specifies ifDELETE
is replicated; default truereplicate_truncate
- specifies ifTRUNCATE
is replicated; default trueautoadd_tables
- specifies if newly created tables should be automatically added to the new replication set; default falseautoadd_sequences
- specifies if newly created sequences should be automatically added to the new replication set; default falseautoadd_existing
- this in combination withautoadd_tables
orautoadd_sequences
specifies if any existing tables and sequences should be added as well
The autoadd options will ignore tables that are in
information_schema
or pg_catalog
schemas or are part of an extension.
The autoadd options will also allow automatic removal of tables from the replication set. So there will be no dependency check on replication membership when the table which is part of the autoadd replication set is being dropped.
If you want to replicate tables which are part of some extension, you still have to add them manually.
pglogical.alter_replication_set
This function changes the parameters of the existing replication set.
Synopsis
Parameters
set_name
- name of the existing replication setreplicate_insert
- specifies ifINSERT
is replicatedreplicate_update
- specifies ifUPDATE
is replicatedreplicate_delete
- specifies ifDELETE
is replicatedreplicate_truncate
- specifies ifTRUNCATE
is replicatedautoadd_tables
- specifies if newly created tables should be automatically added to the new replication setautoadd_sequences
- specifies if newly created sequences should be automatically added to the new replication set
If any of these replication set parameters is NULL (which is the default value if nothing else is specified), the current setting for that parameter will remain unchanged.
pglogical.drop_replication_set
Removes the replication set.
Synopsis
Parameters
set_name
- name of the existing replication set
pglogical.replication_set_add_table
Adds a table to a specified existing replication set, optionally requesting resynchronization by subscribers.
Synopsis
Parameters
set_name
- name of the existing replication setrelation
- name or OID of the table to be added to the setsynchronize_data
- if true, the table data is synchronized on all subscribers which are subscribed to given replication set; default falsecolumns
- list of columns to replicate. Normally when all columns should be replicated, this will be set to NULL which is the default.row_filter
- row filtering expression; default NULL (no filtering). See Row Filtering On Provider for more info.
WARNING: Use caution when synchronizing data with a valid row filter.
Using synchronize_data=true
with a valid row_filter
is like a one-time operation for a table.
Executing it again with a modified row_filter
won't synchronize data to subscriber. Subscribers
may need to call pglogical.alter_subscription_resynchronize_table()
to fix it.
Also, note that if synchronize_data
is enabled, a synchronization request is
scheduled on each subscriber and actioned asynchronously. Adding to the
replication set does not wait for synchronization to complete.
To wait until the resync has completed, first, on the provider, run:
To ensure each subscriber has received the request, then on each subscriber run:
NOTE: There is currently no function to alter the row filter or columns of
a table's replication set membership (RM#5960). However, you can use a single
transaction to remove the table from the replication set and then re-add it with
the desired row filter and column filter. Make sure to set synchronize_data :=
false
. This provides a seamless transition from the old to the new membership
and will not skip or lose any rows from concurrent transactions.
pglogical.replication_set_add_all_tables
Adds all tables in given schemas.
Synopsis
Only existing tables are added; any tables created later will not be added automatically. To see how to automatically add tables to the correct replication set at creation time, see Automatic assignment of replication sets for new tables.
Parameters
set_name
- name of the existing replication setschema_names
- array of names name of existing schemas from which tables should be addedsynchronize_data
- if true, the table data is synchronized on all subscribers which are subscribed to the given replication set; default false
pglogical.replication_set_remove_table
Removes a table from a specified existing replication set.
Synopsis
Parameters
set_name
- name of the existing replication setrelation
- name or OID of the table to be removed from the set
pglogical.replication_set_add_sequence
Adds a sequence to a replication set.
Synopsis
Parameters
set_name
- name of the existing replication setrelation
- name or OID of the sequence to be added to the setsynchronize_data
- if true, the sequence value will be synchronized immediately; default false
pglogical.replication_set_add_all_sequences
Adds all sequences from the given schemas.
Synopsis
Only existing sequences are added; any sequences created later will not be added automatically.
Parameters
set_name
- name of the existing replication setschema_names
- array of names of existing schemas from which tables should be addedsynchronize_data
- if true, the sequence value will be synchronized immediately; default false
pglogical.replication_set_remove_sequence
Remove a sequence from a replication set.
Synopsis
Parameters
set_name
- name of the existing replication setrelation
- name or OID of the sequence to be removed from the set
You can view the information about which table is in which set by querying the
pglogical.tables
view.
Automatic assignment of replication sets for new tables
The event trigger facility can be used for describing rules which define replication sets for newly created tables.
Example:
The above example will put all new tables created in schema config
into
replication set configuration
and all other new tables which are not created
by extensions will go to the default
replication set.
Additional functions
pglogical.synchronize_sequence
Push sequence state to all subscribers.
Synopsis
Unlike the subscription and table synchronization function, this function should be run on the provider. It forces an update of the tracked sequence state which will be consumed by all subscribers (replication set filtering still applies) once they replicate the transaction in which this function has been executed.
Parameters
relation
- name of existing sequence, optionally qualified
Row Filtering on Provider
On the provider side, row filtering can be done by specifying the row_filter
parameter for the pglogical.replication_set_add_table
function. The
row_filter
is a normal PostgreSQL expression with the same limitations as a
CHECK
constraint.
You can see which row filters are active in the pglogical.tables
view.
The table's column(s) are exposed to the row filter as simple identifiers; there's no qualifier or namespace.
Unlike a CHECK constraint's body, the row-filter is passed as a string which is parsed and checked by pglogical. So to avoid quoting issues you should use PostgreSQL's dollar-quoting, like this:
A simple row_filter
would look something like row_filter := 'id > 0'
which
would replicate only those rows where values of column id
are greater than zero.
This will not affect any already-committed rows pending
replication, or any already-replicated rows.
Important: Caveats apply when re-synchronizing tables with row filters
using replication_set_add_table
. See pglogical.replication_set_add_table
.
Writing safer row filters
Be very cautious when writing row filter expressions, and keep them as simple as possible. If a row-filter expression raises an error during replication, it is generally necessary to drop and re-create the subscription, resynchronizing all tables, not just the table with the problem row-filter. So row filters should be simple and defensively written. A non-exhaustive list of rules for writing filters is that they:
Should be simple expressions wherever possible. Try to use only built-in PostgreSQL operators and
IMMUTABLE
functions if you can.Must avoid using any expression that could raise an
ERROR
at runtime, such as casting fromtext
to a more strictly validated data type. They must tolerate any value that the table's constraints permit to appear in the table.May use
VOLATILE
orSTABLE
functions, but any functions must obey the same constraints as the filter expression itself.E.g. you can call
random()
but nottxid_current()
ormy_audit_log_function()
.May call user-defined functions written in SQL, Pl/PgSQL, or (with care) C. Use of other languages is untested and not recommended. PL/PgSQL functions must not use
EXCEPTION
blocks, and may have other as-yet-undiscovered issues so their use is not recommended. Stick to SQL where possible.Should not attempt to access any tables. Only the column values should be used.
Direct use of subqueries in the row-filter expression is blocked.
It's possible to call a user-defined function within the filter, and that can access table contents. This is not recommended and may be subject to surprising behaviour. The function must only access tables in
pg_catalog.*
or tables marked with theuser_catalog_table=true
attribute. Accessing other tables will not raise an error, but may cause undefined behaviour, errors, or crashes.Must never attempt any write operation or anything that assigns a transaction-id. Similar to queries on a read-replica. Attempting writes will break replication.
May safely use columns of the filtered table that are not part of the replication set's column list. Filtering happens on the provider side so non-replicated columns will have their values accessible. This lets you do things like pre-compute complex filter criteria in triggers.
Should not rely on session state, since the
row_filter
is running inside the replication session. Session specific expressions such asCURRENT_USER
will have values of the replication session and not the session which did the writes. The same is true for GUCs etc.
Changing row filters
To change a row-filter expression on a table, use a single transaction to remove the
table from the replication set, then add it again with the new row filter expression.
Do not specify data sync and make sure to explicitly repeat the set of replicated
columns. You can check the pglogical.tables
view for the old column set and row filter.
See pglogical.replication_set_add_table
.