BDR Support v2
LiveCompare can be used against BDR nodes, as well as non-BDR nodes.
Setting logical_replication_mode = bdr
will make the tool assume that all
databases being compared belong to the same BDR cluster. Then you can specify
node names as connections, and replication sets to filter tables.
For example, consider you are able to connect to any node in the BDR cluster.
Let's call this Initial Connection
. By initially connection to this node,
LiveCompare is able to check BDR metadata and retrieve connection information
from all other nodes.
Now consider you want to compare 3 BDR nodes. As LiveCompare is able to connect
to any node starting from the Initial Connection
, you do not need to define
dsn
or any connection information for the data connections. You just need to
define node_name
. LiveCompare searches in BDR metadata about the connection
information for that node, and then connects to the node.
Please note that, for LiveCompare to be able to connect to all other nodes by
fetching BDR metadata, it is required that LiveCompare is able to connect to
them using the same DSN from BDR view bdr.node_summary
, field
interface_connstr
. In this case it is recommended to run LiveCompare on the
same machine as the Initial Connection
, as postgres
user. If that's not
possible, then please define the dsn
attribute in all data connections.
You can also specify replication sets as table filters. LiveCompare will use
BDR metadata to build the table list, considering only tables that belong to the
replication set(s) you defined in the replication_sets
setting.
For example, you can create an .ini
file to compare 3 BDR nodes:
It is also possible to tell LiveCompare to compare all active nodes in the BDR cluster. For that purpose just do the following:
- In
General Settings
, enableall_bdr_nodes = on
; - Specify an
Initial Connection
; - Additional data connections are not required.
For example:
When all_bdr_nodes = on
, LiveCompare uses the Initial Connection
to fetch
the list of all BDR nodes. Additional data connections are not required;
although if set, will be appended to the list of data connections. For example,
it would be possible to compare a whole BDR cluster against a single Postgres
connection, useful in migration projects:
Settings node_name
and replication_sets
are supported for the following
technologies:
- BDR 1, 2, 3 and 4;
- pglogical 2 and 3.
Please note that to enable pglogical metadata fetch instead of BDR, just set
logical_replication_mode = pglogical
instead of
logical_replication_mode = bdr
.
BDR Witness nodes
Using replication sets in BDR, it's possible to configure specific tables to be included in the BDR replication, and also specify which nodes should receive data from such tables, by configuring the node to subscribe to the replication set the table belongs to. This allows for different architectures such as BDR Sharding and the use of BDR Witness nodes.
A BDR Witness is a regular BDR node which doesn't replicate any DML from other nodes. The purpose of the Witness is to provide quorum in Raft Consensus voting (for more details on the BDR Witness node, check BDR documentation). Depending on how replication sets were configured, the Witness may or may not replicate DDL. Which means that there are 2 types of BDR Witnesses:
- A completely empty node, without any data nor tables; or
- A node that replicates DDL from other nodes, hence having empty tables.
In the first case, even if the BDR Witness is included in the comparison (either
manually under [Connections]
or using all_bdr_nodes = on
), as the Witness
doesn't have any tables, the following message will be logged:
In the second case, on the other hand, the table exists on the BDR Witness. However, it would not be correct to report data missing on the Witness as divergences. So, for each table, LiveCompare checks the following information on each node included in the comparison:
- The replication sets that the node subscribes;
- The replication sets that the table is associated with;
- The replication sets, if any, the user defined in filter
replication_sets
underTable Filter
.
If the intersection among all 3 lists of replication sets is empty, which is the case for the BDR Witness, then LiveCompare will log this:
In both cases, the comparison for that specific table proceeds on the nodes where the table exists, and the table is replicated according to the replication sets configuration.
Differences in a BDR cluster
LiveCompare will make changes to the local node only; it is important that corrective changes do not get replicated to other nodes.
When logical_replication_mode = bdr
, LiveCompare will initially check if a
replication origin called bdr_local_only_origin
already exists (the name of
the replication origin can be configured by adjusting the setting
difference_fix_replication_origin
). If a replication origin called
bdr_local_only_origin
does not exist yet, then LiveCompare creates it on all
BDR connections.
IMPORTANT: Please note that BDR 3.6.18 introduced the new pre-created
bdr_local_only_origin
replication origin to be used for applying local-only
transactions. So if LiveCompare is connected to BDR 3.6.18, it won't create this
replication origin.
LiveCompare will generate apply scripts considering the following:
- Set the current transaction to use the replication origin
bdr_local_only_origin
, so any DML executed will havexmin
associated tobdr_local_only_origin
; - Set the current transaction datetime to be far in the past, so if there are any BDR conflicts with real DML being executed on the database, LiveCompare DML always loses the conflict.
After applying LiveCompare fix script to a BDR node, it
will be possible to get exactly which rows were inserted or updated by
LiveCompare using the following query (replace mytable
with the name of any
table):
(Note that deleted rows are no longer visible.)
Please note that LiveCompare requires at least a PostgreSQL user with
bdr_superuser
privileges in order to properly fetch metadata.
All steps above involving replication origins only applied to output script, if
the PostgreSQL user has bdr_superuser
or PostgreSQL superuser privileges.
Otherwise, LiveCompare will generate fixes without associating any replication
origin (transaction replication is still disabled using
SET LOCAL bdr.xact_replication = off
). However, it is recommended to use a
replication origin when applying the DML scripts, because otherwise LiveCompare
will have the same precedence as a regular user application regarding conflict
resolution. Also, as there will not be any replication origin associated to the
fix, the query above to list all rows fixed by LiveCompare can not be used.
Between BDR 3.6.18 and BDR 3.7.0, the following functions are used:
bdr.difference_fix_origin_create()
: Executed by LiveCompare to create the replication origin specified indifference_fix_replication_origin
(by default set tobdr_local_only_origin
), if this replication origin does not exist;bdr.difference_fix_session_setup()
: Included in the generated DML script so the transaction is associated with the replication origin specified indifference_fix_replication_origin
;bdr.difference_fix_xact_set_avoid_conflict()
: Included in the generated DML script so the transaction is set far in the past (2010-01-01
), so the fix transaction applied by LiveCompare always loses a conflict, if any.
The functions above require a bdr_superuser
rather than a PostgreSQL
superuser. Starting from BDR 3.7.0, those functions are deprecated. LiveCompare
then will, if running as a PostgreSQL superuser, use the following functions
instead, to perform the same actions as above:
pg_replication_origin_create(origin_name)
;pg_replication_origin_session_setup()
;pg_replication_origin_xact_setup()
.
If a PostgreSQL superuser is not being used, then LiveCompare will include only the following in the generated DML transaction:
Conflicts in BDR
LiveCompare has an execution mode called conflicts
. This execution mode is
specific for BDR clusters. It will only work in BDR 3.6, BDR 3.7 or BDR 4
clusters.
While compare
mode is used to compare all content of tables as a whole,
conflicts
mode will focus just in tuples/tables that are related to existing
conflicts that are registered in bdr.apply_log
, in case of BDR 3.6, or in
bdr.conflict_history
, in case of BDR 3.7 and BDR 4.
Having said that, conflicts
execution mode is expected to run much faster than
compare
mode, because it will just inspect specific tuples from specific
tables. At the same time, it's not as complete as compare
mode, because of the
same reason.
The main objective of this execution mode is to check that the automatic conflict resolution which is being done by BDR is consistent among nodes, i.e., after BDR resolving conflicts the cluster is in a consistent state.
Although, for the general use case, automatic conflict resolution ensures
cluster consistency, there are a few known cases where automatic conflict
resolution can result in divergent tuples among nodes. So the conflicts
execution mode from LiveCompare can help checking and ensuring consistency, with
a good balance between time vs result.
Conflict example
Imagine on node3
we execute the following query:
We can see the following conflict in bdr.conflict_history
:
Which means that when the DELETE
arrived from node2
to node3
, there was no
row with b = 3
in table tbl
. However, the INSERT
might have arrived from
node1
to node3
later, which then added the row with b = 3
to node3
. So
this is the current situation on node3
:
While on nodes node1
and node2
, this is what we see:
The BDR cluster is divergent.
Now in order to detect and fix such divergence, we could execute LiveCompare in
compare
mode, but depending on the size of the comparison set (imagine table
tbl
is very large), that can take a long time, even hours.
This is exactly the situation where conflicts
mode can be helpful. In this
case, the delete_missing
conflict is visible only from node3
, but
LiveCompare is able to extract the PK values from the conflict logged rows
(key_tuple
, local_tuple
, remote_tuple
and apply_tuple
) and perform an
automatic cluster-wide comparison only on the affected table, already filtering
by the PK values. The comparison will then check the current row version in all
nodes in the cluster.
So we create a check.ini
file to set all_bdr_nodes = on
, i.e., to tell
LiveCompare to compare all nodes in the cluster:
To run LiveCompare in conflicts
mode:
After the execution, in the console output, you will see something like this:
Inside folder ./lc_session_X/
(being X
the number of the current comparison
session), LiveCompare will write the file conflicts_DAY.out
(replacing DAY
in the name of the file with the current day), showing the main information
about all divergent conflicts.
If you connect to database liveoutput
, you will be able to see more details
about the conflicts, for example using this query:
You will see something like this:
The is_conflict_divergent = true
means that LiveCompare has compared the
conflict and found the nodes to be currently divergent in the tables and rows
reported by the conflict. View livecompare.vw_conflicts
shows information
about all conflicts, including the non-divergent ones.
LiveCompare will also automatically generate DML script
./lc_session_X/apply_on_the_node3_DAY.sql
(replacing DAY
in the name of the
file with the current day):
LiveCompare is suggesting to DELETE
the row where b = 3
from node3
,
because on the other 2 nodes the row does not exist. By default, LiveCompare
suggest the DML to fix based on the majority of the nodes.
If you run this DML script against node3
:
You will get the BDR cluster consistent again.
As the --conflicts
mode comparison is much faster than a full --compare
, it
is highly recommended to schedule a --conflicts
comparison session more often,
to ensure conflict resolution is providing cluster-wide consistency.
Please note that, in order to be able to see the data in bdr.conflict_history
in BDR 3.7 or bdr.apply_log
in BDR 3.6, you should run LiveCompare with an
user that is bdr_superuser
or is a PostgreSQL superuser.
Conflicts Filter
It's also possible to tell LiveCompare to filter the conflicts by any of the
columns in either bdr.conflicts_history
or bdr.apply_log
. For example:
Mixing technologies
Please note that metadata for node_name
and replication_sets
are fetched in
the Initial Connection
. So it should be a pglogical- and/or BDR-enabled
database.
The list of tables is built in the first data connection. So the
replication_sets
condition should be valid in the first connection.
It is possible to perform mixed technology comparisons, for example:
- BDR 1 node versus BDR 3 node;
- BDR 4 node versus vanilla Postgres instance;
- Vanilla Postgres instance versus pglogical node.