Configuring EDB Pgpool-II v4

Suggest edits

The configuration files are created in the /etc/sysconfig/edb/pgpool<x.y> directory, where <x.y> is the EDB Pgpool-II release version. By default, .sample is appended to the configuration file name. remove .sample from the configuration file after copying the file to create your custom configuration.

Note

The configuration options for Pgpool-II are extensive. Consider the options listed here as a starting point. For more information about configuring and using Pgpool-II, consult the project website.

Commonly used parameters

The table lists pgpool.conf parameters to use when implementing connection pooling:

Parameter name	Description
`listen_addresses`	Hostname or IP address used by EDB Pgpool-II to listen for connections. The default is `localhost`. Change to '*' for all addresses.
`port`	Port for EDB Pgpool-II connections. The default is `9999`.
`pcp_port`	Port for PCP connections. The default is `9898`.
`backend_hostname0`	Hostname or IP address for backend 0. You can specify '' if the backend and EDB Pgpool-II are running on the same host.
`backend_port0`	Port number for backend 0.
`backend_weight0`	Weight for backend 0 (only in load balancing mode). Specify `1` for each backend if you want to balance the load equally or decimal values (`.9`, `.1`, etc.) to weigh the load toward specific backends.
`backend_data_directory0`	Data directory for backend 0.
`enable_pool_hba`	Set to `on` to use `pool_hba.conf` for client authentication.
`num_init_children`	Number of pools. Default is `32`.
`max_pool`	Number of connections per pool. Default is `4`.
`connection_cache`	Set to `on` to enable connection pooling.
`pool_conn_dbname`	Database name to which EDB Pgpool-II connects. By default, EDB Pgpool-II connects with Postgres. As of v4.3, the deprecated `pool_conn_dbname` parameter is removed.
`sr_check_user`	User name to perform streaming replication check. Required as of EDB Pgpool-II v4.3.
`sr_check_password`	Password of the `sr_check_user` user to perform the streaming replication checks. Required as of EDB Pgpool-II v4.3.

The following table lists pgpool.conf parameters to use when implementing replication and load balancing:

Parameter name	Description
`Allow_sql_comments`	If on, ignore SQL comments. Changes to this parameter require reloading the `pgpool.conf` file.
`load_balance_mode`	Set to `on` to activate load balancing mode. If `load_balance_mode` is on and `replicate_select` is off, `SELECT` statements are sent to one backend. The parameter backend_weight<N>.z determines the proportion of `SELECT` statements each backend receives.
`ignore_leading_white_space`	Ignore leading white spaces of each query. Certain APIs such as DBI/DBD::Pg for Perl add white space that you can't control. Default is `on`.

Configuring connection pooling

EDB Pgpool-II provides a set of child processes that maintain cached connections to one or more database servers. When a client connects, EDB Pgpool-II attempts to reuse a connection from its pool, thus avoiding the overhead of opening and closing client connections.

You can reuse a connection in the pool only if the target database and the connection user match a prior connection that is currently in the pool. The pgpool.conf file specifies the connection pooling configuration options (such as the number of child processes and the maximum number of cached connections per child).

To configure connection pooling with one database server:

Configure the pg_hba.conf file on the Pgpool-II host to permit connections between the clients and the server.

Copy the pgpool.conf.sample file to pgpool.conf, modify the file, set the connection_cache parameter to on, and specify connection properties for your database server.

The following example shows how to connect with the EDB Postgres Advanced Server:

connection_cache = on

backend_hostname0 = 'localhost'

backend_port0 = 5444

backend_weight0 = 1

backend_data_directory0 = '/var/lib/edb/as14/data'

sr_check_user = 'enterprisedb'

sr_check_password = 'enterprisedb_password'

The following example shows how to connect with the PostgreSQL Server:

connection_cache = on

backend_hostname0 = 'localhost'

backend_port0 = 5432

backend_weight0 = 1

backend_data_directory0 = '/var/lib/pgsql/14/data'

sr_check_user = 'enterprisedb'

sr_check_password = 'enterprisedb_password'

Note

In the pgpool.conf file, connection parameters have an appended digit that specifies a cluster node identifier. Database node 0 specifies values for the primary node.

Optionally, configure EDB Pgpool-II client authentication.
Optionally, configure the PCP administrative interface.
Start EDB Pgpool-II:
```
systemctl start edb-pgpool-<x.y>.service
```
<x.y> is the EDB Pgpool release version.
Run the following platform-specific command to connect to Pgpool43:
On EDB Postgres Advanced Server for CentOS 7:
```
./psql -d edb -p 9999 -U enterprisedb -h /tmp
```
On EDB Postgres Advanced Server for Debian:
```
./psql -d edb -p 9999 -U enterprisedb
```
On PostgreSQL Server for CentOS 7:
```
./psql -d postgres -p 9999 -U postgres -h /tmp
```
On PostgreSQL Server for Debian:
```
./psql -d postgres -p 9999 -U postgres
```

Configuring load balancing

EDB supports replication scenarios that use EDB Pgpool-II load balancing with PostgreSQL streaming replication or Slony replication. The supported replication methods ensure that database updates made by client applications apply to multiple backend servers. For detailed information about the benefits of each replication method and configuration instructions, see the project documentation for each utility.

When load balancing is enabled, EDB Pgpool-II distributes some types of SELECT statements to backend servers, allowing multiple database servers and hosts to share the processing load of SELECT statements issued by client applications.

When configuring EDB Pgpool-II load balancing, the initial database environments in all backend servers must be identical:

Tables must have the same name, definition, and row content.
Schemas must exist in each backend application database.
Roles and privileges on each backend server must be configured to ensure the result set of SQL statements are identical on all servers.

If you use password authentication, assign the same password to an associated user name on each database server. Use the same user name/password pair to connect EDB Pgpool-II to each backend connection.

In a replication scenario, each backend is uniquely identified by the hostname (or IP address) and the port number on which the database server instance is listening for connections. Make sure that the pool_hba.conf and pg_hba.conf files allow a connection between that server and the host on which EDB Pgpool-II is running.

The following example shows how to implement EDB Pgpool-II load balancing with two servers (the primary and replica nodes) in a streaming replication scenario. Configuring EDB Pgpool-II load balancing for a Slony replication scenario is similar. See the Slony documentation for information about configuring Slony replication.

Configuring the primary node of the replication scenario

Open an SSH session with the primary node of the replication scenario, and modify the pg_hba.conf file (located in the /var/lib/edb/as14/data directory). Add connection information for the replication user. (In the example that follows, edbrepuser resides on a standby node with an IP address of 107.178.217.178):

host replication edbrepuser 107.178.217.178/32 md5

The connection information must specify the address of the replication scenario's standby node and your preferred authentication method.

Modify the postgresql.conf file (located in /var/lib/edb/as14/data), adding the following replication parameter and values to the end of the file:

 wal_level = replica
 max_wal_senders = 10
 checkpoint_segments = 8
 wal_keep_segments = 0

Save the configuration file, and restart the server:

To restart on RHEL/CentOS 7 and RHEL/Rocky Linux/AlmaLinux 8 platforms:

systemctl restart edb-as-14

To restart on Debian 10.x or Ubuntu 18.04 | 20.04 platforms:

/usr/edb/as14/bin/epas_ctlcluster 14 main restart

Use the sudo su - command to assume the identity of the enterprisedb database superuser:

 sudo su - enterprisedb

Then, start a psql session, connecting to the edb database:

psql -d edb

At the psql command line, create a user with the replication attribute:

 CREATE ROLE edbrepuser WITH REPLICATION LOGIN PASSWORD 'password';

Configuring the standby node of the replication scenario

Open an SSH session with the standby server and assume the identity of the database superuser (enterprisedb):

sudo su - enterprisedb

With your choice of editor, create a .pgpass file in the home directory of the enterprisedb user. The .pgpass file holds the password of the replication user in plain-text form. If you're using a .pgpass file, make sure that only trusted users have access to it:

Add an entry that specifies connection information for the replication user:

*:5444:*:edbrepuser:password

The server enforces restrictive permissions on the .pgpass file. Use the following command to set the file permissions:

chmod 600 .pgpass

Relinquish the identity of the database superuser:

exit

Then, assume superuser privileges:

sudo su -

Use your platform-specific command to stop the database server before replacing the data directory on the standby node with the data directory of the primary node.

Then, delete the data directory on the standby node:

rm -rf /var/lib/edb/as14/data

After deleting the existing data directory, use the pg_basebackup utility to copy the data directory of the primary node to the standby:

pg_basebackup --format=p --label=standby --host=146.148.46.44 --username=edbrepuser --password --wal-method=stream -R

The call to pg_basebackup specifies the IP address of the primary node and the name of the replication user created on the primary node.

Including the -R option creates the standby.signal file and appends connection settings to postgresql.auto.conf in the output directory (or into the base archive file when using tar format) to ease setting up a standby server.

For more information about the options available with the pg_basebackup utility, see the PostgreSQL core documentation.

When prompted by pg_basebackup, provide the password associated with the replication user.

After copying the data directory, change ownership of the directory to the database superuser (enterprisedb):

chown -R enterprisedb /var/lib/edb/as14/data

Modify the postgresql.conf file (located in /var/lib/edb/as14/data), specifying the following values at the end of the file:

wal_level = replica
hot_standby = on

The data file has been copied from the primary node and contains the replication parameters specified previously.

Then, restart the server. At this point, the primary node is replicating data to the standby node.

Configuring EDB Pgpool-II load balancing

In the pgpool.conf file, modify the parameter settings to specify that load balancing is enabled:

load_balance_mode = on

Then, specify the connections settings for the primary database node in the parameter set that ends with a 0. For example:

backend_hostname0 = '146.148.46.44'
backend_port0 = 5444
backend_weight0 = 1
backend_data_directory0 = '/var/lib/edb/as14/data'

Then, specify the connections settings for each node to which queries are distributed. Increment the number that follows the parameter name for each node, and provide connection details:

backend_hostname1 = '107.178.217.178'
backend_port1 = 5444
backend_weight1 = 1
backend_data_directory1 = '/var/lib/edb/as14/data'

Use the backend_weight parameter to specify how to distribute queries distributed among the nodes. Specify a value of 1 to indicate that you want (qualified) queries to be equally distributed across the nodes of the replication scenario.

Restart EDB Pgpool-II

systemctl restart edb-pgpool-<x.y>.service

<x.y> is the EDB Pgpool release version.

Configuring client authentication

When EDB Pgpool-II is enabled, client applications connect to EDB Pgpool-II, which acts as a middleman for a Postgres server. A connecting client application is first authenticated with the EDB Pgpool-II server and then with the Postgres server.

Parameter settings in the pool_hba.conf configuration file determine the EDB Pgpool-II authentication properties. The pool_hba.conf file is similar in format and function to the Postgres pg_hba.conf configuration file. See the Pgpool-II documentation for detailed information about pool_hba.conf entries.

To enable EDB Pgpool-II authentication:

Copy the pool_hba.conf.sample file to pool_hba.conf.
Modify the pool_hba.conf file, specifying authentication information for servers or users you want to connect. Entries must follow the same format used in the pg_hba.conf file.
Modify the pgpool.conf file, setting the enable_pool_hba parameter to on.
Restart EDB Pgpool-II to reload the EDB Pgpool-II configuration files.

Note

When authenticating with the database server, use the user names and passwords specified in the pool_hba.conf file. You must also specify these user names and passwords in the database server's pg_hba.conf file.

Configuring PCP

PCP is an administrative interface for EDB Pgpool-II that allows you to retrieve information about database nodes, EDB Pgpool-II child processes, and other information. Issue PCP commands from the Linux command line.

pcp.conf is the password configuration file for the PCP client. Before using PCP commands, modify the pcp.conf file, providing the user names and passwords you provide when invoking a PCP command. The user names in the pcp.conf file are independent of the database server user names and passwords.

Use the following steps to configure PCP:

Copy the pcp.conf.sample file to pcp.conf.
Add an entry to the pcp.conf file in the following form:
```
username:md5_password
```
username is a PCP user name.
md5_password is the PCP password in md5 format.
You can use the pg_md5 program to generate the encrypted password from the clear-text form:
```
$ pg_md5 mypassword

34819d7beeabb9260a5c854bc85b3e44
```
For example, the entry in the pcp.conf file for a PCP user named pcpuser with the password of mypassword is:
```
# USERID:MD5PASSWD
pcpuser:34819d7beeabb9260a5c854bc85b3e44
```
Restart the EDB Pgpool service.
When issuing a PCP command, specify the PCP user name and the unencrypted form of the password:
```
$ pcp_node_info 5 localhost 9898 pcpuser mypassword 0
localhost 5444 1 1.000000
```

After configuring PCP, you can use the following PCP commands to control EDB Pgpool-II and retrieve information.

PCP command	Description
`pcp_common_options`	Common options used in PCP commands
`pcp_node_count`	Displays the total number of database nodes
`pcp_node_info`	Displays the information on the given node ID
`pcp_health_check_stats`	Displays health check statistics data on given node ID
`pcp_watchdog_info`	Displays the watchdog status of the EDB Pgpool-II
`pcp_proc_count`	Displays the list of EDB Pgpool-II children process IDs
`pcp_proc_info`	Displays the information on the given EDB Pgpool-II child process ID
`pcp_pool_status`	Displays the parameter values as defined in `pgpool.conf`
`pcp_detach_node`	Detaches the given node from EDB Pgpool-II, forcing existing connections to EDB Pgpool-II to be disconnected
`pcp_attach_node`	Attaches the given node to EDB Pgpool-II
`pcp_promote_node`	Promotes the given node as new main to EDB Pgpool-II
`pcp_stop_pgpool`	Terminates the EDB Pgpool-II process
`pcp_reload_config`	Reloads EDB Pgpool-II config file
`pcp_recovery_node`	Attaches the given backend node with recovery

Note

pcp_health_check_stats and pcp_reload_config commands are available from EDB Pgpool version 4.2 onwards.

To view more information about PCP command options, see the Pgpool project site.

Configuring number of connections and pooling

EDB Pgpool-II has some configuration to tune the pooling and connection processing. Depending on this configuration, you must also set the Postgres configuration for max_connections to ensure all connections can be accepted as required. Furthermore, the cloud architecture works with active/active instances, which needs to spread num_init_children over all EDB Pgpool instances (divide the normally used value by the number of active instances).

max_pool: Generally, advised to set max_pool to 1. Alternatively, for applications with many reconnects, you can set max_pool to the number of distinct combinations of users, databases, and connection options for the application connections. All but one connection in the pool are stale connections, which consume a connection slot from Postgres without adding to the performance. We therefore recommend that you don't configure max_pool beyond 4 to preserve a healthy ratio between active and stale connections. As an example, for an application that constantly reconnects and uses two distinct users, both connecting to their own database, set it to 2. If both users can connect to both databases, set it to 4. Increasing max_pool requires that you tune down num_init_children in EDB Pgpool or tune up max_connections in Postgres.

num_init_children: We recommend setting num_init_children to the number of connections that could be running active in parallel, but the divide value by the number of active EDB Pgpool-II instances (one with the on-premise architecture and all instances for the cloud architecture). As an example: in an architecture with three EDB Pgpool instances, to allow the application to have 100 active connections in parallel, set num_init_children to 100 for the on-premises architecture, and set num_init_children to 33 for the cloud architecture. Increasing num_init_children generally requires that you tune up max_connections in Postgres.

listen_backlog_multiplier: Can be set to multiply the number of open connections (as perceived by the application) with the number of active connections (num_init_children). As an example, when the application might open 500 connections, of which 100 should be active in parallel, with the on-premises architecture, set num_init_children to 100 and listen_backlog_multiplier to 4. This setup can process 100 connections active in parallel, and another 400 (listen_backlog_multiplier x num_init_children) connections are queued before connections are blocked. The application perceives a total of 500 open connections, and Postgres processes the load of 100 connections maximum at all times. Increasing listen_backlog_multiplier causes the application to perceive more connections but doesn't increase the number of parallel active connections (which is determined by num_init_children).

max_connections: We recommend setting max_connections in Postgres higher than [number of active pgpool instances] x [max_pool] x [num_init_children] + [superuser_reserved_connections] (Postgres). As an example: in the on-premises setup with three instances active/passive, max_pool set to 2, num_init_children set to 100, and superuser_reserved_connections (Postgres) set to 5, set Postgres max_connections equal to or higher than [1 x 2 x 100+5], which is 205 connections, or higher. A similar setup in the cloud setup runs with three active instances, max_pool set to 2, num_init_children set to 33, and superuser_reserved_connections (Postgres) set to 5. In this case set Postgres max_connections equal or higher than [3x 2 x 33+5] which is 203 or higher. Configuring below the advised setting can cause issues opening new connections and, in combination with max_pool, can cause unexpected behavior (low or no active connections but still connection issues due to stale pooled connections using connection slots from Postgres.

EDB Pgpool-II host setup

After modifying the parameter settings that implement EDB Pgpool-II functionality for your installation, start the EDB Pgpool-II service.

When EDB Pgpool-II starts, it records its process ID in a pgpool.conf file whose name is determined by the pid_file_name configuration parameter. The initial value of the pid_file_name parameter in the sample file is:

pid_file_name = /var/run/edb/pgpool<x.y>/edb-pgpool-<x.y>.pid

<x.y> is the EDB Pgpool release version.

Note

The operating system might remove the contents of the /var/run directory (including the pgpool directory) during a reboot. Don't use the /var/run/edb/pgpool directory as the location for the pgpool.pid file. Modify the pid_file_name parameter to specify a safer directory location.