Managing a PEM agent v9
Managing job notifications
In the PEM console, you can configure the settings for sending the SMTP trap on success or failure of a system-generated job listed under scheduled tasks or a custom agent job. You can configure these email notification settings at the following three levels to send email notifications to the specified user group. These levels are shown in order of precedence.
- Job level
- Agent level
- PEM server level (default)
Configuring job notifications at job level
You can configure email notification settings at job level only for a custom agent job in one of the following ways:
- For a new agent job, you can configure the email notification settings in the Notification tab of the Create Agent Job wizard while creating the job.
- For an existing custom job, you can edit the properties of the job and configure the notification settings.
Use the Notifications tab to configure the email notification settings on job level:
- Use the Send the notifications field to specify when you want to send the email notifications.
- Use the Email group field to specify the email group to send the email notification to.
Configuring job notifications at agent level
Select the agent in the tree view, right-click, and select Properties. In the Properties dialog box, select the Job Notifications tab.
Use the Job notifications tab to configure the email notification settings at agent level:
- Use the Override default configuration? switch to specify if you want the agent level job notification settings to override the default job notification settings. Select Yes to enable the rest of the settings on this dialog box to define when and to whom to send the job notifications.
- Use the Email on job completion? switch to specify whether to send the job notification when the job completes successfully.
- Use the Email on a job failure? switch to specify whether to send the job notification when the job fails.
- Use the Email group field to specify the email group to send the job notification to.
Configuring job notifications at server level
You can use the Server Configuration dialog box to provide information about your email notification configuration at PEM server level. To open the Server Configuration dialog box, select Management > Server Configuration.
Four server configuration parameters specify information about your job notification preferences at PEM server level:
- Use the job_failure_notification switch to specify if you want to send an email notification after each job failure.
- Use the job_notification_email_group parameter to specify the email group to send the email notification to.
- Use the job_retention_time parameter to specify the number of days to retain nonrecurring scheduled tasks in the system.
- Use the job_status_change_notification switch to specify if you want to send an email notification after each job status change, that is, failure, success, or interrupted.
Managing PEM scheduled jobs
You can create a PEM scheduled job to perform a set of steps you define in the specified sequence. These steps can contain SQL code or a batch/shell script that you can run on a server that's bound with the agent. You can schedule these jobs to suit your business requirements. For example, you can create a job for taking a backup of a particular database server and schedule it to run on a specific date and time of every month.
To create or manage a PEM scheduled job, use the PEM tree to browse to the PEM agent for which you want to create the job. The tree displays a Jobs node, under which currently defined jobs are listed. To add a job, right-click the Jobs node and select Create Job from the context menu.
Use the tabs on the Create - Agent Job dialog box to define the steps and schedule that make up a PEM scheduled job.
Use the General tab to provide general information about a job:
- Provide a name for the job in the Name field.
- Set the Enabled switch to Yes to enable a job. Set it to No to disable a job.
- Use the Comment field to store notes about the job.
Use the Steps tab to define and manage the steps that the job performs. Select Add (+) to add a step. Then, select the compose icon, located at the left side of the header, to open the Step Definition dialog box.
Use the Step Definition dialog box to define the step:
Provide a name for the step in the Name field. Steps are performed in alphanumeric order by name.
Use the Enabled switch to include the step when executing the job (True) or to disable the step (False).
Use the Kind switch to indicate if the job step invokes SQL code (SQL) or a batch script (Batch).
- If you select SQL, use the Code tab to provide SQL code for the step.
- If you select Batch, use the Code tab to provide the batch script to execute during the step.
Use the On error list to specify the behavior of pgAgent if it encounters an error while executing the step. Select from:
- Fail — Stop the job if you encounter an error while processing this step.
- Success — Mark the step as completing successfully and continue.
- Ignore — Ignore the error and continue.
If you selected SQL as your input for the Kind switch:
- Use the Server field to specify the server that's bound with the agent for which you are creating the PEM scheduled job.
- Use the Database field to specify the database that's associated with the server that you selected.
Use the Comment field to provide a comment about the step.
Use the context-sensitive field on the Step Definition dialog box Code tab to provide the SQL code or batch script to execute during the step:
If the step invokes SQL code, provide one or more SQL statements in the SQL query field.
If the step invokes a batch script, provide the script in the Code field. If you're running on a Windows server, use standard batch file syntax. On a Linux server, you can use any shell script, provided that you specify a suitable interpreter on the first line (such as
#!/bin/sh
). Along with the defined inline code, you can also provide the path of any batch script, shell script, or SQL file on the filesystem.To invoke a script on a Linux system, you must modify the entry for the
batch_script_user
parameter in theagent.cfg
file and specify the user who runs the script. You can specify either a nonroot user or root for this parameter. If you don't specify a user or the specified user doesn't exist, then the script doesn't execute. Restart the agent after modifying the file.To invoke a script on a Windows system, set the registry entry for
AllowBatchJobSteps
totrue
and restart the PEM agent. PEM registry entries are located inHKEY_LOCAL_MACHINE\Software\Wow6432Node\EnterpriseDB\PEM\agent
.
After providing all the information required by the step, select Save.
Select Add (+) to add each step, or select the Schedules tab to define the job schedule.
On the Schedules tab, select Add (+) to add a schedule for the job. Then select the compose icon located at the left side of the header to open the Schedule Definition dialog box.
Use the Schedules definition tab to specify the days and times for the job to execute.
- Provide a name for the schedule in the Name field.
- Use the Enabled switch to indicate for pgAgent to use the schedule (Yes) or to disable the schedule (No).
- Use the calendar selector in the Start field to specify the starting date and time for the schedule.
- Use the calendar selector in the End field to specify the ending date and time for the schedule.
- Use the Comment field to provide a comment about the schedule.
Select the Repeat tab to define the days when the schedule executes in a cron-style format. The job executes on each date or time element selected on the Repeat tab.
In each field, select a value to add it to the list of selected values for the field. To clear the values from a field, select the X located at the right-side of the field.
- Use the fields in the Days box to specify the days when the job executes:
- Use the Week Days field to select the days when the job executes.
- Use the Month Days field to select the numeric days when the job executes. Select Last Day to perform the job on the last day of the month, regardless of the date.
- Use the Months field to select the months when the job executes.
- Use the fields in the Times box to specify the times when the job executes in hours and minutes.
Select the Exceptions tab to specify any days when you don't want the schedule to execute. For example, you might not want to run jobs on holidays.
Select Add (+) to add a row to the exception table. Then:
- In the Date column, open a calendar selector and select a date when you don't want the job to execute. Specify **<Any>** to indicate that you don't want the job to execute on any day at the time selected.
- In the Time column, open a time selector and specify a time when you don't want the job to execute. Specify **<Any>** to indicate that you don't want the job to execute at any time on the day selected.
Select the Notifications tab to configure the email notification settings on job level:
- Use the Send the notifications field to specify when you want to send the email notifications.
- Use the Email group field to specify the email group to send the email notification to.
When you finish defining the schedule, you can use the SQL tab to review the code that creates or modifies your job.
Select Save to save the job definition.
After you save a job, the job is listed under the Jobs node of the PEM tree of the server on which it was defined. The Properties tab in the PEM console displays a high-level overview of the selected job, and the Statistics tab shows the details of each run of the job. To modify an existing job or to review detailed information about a job, right-click a job name, and select Properties from the context menu.
Agent privileges
By default, the PEM agent is installed with root privileges for the operating system host and superuser privileges for the database server. These privileges allow the PEM agent to invoke unrestricted probes on the monitored host and database server about system usage, retrieving and returning the information to the PEM server.
PEM functionality lessens as the privileges of the PEM agent decrease. For complete functionality, run the PEM agent as root. If the PEM agent is run under the database server's service account, PEM probes don't have complete access to the statistical information used to generate reports, and functionality is limited to the capabilities of that account. If the PEM agent is run under another lesser-privileged account, functionality is limited even further.
Feature Name | Works with root User | Works with non-root User | Works with remote PEM Agent |
---|---|---|---|
Audit Manager | yes | The Audit Log Manager may be unable to apply requested modifications if the service cannot be restarted. The user running PEM Agent may be different from the user who owns the data directory of the database server, so user running PEM Agent may not be able to change the configuration and also may not be able to restart the services of the database server. | no |
Capacity Manager | yes | yes | yes NOTE: There will be no co-relation between the database server and operating system metrices |
Log Manager | yes | The Log Manager may be unable to apply requested modifications if the service cannot be restarted. The user running PEM Agent may be different from the user who owns the data directory of the database server, so user running the PEM Agent may not be able to change the configuration and also may not be able to restart the services of the database server. | no |
Manage Alerts | yes | yes | yes NOTE: When run alert script on the database server is selected, it will run on the machine, where bound PEM Agent is running, and not on the actual database server machine. |
Manage Charts | yes | yes | yes |
Manage Dashboards | yes | Some dashboards may not be able to show complete data. For example, columns such as swap usage, CPU usage, IO read, and IO write will be displayed as 0 in the session activity dashboard. | Some dashboards may not be able to show complete data. For example, the operating system information of the database server will not be displayed as not available. |
Manage Probes | yes | Some of the PEM probes will not return information, and some of functionalities may be affected. For details about probe functionality, see the Agent privileges. | Some of the PEM probes will not return information, and some of the functionalities may be affected. |
Postgres Expert | yes | The Postgres Expert will be able to access the configuration expert and schema expert, but not the security expert. | The Expert will provide partial information as operating system information is not available. |
Postgres Log Analysis Expert | yes | The Postgres Log Analysis Expert may not be able to do the analysis as it is dependent on the logs imported by log manager, which will not work as required. | The Postgres Log Analysis Expert will not be able to do the analysis as it is dependent on the logs imported by log manager, which will not work as required. |
Scheduled Tasks | yes | For Linux if user is the same as batch_script_user in agent.cfg then shell script will run. | Scheduled tasks will work only for database server; scripts will run on a remote Agent. |
Tuning Wizard | yes | The Tuning Wizard will be unable to run if the service cannot be restarted. The user running PEM Agent may be different from the user who owns the data directory of the database server, so user running PEM Agent may not be able to change the configuration and also may not be able to restart the services of the database server. | no |
System Reports | yes | yes | yes |
Core Usage Reports | yes | yes | The Core Usage report will not show complete information. For example, the platform, number of cores, and total RAM will not be displayed. |
Managing BART | yes | BART and the BART scanner may not be able to start/reload. | no NOTE: BART requires password less authentication between two machines, where database server and BART are installed. |
If you limit the operating system privileges of the PEM agent, some of the PEM probes don't return information, and the following functionality might be affected.
Note
The list isn't comprehensive but provides an overview of the type of functionality that's limited.
Probe or action | Operating system | PEM functionality affected |
---|---|---|
Data And Logfile Analysis | Linux/ Windows | The Postgres Expert can't access complete information. |
Session Information | Linux | The per-process statistics are incomplete. |
PG HBA | Linux/ Windows | The Postgres Expert can't access complete information. |
Service restart functionality | Linux/ Windows | The Audit Log Manager, Server Log Manager Log Analysis Expert, and PEM might not be able to apply requested modifications. |
Package Deployment | Linux/ Windows | PEM can't run downloaded installation modules. |
Batch Task | Windows | PEM can't run scheduled batch jobs in Windows. |
Collect data from server (root access required) | Linux/ Windows | Columns such as swap usage, CPU usage, IO read, IO write appear as 0 in the session activity dashboard. |
If you restrict the database privileges of the PEM agent, the following PEM functionality might be affected:
Probe | Operating system | PEM functionality affected |
---|---|---|
Audit Log Collection | Linux/Windows | PEM receives empty data from the PEM database. |
Server Log Collection | Linux/Windows | PEM can't collect server log information. |
Database Statistics | Linux/Windows | The Database/Server Analysis dashboards contain incomplete information. |
Session Waits/System Waits | Linux/Windows | The Session/System Waits dashboards contain incomplete information. |
Locks Information | Linux/Windows | The Database/Server Analysis dashboards contain incomplete information. |
Streaming Replication | Linux/Windows | The Streaming Replication dashboard doesn't display information. |
Slony Replication | Linux/Windows | Slony-related charts on the Database Analysis dashboard don't display information. |
Tablespace Size | Linux/Windows | The Server Analysis dashboard doesn't display complete information. |
xDB Replication | Linux/Windows | PEM can't send xDB alerts and traps. |
If the probe is querying the operating system without enough privileges, the probe might return a permission denied
error. If the probe is querying the database without enough privileges, the probe might return a permission denied
error or display the returned data in a PEM chart or graph as an empty value.
When a probe fails, an entry is written to the log file that contains the name of the probe, the reason the probe failed, and a hint that helps you resolve the problem.
You can view probe-related errors that occurred on the server in the Probe Log dashboard or review error messages in the PEM worker log files. On Linux, the default location of the log file is:
/var/log/pem/worker.log
On Windows, log information is available on the Event Viewer.
Agent configuration
A number of configurable parameters and registry entries control the behavior of the PEM agent. You might need to modify the PEM agent's parameter settings to enable some PEM functionality. After modifying values in the PEM agent configuration file, restart the PEM agent to apply any changes.
With the exception of the PEM_MAXCONN
parameter, we strongly recommend against modifying any of these configuration parameters or registry entries without first consulting EDB support experts unless you need the modifications to enable PEM functionality.
On Linux systems, PEM configuration options are stored in the agent.cfg
file, located in /usr/edb/pem/agent/etc
. The agent.cfg
file contains the following entries.
Parameter name | Description | Default value |
---|---|---|
pem_host | The IP address or hostname of the PEM server. | 127.0.0.1. |
pem_port | The database server port to which the agent connects to communicate with the PEM server. | Port 5432. |
pem_agent | A unique identifier assigned to the PEM agent. | The first agent is '1', the second agent is '2', and so on. |
agent_ssl_key | The complete path to the PEM agent's key file. | /root/.pem/agent.key |
agent_ssl_crt | The complete path to the PEM agent's certificate file. | /root/.pem/agent.crt |
agent_flag_dir | Used for HA support. Specifies the directory path checked for requests to take over monitoring another server. Requests are made in the form of a file in the specified flag directory. | Not set by default. |
log_level | Log level specifies the type of event to write to the PEM log files, one of debug2 , debug , info , warning , error . These are in descending order of logging verbosity; debug2 logs everything possible, and error only logs errors. | warning |
log_location | Specifies the location of the PEM worker log file. | 127.0.0.1. |
agent_log_location | Specifies the location of the PEM agent log file. | /var/log/pem/agent.log |
long_wait | The maximum length of time (in seconds) for the PEM agent to wait before attempting to connect to the PEM server if an initial connection attempt fails. | 30 seconds |
short_wait | The minimum length of time (in seconds) for the PEM agent to wait before checking which probes are next in the queue waiting to run. | 10 seconds |
alert_threads | The number of alert threads to be spawned by the agent. For more information, see About alert threads. | Set to 1 for the agent that resides on the host of the PEM server, 0 for all other agents. |
enable_smtp | When set to true for multiple PEM Agents (7.13 or earlier) it might send more duplicate emails. Whereas for PEM Agents (7.14 or higher) it might send fewer duplicate emails. | true for PEM server host, false for all others. |
enable_snmp | When set to true for multiple PEM Agents (7.13 or earlier) it might send more duplicate traps. Whereas for PEM Agents (7.14 or higher) it might send fewer duplicate traps. | true for PEM server host, false for all others. |
enable_nagios | When set to true, Nagios alerting is enabled. | true for PEM server host, false for all others. |
enable_webhook | When set to true, Webhook alerting is enabled. | true for PEM server host, false for all others. |
max_webhook_retries | Set maximum number of times pemAgent retries to call webhooks on failure. | Default 3. |
connect_timeout | The max time in seconds (a decimal integer string) for the agent to wait for a connection. | Not set by default. Set to 0 to indicate for the agent to wait indefinitely. |
allow_server_restart | If set to TRUE, the agent can restart the database server that it monitors. Some PEM features might be enabled/disabled, depending on the value of this parameter. | False |
max_connections | The maximum number of probe connections used by the connection throttler. | 0 (an unlimited number) |
connection_lifetime | Use ConnectionLifetime (or connection_lifetime) to specify the minimum number of seconds an open but idle connection is retained. This parameter is ignored if the value specified in MaxConnections is reached and a new connection to a different database is required to satisfy a waiting request. | By default, set to 0 (a connection is dropped when the connection is idle after the agent's processing loop). |
allow_batch_probes | If set to TRUE, the user can't create batch probes using the custom probes feature. | false |
heartbeat_connection | When set to TRUE, a dedicated connection is used for sending the heartbeats. | false |
batch_script_dir | Provide the path where script file (for alerting) is stored. | /tmp |
connection_custom_setup | Use to provide SQL code to invoke when a new connection with a monitored server is made. | Not set by default. |
ca_file | Provide the path where the CA certificate resides. | Not set by default. |
batch_script_user | Provide the name of the user to use for executing the batch/shell scripts. | None |
webhook_ssl_key | The complete path to the webhook's SSL client key file. | |
webhook_ssl_crt | The complete path to the webhook's SSL client certificate file. | |
webhook_ssl_crl | The complete path of the CRL file to validate webhook server certificate. | |
webhook_ssl_ca_crt | The complete path to the webhook's SSL ca certificate file. | |
allow_insecure_webhooks | When set to true, allow webhooks to call with insecure flag. | false |
On 64-bit Windows systems, PEM registry entries are located in:
HKEY_LOCAL_MACHINE\Software\Wow6432Node\EnterpriseDB\PEM\agent
The registry contains the entries shown in the table.
Parameter name | Description | Default value |
---|---|---|
PEM_HOST | The IP address or hostname of the PEM server. | 127.0.0.1. |
PEM_PORT | The database server port to which the agent connects to communicate with the PEM server. | Port 5432. |
AgentID | A unique identifier assigned to the PEM agent. | The first agent is '1', the second agent is '2', and so on. |
AgentKeyPath | The complete path to the PEM agent's key file. | %APPDATA%\Roaming\pem\ agent.key. |
AgentCrtPath | The complete path to the PEM agent's certificate file. | %APPDATA%\Roaming\pem\ agent.crt |
AgentFlagDir | Used for HA support. Specifies the directory path checked for requests to take over monitoring another server. Requests are made in the form of a file in the specified flag directory. | Not set by default. |
LogLevel | Specifies the type of event to write to the PEM log files. These are in descending order of logging verbosity; debug2 logs everything possible, and error only logs errors. | warning |
LongWait | The maximum length of time (in seconds) that the PEM agent waits before attempting to connect to the PEM server if an initial connection attempt fails. | 30 seconds |
shortWait | The minimum length of time in seconds that the PEM agent waits before checking which probes are next in the queue (waiting to run). | 10 seconds |
AlertThreads | The number of alert threads for the agent to spawn. For more information, see About alert threads. | Set to 1 for the agent that resides on the host of the PEM server, 0 for all other agents. |
EnableSMTP | When set to true, the SMTP email feature is enabled. | true for PEM server host, false for all others. |
EnableSNMP | When set to true, the SNMP trap feature is enabled. | true for PEM server host, false for all others. |
EnableWebhook | When set to true, Webhook alerting is enabled. | true for PEM server host, false for all others. |
MaxWebhookRetries | Set maximum number of times for pemAgent to retry to call webhooks on failure. | Default 3. |
ConnectTimeout | The max time in seconds (a decimal integer string) that the agent waits for a connection. | Not set by default. If set to 0, the agent waits indefinitely. |
AllowServerRestart | If set to TRUE, the agent can restart the database server that it monitors. Some PEM features might be enabled/disabled, depending on the value of this parameter. | true |
MaxConnections | The maximum number of probe connections used by the connection throttler. | 0 (an unlimited number) |
ConnectionLifetime | Use ConnectionLifetime (or connection_lifetime) to specify the minimum number of seconds an open but idle connection is retained. This parameter is ignored if the value specified in MaxConnections is reached and a new connection to a different database is required to satisfy a waiting request. | By default, set to 0 (a connection is dropped when the connection is idle after the agent's processing loop). |
AllowBatchProbes | If set to TRUE, the user can't create batch probes using the custom probes feature. | false |
HeartbeatConnection | When set to TRUE, a dedicated connection is used for sending the heartbeats. | false |
BatchScriptDir | Provide the path to store the script file for alerting. | /tmp |
ConnectionCustomSetup | Use to provide SQL code to invoke when a new connection with a monitored server is made. | Not set by default. |
ca_file | Provide the path where the CA certificate resides. | Not set by default. |
AllowBatchJobSteps | If set to true,the batch/shell scripts are executed using Administrator user account. | None |
WebhookSSLKey | The complete path to the webhook's SSL client key file. | |
WebhookSSLCrt | The complete path to the webhook's SSL client certificate file. | |
WebhookSSLCrl | The complete path of the CRL file to validate webhook server certificate. | |
WebhookSSLCaCrt | The complete path to the webhook's SSL ca certificate file. | |
AllowInsecureWebhooks | When set to true, allow webhooks to call with insecure flag. | false |
Note
If you add or remove any of the parameter in the agent.cfg
file then agent must be restarted to apply the changes.
About alert threads
The number of alert threads spawned by an agent is determined by the alert_threads
or AlertThreads
parameter. In general, we recommend setting this parameter to 1 on the agent that resides on the PEM server and 0 for all other agents. However, on PEM server instances with very large numbers of alerts (caused by many monitored servers, many enabled alerts, or high alert frequency), it may be necessary to increase this parameter if alerts are not being evaluated at the configured frequency. In this situation, we recommend setting this parameter to around 8 on the agent that resides on the PEM server and 0 for all other agents and tuning up or down accordingly.
When tuning this parameter, it is important to understand that any agent can process any alert, so in general it is unnecessary to have a non-zero number of alert threads on more than one agent. The capacity of the PEM instance to process alerts is determined by the total number of alert thread across all agents. Increasing the number of threads on a specific agent does not give any additional performance for alerts pertaining to servers monitored by that agent.
Each alert thread opens a connection to the PEM server backend, so allocating more threads than necessary does result in additional memory and CPU usage on the PEM server.
Agent properties
The PEM Agent Properties dialog box provides information about the PEM agent from which the dialog box was opened. To open the dialog box, right-click an agent name in the PEM client tree and select Properties from the context menu.
Use the PEM Agent Properties dialog box to review or modify information about the PEM agent:
The Description field displays a modifiable description of the PEM agent. This description is displayed in the tree of the PEM client.
You can use groups to organize your servers and agents in the PEM client tree. Use the Group list to select the group in which the agent is displayed.
Use the Team field to specify the name of the group role that can access servers monitored by the agent. The servers monitored by this agent are visible in the PEM client tree to connected team members. This is a convenience feature. The Team field doesn't provide true isolation. Don't use it for security purposes.
The Heartbeat interval fields display the length of time that elapses between reports from the PEM agent to the PEM server. Use the selectors next to the Minutes or Seconds fields to modify the interval.
Use the Job Notifications tab to configure the email notification settings on agent level:
- Use the Override default configuration? switch to specify if you want the agent level job notification settings to override the default job notification settings. Select Yes to enable the rest of the settings on this dialog box to define when and to whom to send the job notifications.
- Use the Email on job completion? switch to specify whether to send the job notification when the job completes successfully.
- Use the Email on a job failure? switch to specify whether to send the job notification when the job fails.
- Use the Email group field to specify the email group to send the job notification to.
The Agent Configurations tab displays all the current configurations and capabilities of an agent.
- The Parameter column displays a list of parameters.
- The Value column displays the current value of the corresponding parameter.
- The Category column displays the category of the corresponding parameter. It can be either configuration or capability.