1. What is pg_statsinfo?
  2. Description
    1. Statistics Snapshot
    2. Server Log Filter
    3. Server Log Accumulation
    4. Alert Function
    5. Command Line Operations
    6. Automatic Repository Maintenance
  3. Installation
    1. Rquirement
    2. Instlation
  4. Admininistrative Operations and Settings Detailed
    1. Starting and Stopping pg_statsinfo
    2. Taking Snapshots and Deleting Old Snapshots
    3. Distributing Server Log
    4. Server Log Accumulation Into Repository
    5. Alert Function
    6. Getting Reports by Command Line Operation
    7. Automatic Maintenance
    8. Configuration File
  5. Uninstallation
  6. Restrictions
  7. Q&A
  8. Changes From pg_statsinfo 2.5
  9. Detailed Information
    1. Sharing Repository Database
    2. Warm Standby
    3. Fall-Back Mode
    4. Internals
  10. See Also

pg_statsinfo 3.0

What is pg_statsinfo ?

pg_statsinfo is a monitoring tool to record activities and statistics of PostgreSQL server in the form of time series of snapshots. You can examine the snapshots on graphical representations by using pg_stats_reporter.

Description

pg_statsinfo periodically gathers activities and statistics of one or more PostgreSQL servers and packing them as a snapshot. Snapshots are stored into repository database on another or same PostgreSQL server. Besides, it picks up some activities from PostgreSQL's CSV format log files and generates corresponding plain logfiles including its distinctive messages.

Two or more PostgreSQL instances can share single repository database.

You can check for server health and activities in easy-to-grasp graphical representation by using pg_stats_reporter. It shows various information as interacitive tables and graphs.

Components of pg_statsinfo are typically placed as the picture below. Each pg_statsinfo collects the information of the Database Server where it resides on and sends snapshots to the Repository Server. pg_stats_reporter runs behind web server so that the Users can get graphical reports on web browser on the Client.

pic 1: Example system configuration with pg_statsinfo

pic 2: Functional components in pg_statsinfo

Statistics Snapshot

pg_statsinfo periodically gathers various statistics and stores them as snapshots into a repository database. The repository database may riside on the monitored PostgreSQL server (DB cluster) and one repository server can store snapshots from multiple servers. Snapshots are generated for every 10 minutes as default and when commanded manually.

Every snapshots holds the following statistics information:

The required storage for every snapshot depends on the numbers of objects in the monitored database. It occupies about 600 - 800kB in typical cases. pg_statsinfo takes snapshots for every 10 minutes by default, so the required storage for all snapshots in a day from every monitored database is roughly estimated to be 90 - 120MB.

You can see the structure of the tables in pg_statsinfo's repository database in this document. (MS Excel document in Jpanaese).

Server Log Filter

pg_statsinfo distributes server log entries according to message levels. CSV log, plain text log, and syslog can have their own threshold levels.

Server Log Accumulation

pg_statsinfo can store server logs into repository database besides log files.

Alert Function

pg_statsinfo has an alert function which checks for some properties being out of notmal range. It writes alert logs into text log file and repository if set up to do so, but not to CSV log file.

The default alert function checks for the following properties:

(*1) Table correlation is monitored only for clustered tables, which are the tables having clustering index.

You can see setup instrunctions here.

Command line operations

pg_statsinfo has some functions which could be done in command line. Besides, you can get text formatted reports by command line operations.
* You can see the reference in this section.

Reporting in comnand line

You can get a report for the snapshots within the specified period in text format. Following informations also can be shown.

The comprehensive list of report items is shown in "report item list of pg_statsinfo v3.0". (.xls in Japanese)
Report items are equivalent with pg_stats_reporter.
If you would like to see graphical reports, please try pg_stats_reporter.

Administraitve operations

pg_statsinfo also provide administrative operations on command-line interface. Command descriptions are below.


Automatic repository maintenance

pg_statsinfo accumulates snapshots in repository database as it works, so it is necessary to delete stale snapshots. pg_statsinfo has a feature to do that once a day automatically. This feature does following jobs.

It is defaulted to be turned on and will be executed on preset settings.

The snapshots, stored logs and server logs are keeping to increase while this feature is turned off. Manual maintenance should be done properly in the case.

Setup reference of this feature is here.

Installation

Requirement

PostgreSQL versions
PostgreSQL 8.4, 9.0, 9.1, 9.2, 9.3, 9.4 (--with-libxml is required. Standard RPM's are built with this option.)
OS
RHEL 5.x, 6.x, CentOS 5.x, 6.x

Installation

Installing using RPM

Example steps to installing pg_statsinfo using rpm follow.
$ su
# rpm -ivh pg_statsinfo-3.0.0-1/pg92.rhel6.x86_64.rpm

Installing from source

You can build and install pg_statsinfo by following steps using PGXS. Setting up of repository database will be done automatically at first run.

$ cd pg_statsinfo
$ tar xzvf pg_statsinfo-3.0.0.tar.gz 
$ cd pg_statsinfo-3.0.0
$ make USE_PGXS=1
$ su
# make USE_PGXS=1 install

Configuration

This section describes the configuration of pg_statsinfo.

Configuration in postgresql.conf

This section shows the minimal setting to run pg_statsinfo and setting for ordinary case. In these configurations, pg_statsinfo stores snapshots into the 'postgres' database on the same instance to monotored instance. Detailed explanation for all setting parameters is shown in this section.

# minimal configuration
shared_preload_libraries = 'pg_statsinfo'       # preload pg_statsinfo libraries
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # pg_statsinfo requires this log_filename setting
# recommended configuration
shared_preload_libraries = 'pg_statsinfo'       # preload pg_statsinfo libraries
custom_variable_classes = 'pg_statsinfo'        # not required for PostgreSQL 9.2 or later

pg_statsinfo.snapshot_interval = 30min          # snapshot interval 
pg_statsinfo.enable_maintenance = 'on'          # enable automatic maintenance ('on' or 'off')
pg_statsinfo.maintenance_time = '00:02:00'      # delete old snapshots every day at this time.
pg_statsinfo.repository_keepday = 7             # keep snapshots for this period.
pg_statsinfo.repolog_keepday = 7                # keep logs in repository for this period.

log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log' # pg_statsinfo requires this log_filename setting
log_min_messages = 'log'
pg_statsinfo.syslog_min_messages = 'error'
pg_statsinfo.textlog_line_prefix = '%t %p %c-%l %x %q(%u, %d, %r, %a) '
pg_statsinfo.syslog_line_prefix = '%t %p %c-%l %x %q(%u, %d, %r, %a) '

track_functions = 'all'
log_checkpoints = on
log_autovacuum_min_duration = 0
#pg_statsinfo.long_lock_threashold = 30s        # threshold for getting long lock infomation
Notice for tacitly changed parameters.
log_destination
Forcibly set to 'csvlog' and stderr is omitted.
logging_collector
Forcibly set to on.

Configration of pg_hba.conf

Setup to allow the owner of the PostgreSQL process to log in the PostgreSQL server from localhost without password. "ident" is recommended method for authentication. In order to do that, add the following line to pg_hba.conf when "OS-user = DB-superuser = postgres" which is the most common case. Note that only the first line that matches the condition will be in effect. The "ident" authentication method with TYPE=local would be convenient on linux.

# TYPE  DATABASE        USER            CIDR-ADDRESS            METHOD [for UNIX]
local   all             postgres                                ident

Configuration for involving query statistics

For PostgreSQL 8.4 or later, you can have snapshots involving query statistics using pg_stat_statements. pg_statsinfo automatically detects pg_stat_statements and use it. Installation of pg_stat_statements would be done in the following steps after adding it to shared_preload_libraries in postgresql.conf.

# If you use PostgreSQL 9.1 or higher,
$ psql -d postgres -c "CREATE EXTENSION pg_stat_statements"

# If you use PostgreSQL earlier than 9.1,
$ psql -d postgres -f $PGSHARE/contrib/pg_stat_statements.sql

You can set the following parameters in postgresql.conf as needed.

Explanation for these parameters is seen in the Configuration section.

That's all. Have fun.

Admininistrative operations and settings detailed

This section explains about maintenance operations and detailed discription of configuration parameters for pg_statsinfo.

Starting and Stopping pg_statsinfo

No specific operation is required to run pg_statsinfo on PostgreSQL startup. Just start the server.

$ pg_ctl start [OPTIONS]

Likewise, pg_statsinfo stops gracefully along with PostgreSQL's shutdown. Shutting down in other than smart mode might cause some error messages, but they do no harm. Just ignore them.

$ pg_ctl stop -m smart [OPTIONS]

pg_statsinfo can stop individually on running PostgreSQL server by the following command.

$ pg_statsinfo --stop [OPTIONS]

Then it starts by the following command.

$ pg_statsinfo --start [OPTIONS]

Note: pg_statsinfo should be preloaded or it won't start by any means.

Taking snapshots and deleting old snapshots

Automatic snapshots

Pg_statsinfo takes snapshots periodically with the interval determined by pg_statsinfo.snapshot_interval in postgresql.conf.

example: seting snapshot interval time to 30 minutes

pg_statsinfo.snapshot_interval = 30min 

Manual snapshots

Alongside the automatic snapshots, manual snapshots can be taken any time by the following command. The function statsinfo.snapshot takes the label for the snapshot as parameter (text DEFAULT NULL).

$ psql -d postgres -c "SELECT statsinfo.snapshot('comment')"

Deleting snapshots automaticaly

Old snapshots are automatically deleted by the automatic repository maintenance function if tuned on.

Detailed explanation for setting up of the function is shown here.

Deleting snapshots manualy

Snapshots deletion can be executed at any time by the function statsinfo.maintenance(timestamptz). This function deletes all snapshots older than the specified timestamp.

Example: Deleting snapshots older than 2014-02-01 07:00:00.

$ psql -d postgres -c "SELECT statsinfo.maintenance('2014-02-01 07:00:00'::timestamptz);"

Cleaning up log files automatically

Log files are cleaned up along with automatic snapshot deletion.

Detailed explanation for setting up of the function is shown here.

Distributing server log

pg_statsinfo has the function to capture, manipulate, filter and distribute the server logs from PostgreSQL CSV log files.
The types of log files that pg_statsinfo handles and methods of filtering is described below.

Types of log files

CSV log files (*.csv) (named like postgresql-2013-10-01_000000.csv)
CSV log file is the source of the PostgreSQL log messages pg_statsinfo processes. (Detailed explanation for CSV log fie is seen Here)
Text log files (specifically that with the name of "pg_statsinfo.log")
The log file written by pg_statsinfo is called text log file.
This has the following characteristics.
Following parameters detemine the first three characteristics above. (details are Here)
Behavior at the time of log rotation
The file pg_statsinfo.log is renamed to have the same base name to the source CSV log file followed by the extension ".log" on rotating log files. Then the new text log file is created.

PostgreSQL may create a file with the new name mentioned above, which possibly containes stderr messages originate outside PostgreSQL processes. The existing and non-empty "console log" file will be renamed before the rotation.
Following is an example how the rotation looks like.
$ ls -l $PGDATA/pg_log
-rw------- 1 postgres postgres 433644 Oct  1 23:59 postgresql-2013-10-01_000000.csv
-rw------- 1 postgres postgres 322167 Oct  1 23:59 postgresql-2013-10-01_000000.log
-rw------- 1 postgres postgres 425449 Oct  2 23:59 postgresql-2013-10-02_000000.csv
-rw------- 1 postgres postgres 321695 Oct  2 23:59 postgresql-2013-10-02_000000.log
-rw------- 1 postgres postgres 255424 Oct  3 13:40 postgresql-2013-10-03_000000.csv
-rw------- 1 postgres postgres      0 Oct  3 00:00 postgresql-2013-10-03_000000.log
-rw------- 1 postgres postgres 190786 Oct  3 13:40 pg_statsinfo.log

postgresql-2013-10-01_000000.csv ... CSV log that has been rotated
postgresql-2013-10-01_000000.log ... Text log that has been rotated (processed log based on the CSV log above)
postgresql-2013-10-02_000000.csv ... CSV log that has been rotated
postgresql-2013-10-02_000000.log ... Text log that has been rotated (processed log based on the CSV log above)
postgresql-2013-10-03_000000.csv ... Latest CSV log
postgresql-2013-10-03_000000.log ... Console log
pg_statsinfo.log ................... Latest Text log (processed log based on the latest CSV log)

Filtering types

Filter by message level
Log messages are filtered out if its log level is lower than a threshold which is defined by the following configuration parameter in postgresql.conf.
For more information about configuration, refer the Configuration File section.
e.g. Following setup let only log messages whose level is greater than or equal to "warning" be emitted.
pg_statsinfo.textlog_min_messages = warning
Filter by user name
Log messages with particular user names can be filtered out.
The excluding user names are specified by the following cofiguration parameter in postgresql.conf.
For more information about configuration, refer the Configuration File section.
e.g. Output only log messages with the user name other than 'postgres'.
pg_statsinfo.textlog_nologging_users = 'postgres'
Change message level
Error level can be reassigned in the log messages with particular sql states.
The replacement rule is defined by the following parameters in postgresql.conf. The value for each parameter, if any, should be a list of the sqlcodes which determines the log messages whose error level should be changed to be the level that the parameter name suggests.
For more information about configuration, refer the Configuration File section.
e.g. Changing the message level to 'INFO' for logs with the sqlstate of '42P01'.
pg_statsinfo.adjust_log_level = on
pg_statsinfo.adjust_log_info = '42P01'
Note: This configuration is shared between log file and repository-accumulated logs. They cannot have individual settings.

Automatic cleanup of log files

pg_statsinfo automatically moves or removes old log files.
This feature is enabled by default.
Click here for more of the automatic maintenance feature.

Server log accumulation into repository

pg_statsinfo also can accumulate logs into repository database.
The same types of filtering to text log file can be used with partially different settings.

Types of filtering

Filter by message level
Log messages can be filtered out in the same manner to text log file.
The threshold is specified by the folloiwing parameter in postgresql.conf.
Filter by user name
User name filtering of log messages is also available.
The excluding user names are specified by the following parameter in postgresql.conf.
Change the message level
Error level reassignment is also available as text log file. As mentioned in the section, this feature shares the settings with text log files.

Deleting old log entries in the repository database

Old log entries in the repository database is deleted by the automatic maintenance feature.
This feature is enabled by default.

Click here to see how to set up the automatic maintenance.

Alert Function

Alert function is always turned on and provided with a ready-made alert function.

Alert settings using default alert function

The default alert function checks some properties against the thresholds defined in statsrepo.alert table described below. "-1" for the value indicates to turn of the alert for the property.

The alert function is executed just after every snapshot after the first one.

Example: Set ROLLBACKs threshold to 3000 times per second for a snapshot interval.
# UPDATE statsrepo.alert SET commit_tps = 3000 WHERE instid = <instance ID\>
Example: Disabling alert for number of rollbacks per second.
# UPDATE statsrepo.alert SET rollback_tps = -1 WHERE instid = <instance ID\>
Exmaples: Disabling all alerts. Note that the alert function itself is kept being called periodically.
# UPDATE statsrepo.alert SET enable_alert = false WHERE instid = <instance ID>

Definition of alert settings table

Alert settings table stores various settings for the defaultly-provided alert function.
Its name is "statsrepo.alert" and columns are defined as following.

column name data type default value description
instid bigint (no default) Instance ID of the instanece to be monitored
rollback_tps bigint 100 Alert threshold: ROLLBACK per second during a snapshot interval
commit_tps bigint 1000 Alert threshold: COMMIT per second during a snapshot interval
garbage_size bigint 20000 Alert threshold: Table dead space size in megabytes in a snapshot
garbage_percent integer 30 Alert threshold: Table dead space ratio in %
response_avg bigint 10 Alert threshold: Query average response time in seconds for a snapshot interval
response_worst bigint 60 Alert threshold: Query longest response in seconds for a snapshot interval
fragment_percent integer 70 Alert threshold: The absolute value of pg_stats.correlation
backend_max integer 100 Alert threshold: Maximum number of backend for a snapshot interval
disk_remain_percent integer 20 Alert threshold: Available disk space for tablespaces in percent(%)
loadavg_1min real 7.0 Alert threshold: Load average for 1 minutes
loadavg_5min real 6.0 Alert threshold: Load average for 5 minutes
loadavg_15min real 5.0 Alert threshold: Load average for 15 minutes
swap_size integer 1000000 Alert threshold: Disk swap usage in kilobytes
rep_flush_delay integer 100 Alert threshold: Replication delay in megabytes of WAL amount
rep_replay_delay integer 200 Alert threshold: Replication replay delay in megabytes of WAL amount
enable_alert boolean true Enables or disabeles all alerts

Don't delete the existing tuples in order to temporarily disable alerts.

Getting reports by command line operation

Generating a report

$ pg_statsinfo -r REPORTID [-i INSTANCEID] [-b SNAPID] [-e SNAPID] [-B DATE] [-E DATE] [-o FILENAME] [connection-options]

The following example shows a basic usage using the repository database 'postgres' at locahost:5432 accessing as the user 'postgres', which generates the report that is,

$ pg_statsinfo  -r All -h localhost -d postgres -p 5432 -U postgres
Available options are described below,
-r, --report=REPORTID
Generates a report of the type specified by REPORTID.
The following REPORTID are available.
More details are shown in Items of a report in pg_statsinfo v3.0(Both filename and contents are only in Japanese).
REPORTID will be completed if it matches only one ID by case insensitive suffix comparison.
-i, --instid=INSTANCEID
Generates a report for the instance specified by INSTANCEID, which defaults to all monitoring instances.
-b, --beginid=SNAPID
Generates a report for the range begins with SNAPID, which defaults to the oldest snapshot for the instance. This option is not allowed to be used with -B or -E.
-e, --endid=SNAPID
Generates a report for the range ends with SNAPID, which defaults to the latest snapshot for the instance. This option is not allowed to be used with -B or -E.
-B, --begindate=DATE
Generates a report for the period starts at DATE in the format of 'YYYY-MM-DD HH:MI:SS', which defaults to the timestamp of the oldest snapshot for the instance. This option is not allowed to be used with -b or -e.
-E, --enddate=DATE
Generates a report for the period ends at DATE in the format of 'YYYY-MM-DD HH:MI:SS', which defaults to the timestamp of the latest snapshot for the instance. This option is not allowed to be used with -b or -e.
-o, --output=FILENAME
Writes the generated report to FILENAME instaed of stdout. This will overwrites existing file.

Listing snapshots

$ pg_statsinfo -l [-i INSTANCEID] [connection-options]

The following example shows the commnad to list the all snapshots stored in the repository database 'postgres' at localhost:5432 accessing as the user 'postgres'.

$ pg_statsinfo -l -h localhost -d postgres -p 5432 -U postgres
-l, --list
Shows the list of snapshots.
-i, --instid=INSTANCEID
Shows the snapshot list for the instance INSTANCEID, defaults to all instances stored.

Showing repository database size

$ pg_statsinfo -s [connection-options]

The following example is the command to show the size of the repository database 'postgres' at localhost:5432 accessing as the user 'postgres'.

$ pg_statsinfo -s -h localhost -d postgres -p 5432 -U postgres
-s, --size
Shows the repository database size.

Taking a snapshot manually

$ pg_statsinfo -S COMMENT [connection-options]

The following example is the command to tell to take a snapshot of the monitored instance into the repository database 'postgres' at localhost:5432 accessing as the user 'postgres', providing it with the comment 'COMMENT'.

$ pg_statsinfo -S 'COMMENT' -h localhost -d postgres -p 5432 -U postgres
-S, --snapshot=COMMENT
Taking a snapshot manually with the comment 'COMMENT'.

Deleting snapshot manually

$ pg_statsinfo -D SNAPID [connection-options]

The following expmale is the command to tell to delete a snapshot specified by SNAPID in the repository database 'postgres' at localhost:5432 accessing as the user 'postgres'.

$ pg_statsinfo -D 123 -h localhost -d postgres -p 5432 -U postgres
-D, --delete=SNAPID
Deletes a snapshot designated by SNAPID.

Stoping pg_statsinfo agent

$ pg_statsinfo --stop [connection-options]

The following example is the command to tell pg_statsinfo agent to stop by connecting the monitored instance via the database 'postgres' of the user 'postgres' at localhost:5432.

$ pg_statsinfo --stop -h localhost -d postgres -p 5432 -U postgres
--stop
Stops pg_statsinfo agent.

Starting pg_statsinfo agent.

$ pg_statsinfo --start [connection-options]

The following example is the command to tell pg_statsinfo agent to start by connecting the monitored instance via the database 'postgres' of the user 'postgres' at localhost:5432.

$ pg_statsinfo --start -h localhost -d postgres -p 5432 -U postgres
--start
Starts pg_statsinfo agent.

Common connection options

This section describes about the connection options of pg_statsinfo which are common among all subcommands. These options are shared with PostgreSQL and has the same meanings. The connection target will be a repository database when the command manipulates snapshots or the monitored database when the command tells to do something to the agent.

-d, --dbname=DBNAME
Name of database to connect to. The default is $PGDATABASE or the same name as the connection user name if it is not set.
-h, --host=HOSTNAME
Name of the host to coonect to. If this begins with a slash, it specifies Unix-domain communication rather than TCP/IP communication. The default behavior when the host is not specified is to connect to a Unix-domain socket in /tmp.
-p, --port=PORT
Port number to connect to at the server host.
-U, --username=USERNAME
PostgreSQL user name to connect as.
-w, --no-password
Never prompt for password.
-W, --password
Force password prompt (should happen automatically)

Automatic maintenance

Repository database will piled high with snapshots and log directory will face the same situation if older data were left being there. pg_statsinfo deletes them by itself so as not to fill up storage space. This maintenance is executed once every day and it is set up in postgresq.conf as follows,

ex 1: Removing snapshots aged more than 7 days at every 0:02 am.

pg_statsinfo.enable_maintenance = 'snapshot'
pg_statsinfo.maintenance_time = '00:02:00'
pg_statsinfo.repository_keepday = 7

ex 2: Removing logs on the repository aged more than 7 days at every 0:02 am.

pg_statsinfo.enable_maintenance = 'repolog'
pg_statsinfo.maintenance_time = '00:02:00'
pg_statsinfo.repository_keepday = 7

ex 3: Archiving and remove CSV logs before the day before at every 0:02 am.

pg_statsinfo.enable_maintenance = 'log'
pg_statsinfo.maintenance_time = '00:02:00'
pg_statsinfo.log_maintenance_command = '/archive_remove_pglog.sh %l'

Where archive_remove_pglog.sh does, as the name suggests, will archive and remove a log file.

ex 4: Removing both snapshots and logs on the repository aged more than 7 days and archiving and deleting CSV logs before 7 days before.

pg_statsinfo.enable_maintenance = 'on'
pg_statsinfo.maintenance_time = '00:02:00'
pg_statsinfo.repository_keepday = 7
pg_statsinfo.repolog_keepday = 7
pg_statsinfo.log_maintenance_command = '/archive_pglog.sh %l'
Note: All monitored instance sharing one repository database execute this maintenance for the same repository so the following setting results in that snapshots aged more than 3 days won't survive after maintenance process of all instance is finished.

<Monitoring instance1>
pg_statsinfo.enable_maintenance = 'snapshot' 
pg_statsinfo.repository_keepday = 7
<Monitoring instance2>
pg_statsinfo.enable_maintenance = 'snapshot' 
pg_statsinfo.repository_keepday = 5
<Monitoring instance3>
pg_statsinfo.enable_maintenance = 'snapshot' 
pg_statsinfo.repository_keepday = 3

Configuration File

Configuration parameters and their meanings are described below.

pg_statsinfo reads configuration paramters written in postgresql.conf for the monitored instance. Reloding configuration file of PostgreSQL also affects pg_statsinfo.

Required parameters
Name Setting Description
shared_preload_libraries 'pg_statsinfo' Preloading libraries. This is a parameter not of pg_statsinfo's own but needed in order to run pg_statsinfo.
log_filename 'postgresql-%Y-%m-%d_%H%M%S.log' This is also a PostgreSQL's parameter and must be set for pg_statsinfo so that it runs properly. Log files must be ordered by creation time using alphabetical comparson of file names, so the variable parts "%Y", "%m", "%d", "%H", "%M" and "%S" are all should occur in this order in this format string.
track_counts on Enables collection of statistics on database activity. pg_statsinfo depends on the statistics enabled by this parameter.
track_activities on Enables the collection of information on the currently executing command of each session. pg_statsinfo depends on the information enabled by this parameter.
log_min_messages debug5 ∼ log Message level threshould for CSV logs. pg_statsinfo uses this as the source of log distribution, so it must be below or equal to the value of both pg_statsinfo.syslog_min_messages and pg_statsinfo.textlog_min_messages.
log_destination must have 'csvlog' and can have 'syslog' or 'eventlog' optionally pg_statsinfo needs server logs to be emitted to CSV logs, 'stderr' will be silently removed by pg_statsinfo even if it is occured.
logging_collector on pg_statsinfo always forces this tuned on on startup.

Following are PostgreSQL parameters affect the behavior of pg_statsinfo and pg_statsinfo's dedicated parameters. Changes will be in effect after reloading configuration file.

optional parameters
Name Default Setting Description
track_functions none Enables tracking of function call counts and time used. Setting this to 'pl' or 'all' let pg_statsinfo collect call statitics of functions.
track_io_timing off Enable timing of database I/O calls. Enabling this let pg_statsinfo collect I/O statictics. Note that enabling this may cause significant overhead on some platforms.
(This parameter is available for PostgreSQL 9.2 or later.)
log_checkpoints off Causes checkpoints to be logged in the server log. Enableing this let pg_statsinfo collect checkpoint activities.
log_autovacuum_min_duration -1 Causes autovacuums to be logged in the server log. Enabling this let pg_statsinfo collect autovacuum activities. Setting to 0 ∼ 1min is recommended.
log_directory 'pg_log' Directory location for csvlog and textlog files. pg_statsinfo reads this to know the location of log files.
log_rotation_age 1d Rotates logs in this duration.
log_rotation_size 10MB Rotates logs if the size of the current CSV log file exceeds this size.
syslog_facility 'LOCAL0' syslog facility when syslog is enabled.
syslog_ident 'postgres' syslog indent when syslog is enabled
custom_variable_classes - Required for PostgreSQL 9.1 or earlier. The value must be "pg_statsinfo." if required.
pg_statsinfo.textlog_min_messages warning Minimum message level for textlog (*1).
pg_statsinfo.syslog_min_messages disable Minimum message level for syslog (*1).
pg_statsinfo.textlog_filename 'pg_statsinfo.log' Textlog filename. Should not be empty.
pg_statsinfo.textlog_line_prefix '%t %p ' A printf-style string that is output at the beginning of each textlog line. (*2)
pg_statsinfo.syslog_line_prefix '%t %p ' A printf-style string that is output at the beginning of each syslog line. (*2) Note that timestamp and process ID in syslog are them of pg_statsinfo daemon, not of original ones. You need to add %t and %p to preserve the original values.
pg_statsinfo.textlog_permission 0600 Permission mode for textlog file.
pg_statsinfo.textlog_nologging_users - Exclude log lines of these users separated by commas from text log .
pg_statsinfo.repolog_min_messages warning Minimum message levels for repository log (*1).
pg_statsinfo.repolog_nologging_users - Exclude log lines of these users separated by commas from repository log.
pg_statsinfo.repolog_buffer 10000 Since repository logs are sent to repository every 10 seconds by default, pg_statsinfo needs to buffer logs for the intervals. Additionaly, this buffer is expected to absorb a transient burst of log entries which might retard storing them. Log entries which are run over this buffer are simply dropped off.
pg_statsinfo.repolog_interval 10s Repository logs are written at intervals of this value.
pg_statsinfo.sampling_interval 5s Sampling is a process collecting some additional informations such like session states several times for a snapshot interval. This value should be a fraction of snapshot interval (*3)
pg_statsinfo.snapshot_interval 10min snapshot interval (*3)
pg_statsinfo.excluded_dbnames 'template0, template1' Exclude databases listed here from monitoring.
pg_statsinfo.excluded_schemas 'pg_catalog, pg_toast, information_schema' Exclude schemas listed here from monitoring.
pg_statsinfo.repository_server 'dbname=postgres' Connection string to connect the repository (*4). Password prompt must be avoided.
pg_statsinfo.adjust_log_level off Enables or disables log level altering feature.
pg_statsinfo.adjust_log_info - Change loglevel into INFO for logged entries with SQLSTATEs listed here. (*5)
pg_statsinfo.adjust_log_notice - Change loglevel into NOTICE for logged entries with SQLSTATEs listed here. (*5)
pg_statsinfo.adjust_log_warning - Change loglevel into WARNING for logged entries with SQLSTATEs listed here. (*5)
pg_statsinfo.adjust_log_error - Change loglevel into ERROR for loged entries with SQLSTATEs listed here. (*5)
pg_statsinfo.adjust_log_log - Change loglevel into LOG for logged entries with SQLSTATEs listed here. (*5)
pg_statsinfo.adjust_log_fatal - Change loglevel into FATAL for logged entries with SQLSTATEs listed here. (*5)
pg_statsinfo.enable_maintenance 'on' Enable or disable auto maintenance features. Multiple items should be comma separated.
  • 'snapshot': Enables only snapshot maintenance.
  • 'repolog': Enables only repository log maintenance.
  • 'log': Enables only log file maintenance.
  • 'snapshot, log': Enables both snapshot and log file maintenance.
  • 'on': Enables all maintenance features.
  • 'off': Disables everything.
pg_statsinfo.maintenance_time '00:02:00' Time to do automatic maintenance.
pg_statsinfo.repository_keepday 7 Snapshots are preserved for this period.
pg_statsinfo.repolog_keepday 7 Repository logs are preserved for this period.
pg_statsinfo.log_maintenance_command <PGHOME>/bin/archive_pglog.sh %l Command path to be executed to do log file maintenance. Default value is a ready-made shell script which archives old log files into compressed files then removes them.
pg_statsinfo.long_lock_threashold 30s Time to wait before record prolonged locks.
pg_statsinfo.stat_statements_max 30 Maximum number of entries of pg_stat_statements to be recorded for every snapshot.
pg_statsinfo.stat_statements_exclude_users - Exclude queries of these users separated by commas from snapshots.
pg_statsinfo.controlfile_fsync_interval 1min Interval to sync pg_statsinfo's control file.
*1 : Message Levels
The following values are available for a message level. Messages with the specified level or more seviverer level are recorded in the logs. "disable" discards all log entries. Severity order is the same as log_min_messages except additional levels "disable", "alert" and "debug". That is,
disable > alert > panic > fatal > log > error > warning > notice > info > debug

*2 : Prefix Format
Same format with configuration parameter log_line_prefix. Note that log_line_prefix itself is ignored when pg_statsinfo is enabled.
*3 : Time Format
Available units are d (day), h (hour), min (minute), and s (second). If not suffixed by any unit, the value will be interpreted in seconds.
*4 : Connection String
It is a libpq-style connection string. For example, 'hostaddr=127.0.0.1 port=5432 dbname=mydb user=postgres'. See also PQconnectdb in "Database Connection Control Functions" for details. In addition, environment variables used by libpq will be in effect.
Password prompt should be avoided during connectiion to the repository. Use .pgpass file for providing passwords if needed.
*5 : SQLSTATE
An SQLSTATE is an ID looks like '42P01' and '42P02' included in every log entries. If you want to change log levels corresponding to the SQLSTATEs of 42P01 or 42P02 into INFO, set pg_statsinfo.adjust_log_info to '42P01,42P02'.

Reloading configuration for pg_statsinfo

pg_statsinfo shares the PostgreSQL's postgresql.conf as its configuration file. Simply reloading configuration on PostgreSQL where pg_statsinfo resides on also reloads its configurations.

$ pg_ctl reload

Abnormal termination of pg_statsinfo

pg_statsinfo doesn't involve PostgreSQL in its crash but it won't restart automatically. Manually start the agent will help.

Uninstallation

When you uninstall pg_statsinfo, you have to restart PostgreSQL server after removing 'pg_statsinfo' from shared_preload_libraries and all of pg_statsinfo.* parameters in postgresql.conf.

After that, drop all objects created for pg_statsinfo in the monitored instance. $PGSHARE/contrib/uninstall_pg_statsinfo.sql will do it. The following command will remove such objects for 'postgres' database.

$ psql -d postgres -f $PGSHARE/contrib/uninstall_pg_statsinfo.sql 

Repository database is uninstalled by running the script uninstall_pg_statsrepo.sql. Make sure that no other monitored instances are using the same repository database.

$ psql -d <repository> -f $PGSHARE/contrib/uninstall_pg_statsrepo.sql 

Restrictions

There are still some restrictions and limitations in pg_statsinfo.

Character encoding and lc_messages have to be the same among all databases on one monitored instance.
pg_statsinfo supports encodings and message locales that PostgreSQL supports, but all of database in an instance must use the same encoding and locale because pg_statsinfo would fail in parsing logs with mixed-encodings.
Restrictions for log_filename
pg_statsinfo expects log_filename that alphabetical sort makes them ordered in creation time.
Restrictions for pg_statsinfo.textlog_filename
This is mandatory and cannot have variable part.
Unavoidable error messages might occur in fast or immediate shutdown
Fast or immediate shutdown breaks pg_statsninfo's own connection unexpectedly. It does no harm in spite of some error messages.
Shutdown checkpoint cannot be recorded in repository if it shares monitored instance.
Shutodwn checkpoint cannot be recorded in this configuration because the repository database has been shutdown before pg_statsinfo detects it.
Simultaneous snapshot and maintenance would fail.
Repository maintenance takes exclusive locks which prevents snapshots from being recorded. The failed snapshot registering will be retried soon.
Limitation on multi-instance configuration
Monitored instance are distinguished by the combination of database system identifier, host name and waiting port number, so pg_statsinfo get confused if any two instances share the same values for them.
Note about auto maintenance with shared repository.
As described above, shortest setting will be in effect finally if multiple monitored instances sharing one repository database have different preserving periods for auto maintenance.
Limitaion for WAL statistics.
WAL amount is not available on standby server so you will see WAL statistics only on master side.
Rounding off of sub-integer values
Values represented in integer format may be rounded off to be zero for sub integer values.
Logs accumulated in repository might be retarded.
Manually deleteion of CSV log when repository log accumulation is heavily retarded by some reason could cause loss of log entries in the deleted CSV log file.
Connection to repository as a non-superuser.
If you want to connect to repository database as a user other than superuser, make sure that the user has required privileges. The recommended way to accomplish this is creating repository database specifying its owner to be that user. Following steps would do that.
$ createuser -DRSl -U <superuser> <connection user>
$ createdb -U <superuser> -O <connection user> <repository database>
log_timezone that differs from system timezone
pg_statsinfo uses system timezone which is specified by LC_TIME or other means as its working timezone. If it is different from log_timezone, log entries that pg_statsinfo generates have seemingly shifted timestamp in text log file, and log entries in repository will be sorted in wrong order.
Summertime and switching timezone
Switching of summertime and switching timezone to another one possibly rewinds the time representation, that is, name of the new log file. Note that pg_statsinfo always processes the CSV log file with the 'largest' name so the new log file with the 'older' name will be skipped and its contents will disappear.
timezone_abbreviations
Some abbreviations for timezone name duplicates among multiple timezones, say, CST is the abbreviation for '(American) Central Standard Time', 'Cuba Central Standard Time', 'Chinese Standard Time' and so.. pg_statsinfo depends on the abbreviation mappings on the repository database when interpreting the abbreviated timezone in the log entries to be stored so make sure that the abbreviation for the log_timezone on the monitored database is correctly understood by the repository server. You will see the instruction about timezone abbreviation setup here.

Q&A

Q1. How can I check if pg_statsinfo is working fine?

Take manual snapshot, then confirm that the snapshot be stored in repository. The following steps will do.

$psql -d postgres -c "SELECT statsinfo.snapshot('test')"
$psql -d postgres -c "SELECT * FROM statsrepo.snapshot WHERE COMMENT = 'test'"

Q2. How to utilize snapshots in repository?

The snapshots in repository can be of course inspected using sql queries but it needs too much labor for most cases. You can use pg_statsinfo's command line reporting feature to see them as simple reports in text format. For those who wants to see them in graphically interface, pg_stats_reporter provides sortable tables with pager and manipulative graphs. An operable sample report of pg_repoter is available here.

Q3. Auto maintenance seems not cleaning up snapshots.

Perhaps it might be misconfigured. Make sure your configuration is correct. Above all, enable_maintenance should be 'on' or include 'snapshot'.

Q4. How can I confirm that alert function is working?

Set commit alert threshold to 0, then taking a snapshot manually should give you a commit count alert message into server log.

# UPDATE statsrepo.alert SET enable_alert = true, commit_tps = 0;

Don't forget to restore the setting as before.

Q5. How can I upgrade statsinfo from 2.x to 3.x in place?

Sorry but you can't. You should uninstall older version first, then install new one. Furthurmore, the repository database schema of 3.0 is not compatible with 2.x. Please uninstall the old repository, and drop its schema before running new version.

Q6. Command line reporting won't give me anything.

Two or more snapshots in repository are needed to generate a report. If there wasn't sufficient snapshot for the time, you will get a report after some waiting.

Q7. Some report items seem to be missing in commind line reporting.

Some kinds of report needs additional settings and others have some restrictions.

Query Activity (Functions)
Needs track_functions to be set properly in postgresql.conf.
Make sure it is set to other than 'none'.
Query Activity (Statements)
Needs pg_stat_statement to be installed.
Autovacuum Activity
Needs log_autovacuum_min_duration to be non-negative value in postgresql.conf.
Checkpoint Activity
Needs log_checkpoints to be 'on' in postgresql.conf.
OS Resource Usage (IO Usage)
pg_statsinfo reads /proc/diskstats to get the information for device informations so this item doesn't contain the information of NFS mounted devices.
Long Transactions
Skipped if no long transaction information found in the snapshots for the poriod.
Notable Tables
Skipped if no information for such tables found in the snapshots for the period.
Lock Conflicts
Skipped if no lock conflict information found in the snapshots for the period.
Replication Activity
Skipped if no replication activity information found in the snapshots for the period.
Schema Information
Make sure the settings so that all target instances can be connected without prompting for password. See here for details.

Changes from pg_statsinfo 2.5

Following changes have been made after pg_statsinfo 2.5.


Detailed information

More advanced usages and internal structures are explained in the section.

Sharing one repository database among more than one monitored system.

As described above, multiple monitored instances can share single repository database. In order to build such configuration, the repository server should accept connections from the monitored instances without password prompts and the combination of PostgreSQL's system identifier, node name given by uname(2) and PostgreSQL's lisnen port number should be different from any other monitored instance.

Going good along with warm standby

pg_statsinfo can work good with warm standby. There are two typical configuration that can work with warm standby mode. See "pg_statsinfo: warm-standby" for details for each configuration.

  1. Two servers having common repositories outside the cluster.
  2. Two servers having their own repositories.

Fall-Back Mode

pg_statsinfo checks for the validity of repository database at startup and reloading configuration file and it enters fall-back mode if any problem found in the repository. Fall-back mode of pg_statsinfo is a running state that enbales only the functions available without repository access.

Followings are the lists of what is checked for in repository sanity check, and what is disabled on fall-back mode.

Check items for repository sanity check

Turned-off functions in fallback mode

These functions disabled in fallback mode resume working after restoration. All snapshots, alerts during fallback mode are lost forever. On the contrary, log accumulation will be continued at the point of fallback.

Recoverying from fallback mode

pg_statsinfo recovers automatically from fallback mode due to connection error. Otherwise, reloading configuration after removing the cause is needed to recover from fallback.

Usually you can see the cause in log file.
Followings are the common error messages for fallback.

# Connection failure
ERROR:  pg_statsinfo: could not connect to database with "host=192.168.0.1 user=postgres": timeout expired
LOG:  pg_statsinfo: pg_statsinfo is starting in fallback mode

# Wrong version number of repository database schema
ERROR:  pg_statsinfo: incompatible statsrepo schema: version mismatch
LOG:  pg_statsinfo: pg_statsinfo is starting in fallback mode

# XML feature unavailable
ERROR:  pg_statsinfo: repository server must support XML feature. you need to rebuild PostgreSQL using "./configure --with-libxml"
LOG:  pg_statsinfo: pg_statsinfo is starting in fallback mode

The ways to revoer from common issues are shown below.

Connection failure to repository database
Version number mismatch of repository schema
Missing XML feature

Monitoring using SystemTap (Experimental)

Systemtap is available to monitor PostgreSQL on RHEL6, Fedora13 and later.

Prerequisites

Preparation

After the above steps done, pg_statsinfo takes snapshots including Systemtap information.

Note

This is an experimental feature requires --enable-debug and puts some burdain on running server so please refrain from using this on commercial systems.

Internals

pg_statsinfo consists of a library loaded on PostgreSQL and a daemon process. Since the daemon is executed implicitly by the library at server startup, users don't have to execute the daemon explicitly. See also "pg_statsinfo: internal" for details.

See Also

pg_ctl, psql, Server Configuration, The Statistics Collector, System Catalogs, pg_stat_statements, pg_stats_reporter