Introduction to the pg_dump command

The pg_dump is a PostgreSQL backup and restores utility. The pg_dump command dumps a PostgreSQL database into a script or archive file, performing consistent backups.

Digital Analytics
-
8 min
Digital Analytics
/
Introduction to the pg_dump command

Forbes cites leveraging data as the key to smarter business decisions. That said, more companies are relying on data to drive innovation and competitiveness.

Protecting that information is critical to ensuring leaders have reliable and updated data to inform their decision-making. The pg_dump utility is a PostgreSQL tool for backing up PostgreSQL databases.

In this tutorial, we’ll discuss the basics of the tool and how to use it to start backing up your databases.  

How To Use pg_dump

The pg_dump utility is a command-line tool to export the data from a PostgreSQL database into a file. Extracting database objects such as tablespaces, schemas, individual tables, and views becomes efficient and effortless using pg_dump.

The tool requires elevated permissions. As such, the PostgreSQL superuser account invokes the process. The utility does not block other processes during the backup to avoid performance degradation. It continues to process new data in parallel.

The resulting file is useful in several ways:

  • Use as a point-in-time restore for disaster recovery
  • Create database archives for long-term retention
  • Use the data for a test and development environment
  • Meet requirements for compliance audits
  • Transfer data between databases

Using the psql tool for pg_dump

Psql is a command-line interface to PostgreSQL. It allows you to run SQL commands to query the database, change data, explore schema, modify database objects, bulk load data, and more.

Furthermore, the pg_dump utility can now export data to a specified directory, load data from a specified directory, and load data from a specified file.

Administrators use psql to create pg_dump files. The syntax for the psql command is as follows:

psql [options] [database | dbname] [command | query]... [options]

Supported backup types

The tool performs either a Hot or Cold snapshot before creating the backup. 

  • Hot: This is the default setting. A hot backup creates a snapshot of data as it exists in the database when the backup is started. While the data is being transferred to the backup media, the database remains operational.
  • Cold: A cold backup stops the database when the backup is started and creates a snapshot of data as it exists. While the data is being transferred to the backup media, the database cannot be used.

When performing an export, pg_dump supports the following backup types: 

  • Full: This backs up the entire Postgres database server, including data, configuration, and system data.
  • Incremental: This backs up only data that has changed since the last full or incremental backup, including configuration and system data.
  • Logical: This backs up the Postgres transaction log, capturing only the data changed since the last full backup. Logical backups require Postgres to be run with logging enabled.
  • Archive: This backs up data from the Postgres database server, excluding configuration and system data. Archive file backups require Postgres to be run with archiving enabled.

Backing up PostgreSQL databases using pg_dump

The output of the utility is a file that contains either a standard export of data from the database or an SQL script file used to restore the data at a later time.

The utility supports saving dumps in various formats, including text format files with an ASCII format; compressed files using gzip (the default); and tar format.

The tool can also write dump files in raw binary format if you have special requirements. 

How to back up a Single Database using the pg_dump Command

A standard command will follow this format:

pg_dump [connection-option…] [option…] [dbname]

Example:

The following command shows how to perform a database backup on localhost.

pg_dump -h localhost -p 5432 -U postgres postgres > backup.sql

The connection-option will specify the hostname, port, user, and password to connect to the database. 

The dbname will be the name of the database you want to dump. If omitted, the command dumps all databases on the server.

The -U switch tells pg_dump to use the database schema, not the table data.

The -h switch tells pg_dump to dump the entire database, even if it does not contain any data. The -p switch specifies the PostgreSQL port number.

The > specifies where to write the backup file. You can specify an absolute path or a relative path 

The tool has several additional parameters useful when creating a database backup. 

How to use pg_dump for remote databases

The command for a remote server is as follows: 

pg_dump -h host_address> -p port> -U username> -d database> 

host_address> is the IP address or hostname of the server

port> is the port number

username> is the database user’s name

database> is the database name

Example: 

pg_dump -h 192.168.1.10 -p 5432> -U postgres> -d testdatabase> > /path/to/dump.dmp

The above command creates a plain text dump file at the location /path/to/dump.dmp. The command uses the default port on the remote server, which is 5432 unless otherwise specified by the user in the command line. 

Using pg_dumpall to back up all databases

Using pg_dumpall creates a dump file for each database. The name of the dump file for each database will be in the format of:

dbname~1_seq~1.dump

dbname> is the name of the database 

seq> is a sequence number for the database

Example

An example pg_dumpall command is:

pg_dumpall -U -h -f /my/backup/dir/dbname>_seq> dumpfile.dmp

How to restore databases 

The pg_restore utility restores the contents of a dump file to a PostgreSQL server. The pg_restore utility does not require the database to be shut down while restoring the data. The syntax of pg_restore is as follows:

pg_restore [-v] [-f format] [-d directory] [-h hostname] dbname filename

If the filename is omitted, it is assumed to be the name of a PostgreSQL server. The options are as follows:

-U, --unlock-tables Unlock tables in the current schema locked by active transactions. This option unlocks those tables.

-f, --force-empty Forces pg_restore to restore empty tables and empty sequences even if they don't exist on disk. 

-d, --directory Forces pg_restore to create files in a directory instead of restoring files into an existing directory hierarchy. 

-i, --include=pattern Exclude specific files and directories from being restored. If no pattern is provided, all files and directories will be included in the restore operation. 

Example:

> pg_restore –-v –-f csv –-d /home/postgres /home/postgres/data > pg_restore –-v –-f text –-d /home/postgres /home/postgres/data

Restoring all databases using pg_restoreall

The pg_restoreall command restores all the data from a backup file. The pg_restoreall command offers two restoration options.

  • Full: Restores all, including the metadata and database structure. 
  • Incremental: Restores data changed since the last full or incremental backup. 

The syntax of pg_restoreall is as follows:

pg_restoreall [-f format] [-v] dbname filename

For example: 

> pg_restoreall -v -f csv /home/postgres/data/db1 > pg_restoreall -v -f text /home/postgres/data/db1

Adding Pg_Dump to your data management strategy

The pg_dump tool is invaluable to a data management strategy. From data recovery to audit compliance, pg_dump is the key to reliable data backup.

The tool is highly configurable to empower you to save and restore information at any level of granularity within your system. 

If adding pg_dump to your data backup plan is on your roadmap, our  team of data experts, can help. Get in touch

Published on
July 4, 2022

Industry insights you won’t delete. Delivered to your inbox weekly.

Other posts