PostgreSQL Logical Replication

PostgreSQL is one of the most advanced open source databases with a lot of great features. Replication of the data is one of them. There are two types of data replication methods possible in the Postgresql.

  1. Physical replication mechanism

In the physical replication, data of a cluster is copied from the server to another standby system. Physical replication transfers the write-ahead log to the standby server.

Standby system is used in three ways:

  1. Hot standby – keeps standby database updated with primary and is used to execute read data queries
  2. Wet standby – keeps standby database updated and it does not let clients connect to it
  3. Cold standby – keeps standby database updated and it does not start until a primary server has failed
  4. Logical replication mechanism
https://lh3.googleusercontent.com/BdtD_QqTQFfsM-pZAvDYhMK4JaRc3XM8i55chjwocoUdsORykQYVvleVjN95fGhk4EDk1JY2dbBmJH2druXA03czSTwLF3or_vLIcNGe6Fsbal5_Ch5SorKyqd6RzCKqXevMhFkX

Logical replication is a method of replicating data objects and their changes, based upon their replication identity or primary key. 

In the logical replication, write-ahead logs are not physically transferred to the standby server. Logical replication allows replicating at the level of tables, columns or rows. It allows fine-grained control over both data replication and security. 

Logical replication uses a publish-and-subscribe model. It can consolidate multiple databases into a single database. There can be multiple publishers which got subscribed by a single subscriber. There can be multiple subscribers to create multiple standby databases. 

Snapshots are used to get data from the publisher and send to the subscriber. In this way, real-time replication can be achieved. In logical replication, you need to create a Publication in the primary database (called as ‘publisher’) and create a subscription in the standby database (called as ‘subscriber’).

Publication

By default, all operation types are replicated but publications can choose to limit the changes into the replicated data.  Any combination of INSERT, UPDATE, DELETE, and TRUNCATE are published. While creating the publication, we need to define the key for a table called ‘replica identity’.

This table is used while updating and deleting rows from the subscribers’ table, and by default, it is the primary key of the table. If the table does not have a unique or a primary key, the complete row will be considered as the replica identity.

CREATE PUBLICATION command is used to create a publication. Publication can be dropped using the command ‘DROP PUBLICATION’. Publication can be altered by using the ‘ALTER PUBLICATION’ command.

Subscription

The Subscription is added using the ‘ CREATE SUBSCRIPTION’ command and it defines the connection to another database and a set of publications (one or more) to which it wants to subscribe.

The subscriber database behaves in the same way as any other PostgreSQL instance and can be used as a publisher for other databases by defining its own publications.

Subscription can be stopped/resumed at any time by using the ALTER SUBSCRIPTION command and removed by using the  DROP SUBSCRIPTION command.

The schema definitions statements executed in the publisher are not replicated, and the published tables must exist on the subscriber. The tables are matched between the publisher and the subscriber using the fully qualified table name. The columns are also matched by name. The order of columns in the subscriber table does not need to match that of the publisher.

Logical replication behaves similarly to normal DML operations in that the data will be updated even if it was changed locally on the subscriber node. If incoming data violates any constraints, the replication will stop and this is called as ‘conflict’ which needs to be resolved manually.

When is logical replication useful?

It is very important to know when to use the Logical Replication, otherwise, you will not get much benefit, if your use case does not match. So, here are some use cases on when to use Logical Replication:

  1. If you want to consolidate multiple databases into a single database for analytical purposes.
  2. If your requirement is to replicate data between different major versions of PostgreSQL.
  3. If you want to send incremental changes in a single database or a subset of a database to other databases.
  4. If giving access to replicated data to different groups of users.
  5. If sharing a subset of the database between multiple databases.

Conclusion

Logical replication is very simple to implement and you can easily create multiple standby database backups with real-time data. This also helps to improve the performance of the application where continuous reading and writing to the database is required.

Written by: Nagesh Kshirsagar
Edited by: Thailambal PN

You may also like