Skip to content

Engine Enable HA

Overview

Master Image

When the Master node is enabled with high availability, there are primary and standby instances. Clients can only connect to the primary master and execute commands on it, while the standby master maintains data consistency with the primary master through Write-Ahead Log (WAL) streaming replication. When the primary master fails, the standby master does not automatically switch to become the primary master. Administrators can switch the standby master to become the new primary master by running the gpactivatestandby tool. For detailed information, please refer to Overview of Master Mirroring.

Segment Mirror

The engine database stores data in multiple segment instances, each of which is a PostgreSQL instance. Data is distributed across segment nodes according to the distribution strategy defined in the table creation statement. If high availability is not enabled, the database must be manually recovered after a segment node fails before it can be started.

When high availability is enabled, each segment has a secondary node, referred to as a mirror node. Each segment instance consists of a pair of primary and mirror, with the mirror segment maintaining data consistency with the primary segment through streaming replication based on write-ahead logs (WAL).

For more detailed information, please refer to Overview of Segment Mirroring.

Enable HA

Below describes how to start HA. Assuming HENGSHI SENSE is installed in the /opt/hengshi directory on a host named host1, with the user being hengshi, the engine has 2 segment instances, and HA is not enabled. Another host, named host2, serves as the mirror.

Install and Initialize the Image Host

If HENGSHI SENSE is already installed on the host, you can proceed directly to the next step: Enable HA for Segment.

  1. Create an execution user with root privileges, named hengshi in the example.

    bash
    grep hengshi /etc/passwd > /dev/null || sudo useradd -m hengshi
  2. Configure passwordless login for host1 and host2.

  3. Create the installation path /opt/hengshi.

    bash
    sudo mkdir -p /opt/hengshi && sudo chown hengshi:hengshi /opt/hengshi
  4. Install HENGSHI SENSE.

    bash
    sudo su - hengshi             # Switch to the product running user
    cd ~/pkgs/hengshi-sense-[version]           # Switch to the extraction target directory
    ./hs_install -p /opt/hengshi    # Execute the installation
  5. Initialize the OS with sudo privileges.

    bash
    sudo su - hengshi             # Switch to the product running user
    cd /opt/hengshi                 # Enter the installation target directory
    bin/hengshi-sense-bin init-os all  # Initialize the OS

Note

The HENGSHI SENSE service does not need to be started here.

Enable HA for Segment

Mirror segment instances can be deployed in different ways across cluster hosts based on different configurations.

  • Group method, the default deployment method. The main segment images corresponding to each host are collectively placed on another host. When one host fails, the number of active segments on the machine where the images of the services of the other host are located will double.
  • Spread method ensures that at most one image on each machine is promoted to the primary segment. This approach prevents a sudden increase in pressure on other hosts in the event of a single host failure. Distributing mirrors in a spread manner requires the number of cluster hosts to be greater than the number of segments on each host.

Detailed instructions for the above two deployment methods can be found in Segment Mirroring Overview. This article focuses on deploying segment mirroring in group mode, explaining how to enable HA using the gpaddmirrors tool. The specific steps are as follows.

  1. Create the configuration file. The format of the configuration file is:
text
contentID|address|port|data_dir

Field Description:

  • contentID: The content ID of the mirror segment, which is the same as the content ID of the primary node. For more details, please refer to the gp_segment_configuration reference information under content.
  • address: The hostname or IP of the node.
  • port: The listening port of the mirror segment, incremented based on the port base of the existing node.
  • data_dir: The data directory of the mirror segment.

Below is an example of the configuration file, assuming the configuration file is named mirrors.txt, and the content of the configuration is:

0|host2|26432|/opt/hengshi/engine-cluster/mirror/SegDataDir0
1|host2|26433|/opt/hengshi/engine-cluster/mirror/SegDataDir1
  1. Run gpaddmirrors to enable segment mirroring.
bash
source /opt/hengshi/engine-cluster/export-cluster.sh
gpaddmirrors -a -i mirrors.txt
  1. After the mirror is successfully added, you can see the following prompt message.
text
20200313:15:52:54:007684 gpaddmirrors:host1:hengshi-[INFO]:-Process results...
20200313:15:52:54:007684 gpaddmirrors:host1:hengshi-[INFO]:-******************************************************************
20200313:15:52:54:007684 gpaddmirrors:host1:hengshi-[INFO]:-Mirror segments have been added; data synchronization is in progress.
20200313:15:52:54:007684 gpaddmirrors:host1:hengshi-[INFO]:-Data synchronization will continue in the background.
20200313:15:52:54:007684 gpaddmirrors:host1:hengshi-[INFO]:-Use gpstate -s to check the resynchronization progress.
20200313:15:52:54:007684 gpaddmirrors:host1:hengshi-[INFO]:-******************************************************************
  1. Verify the synchronization status of the mirrors. Execute the command 'gpstate -m', and you will see the following prompt message, indicating that the mirrors have been synchronized.
text
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-Starting gpstate with args: -m
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.2.1 build dev'
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.2.1 build dev) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit compiled on Dec 23 2019 17:10:46'
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-Obtaining Segment details from master...
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:--------------------------------------------------------------
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:--Current GPDB mirror list and status
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:--Type = Group
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:--------------------------------------------------------------
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-   Mirror   Datadir                                        Port    Status    Data Status
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-   host2     /opt/hengshi/engine-cluster/mirror/SegDataDir0   26432   Passive   Synchronized
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:-   host2     /opt/hengshi/engine-cluster/mirror/SegDataDir1   26433   Passive   Synchronized
20200313:15:54:12:007841 gpstate:host1:hengshi-[INFO]:--------------------------------------------------------------

Tip

  1. The status during verification may be Failed. Possible reasons include data still being synchronized or the mirror just starting up. Please try again later.
  2. If the data in the segment is too large, it will put significant pressure on Greenplum. Therefore, it is recommended to enable high availability during periods of low business pressure.

Enable Master's HA

Enabling HA for the master is relatively simple, using the command gpinitstandby can complete the HA enablement.

  1. Enable the mirror of the Master.

    Execute the command gpinitstandby to enable the mirror of the Master. Refer to the example below.

bash
gpinitstandby -s host2

You can see the following prompt message indicating a successful operation.

text
20200313:16:09:55:008076 gpinitstandby:host1:hengshi-[INFO]:-Validating environment and parameters for standby initialization...
20200313:16:09:55:008076 gpinitstandby:host1:hengshi-[INFO]:-Checking for data directory /opt/hengshi/engine-cluster/data/SegDataDir-1 on host2
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:------------------------------------------------------
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum standby master initialization parameters
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:------------------------------------------------------
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum master hostname               = bdp2
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum master data directory         = /opt/hengshi/engine-cluster/data/SegDataDir-1
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum master port                   = 15432
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum standby master hostname       = host2
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum standby master port           = 15432
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum standby master data directory = /opt/hengshi/engine-cluster/data/SegDataDir-1
20200313:16:09:56:008076 gpinitstandby:host1:hengshi-[INFO]:-Greenplum update system catalog         = On
Do you want to continue with standby master initialization? Yy|Nn (default=N):
> y
20200313:16:09:57:008076 gpinitstandby:host1:hengshi-[INFO]:-Syncing Greenplum Database extensions to standby
20200313:16:09:57:008076 gpinitstandby:host1:hengshi-[INFO]:-The packages on host2 are consistent.
20200313:16:09:57:008076 gpinitstandby:host1:hengshi-[INFO]:-Adding standby master to catalog...
20200313:16:09:57:008076 gpinitstandby:host1:hengshi-[INFO]:-Database catalog updated successfully.
20200313:16:09:57:008076 gpinitstandby:host1:hengshi-[INFO]:-Updating pg_hba.conf file...
20200313:16:09:58:008076 gpinitstandby:host1:hengshi-[INFO]:-pg_hba.conf files updated successfully.
20200313:16:09:59:008076 gpinitstandby:host1:hengshi-[INFO]:-Starting standby master
20200313:16:09:59:008076 gpinitstandby:host1:hengshi-[INFO]:-Checking if standby master is running on host: host2  in directory: /opt/hengshi/engine-cluster/data/SegDataDir-1
20200313:16:10:00:008076 gpinitstandby:host1:hengshi-[INFO]:-Cleaning up pg_hba.conf backup files...
20200313:16:10:01:008076 gpinitstandby:host1:hengshi-[INFO]:-Backup files of pg_hba.conf cleaned up successfully.
20200313:16:10:01:008076 gpinitstandby:host1:hengshi-[INFO]:-Successfully created standby master on host2
  1. Use gpstate to check the status of the mirror master.
bash
gpstate -f

The command execution will prompt the following message.

text
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-Starting gpstate with args: -f
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-local Greenplum Version: 'postgres (Greenplum Database) 6.2.1 build dev'
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-master Greenplum Version: 'PostgreSQL 9.4.24 (Greenplum Database 6.2.1 build dev) on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-39), 64-bit compiled on Dec 23 2019 17:10:46'
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-Obtaining Segment details from master...
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-Standby master details
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-----------------------
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-   Standby address          = host2
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-   Standby data directory   = /opt/hengshi/engine-cluster/data/SegDataDir-1
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-   Standby port             = 15432
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-   Standby PID              = 3050
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:-   Standby status           = Standby host passive
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--------------------------------------------------------------
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--pg_stat_replication
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--------------------------------------------------------------
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--WAL Sender State: streaming
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--Sync state: sync
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--Sent Location: 0/C000000
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--Flush Location: 0/C000000
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--Replay Location: 0/C000000
20200313:16:13:07:008235 gpstate:host1:hengshi-[INFO]:--------------------------------------------------------------

Engine Enable HA Common FAQ

Master HA Common Issues

  1. How to switch to standby master when the primary master fails?

    When the primary master fails, it will not automatically switch to the standby master. Manual switching is required. On the standby master node host2, execute the following command.

bash
source /opt/hengshi/engine-cluster/export-cluster.sh               # If this file is not present, copy it from the original primary master
gpactivatestandby -d /opt/hengshi/engine-cluster/data/SegDataDir-1
  1. How to restore the original primary master?

    When the original primary master fails, you need to first switch the standby master to the new primary master (referred to as the backup master here), and then switch the original primary master as the new standby master manually.

    a. Back up the data directory of the original primary master.

bash
mv /opt/hengshi/engine-cluster/data/SegDataDir-1/opt/hengshi/engine-cluster/data/SegDataDir-1-backup

b. On the backup master node, execute the following command to create the original primary master as a mirrored master.

bash
gpinitstandby -s host1

c. On the standby master node, stop the master.

bash
gpstop -m

d. Execute the switch command on the original primary master node.

bash
gpactivatestandby -d /opt/hengshi/engine-cluster/data/SegDataDir-1

e. On the standby master node, back up the data directory.

bash
mv /opt/hengshi/engine-cluster/data/SegDataDir-1 /opt/hengshi/engine-cluster/data/SegDataDir-1-backup

f. On the original primary master node, create the backup master as a mirror master.

bash
gpinitstandby -s host2
  1. How to Start If the Mirror Master Fails? Execute the following command to start the mirror master.
bash
gpinitstandby -n
  1. How to Remove the Mirror Master? Execute the following command to remove the mirror master.
bash
gpinitstandby -r

Segment HA Common Issues

  1. How to restore the original role after switching between master and slave segments? Please follow the instructions below.
    • Check the current synchronization status.
bash
gpstate -m

If the status is not Synchronized, wait for them to complete synchronization.

  • When running the gprecoverseg tool with the -r option, return the segments to their preferred roles.
bash
gprecoverseg -r
  • Confirm their status.
bash
gpstate -e

HENGSHI SENSE Platform User Manual