Skip to content

Engine Upgrade

HENGSHI SENSE 3.0 and above versions come with the built-in engine GP6 installed by default. GP6 offers higher query performance, non-stop scaling, and support for jsonb types compared to GP5.

During a version upgrade, the engine does not upgrade automatically. Therefore, when upgrading from a lower version to 3.0 or above, the engine will retain the previous version, GP5, unless manually upgraded to GP6 as outlined below.

Pre-Upgrade Environment Preparation

Ensure the environment meets the following requirements before upgrading:

  • During the upgrade process, all engine machines, including Master and Segment, must have twice the free space of the current node's data capacity. One time the space is needed for backing up the original data, and the other for GP6 storage. Use the following command to check the current node's data size:
    shell
    HENGSHI_HOME=/opt/hengshi
    du ${HENGSHI_HOME}/engine-cluster -sch
  • The number of Segments should remain the same as before the upgrade.
  • Since the upgrade process relies on SSH, it is advisable to configure SSH password-less login.

Engine Upgrade Process

The following operations are performed on the engine's Master machine. During the gpbackup phase, all Segments will write data in parallel to the local directory specified by the backup-dir parameter on their respective machines. During gprestore, all Segments will locate and load the backup data files on their respective machines.

Backup GP5 Data

  1. Stop all services, only start the engine.

    shell
    HENGSHI_HOME=/opt/hengshi
    ${HENGSHI_HOME}/bin/hengshi-sense-bin stop all
    ${HENGSHI_HOME}/bin/hengshi-sense-bin start engine
  2. Create a directory for the installation package extraction, such as hengshi-[version].

  3. Download the migration tool.

    shell
    cd hengshi-[version]
    wget https://download.hengshi.com/3rd/pivotal_greenplum_backup_restore-1.15.0-1.tar.gz
  4. Update the engine-related configurations.

    shell
    HENGSHI_HOME=/opt/hengshi
    cd ${HENGSHI_HOME}
    test -f conf/hengshi-sense-env.sh || cp conf/hengshi-sense-env.sh.sample conf/hengshi-sense-env.sh
    set_kv_config() {
        local config_file="$1"
        local param="$2"
        local val="$3"
        # edit param=val if exist or insert new param=val
        grep -E "^\s*${param}\s*=" "${config_file}" > /dev/null \
                    || sed -i "$ a ${param}=${val}" "${config_file}"
    }
    set_kv_config conf/hengshi-sense-env.sh HS_PG_DB postgres
    set_kv_config conf/hengshi-sense-env.sh HS_PG_USR postgres
    set_kv_config conf/hengshi-sense-env.sh HS_PG_PWD postgres
    set_kv_config conf/hengshi-sense-env.sh HS_ENGINE_DB postgres
    set_kv_config conf/hengshi-sense-env.sh HS_ENGINE_USR postgres
    set_kv_config conf/hengshi-sense-env.sh HS_ENGINE_PWD postgres
  5. Export GP5 data, select a directory to store the exported data, such as: ${HENGSHI_HOME}/gpbackup, the free space in this directory must be larger than the current data size.

    shell
    export HENGSHI_HOME=/opt/hengshi
    cd hengshi-[version]
    tar -xf pivotal_greenplum_backup_restore-1.15.0-1.tar.gz -C ${HENGSHI_HOME}/lib/gpdb/gpdb/ #must be executed, extract to the current GP5 symlink directory
    bash #launch a new bash
    source ${HENGSHI_HOME}/engine-cluster/export-cluster.sh
    psql postgres -A -t -c "select 'drop view '|| viewname || ' cascade;' from pg_catalog.pg_views where schemaname NOT IN ('pg_catalog', 'information_schema', 'gp_toolkit') order by schemaname, viewname" > drop_views.sql
    cat drop_views.sql
    psql postgres -f drop_views.sql
    psql postgres -c "drop function if exists public.safe_to_number(text)"
    # backup
    gpbackup --dbname postgres --backup-dir ${HENGSHI_HOME}/gpbackup --compression-level 9
    exit #exit new bash

Tip

The example uses the library name postgres, specify it according to the actual situation when operating. If there are multiple libraries, each library needs to be backed up separately and specify a different --backup-dir. The value of --compression-level is 1-9. The larger the value, the higher the compression ratio, and the longer the time it takes. During self-testing, it was found that when the level is 6, it takes about 1 hour for 100G, and the size of the backup data is nearly 30G. This result is for reference only. For other parameters of the gpbackup command, please refer to gpbackup.

Start GP6

  1. Stop GP5, start GP6.

    shell
    HENGSHI_HOME=/opt/hengshi
    cd hengshi-[version]
    cp -r lib/gpdb-6* ${HENGSHI_HOME}/lib
    cd ${HENGSHI_HOME}
    bin/hengshi-sense-bin stop engine
    mv engine-cluster engine-cluster.gp5.bak
    gpdb_name=$(ls ${HENGSHI_HOME}/lib/gpdb-* -dvr --color=never| head -n 1)
    gpdb_name=${gpdb_name##*/}
    cd ${HENGSHI_HOME}/lib
    rm -f gpdb
    ln -s ${gpdb_name} gpdb
    cd ${HENGSHI_HOME}
    bin/hengshi-sense-bin init engine
    bin/hengshi-sense-bin start engine
  2. After GP6 starts, import engine data.

    shell
    export HENGSHI_HOME=/opt/hengshi
    cd hengshi-[version]
    tar -xf pivotal_greenplum_backup_restore-1.15.0-1.tar.gz -C ${HENGSHI_HOME}/lib/gpdb/gpdb/ #Must be executed, extract to the current GP6 symlink directory
    bash #launch a new bash
    source ${HENGSHI_HOME}/engine-cluster/export-cluster.sh
    psql postgres -c 'create role dwguest'
    # find all timestamp (14chars)
    find ${HENGSHI_HOME}/gpbackup/SegDataDir-1/backups/ -maxdepth 2 | sort
    # restore with a timestamp
    gprestore --backup-dir ${HENGSHI_HOME}/gpbackup --timestamp xxxxxxxxxxxxxx
    exit #exit new bash

    Note:

    • If there is an issue with the import and you need to re-import, you can follow the steps below to re-initialize and start.
      bash
      cd ${HENGSHI_HOME}
      bin/hengshi-sense-bin stop engine
      rm -rf engine-cluster
      bin/hengshi-sense-bin init engine
      bin/hengshi-sense-bin start engine
    • When importing engine data using this method, global objects, including Tablespaces, Databases, Database-wide configuration parameter settings (GUCs), Resource group definitions, Resource queue definitions, Roles, GRANT assignments of roles to databases, are not imported. Refer to Parallel Backup with gpbackup and gprestore. Therefore, there may be cases where roles or queues do not exist. Refer to the following methods to resolve them.
      • Specify the --with-globals option, but it may prompt that roles or queues already exist. You need to check and delete them before importing or specify the --on-error-continue option to ignore them. However, this option will ignore all errors, so use it with caution.
      • Manually create them. Open the ${HENGSHI_HOME}/gpbackup/SegDataDir-1/backups/YYYYMMDD/YYYYMMDDHHMMSS/gpbackup_YYYYMMDDHHMMSS_metadata.sql file to see which roles, queues, etc., were created, and then manually execute them. Existing roles, queues can be ignored. If there are authorization operations for roles, queues, etc., they also need to be executed. Please check carefully not to miss anything.
    • If the safe_to_number function does not exist, create one manually.
    sql
    CREATE OR REPLACE FUNCTION SAFE_TO_NUMBER(text)
    RETURNS numeric IMMUTABLE STRICT AS
    $$
    BEGIN
      RETURN $1::numeric;
    EXCEPTION WHEN OTHERS THEN
      RETURN NULL;
    END
    $$ LANGUAGE plpgsql;
    • If the database name does not exist, you can specify the --create-db option to automatically create the database. If it already exists, do not specify it, otherwise, it will report an error.
    • You can specify --metadata-only to import only meta data, including table creation, but not data.
    • You can specify --data-only to import only data, not including table creation.
    • Based on self-testing results, at compression-level 6, the time taken is approximately 1.5 times that of the backup.
    • For related instructions on the gprestore command, please refer to link.
  3. After the upgrade is successful, clean up the data.

    shell
    HENGSHI_HOME=/opt/hengshi
    cd ${HENGSHI_HOME}
    rm -rf engine-cluster.gp5.bak
    rm -rf lib/gpdb-5*

Rollback on Upgrade Failure

When issues arise during the upgrade process, please follow the instructions below to perform a rollback.

  1. Stop all HENGSHI SENSE services.

    shell
    HENGSHI_HOME=/opt/hengshi
    ${HENGSHI_HOME}/bin/hengshi-sense-bin stop all
  2. Delete the GP6 data directory.

    shell
    HENGSHI_HOME=/opt/hengshi
    cd ${HENGSHI_HOME}
    test -d engine-cluster.gp5.bak && rm -rf engine-cluster
  3. Restore GP5 related engine data.

    shell
    HENGSHI_HOME=/opt/hengshi
    cd ${HENGSHI_HOME}
    mv engine-cluster.gp5.bak engine-cluster
    gpdb_name=$(ls ${HENGSHI_HOME}/lib/gpdb-5* -dvr --color=never| head -n 1)
    gpdb_name=${gpdb_name##*/}
    rm -f ${HENGSHI_HOME}/lib/gpdb
    cd ${HENGSHI_HOME}/lib
    ln -sf ${gpdb_name} ${HENGSHI_HOME}/lib/gpdb

HENGSHI SENSE Platform User Manual