Engine Upgrade (GP5 -> GP6)

This article explains how to upgrade GP5 from the historical version to GP6.

Pre-upgrade Environment Preparation

Before upgrading, ensure the environment meets the following requirements:

During the upgrade process, all machines in the engine, including Master and Segment, must reserve 2 times the current node's data capacity as free space. One time the space is needed for backing up the original data, and another time for GP6 storage. Use the following command to check the current node's data size:
shell
```
HENGSHI_HOME=/opt/hengshi
du ${HENGSHI_HOME}/engine-cluster -sch
```
1
2
The number of Segments during the upgrade must be the same as the number of nodes before the upgrade.
Since the upgrade process relies on SSH, it is recommended to configure SSH passwordless login.

Engine Upgrade Process

The following operations are performed on the Master machine of the engine. During the gpbackup phase, all Segments will concurrently write data to the local directory specified by the backup-dir parameter on their respective machines. Similarly, during gprestore, all Segments will load the backup data files from their respective machines.

Backup GP5 Data

Stop all services and start only the engine.

shell

HENGSHI_HOME=/opt/hengshi
${HENGSHI_HOME}/bin/hengshi-sense-bin stop all
${HENGSHI_HOME}/bin/hengshi-sense-bin start engine

Create a directory for the installation package extraction, such as hengshi-[version].

Download the migration tool.

shell

cd hengshi-[version]
wget https://download.hengshi.com/3rd/pivotal_greenplum_backup_restore-1.15.0-1.tar.gz

Update engine-related configurations.

shell

HENGSHI_HOME=/opt/hengshi
cd ${HENGSHI_HOME}
test -f conf/hengshi-sense-env.sh || cp conf/hengshi-sense-env.sh.sample conf/hengshi-sense-env.sh
set_kv_config() {
    local config_file="$1"
    local param="$2"
    local val="$3"
    # edit param=val if exist or insert new param=val
    grep -E "^\s*${param}\s*=" "${config_file}" > /dev/null \
                || sed -i "$ a ${param}=${val}" "${config_file}"
}
set_kv_config conf/hengshi-sense-env.sh HS_PG_DB postgres
set_kv_config conf/hengshi-sense-env.sh HS_PG_USR postgres
set_kv_config conf/hengshi-sense-env.sh HS_PG_PWD postgres
set_kv_config conf/hengshi-sense-env.sh HS_ENGINE_DB postgres
set_kv_config conf/hengshi-sense-env.sh HS_ENGINE_USR postgres
set_kv_config conf/hengshi-sense-env.sh HS_ENGINE_PWD postgres

Export GP5 data and specify the directory to store the exported data, such as: ${HENGSHI_HOME}/gpbackup. The free space in this directory must be larger than the current data size.

shell

export HENGSHI_HOME=/opt/hengshi
cd hengshi-[version]
tar -xf pivotal_greenplum_backup_restore-1.15.0-1.tar.gz -C ${HENGSHI_HOME}/lib/gpdb/gpdb/ #must execute, extract to the current GP5 symlink directory
bash #launch a new bash
source ${HENGSHI_HOME}/engine-cluster/export-cluster.sh
psql postgres -A -t -c "select 'drop view '|| viewname || ' cascade;' from pg_catalog.pg_views where schemaname NOT IN ('pg_catalog', 'information_schema', 'gp_toolkit') order by schemaname, viewname" > drop_views.sql
cat drop_views.sql
psql postgres -f drop_views.sql
psql postgres -c "drop function if exists public.safe_to_number(text)"
# backup
gpbackup --dbname postgres --backup-dir ${HENGSHI_HOME}/gpbackup --compression-level 9
exit #exit new bash

Tip

In the example, the database name is postgres. Specify the actual database name during operation. If there are multiple databases, each database needs to be backed up separately, and a different --backup-dir must be specified. The value of --compression-level ranges from 1 to 9. The higher the value, the greater the compression ratio, but the longer the time required. During testing, it was found that when the level is 6, 100G takes about 1 hour, and the backup data size is approximately 30G. This result is for reference only. Other parameters for the gpbackup command can be found at gpbackup.

Start GP6

Stop GP5 and start GP6.

shell

HENGSHI_HOME=/opt/hengshi
cd hengshi-[version]
cp -r lib/gpdb-6* ${HENGSHI_HOME}/lib
cd ${HENGSHI_HOME}
bin/hengshi-sense-bin stop engine
mv engine-cluster engine-cluster.gp5.bak
gpdb_name=$(ls ${HENGSHI_HOME}/lib/gpdb-* -dvr --color=never| head -n 1)
gpdb_name=${gpdb_name##*/}
cd ${HENGSHI_HOME}/lib
rm -f gpdb
ln -s ${gpdb_name} gpdb
cd ${HENGSHI_HOME}
bin/hengshi-sense-bin init engine
bin/hengshi-sense-bin start engine

After GP6 starts, import engine data.
shell
```
export HENGSHI_HOME=/opt/hengshi
cd hengshi-[version]
tar -xf pivotal_greenplum_backup_restore-1.15.0-1.tar.gz -C ${HENGSHI_HOME}/lib/gpdb/gpdb/ #must execute, extract to the current GP6 symlink directory
bash #launch a new bash
source ${HENGSHI_HOME}/engine-cluster/export-cluster.sh
psql postgres -c 'create role dwguest'
# find all timestamp (14chars)
find ${HENGSHI_HOME}/gpbackup/SegDataDir-1/backups/ -maxdepth 2 | sort
# restore with a timestamp
gprestore --backup-dir ${HENGSHI_HOME}/gpbackup --timestamp xxxxxxxxxxxxxx
exit #exit new bash
```
1
2
3
4
5
6
7
8
9
10
11
Notes:
- If issues occur during import and re-import is needed, you can execute the following steps to reinitialize and restart.
  bash
```
cd ${HENGSHI_HOME}
bin/hengshi-sense-bin stop engine
rm -rf engine-cluster
bin/hengshi-sense-bin init engine
bin/hengshi-sense-bin start engine
```
  1
  2
  3
  4
  5
- Using this method to import engine data will not include global objects, such as Tablespaces, Databases, Database-wide configuration parameter settings (GUCs), Resource group definitions, Resource queue definitions, Roles, GRANT assignments of roles to databases. Refer to Parallel Backup with gpbackup and gprestore. This may result in missing roles or queues. Refer to the following methods to resolve:
  - Specify the --with-globals option, but it may prompt that roles or queues already exist. You need to check and delete them before import or use the --on-error-continue option to ignore errors. However, this option will ignore all errors, so use it cautiously.
  - Manually create roles and queues. Open the file ${HENGSHI_HOME}/gpbackup/SegDataDir-1/backups/YYYYMMDD/YYYYMMDDHHMMSS/gpbackup_YYYYMMDDHHMMSS_metadata.sql to see which roles and queues were created, then manually execute the creation commands. Existing roles and queues can be ignored. If there are authorization operations for roles or queues, execute the corresponding authorization commands. Carefully check to avoid omissions.
- If safe_to_number is missing, manually create it.
sql
```
CREATE OR REPLACE FUNCTION SAFE_TO_NUMBER(text)
RETURNS numeric IMMUTABLE STRICT AS
$
BEGIN
  RETURN $1::numeric;
EXCEPTION WHEN OTHERS THEN
  RETURN NULL;
END
$ LANGUAGE plpgsql;
```
1
2
3
4
5
6
7
8
9
- If the database name is missing, you can specify the --create-db option to automatically create the database. If it already exists, do not specify this option, otherwise an error will occur.
- You can specify --metadata-only to import only metadata, including table creation, but not data.
- You can specify --data-only to import only data, excluding table creation.
- Based on self-testing results, with a compression level of 6, the time taken is approximately 1.5 times the backup duration.
- For details about the gprestore command, refer to link.

Clean up data after a successful upgrade.

shell

HENGSHI_HOME=/opt/hengshi
cd ${HENGSHI_HOME}
rm -rf engine-cluster.gp5.bak
rm -rf lib/gpdb-5*

Rollback After Upgrade Failure

If issues occur during the upgrade process, follow the steps below to perform a rollback.

Stop all HENGSHI SENSE services.

shell

HENGSHI_HOME=/opt/hengshi
${HENGSHI_HOME}/bin/hengshi-sense-bin stop all

Delete the GP6 data directory.

shell

HENGSHI_HOME=/opt/hengshi
cd ${HENGSHI_HOME}
test -d engine-cluster.gp5.bak && rm -rf engine-cluster

Restore GP5-related engine data.

shell

HENGSHI_HOME=/opt/hengshi
cd ${HENGSHI_HOME}
mv engine-cluster.gp5.bak engine-cluster
gpdb_name=$(ls ${HENGSHI_HOME}/lib/gpdb-5* -dvr --color=never| head -n 1)
gpdb_name=${gpdb_name##*/}
rm -f ${HENGSHI_HOME}/lib/gpdb
cd ${HENGSHI_HOME}/lib
ln -sf ${gpdb_name} ${HENGSHI_HOME}/lib/gpdb

User Manual

SDK

ChatBot

Integration

Data Reporting

Create Dataset

Dataset Management

Function List

Dashboard Creation

Chart Controls

Advanced Chart Calculations

Functionality

Display Controls

App Settings

Data Agent

Model Providers

Engine Upgrade (GP5 -> GP6) ​

Pre-upgrade Environment Preparation ​

Engine Upgrade Process ​

Backup GP5 Data ​

Start GP6 ​

Rollback After Upgrade Failure ​

Engine Upgrade (GP5 -> GP6)

Pre-upgrade Environment Preparation

Engine Upgrade Process

Backup GP5 Data

Start GP6

Rollback After Upgrade Failure