Engine Upgrade
HENGSHI SENSE 3.0 and above versions come with the built-in engine GP6 installed by default. GP6 offers higher query performance, non-stop scaling, and support for jsonb types compared to GP5.
During a version upgrade, the engine does not upgrade automatically. Therefore, when upgrading from a lower version to 3.0 or above, the engine will retain the previous version, GP5, unless manually upgraded to GP6 as outlined below.
Pre-Upgrade Environment Preparation
Ensure the environment meets the following requirements before upgrading:
- During the upgrade process, all engine machines, including Master and Segment, must have twice the free space of the current node's data capacity. One time the space is needed for backing up the original data, and the other for GP6 storage. Use the following command to check the current node's data size:shell
HENGSHI_HOME=/opt/hengshi du ${HENGSHI_HOME}/engine-cluster -sch
- The number of Segments should remain the same as before the upgrade.
- Since the upgrade process relies on SSH, it is advisable to configure SSH password-less login.
Engine Upgrade Process
The following operations are performed on the engine's Master machine. During the gpbackup phase, all Segments will write data in parallel to the local directory specified by the backup-dir parameter on their respective machines. During gprestore, all Segments will locate and load the backup data files on their respective machines.
Backup GP5 Data
Stop all services, only start the engine.
shellHENGSHI_HOME=/opt/hengshi ${HENGSHI_HOME}/bin/hengshi-sense-bin stop all ${HENGSHI_HOME}/bin/hengshi-sense-bin start engine
Create a directory for the installation package extraction, such as hengshi-[version].
Download the migration tool.
shellcd hengshi-[version] wget https://download.hengshi.com/3rd/pivotal_greenplum_backup_restore-1.15.0-1.tar.gz
Update the engine-related configurations.
shellHENGSHI_HOME=/opt/hengshi cd ${HENGSHI_HOME} test -f conf/hengshi-sense-env.sh || cp conf/hengshi-sense-env.sh.sample conf/hengshi-sense-env.sh set_kv_config() { local config_file="$1" local param="$2" local val="$3" # edit param=val if exist or insert new param=val grep -E "^\s*${param}\s*=" "${config_file}" > /dev/null \ || sed -i "$ a ${param}=${val}" "${config_file}" } set_kv_config conf/hengshi-sense-env.sh HS_PG_DB postgres set_kv_config conf/hengshi-sense-env.sh HS_PG_USR postgres set_kv_config conf/hengshi-sense-env.sh HS_PG_PWD postgres set_kv_config conf/hengshi-sense-env.sh HS_ENGINE_DB postgres set_kv_config conf/hengshi-sense-env.sh HS_ENGINE_USR postgres set_kv_config conf/hengshi-sense-env.sh HS_ENGINE_PWD postgres
Export GP5 data, select a directory to store the exported data, such as: ${HENGSHI_HOME}/gpbackup, the free space in this directory must be larger than the current data size.
shellexport HENGSHI_HOME=/opt/hengshi cd hengshi-[version] tar -xf pivotal_greenplum_backup_restore-1.15.0-1.tar.gz -C ${HENGSHI_HOME}/lib/gpdb/gpdb/ #must be executed, extract to the current GP5 symlink directory bash #launch a new bash source ${HENGSHI_HOME}/engine-cluster/export-cluster.sh psql postgres -A -t -c "select 'drop view '|| viewname || ' cascade;' from pg_catalog.pg_views where schemaname NOT IN ('pg_catalog', 'information_schema', 'gp_toolkit') order by schemaname, viewname" > drop_views.sql cat drop_views.sql psql postgres -f drop_views.sql psql postgres -c "drop function if exists public.safe_to_number(text)" # backup gpbackup --dbname postgres --backup-dir ${HENGSHI_HOME}/gpbackup --compression-level 9 exit #exit new bash
Tip
The example uses the library name postgres, specify it according to the actual situation when operating. If there are multiple libraries, each library needs to be backed up separately and specify a different --backup-dir
. The value of --compression-level
is 1-9. The larger the value, the higher the compression ratio, and the longer the time it takes. During self-testing, it was found that when the level is 6, it takes about 1 hour for 100G, and the size of the backup data is nearly 30G. This result is for reference only. For other parameters of the gpbackup command, please refer to gpbackup.
Start GP6
Stop GP5, start GP6.
shellHENGSHI_HOME=/opt/hengshi cd hengshi-[version] cp -r lib/gpdb-6* ${HENGSHI_HOME}/lib cd ${HENGSHI_HOME} bin/hengshi-sense-bin stop engine mv engine-cluster engine-cluster.gp5.bak gpdb_name=$(ls ${HENGSHI_HOME}/lib/gpdb-* -dvr --color=never| head -n 1) gpdb_name=${gpdb_name##*/} cd ${HENGSHI_HOME}/lib rm -f gpdb ln -s ${gpdb_name} gpdb cd ${HENGSHI_HOME} bin/hengshi-sense-bin init engine bin/hengshi-sense-bin start engine
After GP6 starts, import engine data.
shellexport HENGSHI_HOME=/opt/hengshi cd hengshi-[version] tar -xf pivotal_greenplum_backup_restore-1.15.0-1.tar.gz -C ${HENGSHI_HOME}/lib/gpdb/gpdb/ #Must be executed, extract to the current GP6 symlink directory bash #launch a new bash source ${HENGSHI_HOME}/engine-cluster/export-cluster.sh psql postgres -c 'create role dwguest' # find all timestamp (14chars) find ${HENGSHI_HOME}/gpbackup/SegDataDir-1/backups/ -maxdepth 2 | sort # restore with a timestamp gprestore --backup-dir ${HENGSHI_HOME}/gpbackup --timestamp xxxxxxxxxxxxxx exit #exit new bash
Note:
- If there is an issue with the import and you need to re-import, you can follow the steps below to re-initialize and start.bash
cd ${HENGSHI_HOME} bin/hengshi-sense-bin stop engine rm -rf engine-cluster bin/hengshi-sense-bin init engine bin/hengshi-sense-bin start engine
- When importing engine data using this method, global objects, including Tablespaces, Databases, Database-wide configuration parameter settings (GUCs), Resource group definitions, Resource queue definitions, Roles, GRANT assignments of roles to databases, are not imported. Refer to Parallel Backup with gpbackup and gprestore. Therefore, there may be cases where roles or queues do not exist. Refer to the following methods to resolve them.
- Specify the
--with-globals
option, but it may prompt that roles or queues already exist. You need to check and delete them before importing or specify the--on-error-continue
option to ignore them. However, this option will ignore all errors, so use it with caution. - Manually create them. Open the ${HENGSHI_HOME}/gpbackup/SegDataDir-1/backups/YYYYMMDD/YYYYMMDDHHMMSS/gpbackup_YYYYMMDDHHMMSS_metadata.sql file to see which roles, queues, etc., were created, and then manually execute them. Existing roles, queues can be ignored. If there are authorization operations for roles, queues, etc., they also need to be executed. Please check carefully not to miss anything.
- Specify the
- If the
safe_to_number
function does not exist, create one manually.
sqlCREATE OR REPLACE FUNCTION SAFE_TO_NUMBER(text) RETURNS numeric IMMUTABLE STRICT AS $$ BEGIN RETURN $1::numeric; EXCEPTION WHEN OTHERS THEN RETURN NULL; END $$ LANGUAGE plpgsql;
- If the database name does not exist, you can specify the
--create-db
option to automatically create the database. If it already exists, do not specify it, otherwise, it will report an error. - You can specify
--metadata-only
to import only meta data, including table creation, but not data. - You can specify
--data-only
to import only data, not including table creation. - Based on self-testing results, at compression-level 6, the time taken is approximately 1.5 times that of the backup.
- For related instructions on the gprestore command, please refer to link.
- If there is an issue with the import and you need to re-import, you can follow the steps below to re-initialize and start.
After the upgrade is successful, clean up the data.
shellHENGSHI_HOME=/opt/hengshi cd ${HENGSHI_HOME} rm -rf engine-cluster.gp5.bak rm -rf lib/gpdb-5*
Rollback on Upgrade Failure
When issues arise during the upgrade process, please follow the instructions below to perform a rollback.
Stop all HENGSHI SENSE services.
shellHENGSHI_HOME=/opt/hengshi ${HENGSHI_HOME}/bin/hengshi-sense-bin stop all
Delete the GP6 data directory.
shellHENGSHI_HOME=/opt/hengshi cd ${HENGSHI_HOME} test -d engine-cluster.gp5.bak && rm -rf engine-cluster
Restore GP5 related engine data.
shellHENGSHI_HOME=/opt/hengshi cd ${HENGSHI_HOME} mv engine-cluster.gp5.bak engine-cluster gpdb_name=$(ls ${HENGSHI_HOME}/lib/gpdb-5* -dvr --color=never| head -n 1) gpdb_name=${gpdb_name##*/} rm -f ${HENGSHI_HOME}/lib/gpdb cd ${HENGSHI_HOME}/lib ln -sf ${gpdb_name} ${HENGSHI_HOME}/lib/gpdb