Cluster Environment Version Upgrade
The version upgrade process in a cluster environment is similar to that in a standalone environment, following the workflow of stopping the old version service -> installing the new version -> updating dependencies -> starting the new version service. Detailed upgrade steps are as follows:
Obtain the new version installation package and extract it to the installation device.
Stop the old version service. Switch to the runtime user, navigate to the installation directory, and execute the stop service command. Refer to the example below, where the runtime username is
hengshiand the old version installation path is/opt/hengshi.
sudo su - hengshi # Switch to the product runtime user
cd /opt/hengshi # Navigate to the installation target directory
bin/hengshi-sense-bin stop all # Stop the old version service- Install the new version. Before installation, refer to Cluster Configuration Information to configure the
hostsandvars.ymlfiles incluster-conf. Ensure that the "hengshi-sense-cluster-[version]" directory is at the same level as thecluster-confdirectory. Then execute the cluster installation command. Refer to the example below for operation.
❯ ls
cluster-conf hengshi-sense-cluster-[version]sudo su - hengshi # Switch to the product runtime user
cd hengshi-sense-cluster-[version] # Navigate to the extracted target directory
./hs_install -m cluster -c ../cluster-conf # Execute cluster installation, automatically backing up data from the previous version; for special cases where backup is not needed, confirm with HENGSHI technical support and use the parameter -s t to skip backupInstallation is successful when the status of all nodes in the installation prompt is [unreachable=0,failed=0].
PLAY RECAP ****************************************************************
Node-A : ok=18 changed=3 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
Node-B : ok=18 changed=3 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
Node-C : ok=18 changed=3 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0- Update system dependencies. During this operation, the user needs sudo permissions on each machine. After successful execution, sudo permissions can be revoked. Refer to the example below for operation.
sudo su - hengshi # Switch to the product runtime user
cd /opt/hengshi # Navigate to the installation target directory
bin/hengshi-sense-bin init-os all # Initialize OSTip
For offline environments, use bin/hengshi-sense-bin init-os all-offline to skip online dependency installation.
- Start the service. Navigate to the installation directory and execute the start command.
sudo su - hengshi # Switch to the product runtime user
cd /opt/hengshi # Navigate to the installation target directory
bin/hengshi-sense-bin start all # Start the new version serviceThe upgrade task is complete when the status of all nodes in the installation prompt is [unreachable=0,failed=0].
PLAY RECAP ***********************************************************************
Node-A : ok=4 changed=3 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
Node-B : ok=3 changed=2 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
Node-C : ok=3 changed=2 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0Rollback
The main operation of rollback extraction is to restore the contents of the most recent backup archive in the backup directory to the deployment location (including: programs, configurations, and data).
First, navigate to the backup directory and select the most recent archive for extraction. Be sure to confirm the timestamp of the archive. In the example, the archive in the hengshi-20210819163228 directory is selected for extraction, with a timestamp of 2021-08-19_16:32:28.
- Prepare rollback files
Log in to the server where the backup data was generated, typically the machine hostingmetadb, such asNode-A. Copy the backup archive to every server in the cluster.
ssh Node-A
nodes=(Node-B Node-C) # All machines except the current server
for host in ${nodes[@]}; do
rsync -avP /opt/hengshi/backup/hengshi-20210819163228 ${host}:/opt/hengshi/backup/;
done- Restore rollback programs
Extract the backup data on each machine.
cd /opt/hengshi
bin/hengshi-sense-bin stop all
nodes=(Node-A Node-B Node-C) # Every server in the cluster
for host in ${nodes[@]}; do
ssh ${host} "cd /opt/hengshi; tar -xf backup/hengshi-20210819163228/bin_conf_lib.tar.gz";
done- Restore rollback data
On themetadbserver, such asNode-A:
cd /opt/hengshi
mv pg_data pg_data.bak
bin/hengshi-sense-bin init metadb
bin/hengshi-sense-bin start metadb
bin/dbrestore.sh -m metadb -l backup/hengshi-20210819163228 -t metadb_backup.2021-08-19_16-33-14[.tar.gz]
bin/hengshi-sense-bin stop metadb
bin/hengshi-sense-bin start all- Clean up old data
After the rollback, once the business checks confirm there are no issues, execute the following cleanup operations:
nodes=(Node-A Node-B Node-C) # Every server in the cluster
for host in ${nodes[@]}; do
ssh ${host} "cd /opt/hengshi && rm -rf pg_data.bak; cd backup/hengshi-20210819163228 && rm metadb_backup.2021-08-19_16-33-14 -rf;"
done