Cluster Environment Version Upgrade
The version upgrade process in a cluster environment is similar to that in a standalone environment, following the workflow of stopping the old version service -> installing the new version -> updating dependencies -> starting the new version service. Detailed upgrade steps are as follows:
Obtain the new version installation package and extract it to the installation device.
Stop the old version service. Switch to the runtime user, navigate to the installation directory, and execute the stop service command. Refer to the example below, where the runtime username is
hengshi
and the old version installation path is/opt/hengshi
.
sudo su - hengshi # Switch to the product runtime user
cd /opt/hengshi # Navigate to the installation target directory
bin/hengshi-sense-bin stop all # Stop the old version service
- Install the new version. Before installation, refer to Cluster Configuration Information to configure the
hosts
andvars.yml
files incluster-conf
. Ensure that the "hengshi-sense-cluster-[version]" directory is at the same level as thecluster-conf
directory. Then execute the cluster installation command. Refer to the example below for operation.
❯ ls
cluster-conf hengshi-sense-cluster-[version]
sudo su - hengshi # Switch to the product runtime user
cd hengshi-sense-cluster-[version] # Navigate to the extracted target directory
./hs_install -m cluster -c ../cluster-conf # Execute cluster installation, automatically backing up data from the previous version; for special cases where backup is not needed, confirm with HENGSHI technical support and use the parameter -s t to skip backup
Installation is successful when the status of all nodes in the installation prompt is [unreachable=0,failed=0].
PLAY RECAP ****************************************************************
Node-A : ok=18 changed=3 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
Node-B : ok=18 changed=3 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
Node-C : ok=18 changed=3 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
- Update system dependencies. During this operation, the user needs sudo permissions on each machine. After successful execution, sudo permissions can be revoked. Refer to the example below for operation.
sudo su - hengshi # Switch to the product runtime user
cd /opt/hengshi # Navigate to the installation target directory
bin/hengshi-sense-bin init-os all # Initialize OS
Tip
For offline environments, use bin/hengshi-sense-bin init-os all-offline
to skip online dependency installation.
- Start the service. Navigate to the installation directory and execute the start command.
sudo su - hengshi # Switch to the product runtime user
cd /opt/hengshi # Navigate to the installation target directory
bin/hengshi-sense-bin start all # Start the new version service
The upgrade task is complete when the status of all nodes in the installation prompt is [unreachable=0,failed=0].
PLAY RECAP ***********************************************************************
Node-A : ok=4 changed=3 unreachable=0 failed=0 skipped=2 rescued=0 ignored=0
Node-B : ok=3 changed=2 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
Node-C : ok=3 changed=2 unreachable=0 failed=0 skipped=3 rescued=0 ignored=0
Rollback
The main operation of rollback extraction is to restore the contents of the most recent backup archive in the backup
directory to the deployment location (including: programs, configurations, and data).
First, navigate to the backup
directory and select the most recent archive for extraction. Be sure to confirm the timestamp of the archive. In the example, the archive in the hengshi-20210819163228
directory is selected for extraction, with a timestamp of 2021-08-19_16:32:28
.
- Prepare rollback files
Log in to the server where the backup data was generated, typically the machine hostingmetadb
, such asNode-A
. Copy the backup archive to every server in the cluster.
ssh Node-A
nodes=(Node-B Node-C) # All machines except the current server
for host in ${nodes[@]}; do
rsync -avP /opt/hengshi/backup/hengshi-20210819163228 ${host}:/opt/hengshi/backup/;
done
- Restore rollback programs
Extract the backup data on each machine.
cd /opt/hengshi
bin/hengshi-sense-bin stop all
nodes=(Node-A Node-B Node-C) # Every server in the cluster
for host in ${nodes[@]}; do
ssh ${host} "cd /opt/hengshi; tar -xf backup/hengshi-20210819163228/bin_conf_lib.tar.gz";
done
- Restore rollback data
On themetadb
server, such asNode-A
:
cd /opt/hengshi
mv pg_data pg_data.bak
bin/hengshi-sense-bin init metadb
bin/hengshi-sense-bin start metadb
bin/dbrestore.sh -m metadb -l backup/hengshi-20210819163228 -t metadb_backup.2021-08-19_16-33-14[.tar.gz]
bin/hengshi-sense-bin stop metadb
bin/hengshi-sense-bin start all
- Clean up old data
After the rollback, once the business checks confirm there are no issues, execute the following cleanup operations:
nodes=(Node-A Node-B Node-C) # Every server in the cluster
for host in ${nodes[@]}; do
ssh ${host} "cd /opt/hengshi && rm -rf pg_data.bak; cd backup/hengshi-20210819163228 && rm metadb_backup.2021-08-19_16-33-14 -rf;"
done