Cluster Installation

This document describes the process of installing HENGSHI SENSE in a cluster environment.

Before installation, please confirm the network environment. If it is an isolated environment and cannot connect to the internet, please first follow the guidance in Installing Dependencies in Offline Environment to install the dependency packages, and then proceed with the instructions in this document. If the network environment can connect to the internet, you can directly follow the instructions in this document for installation.

Preparation Work

Before the integrated installation, please complete the following preparation work.

Environment Preparation

Please follow the steps below to prepare the environment.

First, refer to the Installation Environment document to prepare the installation environment.
Ensure the installation devices meet the following conditions:
- Each device has the sudo command installed.
- Each device has a running user configured for passwordless SSH login.
- Each device's running user is configured with passwordless sudo permissions.
- Each device has a unique hostname.
- Firewalls between devices allow port access, ensuring internal network communication.
- Ensure the machine executing the cluster installation has ansible installed.

If you have completed the environment preparation in steps 1 and 2, you can skip the following instructions and proceed directly to Configure User and Installation Directory to continue the installation process. If you are unsure how to configure the conditions in step 2, you can refer to the following instructions for setup.

Install the sudo command. This command needs to be executed under the root user.
shell
```
yum install -y sudo
```
1
Create an execution user on each device. In the example, the user hengshi is used. This operation needs to be executed under the root user.
shell
```
useradd -m hengshi
passwd hengshi # Set the login password for hengshi
```
1
2
Configure passwordless sudo permissions for the execution user. This operation needs to be executed under the root user.
shell
```
visudo
```
1
Enter the following, save, and exit:
shell
```
hengshi ALL=(ALL)       NOPASSWD: ALL
```
1
Ensure each device has a unique hostname. If there are duplicate hostnames, such as localhost, you need to modify them by directly editing the hostname file.
shell
```
sudo vim /etc/hostname
```
1
Ensure each machine can communicate using hostnames. Edit the /etc/hosts file. If there is local IP information like 127.0.0.1, delete it and restart the server.
shell
```
a.b.c.d1 ${Node-A-hostname}
a.b.c.d2 ${Node-B-hostname}
a.b.c.d3 ${Node-C-hostname}
```
1
2
3
Configure the running user to ensure passwordless SSH login for each machine. Assume the cluster consists of three machines: Node-A, Node-B, Node-C, and the user hengshi is running on each machine.
Note
Node-A, Node-B, Node-C are only examples in the document. Actual configuration should use the real hostnames of the servers.
- During the process, enter the password for hengshi as prompted.
- When prompted with the message "Are you sure you want to continue connecting (yes/no)?", enter yes.
- Perform the ssh-copy-id operation for the local machine's IP. For example, on Node-A, execute ssh-copy-id hengshi@Node-A.
shell
```
test -e ~/.ssh/id_rsa || { yes "" | ssh-keygen -t rsa -q -P ''; }
ssh-copy-id hengshi@localhost
ssh-copy-id hengshi@127.0.0.1
ssh-copy-id hengshi@${Node-A-hostname}
ssh-copy-id hengshi@${Node-A-ip}
ssh-copy-id hengshi@${Node-B-hostname}
ssh-copy-id hengshi@${Node-B-ip}
ssh-copy-id hengshi@${Node-C-hostname}
ssh-copy-id hengshi@${Node-C-ip}
```
1
2
3
4
5
6
7
8
9
Install ansible on the machine executing the installation deployment.
shell
```
sudo yum install -y epel-release
sudo yum install -y ansible
```
1
2

Configure Users and Installation Directory

The following operations should be performed under sudo or root privileges.

The example demonstrates how to configure users and installation directories on a cluster. The username is hengshi, and the installation directory is /opt/hengshi. Assume there are three nodes: A, B, and C. Users should execute the following operations on different nodes.

shell

for x in ${Node-A-hostname} ${Node-B-hostname} ${Node-C-hostname}; do
    ssh $x "grep hengshi /etc/passwd > /dev/null || sudo useradd -m hengshi"
    # Create the hengshi user, set the installation directory and permissions
    ssh $x "sudo mkdir -p /opt/hengshi && sudo chown hengshi:hengshi /opt/hengshi"
done

Assume three machines A, B, and C. Execute the following code for login confirmation.

shell

nodes=(${Node-A-hostname} ${Node-B-hostname} ${Node-C-hostname})
for host in ${nodes[@]}; do
  ssh $host "for x in ${nodes[@]}; do ssh-keygen -R \$x; ssh-keyscan -H \$x >> ~/.ssh/known_hosts; done"
done

SSHD Listening on Non-22 Port on the Server

The installation involves both the local machine and the machines configured in the HS_ENGINE_SEGMENTS variable. If there are cases where SSH does not use port 22, you need to configure the actual port for each host in the deployment user's ~/.ssh/config.

The local machine needs to configure the ports for localhost and the domain name returned by the hostname command.

For example: The local machine is configured with the hostname as localhost, and HS_ENGINE_SEGMENTS includes machines A, B, and C, all of which listen on port 122.

The .ssh/config file needs to include the following configuration and be synchronized to the .ssh/config file on each machine.

Host localhost
  Port 122
Host ${Node-A-hostname}
  Port 122
Host ${Node-B-hostname}
  Port 122
Host ${Node-C-hostname}
  Port 122

Configure Cluster Information

Set up cluster information on the machine where deployment commands need to be executed.

Create a cluster configuration directory. It is recommended to place this directory at the same level as the installation package extraction directory for easier reuse during upgrades. Refer to the example below, where the installation package extraction directory is hengshi-sense-[version].
shell
```
mkdir hengshi-sense-[version]/../cluster-conf
cd hengshi-sense-[version]
cp ansible/hosts.sample ../cluster-conf/hosts
cp ansible/vars.yml.sample ../cluster-conf/vars.yml
```
1
2
3
4

Configure hosts. Follow the instructions in the example below.

[metadb] # Internal metadata database
${Node-A-hostname}

#[metaslave] # metadb database replica (optional), can be used as a backup when the primary database is down
#${Node-B-hostname}

[engine] # Specify one node as master
${Node-A-hostname} master=true
${Node-B-hostname}
${Node-C-hostname}

# If the engine type is not doris, this configuration can be ignored. Commenting or clearing it will not affect the configuration.
# Note: The number of doris-fe nodes should be an odd number.
[doris-fe] # It is recommended to use IP information for configuration; hostname configuration may cause startup failures.
${Node-A-hostname} master=true
${Node-B-hostname}
${Node-C-hostname}

# If the engine type is not doris, this configuration can be ignored. Commenting or clearing it will not affect the configuration.
[doris-be] # It is recommended to use IP information for configuration; hostname configuration may cause startup failures.
${Node-A-hostname}
${Node-B-hostname}
${Node-C-hostname}

[minio]
${Node-A-hostname}

[redis]
${Node-A-hostname}

[hengshi]
${Node-A-hostname}
${Node-B-hostname}
${Node-C-hostname}

Configure vars.yaml. Follow the instructions in the example below to configure vars.yaml.

yaml

temp_work_dir_root: "/tmp" # Temporary directory, usually does not need to be changed
install_path: "/opt/hengshi" # Installation target directory
hengshi_sense_port: 8081
metadb_port: 54320
engine_master_port: 15432
engine_segment_base_port: 25432

Installation

Follow the instructions below to complete the installation process.

Set the environment variable ANSIBLE_PLAYBOOK.
shell
```
export ANSIBLE_PLAYBOOK="ansible-playbook -v"
```
1
Switch to the user executing the installation. In the example, the username is hengshi.
shell
```
sudo su - hengshi
```
1
Navigate to the target directory where the installation package has been extracted.
shell
```
cd ~/pkgs/hengshi-sense-[version]
```
1

Execute the cluster installation command.

shell

./hs_install -m cluster -c ../cluster-conf    # Execute cluster installation

During the installation process, prompt messages will be displayed. When the status of all nodes is [unreachable=0,failed=0], the installation is successful.

shell

PLAY RECAP ****************************************************************
Node-A : ok=18   changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-B : ok=18   changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-C : ok=18   changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0

Configure the System

Before starting the service, please read the Configuration File to set up the relevant configurations.
If the built-in engine type requires Doris, please read the Doris Engine Configuration.

Start the Service

Follow the steps below to start the service.

Initialize the OS.

During initialization, ensure the executing user has sudo privileges. After initialization is complete, you can disable sudo privileges. Switch to the executing user, navigate to the installation directory, and execute the OS initialization command. Refer to the example below, where the executing user is hengshi and the installation directory is /opt/hengshi.

shell

sudo su - hengshi
cd /opt/hengshi
bin/hengshi-sense-bin init-os all  # Initialize OS

Tip

In an offline environment, you can execute bin/hengshi-sense-bin init-os all-offline to skip dependency installation.

Check the prompt messages. When the status of all nodes displays [unreachable=0,failed=0], it indicates that the OS initialization was successful.

TASK [deploy : init-os kernel] ********************************************************************************************************************************************************************************************************************************
changed: [Node-A]
changed: [Node-B]
changed: [Node-C]

TASK [deploy : init-os deps] **********************************************************************************************************************************************************************************************************************************
changed: [Node-A]
changed: [Node-B]
changed: [Node-C]

PLAY RECAP ****************************************************************************************************************************************************************************************************************************************************
Node-A              : ok=5    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
Node-B              : ok=5    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
Node-C              : ok=5    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Initialize HENGSHI SENSE.

Switch to the executing user, navigate to the installation directory, and execute the HENGSHI SENSE initialization command. Refer to the example below, where the executing user is hengshi and the installation directory is /opt/hengshi.

sudo su - hengshi
cd /opt/hengshi
bin/hengshi-sense-bin init all   # Initialize HENGSHI SENSE

Check the prompt messages. When the status of all nodes displays [unreachable=0 failed=0], it indicates that the HENGSHI SENSE initialization was successful.

TASK [operations : metadb init] *******************************************************************************************************************************************************************************************************************************
skipping: [Node-A]
skipping: [Node-B]
skipping: [Node-C]

TASK [operations : engine init] *******************************************************************************************************************************************************************************************************************************
skipping: [Node-A]
skipping: [Node-B]
skipping: [Node-C]

PLAY RECAP ****************************************************************************************************************************************************************************************************************************************************
Node-A              : ok=1    changed=0    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-B              : ok=1    changed=0    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-C              : ok=2    changed=1    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0

Start the service before entering the license.
Before entering the license, the system does not support multi-node operation. You need to start the service for one instance (e.g., Node-A) first. After updating the license, you can start the service for all instances.

shell

cd /opt/hengshi
bin/hengshi-sense-bin start metadb
bin/hengshi-sense-bin start engine
bin/hengshi-sense-bin start minio
bin/hengshi-sense-bin start redis
ansible-playbook ansible/site.yml -i ansible/hosts --tags start-hengshi -e "target=hengshi"  --limit "Node-A";

Refer to Software License to enter the license.

After successful authorization, start HENGSHI SENSE normally.

Switch to the executing user, navigate to the installation directory, and execute the command to start HENGSHI SENSE. Refer to the example below, where the executing user is hengshi and the installation directory is /opt/hengshi.

sudo su - hengshi
cd /opt/hengshi                 # Navigate to the installation directory
bin/hengshi-sense-bin restart hengshi    # Restart HENGSHI SENSE service

Check the prompt messages. When the status of all nodes displays [unreachable=0 failed=0], it indicates that HENGSHI SENSE has started successfully.

PLAY RECAP ***********************************************************************
Node-A              : ok=4    changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-B              : ok=3    changed=2    unreachable=0    failed=0    skipped=3    rescued=0    ignored=0
Node-C              : ok=3    changed=2    unreachable=0    failed=0    skipped=3    rescued=0    ignored=0

You can access the service via a browser at the default address: http://localhost:8081 to use the HENGSHI SENSE service. If access is not possible, check whether the service port HS_HENGSHI_PORT in the configuration file conf/hengshi-sense-env.sh is open to external connections.

Operations After Starting the Service

When the HENGSHI SENSE service is running, it is necessary to regularly back up data to prevent data loss and promptly clean up unused logs to free up storage space.

Regular Data Backup.
It is recommended to back up the database metadb daily. The backup can be stored on local devices or remote devices. Regular backups are suggested during non-peak business hours, such as midnight, to avoid affecting user service usage. The following example demonstrates the command to back up data to a remote device daily at midnight. For detailed parameter explanations, please refer to Data Backup.
sh
```
0 0 * * * /opt/hengshi/bin/dbbackup.sh -m metadb -l /BACKUP/PATH -h $REMOTE_IP -r /BACKUP/PATH
```
1
Regular Log Cleanup.
During operation, HENGSHI SENSE generates runtime logs, which need to be cleaned regularly to free up storage space. The following example demonstrates the command to clean up rolling logs in the internal database daily.
sh
```
0 0 * * * /opt/hengshi/bin/clean_engine.sh -t -r -c -g -p
*/5 * * * * /opt/hengshi/bin/clean_engine.sh -l
```
1
2
Port Exposure Guidelines in Public Network Environments.
In public network environments, avoid exposing the overall service ports of HENGSHI unless necessary to prevent potential attacks due to component vulnerabilities. In special cases, web service ports can be accessed using the IP+port format.

Stop Service

Stop the cluster service using the following command:

shell

bin/hengshi-sense-bin stop all

The service is successfully stopped when the status of each node in the prompt message is [unreachable=0 failed=0].

PLAY RECAP ****************************************************************
Node-A : ok=18   changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-B : ok=18   changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-C : ok=18   changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0

Check Service Running Status

Run the following command to check the service running status.

shell

bin/hengshi-sense-bin status all

The displayed information allows you to view the running status of HENGSHI modules such as metadb, engine, minio, redis, etc. "IS ACTIVE" indicates the module is running, "NOT ACTIVE" indicates the module has stopped, and "skipping" means the node does not have the corresponding module installed.

TASK [operations : metadb status msg] ******************************************
ok: [10.0.5.230] => {}

MSG:

['[metadb]: IS ACTIVE']


TASK [operations : engine status msg] ******************************************
ok: [10.0.5.42] => {}
ok: [10.0.5.81] => {}
ok: [10.0.5.32] => {}

MSG:

['[engine]: IS ACTIVE']

TASK [operations : redis status msg] *******************************************
ok: [10.0.5.173] => {}

MSG:

['[redis]: IS ACTIVE']


TASK [operations : minio status msg] *******************************************
ok: [10.0.5.173] => {}

MSG:

['[minio]: IS ACTIVE']

TASK [operations : syslog status msg] ******************************************
ok: [10.0.5.42] => {}

MSG:

['[syslog]: IS ACTIVE']
ok: [10.0.5.81] => {}

MSG:

['[syslog]: IS ACTIVE']
ok: [10.0.5.32] => {}

MSG:

['[syslog]: IS ACTIVE']

TASK [operations : apmserver status msg] ***************************************
ok: [10.0.5.42] => {}

MSG:

['[apmserver]: IS ACTIVE']
ok: [10.0.5.81] => {}

MSG:

['[apmserver]: IS ACTIVE']
ok: [10.0.5.32] => {}

MSG:

['[apmserver]: IS ACTIVE']

TASK [operations : hengshi sense status msg] ***********************************
ok: [10.0.5.42] => {}

MSG:

['[hengshi]: IS ACTIVE']
ok: [10.0.5.81] => {}

MSG:

['[hengshi]: IS ACTIVE']
ok: [10.0.5.32] => {}

MSG:

['[hengshi]: IS ACTIVE']

TASK [operations : monit status msg] *******************************************
ok: [10.0.5.42] => {}

MSG:

['[monit]: IS ACTIVE']
ok: [10.0.5.81] => {}

MSG:

['[monit]: IS ACTIVE']
ok: [10.0.5.32] => {}

MSG:

['[monit]: IS ACTIVE']
ok: [10.0.5.230] => {}

MSG:

['[monit]: IS ACTIVE']

PLAY RECAP *********************************************************************
10.0.5.173                 : ok=7    changed=3    unreachable=0    failed=0    skipped=14   rescued=0    ignored=0
10.0.5.230                 : ok=5    changed=2    unreachable=0    failed=0    skipped=16   rescued=0    ignored=0
10.0.5.32                  : ok=11   changed=5    unreachable=0    failed=0    skipped=10   rescued=0    ignored=0
10.0.5.42                  : ok=11   changed=5    unreachable=0    failed=0    skipped=10   rescued=0    ignored=0
10.0.5.81                  : ok=11   changed=5    unreachable=0    failed=0    skipped=10   rescued=0    ignored=0

User Manual

SDK

ChatBot

Integration

Data Reporting

Create Dataset

Dataset Management

Function List

Dashboard Creation

Chart Controls

Advanced Chart Calculations

Functionality

Display Controls

App Settings

Data Agent

Model Providers

Cluster Installation ​

Preparation Work ​

Environment Preparation ​

Configure Users and Installation Directory ​

ssh Login Confirmation ​

SSHD Listening on Non-22 Port on the Server ​

Configure Cluster Information ​

Installation ​

Configure the System ​

Start the Service ​

Operations After Starting the Service ​

Stop Service ​

Check Service Running Status ​

Cluster Installation

Preparation Work

Environment Preparation

Configure Users and Installation Directory

ssh Login Confirmation

SSHD Listening on Non-22 Port on the Server

Configure Cluster Information

Installation

Configure the System

Start the Service

Operations After Starting the Service

Stop Service

Check Service Running Status