Cluster Installation

This article describes the process of installing HENGSHI SENSE in a cluster environment.

Before installation, please confirm the network environment. If it is an isolated environment and cannot connect to the internet, please first follow the instructions in Offline Environment Dependency Installation to install the dependencies, and then continue with the instructions in this article. If the network environment can connect to the internet, please proceed directly with the installation as guided in this article.

Preparation

Please complete the following preparations before the integrated installation.

Environment Preparation

Please follow these steps to prepare the environment.

First, refer to the Installation Environment document to prepare the installation environment.
Please ensure that the installation device meets the following conditions.
- The sudo command is installed on each device.
- A running user with passwordless ssh login is established on each device.
- The running user on each device is configured with passwordless sudo permissions.
- Each device has a different hostname.
- Firewall port restrictions between devices are open, and they are interconnected within the internal network.
- Ensure that the machine currently executing the cluster installation has ansible installed.

If you have completed the environment setup in Steps 1 and 2, please ignore the following prompt and proceed directly to Configure Users and Installation Directory to continue the installation. If you are unsure how to configure the conditions in Step 2, you can refer to the following prompt for setup.

Install the sudo command. This command needs to be executed under the root user.
shell
```
yum install -y sudo
```
1
Create an execution user on each device. In this example, the user hengshi is used. This operation needs to be executed under the root user.
shell
```
useradd -m hengshi
passwd hengshi # Set the login password for hengshi
```
1
2
Set passwordless sudo permissions for the execution user. This operation needs to be executed under the root user.
shell
```
visudo
```
1
Enter the following and save and exit
shell
```
hengshi ALL=(ALL)       NOPASSWD: ALL
```
1
Each device has a different hostname. If there are identical hostnames, such as localhost, they need to be set. Directly edit the hostname file to modify it.
shell
```
sudo vim /etc/hostname
```
1
Each machine can communicate using the hostname. Edit the /etc/hosts file, and if there is local IP information like 127.0.0.1, delete it and restart the server.
shell
```
a.b.c.d1 Node-A
a.b.c.d2 Node-B
a.b.c.d3 Node-C
```
1
2
3
Configure the running user to ensure passwordless ssh login on each machine. Assume the cluster consists of three machines: Node-A, Node-B, Node-C, and the user hengshi runs on each machine.
- Enter the hengshi password as prompted
- Enter yes when prompted with the following message: "Are you sure you want to continue connecting (yes/no)?"
- Execute the ssh-copy-id operation for the local machine IP, such as on Node-A, execute ssh-copy-id hengshi@Node-A
shell
```
test -e ~/.ssh/id_rsa || { yes "" | ssh-keygen -t rsa -q -P ''; }
ssh-copy-id hengshi@localhost
ssh-copy-id hengshi@127.0.0.1
ssh-copy-id hengshi@Node-A
ssh-copy-id hengshi@Node-B
ssh-copy-id hengshi@Node-C
```
1
2
3
4
5
6
Install ansible on the machine where the installation and deployment are executed.
shell
```
sudo yum install -y epel-release
sudo yum install -y ansible
```
1
2

Configure Users and Installation Directory

Please perform the following operations under sudo or root privileges.

The example demonstrates how to configure users and installation directories on a cluster, with the username set to hengshi and the installation directory set to /opt/hengshi. The three nodes are Node-A, Node-B, and Node-C. Users perform the following operations on different nodes.

shell

for x in Node-A Node-B Node-C; do
    ssh $x "grep hengshi /etc/passwd > /dev/null || sudo useradd -m hengshi"
    # Create hengshi user, set installation directory and permissions
    ssh $x "sudo mkdir -p /opt/hengshi && sudo chown hengshi:hengshi /opt/hengshi"
done

If there are three machines, Node-A, Node-B, and Node-C, execute the following code to confirm login.

shell

nodes=(Node-A Node-B Node-C)
for host in ${nodes[@]}; do
  ssh $host "for x in ${nodes[@]}; do ssh-keygen -R \$x; ssh-keyscan -H \$x >> ~/.ssh/known_hosts; done"
done

sshd Listening on Non-22 Port on Server

The machines involved in the installation include the local machine and the machines configured in the HS_ENGINE_SEGMENTS variable. If there are non-ssh 22 port situations, you need to configure the actual ports for each host in the deployment user's ~/.ssh/config.

The local machine needs to configure the port for the domain name returned by the localhost and hostname commands.

For example: The hostname of the local machine is configured as localhost, and HS_ENGINE_SEGMENTS=(Node-A Node-B Node-C), with corresponding listening ports all set to 122.

The following configuration needs to be included in the .ssh/config file and synchronized to the .ssh/config on each machine.

Host localhost
  Port 122
Host Node-A
  Port 122
Host Node-B
  Port 122
Host Node-C
  Port 122

Set Cluster Information

Set up cluster information on the machine where the deployment command needs to be executed.

Create the cluster configuration directory. It is recommended that this directory be at the same level as the installation package extraction directory for easy reuse of configurations during upgrades. You can refer to the example below, where the installation package extraction directory is hengshi-sense-[version].
shell
```
mkdir hengshi-sense-[version]/../cluster-conf
cd hengshi-sense-[version]
cp ansible/hosts.sample ../cluster-conf/hosts
cp ansible/vars.yml.sample ../cluster-conf/vars.yml
```
1
2
3
4

Configure hosts. Follow the instructions in the example.

[metadb] #Internal meta database
Node-A

#[metaslave] #metadb database slave (optional) can be used as a standby in case the master goes down
#Node-B

[engine] #Specify one as master
Node-A master=true
Node-B
Node-C

#Note that the number of doris-fe needs to be configured as an odd number
[doris-fe] #It is recommended to use IP information for configuration, hostname configuration may cause startup failure
Node-A master=true
Node-B
Node-C

[doris-be] #It is recommended to use IP information for configuration, hostname configuration may cause startup failure
Node-A
Node-B
Node-C

[minio]
Node-A


[gateway]
Node-A

[zookeeper] #Ensure that the zkid between machines is unique
#Note that only three nodes need to be configured (1, 2, or 4 nodes are not allowed)
Node-A zkid=1
Node-B zkid=2
Node-C zkid=3

[redis]
Node-A

[flink]
Node-A

[hengshi]
Node-A
Node-B
Node-C

Configure vars.yaml. Configure vars.yaml according to the instructions in the example below.

yaml

temp_work_dir_root: "/tmp"   #Temporary directory, generally does not need to be changed
install_path: "/opt/hengshi"  #Installation target directory
gateway_port: 8080
hengshi_sense_port: 8081
metadb_port: 54320
zookeeper_client_port: 2181
engine_master_port: 15432
engine_segment_base_port: 25432

Installation

Follow the instructions below to complete the installation process.

Set the environment variable ANSIBLE_PLAYBOOK.
shell
```
export ANSIBLE_PLAYBOOK="ansible-playbook -v"
```
1
Switch to the user executing the installation, the example user name is hengshi.
shell
```
sudo su - hengshi
```
1
Navigate to the target directory where the installation package is unzipped.
shell
```
cd ~/pkgs/hengshi-sense-[version]
```
1

Execute the cluster installation command.

shell

./hs_install -m cluster -c ../cluster-conf    # Execute cluster installation

The installation process will display prompt messages. When the status of each node is [unreachable=0,failed=0], the installation is successful.

shell

PLAY RECAP ****************************************************************
Node-A : ok=18   changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-B : ok=18   changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-C : ok=18   changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0

System Configuration

Before starting the service, please read the Configuration File to set up the relevant configurations. If the built-in engine type requires Doris, please read the Doris Engine Configuration.

Start Service

Please follow these steps to start the service.

Initialize the OS.

During initialization, please check that the executing user has sudo privileges, and the sudo privileges can be turned off after initialization is complete. Enter the executing user, navigate to the installation directory, and execute the OS initialization command. You can refer to the following example, where the executing user is hengshi and the installation directory is /opt/hengshi.

shell

sudo su - hengshi
cd /opt/hengshi
bin/hengshi-sense-bin init-os all  # Initialize OS

Check the prompt messages, and when the status of each node shows [unreachable=0,failed=0], it indicates that the OS initialization is successful.

TASK [deploy : init-os kernel] ********************************************************************************************************************************************************************************************************************************
changed: [Node-A]
changed: [Node-B]
changed: [Node-C]

TASK [deploy : init-os deps] **********************************************************************************************************************************************************************************************************************************
changed: [Node-A]
changed: [Node-B]
changed: [Node-C]

PLAY RECAP ****************************************************************************************************************************************************************************************************************************************************
Node-A              : ok=5    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
Node-B              : ok=5    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
Node-C              : ok=5    changed=3    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

Initialize HENGSHI SENSE.

Enter the executing user, navigate to the installation directory, and execute the HENGSHI SENSE initialization command. You can refer to the following example, where the executing user is hengshi and the installation directory is /opt/hengshi.

sudo su - hengshi
cd /opt/hengshi
bin/hengshi-sense-bin init all   # Initialize HENGSHI SENSE

Check the prompt messages, and when the status of each node shows [unreachable=0 failed=0], it indicates that the HENGSHI SENSE initialization is successful.

TASK [operations : metadb init] *******************************************************************************************************************************************************************************************************************************
skipping: [Node-A]
skipping: [Node-B]
skipping: [Node-C]

TASK [operations : engine init] *******************************************************************************************************************************************************************************************************************************
skipping: [Node-A]
skipping: [Node-B]
skipping: [Node-C]

PLAY RECAP ****************************************************************************************************************************************************************************************************************************************************
Node-A              : ok=1    changed=0    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-B              : ok=1    changed=0    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-C              : ok=2    changed=1    unreachable=0    failed=0    skipped=1    rescued=0    ignored=0

Start the service before entering the license. Before entering the license, the system does not support multi-machine operation. You need to start the service of an instance first (e.g., Node-A), and then start the services of all instances after updating the license.

shell

cd /opt/hengshi
bin/hengshi-sense-bin start metadb
bin/hengshi-sense-bin start engine
bin/hengshi-sense-bin start zookeeper
bin/hengshi-sense-bin start minio
bin/hengshi-sense-bin start redis
bin/hengshi-sense-bin start flink
ansible-playbook ansible/site.yml -i ansible/hosts --tags start-hengshi -e "target=hengshi"  --limit "Node-A";

Refer to Software License to enter the license.

After successful authorization, start HENGSHI SENSE normally.

Enter the executing user, navigate to the installation directory, and execute the HENGSHI SENSE startup command. Please refer to the following example, where the executing user is hengshi and the installation directory is /opt/hengshi.

sudo su - hengshi
cd /opt/hengshi                 # Enter the installation target directory
bin/hengshi-sense-bin restart hengshi    # Restart the hengshi service

Check the prompt messages, and when the status of each node shows [unreachable=0 failed=0], it indicates that the HENGSHI SENSE startup is successful.

PLAY RECAP ***********************************************************************
Node-A              : ok=4    changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-B              : ok=3    changed=2    unreachable=0    failed=0    skipped=3    rescued=0    ignored=0
Node-C              : ok=3    changed=2    unreachable=0    failed=0    skipped=3    rescued=0    ignored=0

You can access the service address through a browser to use the HENGSHI SENSE service. If you cannot access it, please check if the service port HS_HENGSHI_PORT in the configuration file conf/hengshi-sense-env.sh is open to the public.

Operations After Starting the Service

When HENGSHI SENSE service is running, it is necessary to regularly back up data to prevent data loss and promptly clean up unnecessary logs to free up storage space.

Schedule Data Backup.
It is recommended to back up the database metadb daily, which can be backed up to local devices or remote devices. Scheduled backups are recommended to be performed during off-peak business hours, such as early morning, to avoid affecting user service usage. The following example is the execution command to back up data to a remote device at midnight every day. For detailed parameter explanations, please refer to Data Backup.
sh
```
0 0 * * * /opt/hengshi/bin/dbbackup.sh -m metadb -l /BACKUP/PATH -h $REMOTE_IP -r /BACKUP/PATH
```
1
Schedule Log Cleanup.
During operation, HENGSHI SENSE generates runtime logs, which need to be cleaned up regularly to free up storage space. The following example is the command to clean up rolling logs of the internal database daily.
sh
```
0 0 * * * /opt/hengshi/bin/clean_engine.sh -t -r -c -g -p
*/5 * * * * /opt/hengshi/bin/clean_engine.sh -l
```
1
2
Port Opening Considerations in Public Networks.
In public network environments, avoid exposing the entire HENGSHI SENSE service port in non-essential scenarios to prevent attacks due to component issues. In special cases, the web service port can be accessed via IP+port.

Stop Service

Stop the cluster service with the following command.

shell

bin/hengshi-sense-bin stop all

When the status of each node in the prompt message is [unreachable=0 failed=0], it indicates that the service has been successfully stopped.

PLAY RECAP ****************************************************************
Node-A : ok=18   changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-B : ok=18   changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0
Node-C : ok=18   changed=3    unreachable=0    failed=0    skipped=2    rescued=0    ignored=0

Check Service Running Status

When executing the following command, you can view the service running status.

shell

bin/hengshi-sense-bin status all

In the display information, you can view the running information of HENGSHI modules such as metadb, engine, zookeeper, gateway, minio, redis, and flink. Among them, "IS ACTIVE" indicates that the corresponding module is running, "NOT ACTIVE" indicates that the corresponding module has stopped, and "skipping" indicates that the node does not have the corresponding module installed.

TASK [operations : metadb status msg] ******************************************************************************************
ok: [Node-A] => {
    "msg": [
        "[metadb]: NOT ACTIVE"
    ]
}
skipping: [Node-B]
skipping: [Node-C]

TASK [operations : engine status msg] ******************************************************************************************
ok: [Node-B] => {
    "msg": [
        "[engine]: NOT ACTIVE"
    ]
}

TASK [operations : zookeeper status msg] ***************************************************************************************
ok: [Node-A] => {
    "msg": [
        "[zookeeper]: NOT ACTIVE"
    ]
}
ok: [Node-B] => {
    "msg": [
        "[zookeeper]: NOT ACTIVE"
    ]
}
ok: [Node-C] => {
    "msg": [
        "[zookeeper]: NOT ACTIVE"
    ]
}

TASK [operations : gateway status msg] *****************************************************************************************
ok: [Node-A] => {
    "msg": [
        "[gateway]: NOT ACTIVE"
    ]
}
skipping: [Node-B]
skipping: [Node-C]


TASK [operations : hengshi sense status msg] **********************************************************************************
ok: [Node-A] => {
    "msg": [
        "[syslog]: NOT ACTIVE",
        "[hengshi]: NOT ACTIVE",
        "[watchdog]: NOT ACTIVE"
    ]
}
ok: [Node-B] => {
    "msg": [
        "[syslog]: NOT ACTIVE",
        "[hengshi]: NOT ACTIVE",
        "[watchdog]: NOT ACTIVE"
    ]
}
skipping: [Node-C]

TASK [operations : redis status msg] *****************************************************************************************
ok: [Node-A] => {
    "msg": [
        "[redis]: NOT ACTIVE"
    ]
}
skipping: [Node-B]
skipping: [Node-C]

TASK [operations : minio status msg] *****************************************************************************************
ok: [Node-A] => {
    "msg": [
        "[minio]: NOT ACTIVE"
    ]
}
skipping: [Node-B]
skipping: [Node-C]

TASK [operations : flink status msg] *****************************************************************************************
ok: [Node-A] => {
    "msg": [
        "[flink]: NOT ACTIVE"
    ]
}
skipping: [Node-B]
skipping: [Node-C]

AI Assistant

Connect Data Source

Database

NoSQL/NewSQL

SQL on Hadoop

Cloud

Searching

Multi Dimensional Database

SaaS API

Create Dataset

Dataset Management

parameter

Dashboard

Control Settings

Chart Controls

Indicator Class

map

Table

Advanced Chart Calculations

Display Widgets

Functional Controls

Filter

HENGSHI SENSE Embedded Data Analysis

Data Permission Related Practices

Cluster Installation

Preparation

Environment Preparation

Configure Users and Installation Directory

sshd Listening on Non-22 Port on Server

Set Cluster Information

Installation

System Configuration

Start Service

Operations After Starting the Service

Stop Service

Check Service Running Status

AI Assistant

Connect Data Source

Database

NoSQL/NewSQL

SQL on Hadoop

Cloud

Searching

Multi Dimensional Database

SaaS API

Control Settings

Chart Controls

Indicator Class

map

Table

Advanced Chart Calculations

Filter

Cluster Installation ​

Preparation ​

Environment Preparation ​

Configure Users and Installation Directory ​

SSH Login Confirmation ​

sshd Listening on Non-22 Port on Server ​

Set Cluster Information ​

Installation ​

System Configuration ​

Start Service ​

Operations After Starting the Service ​

Stop Service ​

Check Service Running Status ​

Cluster Installation

Preparation

Environment Preparation

Configure Users and Installation Directory

SSH Login Confirmation

sshd Listening on Non-22 Port on Server

Set Cluster Information

Installation

System Configuration

Start Service

Operations After Starting the Service

Stop Service

Check Service Running Status