Kubernetes Deployment Guide
Pre-deployment Preparation
- Obtain the k8s deployment configuration file based on the version.
Installation Version | Deployment File | Component Dependencies |
---|---|---|
5.1.x | k8s-yaml | metadb, engine, hengshi, minio, redis, |
5.3.x | k8s-yaml | metadb, engine, hengshi, minio, redis, apm-server |
5.4.x | k8s-yaml | metadb, engine, hengshi, minio, redis, apm-server |
- Import offline images and modify the image address.
wget https://download.hengshi.com/releases/hengshi-sense-xxx.tar.gz
docker load -i hengshi-sense-xxx.tar.gz
Note
The image address for gpdb is different and needs to be replaced separately, e.g., image: gpdb:x.x.x.
All other components should be replaced with the imported offline image tag, e.g., image: hengshi-sense:5.0-20231103-dp-427c5f.
For k8s/helm environments, images need to be pushed to the image repository used by the cluster, such as registry, harbor, Alibaba Cloud image repository, or Tencent Cloud image repository.
- Replace the $(POD_NAMESPACE) variable in gpdb.yaml with the current namespace. For example, modify it to hengshi as shown below:
sed -i 's/$(POD_NAMESPACE)/hengshi/'
- Modify pvc:
- Change
storageClassName: xxx
to the storage class of the current cluster. - Change
storage: xxx
to the storage size for each service. - For the doris engine, modify doris.yaml.
- Change
metadb.yaml
gpdb.yaml
redis.yaml
minio.yaml
- Specify the namespace, e.g., hengshi.
kubectl create namespace hengshi
engine
Deploying Engine
To modify the gpdb password, changes need to be made in two places:
- gpdb.yaml
GREENPLUM_PWD: hengshi202020
GREENPLUM_QUERY_PWD: query202020
GREENPLUM_ETL_PWD: etl202020
- configmap.yaml
HS_ENGINE_PWD: hengshi202020
ENGINE_QUERY_PASSWORD: query202020
ENGINE_ETL_PASSWORD: etl202020
Initialize and start the engine.
kubectl -n hengshi apply -f gpdb.yaml
kubectl -n hengshi exec -it master-0 -- /entrypoint.sh -m initsystem
kubectl -n hengshi exec -it master-0 -- /entrypoint.sh -m startsystem
Tip
Doris engine yaml: doris.yaml does not require initsystem and startsystem operations.
Deploy Remaining Components
Refer to the following deployment YAML file list.
kubectl -n hengshi apply -f configmap.yaml
kubectl -n hengshi apply -f service.yaml
kubectl -n hengshi apply -f metadb.yaml
kubectl -n hengshi apply -f minio.yaml
kubectl -n hengshi apply -f redis.yaml
kubectl -n hengshi apply -f hengshi.yaml
kubectl -n hengshi apply -f ingress.yaml # Decide whether to execute this step based on your needs
Tip
configmap.yaml is the configuration file for the HENGSHI service.
service.yaml is the service file for internal cluster communication and external exposure.
ingress.yaml deployment is optional based on your needs.
Exposing hengshi Services
hengshi provides example configurations for external access. You can choose one of them as needed.
nodePort
Expose the HENGSHI service using the nodePort method (default, if ingress is not configured, the service can be exposed externally using the nodePort port provided by the service).
For example, the nodePort mapping port corresponding to 8080 in the following example cluster.
apiVersion: v1
kind: Service
metadata:
name: hengshi
spec:
selector:
hsapp: hengshi-sense
hsrole: hengshi
ports:
- protocol: TCP
name: "8080"
port: 8080
targetPort: 8080
- protocol: TCP
name: "5005"
port: 5005
targetPort: 5005
type: NodePort
ingress
Expose the HENGSHI service externally using the ingress method (optional).
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: hengshi-sense
namespace: hengshi-sense
annotations:
ingress.kubernetes.io/force-ssl-redirect: "false"
nginx.ingress.kubernetes.io/proxy-connect-timeout: "90"
nginx.ingress.kubernetes.io/proxy-send-timeout: "90"
nginx.ingress.kubernetes.io/proxy-read-timeout: "90"
spec:
ingressClassName: nginx
rules:
- host: xxxx.hengshi.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: hengshi-sense
port:
number: 8080
Tip
ingressClassName: <Please modify to the ingressClass of the current cluster>
host: <Domain Name>
Basic Maintenance Operations
Safely Stop Database Services
Refer to the following commands to stop metadb and engine.
kubectl -n hengshi exec -it metadb-0 -- /docker-entrypoint.sh stop metadb single
kubectl -n hengshi exec -it master-0 -- /entrypoint.sh -m stopsystem
Restart engine
Refer to the following command to restart the engine.
kubectl -n hengshi exec -it master-0 -- /entrypoint.sh gpstop -r
Log Cleanup
During operation, HENGSHI SENSE generates runtime logs, which need to be cleaned regularly to free up storage space. Below is an example of the command to clean rolling logs in the internal database.
kubectl -n hengshi exec -it master-0 -- /bin/bash
crontab -e # Write the following scheduled statements into the file, save and exit
0 0 * * * /opt/hengshi/bin/clean_engine.sh -t -r -c -g -p
*/5 * * * * /opt/hengshi/bin/clean_engine.sh -l
Scale Out Engine
- Modify StatefulSet/segment
kubectl -n hengshi edit StatefulSet/segment
- Fill in the SEGMENTS field with the appname of all segments after scaling out (e.g., scaling from 2 to 4 segments).
- Update the replicas field in StatefulSet/segment to match the total number of segments after scaling out.
apiVersion: v1
kind: ConfigMap
metadata:
name: greenplum
data:
MASTER: "master-0"
SEGMENTS: | # List of 4 segments
segment-0
segment-1
segment-2
segment-3
...
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: segment
spec:
replicas: 4 # For example, scaling out to 4 segments
- Then run
kubectl -n hengshi apply -f gpdb.yaml
. - Wait until the status of all new and existing segment pods changes to running.
- Create
new_host_file
(list of new segments, e.g., originally 2 segments (0,1), now scaled out to 4 segments (0,1,2,3)).
kubectl -n hengshi exec -it master-0 /bin/bash
cd /opt/hsdata/ && mkdir expand && cd expand
cat <<EOF > new_host_file
segment-2
segment-3
EOF
- Execute the scale-out operation.
kubectl -n hengshi exec -it master-0 /bin/bash
cd /opt/hsdata/expand
psql postgres -c "create database expand"
gpexpand -f new_host_file -D expand
>y
>0 # This will generate a gpexpand_inputfile_yyyymmdd_xxxxxx file
gpexpand -i gpexpand_inputfile_yyyymmdd_xxxxxx -D expand
If the scale-out operation fails, you can refer to the following commands to roll back the engine.
kubectl -n hengshi exec -it master-0 /bin/bash
cd /opt/hsdata/expand
gpstart -aR
gpexpand -r -D expand
engine Data Migration
- Export old engine data
# dump db data
kubectl -n hengshi exec -it <OLD_ENGINE_POD_NAME> -- bash
source $HS_HOME/engine-cluster
pg_dumpall > /opt/hsdata/engine.back.sql
exit
- Copy data to the new machine
# cp db data
# Copy the old engine data to the local machine
kubectl -n hengshi cp <OLD_ENGINE_POD_NAME>:/opt/hsdata/engine.back.sql engine.back.sql
# Copy the data to the new engine environment
kubectl -n hengshi cp engine.back.sql master-0:/opt/hsdata/engine.back.sql
- Import data into the new environment
# load db data
kubectl -n hengshi exec -it master-0 -- bash
source $HS_HOME/engine-cluster
psql postgres < /opt/hsdata/engine.back.sql
rm /opt/hsdata/engine.back.sql
Deploy Single-Node Version (POC)
- Modify the configuration file for single-node setup
Ensure that configmap.yaml
, hengshi.yaml
, and other configuration files are in the same directory as config_to_single.sh
before execution.
./config_to_single.sh
- Deploy the engine
Refer to Engine Deployment
- Deploy other components
Refer to the following deployment YAML files.
kubectl -n hengshi apply -f configmap.yaml
kubectl -n hengshi apply -f service.yaml
kubectl -n hengshi apply -f metadb.yaml
kubectl -n hengshi apply -f minio.yaml
kubectl -n hengshi apply -f redis.yaml
kubectl -n hengshi apply -f hengshi.yaml