Skip to content

Connecting to Hive

Operation Steps

Please follow the steps below to connect to a Hive data source.

  1. Click "New Data Connection" in the upper right corner of the data connection page.

  2. Select the Hive data source from the list of data source types.

  3. Fill in the required parameters for the data source connection as prompted.

    Connection Configuration Information Description

    FieldDescription
    NameThe name of the connection. Required and unique within the user.
    Host AddressThe address of the database. If the URL field is filled in, the value in the URL will be used first.
    PortThe port of the database. If the URL field is filled in, the value in the URL will be used first.
    UsernameThe username for the database.
    PasswordThe password for the database.
    DatabaseThe name of the database.
    SchemaThe schema of the database.
    Max ConnectionsThe maximum number of connections in the connection pool.
    Prefer using database comment as dataset titlePrefer to display the table name or the table comment.
    Hive Execution EngineHive execution engine, options include mr, tez, and spark.
    Hadoop Authentication MethodHadoop authentication method, supports "simple", "kerberos", and "tbds". When "kerberos" or "tbds" is selected, the "Username" and "Password" above must be filled in with the corresponding username and password in the "kerberos" or "tbds" system.
    realmARequired when Hadoop authentication method is kerberos.
    kdcARequired when Hadoop authentication method is kerberos.
    realmBRequired when Hadoop authentication method is kerberos.
    kdcBRequired when Hadoop authentication method is kerberos.
    server principalRequired when Hadoop authentication method is kerberos.
    URLThe JDBC URL of the database.
    Transaction Isolation Level for Read OperationsThis setting only affects reading data; writing data still uses the default isolation level.
    Hierarchical loading of schema and tablesDefault is off. When enabled, schema and tables are loaded hierarchically; only schema is loaded during connection, and you need to click the schema to load the tables under it.
    Query Timeout (seconds)Default is 600. When the data volume is large, you can appropriately increase the timeout.
    Only show tables under the specified database/schemaWhen this option is selected and the database field is not empty, only tables under the specified db will be displayed.
  4. After filling in the parameters, click the Validate button to get the validation result (this validates the connectivity between HENGSHI SENSE and the configured data connection; you cannot add the connection if validation fails).

  5. After validation passes, click Execute Preset Code to pop up the preset code for this data source, then click the execute button.

  6. Click the Add button to add the Hive connection.

Please Note

  1. Parameters marked with * are required; others are optional.
  2. When connecting to a data source, you must execute the preset code. Failure to do so may result in certain functions being unavailable during data analysis. In addition, when upgrading from a version prior to 4.4 to 4.4, you need to execute the preset code for existing data connections in the system.

Supported Versions

1.2.0, 1.2.1, 1.2.2, 2.0.0, 2.0.1, 2.1.0, 2.1.1, 2.2.0, 2.3.0, etc.

Data Connection Preview Support

Supports all tables that can be listed by show tables.

SQL Dataset Support for SQL

All SELECT-related features are supported, and SELECT SQL statements must comply with the Hive syntax specification.

Unsupported Field Types

The following data types in Hive cannot be processed correctly:

  • BINARY
  • arrays
  • maps
  • structs
  • union

User Manual for Hengshi Analysis Platform