Skip to content

Connecting to Cloudera Impala

Steps

Please follow the steps below to connect to the Cloudera Impala data source.

  1. On the Data Connection page, click "New Data Connection" in the upper right corner.

  2. In the data source types, select the Cloudera Impala data source.

  3. Fill in the required parameters for the data source connection as prompted.

    Connection Configuration Information Description

    FieldDescription
    NameThe name of the connection. Required and unique within the user
    Host AddressThe address of the database. If the URL field is filled, the value in the URL takes precedence
    PortThe port of the database. If the URL field is filled, the value in the URL takes precedence
    UsernameThe username for the database
    PasswordThe password for the database
    DatabaseThe name of the database
    SchemaThe schema of the database
    Max ConnectionsMaximum number of connections in the connection pool
    EncodingEncoding settings for the database connection
    Prefer using database comment as dataset titleWhether to display the table name or the table comment as the title
    Hadoop Authentication MethodHadoop authentication method. "simple" is simple authentication and requires no extra info; "Kerberos" requires additional information
    realmARequired when Hadoop authentication method is Kerberos
    kdcARequired when Hadoop authentication method is Kerberos
    realmBRequired when Hadoop authentication method is Kerberos
    kdcBRequired when Hadoop authentication method is Kerberos
    server principalRequired when Hadoop authentication method is Kerberos
    URLThe JDBC URL of the database
    Additional JDBC ParametersAdditional JDBC parameters. It is recommended to use the URL to write the complete URL. This parameter is only appended to the auto-generated JDBC URL
    Hierarchical loading of schema and tablesDefault is off. When enabled, schemas and tables are loaded hierarchically. Only schemas are loaded during connection; you need to click the schema to load the tables under it
    Query Timeout (seconds)Default is 600. If the data volume is large, you can increase the timeout appropriately
    Allow Write OperationsIndicates that this connection can be selected as an output connection in Data Integration and Batch Sync. You must have write permission to the database and pass verification before configuring this parameter.
    Support uploading files to specified pathIndicates the database name where the table generated when creating a local file dataset is stored. You must have write permission to the database and pass verification before configuring this parameter.
    Show only tables under specified database/schemaWhen this option is selected and the database field is not empty, only tables under that DB are displayed
  4. After filling in the parameters, click the Validate button to get the validation result (this checks the connectivity between HENGSHI SENSE and the configured data connection; you cannot add the connection if validation fails).

  5. After validation passes, Allow Write Operations and Support uploading files to specified path will be enabled and can be optionally turned on.

  6. Click Execute Preset Code to pop up the preset code for this data source, then click the execute button.

  7. Click the Add button to add the Cloudera Impala connection.

Please note

  1. Parameters marked with * are required; others are optional.
  2. When connecting to a data source, you must execute the preset code. Failure to do so may result in certain functions being unavailable during data analysis. In addition, when upgrading from a version prior to 4.4 to 4.4, you need to execute the preset code for existing data connections in the system.

Supported Versions

2.5, 2.9, 2.10, 3.2 and later versions

Data Connection Preview Support

Supports all tables that can be listed by show tables.

SQL Dataset Support for SQL

All SELECT-related features are supported. The SELECT SQL statement must comply with the Impala SQL syntax specification.

Unsupported Field Types

The following data types in Impala cannot be processed correctly:

  • ARRAY
  • MAP
  • STRUCT

User Manual for Hengshi Analysis Platform