Connecting to Cloudera Impala
Steps
Please follow the steps below to connect to the Cloudera Impala
data source.
On the Data Connection page, click "New Data Connection" in the upper right corner.
In the data source types, select the
Cloudera Impala
data source.Fill in the required parameters for the data source connection as prompted.
Connection Configuration Information Description
Field Description Name The name of the connection. Required and unique within the user Host Address The address of the database. If the URL field is filled, the value in the URL takes precedence Port The port of the database. If the URL field is filled, the value in the URL takes precedence Username The username for the database Password The password for the database Database The name of the database Schema The schema of the database Max Connections Maximum number of connections in the connection pool Encoding Encoding settings for the database connection Prefer using database comment as dataset title Whether to display the table name or the table comment as the title Hadoop Authentication Method Hadoop authentication method. "simple" is simple authentication and requires no extra info; "Kerberos" requires additional information realmA Required when Hadoop authentication method is Kerberos kdcA Required when Hadoop authentication method is Kerberos realmB Required when Hadoop authentication method is Kerberos kdcB Required when Hadoop authentication method is Kerberos server principal Required when Hadoop authentication method is Kerberos URL The JDBC URL of the database Additional JDBC Parameters Additional JDBC parameters. It is recommended to use the URL to write the complete URL. This parameter is only appended to the auto-generated JDBC URL Hierarchical loading of schema and tables Default is off. When enabled, schemas and tables are loaded hierarchically. Only schemas are loaded during connection; you need to click the schema to load the tables under it Query Timeout (seconds) Default is 600. If the data volume is large, you can increase the timeout appropriately Allow Write Operations Indicates that this connection can be selected as an output connection in Data Integration and Batch Sync. You must have write permission to the database and pass verification before configuring this parameter. Support uploading files to specified path Indicates the database name where the table generated when creating a local file dataset is stored. You must have write permission to the database and pass verification before configuring this parameter. Show only tables under specified database/schema When this option is selected and the database field is not empty, only tables under that DB are displayed After filling in the parameters, click the
Validate
button to get the validation result (this checks the connectivity between HENGSHI SENSE and the configured data connection; you cannot add the connection if validation fails).After validation passes,
Allow Write Operations
andSupport uploading files to specified path
will be enabled and can be optionally turned on.Click
Execute Preset Code
to pop up the preset code for this data source, then click the execute button.Click the
Add
button to add theCloudera Impala
connection.
Please note
- Parameters marked with * are required; others are optional.
- When connecting to a data source, you must execute the preset code. Failure to do so may result in certain functions being unavailable during data analysis. In addition, when upgrading from a version prior to 4.4 to 4.4, you need to execute the preset code for existing data connections in the system.
Supported Versions
2.5
, 2.9
, 2.10
, 3.2
and later versions
Data Connection Preview Support
Supports all tables that can be listed by show tables
.
SQL Dataset Support for SQL
All SELECT
-related features are supported. The SELECT SQL
statement must comply with the Impala SQL
syntax specification.
Unsupported Field Types
The following data types in Impala cannot be processed correctly:
- ARRAY
- MAP
- STRUCT