Skip to content

Local File Dataset

A local file dataset refers to creating a dataset by uploading local files. Supported file types include csv, xls, xlsx, and xlsm. It allows operations such as selecting rows and columns, as well as row-column inversion.

Create Local File Dataset

Follow the steps below to create a local file dataset.

  1. In the dataset interface, click "New Dataset" and select "Local File."
  2. Upload the local file by either dragging and dropping or selecting the file for upload.
  3. Preview the file and select the data. You can set the table header, choose rows and columns, perform row-column transformations, and more.
    • Set the table header. You can select any row as the table header. Once the header is set, content above the header will be discarded. Data below the header will form the dataset content.
    • Select rows and columns. Dynamically select the data content.
    • Row-column transformation. Convert rows to columns and vice versa. When the file exceeds 1000 rows, transformation is not supported.
    • CSV files support selecting column delimiters and file encoding.
  4. Click "Next" to enter the data structure configuration page, where you can check fields, set field aliases, and define field types.

    Note

    Fields that are unchecked will be stored in the dataset in a hidden form. Users can display these fields later by adjusting the field settings.

  5. Import the data, edit the dataset name, and select the output data source.
  6. The dataset is successfully created. You can view and perform Dataset Management operations on the dataset page.
  1. If the uploaded file contains multiple sheets, all these sheets will be displayed, and you can choose a specific sheet for preview and import. After one sheet is imported, you can proceed to import other sheets.
  2. For selecting the connection to store file data, besides the engine connection (built-in data connection), other data connections are also available. As long as the current app or dataset has datasets from data connections that meet the conditions, or the user has permission to view data connections that meet the conditions (including those created by themselves or authorized by others).
  3. Data connections that meet the file upload storage conditions need to belong to specific types and require specific settings. Currently supported options include MySQL, Apache Doris, StarRocks, SelectDB, PostgreSQL, Greenplum, Oracle, Saphana, SQL Server, Cloudera Impala, Amazon Redshift, Amazon Athena, Alibaba Hologres, Presto, ClickHouse, Dameng, TDSQL MySQL, TDSQL PostgreSQL, GBase 8a, GaussDB, OceanBase, and AnalyticDB MySQL as file upload output options. Data connections used as output data sources need to have the Support uploading files to a specified path option selected during creation and must set the output destination. This destination is shared, but the table names corresponding to the file data will not be displayed when viewing the connection.

User Manual for Hengshi Analysis Platform