数据代理
Overview
In HENGSHI SENSE, leveraging the capabilities of large models, the Data Agent can help users make full use of their data. Based on conversational interaction experiences, the Data Agent can assist users in completing tasks ranging from instant analysis of business data to metric creation and dashboard generation. We will continue to integrate Agent capabilities into the product, aiming to enhance the efficiency of data analysts and data managers, while simplifying workflows and complex tasks.
Installation and Configuration
Prerequisites
Please ensure the following steps are completed to make the Data Agent available:
- Installation and Startup: Complete the installation of the HENGSHI service by following the Installation and Startup Guide.
- AI Deployment: Complete the installation and deployment of related services by following the AI Deployment Documentation.
Configure Large Model
After the HENGSHI service starts, go to the "Feature Configuration" page in System Settings to configure the relevant information for the Data Agent, including the address and key of the large model, etc.

User Guide
Before using the Data Agent, some data preparation is required to ensure that the Data Agent understands the unique business context, prioritizes accurate information, and provides consistent, reliable, and goal-aligned responses.
Preparing data for the Data Agent lays the foundation for a high-quality, practical, and context-aware Data Agent experience. If the data is disorganized or ambiguous, the Data Agent may struggle to understand it accurately, resulting in responses that are either superficial, off-target, or even misleading.
By investing effort in data preparation, the Data Agent can fully grasp the business context, accurately extract key information, and deliver responses that are not only stable and reliable but also highly aligned with your goals, maximizing the effectiveness of the Data Agent.
Note
AI behavior is inherently unpredictable. Even with the same input, AI does not always generate identical responses.
Writing Prompts for AI
Industry Terminology and Private Domain Knowledge
To enable the large model to perform at its best, we provide a feature to configure prompts in the Data Agent Console under system settings. You can use natural language in the UserSystem Prompt to provide the large model with information such as company industry background, business logic, analytical direction guidance, and specific instructions. Data Agent will use these directives to understand the organization's internal language habits, professional terminology, and analytical focus, thereby accurately interpreting the specialized terms and analytical expectations in your field to improve response quality and relevance.
Prompts can help Data Agent respond based on your industry, strategic goals, terminology, or operational logic, ensuring users receive more accurate and relevant data analysis. For example:
- "Big Promotion" refers to the period from October 11 to November 11 each year.
- When users mention product-related questions, please retrieve both the product name and product ID.

Dataset Analysis Rules
In the Knowledge Management of the dataset, you can use natural language to describe in detail the purpose of the dataset, implicit rules (such as filter conditions), synonyms, and the corresponding fields and metrics for specific business terms, guiding the Data Agent on how to perform certain types of analysis. For example:
- "Small orders" refer to orders where the total quantity under the same order number is less than or equal to 2.
- "Fiscal year" refers to the period starting from December 1 of the previous year to November 30 of the current year. For example, the 2025 fiscal year refers to 2024/12/1 to 2025/11/30, and the 2024 fiscal year refers to 2023/12/1 to 2024/11/30.
- When asked about AAA, also list metrics such as BBB, CCC, and DDD.

Note
Understanding the best practices of prompt engineering is crucial. AI can be sensitive to the prompts it receives, and the construction of prompts can affect AI's understanding and output. The characteristics of prompts are as follows:
- Clear and specific
- Use analogies and descriptive language
- Avoid ambiguity
- Use markdown to write in a structured and thematic manner
- Break down complex instructions into simple steps whenever possible
Connect Agent with Enterprise Knowledge Base, External APIs, etc.
Data Agent extends its capabilities in a modular way through "Tools." As long as the target system provides accessible HTTP/HTTPS interfaces (REST, GraphQL, HTTP-based RPC, etc.), it can be integrated: internal enterprise knowledge bases, full-text/vector search, internal microservices, third-party SaaS/platform APIs, search engines, RPA services... This means the Agent can not only understand and respond but also "invoke" your external/internal systems in real-time to retrieve or write data, forming a powerful capability boundary.
Below is an example of integrating Tavily Web Search by registering a search-type Tool in "System Settings → Global JS":
// Ensure the Agent runtime environment has injected createTool
if (typeof window.heisenberg?.createTool === 'function') {
const apiKey = '<tavily_api_key>'; // It is recommended to use proxy service management instead of hardcoding
window.heisenberg.createTool({
// Tool name: English + underscores for reliable LLM reference
name: 'web_search',
// Tool description: Helps the model determine when to invoke
description: 'Use Tavily to search the latest internet information and return answers with sources',
// Parameters: Fully extensible as needed — any control options you want the LLM to generate/pass can be included
parameters: window.heisenberg.zod.object({
plan: window.heisenberg.zod.string().describe('The purpose and key information extraction for invoking this tool'),
query: window.heisenberg.zod.string().describe('Search content/query keywords'),
max_results: window.heisenberg.zod.number().min(1).max(20).default(5).describe('Number of results needed'),
// Additional parameters like language, time_range, site, freshness, etc., can be added
}),
// Execution logic: Integration is possible as long as HTTP requests can be made (fetch/axios/custom SDKs are all viable)
execute: async (args) => {
try {
const { query, max_results } = args;
const resp = await fetch('https://api.tavily.com/search', {
method: 'POST',
headers: {
Authorization: `Bearer ${apiKey}`,
'Content-Type': 'application/json',
},
body: JSON.stringify({
query,
max_results,
search_depth: 'basic',
include_answer: true,
include_raw_content: true,
include_images: false,
time_range: null,
}),
});
const data = await resp.json().catch(() => ({}));
if (!resp.ok) {
return `Search failed: ${data?.error || resp.status} ${resp.statusText}`;
}
return `Search results:\n\n${data.answer || '(No direct answer)'}\n\nSources:\n${JSON.stringify(data.results || [], null, 2)}`;
} catch (err) {
return `Error occurred during search: ${err?.message || String(err)}`;
}
},
});
}After saving and refreshing, the Data Agent can search the internet through this Tool. Similarly, as long as the target capability can be accessed via HTTP requests, it can be encapsulated into a Tool and dynamically invoked by the Agent, enabling rapid expansion to internal enterprise systems or any third-party services.

General integration recommendations:
- Identity and Security: Prefer using proxy service management to avoid exposing tokens in plain text in global JS.
- Response Format: Extract the raw response into concise text or structured JSON, then concatenate it into a string to reduce irrelevant noise.
- Minimal Permissions: Implement authentication, IP/rate limiting, and access auditing on the server side to prevent abuse.
- Parameter Design: Expose only necessary and controllable parameters to the LLM to reduce unauthorized or invalid calls.
Tips
- You can register multiple Tools (search, knowledge base retrieval, report generation, process triggering, ticket creation, etc.), and the Agent will autonomously select and combine them in the reasoning chain, forming a continuously evolving "capability surface."
- Tools extended through global JS are currently only available on the web version and cannot be used in API Agent mode.
Preparing Data for AI
Data Vectorization
To efficiently and accurately locate the most relevant information within massive data assets, it is recommended to perform "vectorization" on the data. Vectorization converts text information such as field/atomic metric names, descriptions, and field values into computable semantic vectors, which are then written into a vector database. This enables the Data Agent to perform semantic-based retrieval and recall, rather than relying solely on keyword matching.
Benefits of Vectorization:
- Higher Relevance: Understands synonyms, industry terms, and context, reducing missed and false detections.
- Faster Response: Narrows the search scope, reducing the context-filling cost of large models.
- Greater Scalability: Supports semantic associations and knowledge linking across datasets, adapting to multi-language scenarios.
- Continuous Optimization: Enhances Q&A quality over time by combining human review results with "intelligent learning" tasks.
Steps to Operate:
- Navigate to the target dataset page and click "Vectorize" in the action bar.
- Check the progress in System Settings - Task Management - Execution Plan, and enable scheduled tasks as needed to improve recall stability and coverage.

Data Management
Effective data management is the foundation for Data Agent to correctly understand business semantics and metric definitions. By standardizing naming conventions, completing field/metric descriptions, setting appropriate data types, and hiding or cleaning irrelevant objects, you can significantly improve query relevance and response speed, reduce large model context costs, and minimize misunderstandings. It is recommended to perform self-checks according to the following checklist before publishing datasets and during routine maintenance. Using this in conjunction with "Data Vectorization" and "Intelligent Learning" will yield even better results.
- Dataset Naming: Ensure dataset names are concise and clearly reflect their purpose.
- Field Management: Ensure field names are concise and descriptive, avoiding special characters. Provide detailed explanations of field purposes in the Field Description, such as "Default to use me as the timeline." Additionally, field types should align with their intended use; for example, fields requiring summation should use numeric types, and date fields should use date types, etc.
- Metric Management: Ensure atomic metric names are concise and descriptive, avoiding special characters. Provide detailed explanations of metric purposes in the Atomic Metric Description.
- Field Hiding: For fields not involved in queries, it is recommended to hide them to reduce the number of tokens sent to the large model, improve response speed, and lower costs.
- Distinguishing Fields and Metrics: Ensure field names and metric names are not similar to avoid confusion. Fields not required for answering questions should be hidden, and unnecessary metrics should be deleted.
- Intelligent Learning: It is recommended to trigger the "Intelligent Learning" task to convert general examples into dataset-specific examples. After execution, manually review the learning results and perform additions, deletions, or modifications to enhance the assistant's capabilities.
Enhancing Understanding of Complex Calculations
Predefine reusable business metrics on the data side and expose them in the form of Metrics to achieve higher accuracy, stability, and interpretability in query scenarios.
Practical Recommendations:
- Provide unified definitions for industry-specific metrics (e.g., financial risk control, advertising campaigns, e-commerce conversion) and maintain synonym mappings in the Knowledge Management of the dataset.
- Establish mappings between "business terms → metrics" for easily confused concepts (e.g., "conversion rate," "ROI," "repurchase rate") to avoid free-form field combinations in models.
- Prioritize using "metrics" to carry definitions rather than temporary calculation expressions in single conversations; for critical metrics, establish versioning and change logs to prevent definition drift.
Example (ROI):
- Advertising/E-commerce: ROI = GMV ÷ Advertising Cost. Clearly specify in the metric description whether coupons are included, whether refunds and shipping costs are deducted, whether platform service fees are included, the statistical basis ("payment time/order time"), and the time window (e.g., day/week/month).
- Manufacturing/Projects: ROI = (Revenue − Cost) ÷ Cost, with the window being the full project cycle or financial period.
Usage Scenarios
The agent mode of Data Agent has the following characteristics:
- No restrictions on conversation sources
Data Agent will autonomously determine user intent based on the input content, decompose user needs, and perform mixed searches within the user's authorized data scope across the dataset marketplace, app marketplace, and app creation. It will then analyze and query data from the target data sources to provide answers.
- Complex problem decomposition
Data Agent not only supports regular data query issues but also allows input of multiple questions at once, especially when there are reasoning relationships between them. Data Agent will perform one or multiple data queries depending on the complexity of the requirements.
- Context awareness
Data Agent can read login account information and seamlessly understand indicative pronouns in user input (e.g., "my department" and other user attributes). Additionally, it can read the information of the page the user is currently browsing. When the user is on specific pages such as a data package, dataset, or dashboard, the Agent will directly interact based on the current page's information when handling data queries and other needs.
With these capabilities, Data Agent can transform into multiple roles such as a visual creation assistant, metric creation assistant, or analyst assistant.
Intelligent Query
After upgrading the Data Agent, Intelligent Query is no longer limited to a specific data range and does not require manual range selection to start querying. This means the agent's tasks will involve finding content, performing ad-hoc analysis, or providing insights.

Visualization Creation
Data Agent allows users to start creating a dashboard from scratch on the dashboard list page or directly edit an existing dashboard. Whether it's chart creation, adding filters, analyzing data with rich text reports, adjusting dashboard layouts, modifying colors, or performing batch operations on controls.

Intelligent Interpretation
To facilitate business users in performing business data analysis, regular reviews, and data interpretation using Data Agent, we have added the "Intelligent Interpretation" configuration and shortcut button. On the dashboard page, a "Intelligent Interpretation" button will appear in Data Agent. By clicking it, Data Agent will follow the pre-configured interpretation logic to perform real-time data queries, anomaly detection, decomposition, and drill-down, ultimately providing an interpretation report.

In the dashboard editing mode, you can click the dropdown menu in the upper-right corner to open the "Intelligent Interpretation Configuration." Here, users can configure fixed interpretation logic based on their business needs or click a button to let AI analyze the dashboard structure and data to generate an interpretation logic template. Each chart on the dashboard also supports individual interpretation logic configuration. The "Intelligent Interpretation" button in the upper-right corner of the chart control can be clicked to invoke Data Agent and send interpretation commands.

The Intelligent Interpretation feature leverages artificial intelligence technology to perform automated analysis on user-specified data ranges. Its core capabilities and boundaries are as follows:
- Data Query and Extraction: Quickly locate and extract relevant information from data sources based on user instructions or built-in analysis logic.
- Data Summarization and Induction: Integrate, summarize, and condense query results across multiple dimensions to reveal key facts, patterns, and current states in the data.
- Generate Descriptive Reports: Output analysis results in the form of structured reports or concise text summaries, helping users understand "what happened in the past" and "what the current situation is."
Note that Intelligent Interpretation does not perform predictive inference. This feature strictly analyzes existing and historical data, and its output is a description and summary of established facts. It cannot predict future data trends, business outcomes, or any probabilistic events that have not yet occurred.
Note
Complex reports and complex tables are not supported by Intelligent Interpretation.
Expression Writing
Data Agent, based on its understanding of HQL, can assist users in writing complex expressions and creating metrics.

Integrating ChatBI
HENGSHI SENSE offers multiple integration methods, allowing you to choose the one that best suits your needs:
IFRAME Integration
Use iframe to integrate ChatBI into your existing system, enabling seamless connection with the HENGSHI SENSE BI PaaS platform. The iframe is simple and easy to use, allowing direct utilization of HENGSHI ChatBI's conversation components, styles, and features without requiring additional development in your system.
SDK Integration
By integrating ChatBI into your existing system through the SDK, you can implement more complex business logic and achieve finer control, such as customizing the UI. The SDK provides a wealth of configuration options to meet personalized needs. Depending on your development team's tech stack, choose the appropriate SDK integration method. We offer two JavaScript SDKs: Native JS SDK and React JS SDK.
How to choose which SDK to use?
The difference between Native JS and React JS lies in that Native JS is pure JavaScript and does not depend on any framework, while React JS is JavaScript based on the React framework, requiring React to be installed first.
The Native JS SDK provides UI, functionality, and integration similar to iframe. It directly uses the HENGSHI ChatBI conversation components, styles, and features, but allows for custom API requests, request interception, etc., through JavaScript control and SDK initialization parameters.
The React JS SDK, on the other hand, only provides the Completion UI component and the useProvider hook, making it suitable for use in your own React projects.
API Integration
Integrate ChatBI capabilities into your Feishu, DingTalk, WeCom, or Dify workflow through the Backend API to achieve customized business logic. For the Dify workflow tool, refer to the attachment HENGSHI AI Workflow Tool v1.0.1.zip.
Enterprise Instant Messaging Tool Data Q&A Bot
You can create an intelligent data Q&A bot through the Enterprise Instant Messaging Tool Data Q&A Bot, linking relevant data in HENGSHI ChatBI to enable intelligent data Q&A within instant messaging tools. Currently supported enterprise instant messaging tools include WeCom, Lark, and DingTalk.
FAQ
How to Troubleshoot Model Connection Failure?
There are various reasons for connection failure. It is recommended to troubleshoot by following these steps:
Check Request Address
Ensure the model address is correct, as different vendors provide different model addresses. Please refer to the documentation provided by the vendor you purchased from.
We can provide preliminary troubleshooting guidance:
- Model addresses from various vendors typically end with
<host>/chat/completions, rather than just the domain name, such ashttps://api.openai.com/v1/chat/completions. - If your model vendor is Azure OpenAI, the model address structure is
https://<your-tenant>.openai.azure.com/openai/deployments/<your-model>/chat/completions, where<your-tenant>is your tenant name and<your-model>is your model name. You need to log in to the Azure OpenAI platform to check. For more detailed steps, please refer to Connect to Azure OpenAI. - If your model vendor is Tongyi Qianwen, there are two types of model addresses: one compatible with OpenAI format,
https://dashscope.aliyuncs.com/compatible-mode/v1/chat/completions, and another specific to Tongyi Qianwen,https://dashscope.aliyuncs.com/api/v1/services/aigc/text-generation/generation. When using the OpenAI-compatible format (indicated bycompatible-modein the URL), please selectOpenAIorOpenAI-API-compatibleas the provider in the HENGSHI Intelligent Query Assistant model configuration. - If your model is privately deployed, ensure the model address is correct, the model service is running, and the model provides an HTTP service with an interface format compatible with the OpenAI API.
Check the Key
- Large model interfaces provided by various model vendors usually require a key for access. Please ensure that the key you provide is correct and has permission to access the model.
- If your company has deployed its own model, a key may not be required. Please confirm this with your company's developers or engineering team.
Check Model Name
- Most model providers generally offer multiple models. Please select the appropriate model based on your needs and ensure that the model name you provide is correct and that you have access to the model.
- If your company uses a self-deployed model, the model name may not be required. Please confirm this with your company's developers or engineering team.
How to Troubleshoot Query Failures and Errors?
Failures and errors involve diagnosing multiple aspects. When encountering issues, collect the following information and contact the support engineer:
- Click the three-dot menu below the dialog card, select "Execution Log," and then click "Copy Full Log."

- Press the F12 key on your keyboard or right-click and select "Inspect" to open the browser console. Navigate to "Network" - "Fetch/XHR."

Reproduce the error by querying again, then right-click the failed network request and select "Copy" - "Copy Response."

- Go to "System Settings" - "Intelligent Operations" - "System Debugging," set "Unified Settings" to "DEBUG," enable "Real-Time Debugging," reproduce the error by querying again, and then click "Export Logs."

How to Fill in the Vector Database Address?
Follow the AI Assistant Deployment Documentation to complete the installation and deployment of related services. No manual input is required.
Does it support other vector models?
Currently, it is not supported. If you have any requirements, please contact the support engineer.
What are the differences between the Data Agent Sidebar and ChatBI?
| Capability | Data Agent Sidebar | ChatBI |
|---|---|---|
| Intelligent querying with specified data sources | ✅ | ✅ |
| Intelligent querying without data source limitations | ✅ | ❌ |
| One-click dashboard creation from conversation charts | ❌ | ✅ |
| Visualized assisted creation | ✅ | ❌ |
| Metric-assisted creation | ✅ | ❌ |
| Intelligent interpretation | ✅ | ❌ |
What are the differences between Agent mode, Workflow mode, and API mode?
| Capability | Agent Mode | Agent API Mode | Workflow and Workflow API Mode |
|---|---|---|---|
| Intelligent querying for specified data sources | ✅ | ✅ | ✅ |
| Intelligent querying without data source limitations | ✅ | ✅ | ❌ |
| Visualized assisted creation | ✅ | ❌ | ❌ |
| Metric-assisted creation | ✅ | ❌ | ❌ |
| Intelligent interpretation | ✅ | ❌ | ❌ |