AI Model Configuration
Getting Started with Configuration
Request Address
Whether it is a model provided by a cloud service provider or a privately deployed model, an HTTP service address must be provided, either an IP or a domain name.
Cloud Service Providers
The model addresses provided by cloud service providers usually end with completions, such as:
- DeepSeek:
https://platform.deepseek.com/chat/completions
- OpenAI:
https://api.openai.com/v1/chat/completions
The model for Microsoft Azure depends on your own deployment. Refer to this step to obtain the request address.
Private Deployment Model
The address of the private deployment model depends on the specific implementation. If the service is provided via Ollama within the intranet, the address is typically http://localhost:8000/api/generate
.
Key
Models provided by cloud service providers usually require an API Key.
For privately deployed models, it depends on the specific implementation. If Ollama is used to provide services within the intranet, the key can be left empty.
Note
The API Key from the model provider needs to be applied for by the user and kept securely.
Other Providers
On the page, you can also see our support for the following model providers, but we currently cannot guarantee the effectiveness of these models:
- DeepSeek
- OpenAI
- 01AI
- Baichuan AI
- Bedrock
- Groq Cloud
- MiniMax
- MistralAI
- MoonShot AI
- Nvidia
- TogetherAI
- Tongyi Qianwen
Note
The effectiveness of large models is influenced by the model providers, and HENGSHI SENSE cannot guarantee the performance of all models. If you find the performance unsatisfactory, please contact support@hengshi.com or the model provider promptly.
OpenAI-API-Compatible
If you need to use models other than those listed above, please select the OpenAI-API-Compatible
option. Any model compatible with the OpenAI API format can be used.
Take Doubao AI as an example, you can configure it as follows:
OpenAI API Format
If you need to deploy a private model, ensure that the HTTP service request address and response format are consistent with the OpenAI API. The input and output formats are as follows:
curl https://api.openai.com/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer $OPENAI_API_KEY" \
-d '{
"model": "gpt-4o",
"messages": [
{
"role": "developer",
"content": "You are a helpful assistant."
},
{
"role": "user",
"content": "Hello!"
}
]
}'
{
"id": "chatcmpl-123",
"object": "chat.completion",
"created": 1677652288,
"model": "gpt-4o-mini",
"system_fingerprint": "fp_44709d6fcb",
"choices": [{
"index": 0,
"message": {
"role": "assistant",
"content": "\n\nHello there, how may I assist you today?",
},
"logprobs": null,
"finish_reason": "stop"
}],
"usage": {
"prompt_tokens": 9,
"completion_tokens": 12,
"total_tokens": 21,
"completion_tokens_details": {
"reasoning_tokens": 0,
"accepted_prediction_tokens": 0,
"rejected_prediction_tokens": 0
}
}
}
Test Model Connection
After configuring the model's API Key, click the Test Model Connection
button to test whether the model connection is functioning properly. As shown in the image below, if the connection is successful, the return content from the model interface will be displayed.
Response Speed
The output speed of large models is the result of the combined effects of hardware, model complexity, input and output length, optimization techniques, and system environment. Generally speaking, privately deployed small models tend to have slower response speeds and poorer performance compared to models provided by cloud service providers.