创建聊天补全

POST/v1/chat/completions

为给定对话创建聊天补全响应。

身份验证

Authorization Bearer

在 Authorization header 中使用 API key 作为 Bearer token。

Request Body

messagesarrayrequired

Conversation messages.

messages.rolestringrequired

Message role, such as system, user, assistant, or tool.

messages.contentstring | object[]required

Message content. Multimodal models may accept structured content parts.

messages.namestring

Optional participant name.

messages.tool_call_idstring

Tool call ID for tool messages.

modelstringrequired

Model ID.

temperaturenumber

Controls randomness in generated output. Higher values make responses more varied; lower values make responses more deterministic.

top_pnumber

Controls nucleus sampling by limiting token choices to a cumulative probability mass. Lower values make output more focused; higher values allow more variety.

max_completion_tokensinteger

Maximum number of tokens the model may generate in the response.

top_kinteger

Top-k sampling cutoff.

frequency_penaltynumber

Frequency-based repetition penalty.

presence_penaltynumber

Presence-based novelty penalty.

context_lengthinteger

Context window length.

streamboolean

Streaming response flag.

stream_optionsobject

Streaming response configuration.

stream_options.include_usageboolean

Include usage information in the streaming response.

toolsarray

Tool definitions.

tools.typestring

Tool type, usually function.

tools.functionobject

Function definition.

tool_choiceobject

Tool selection policy.

tool_choice.typestring

Tool choice type, such as function.

tool_choice.functionobject

Function selection payload.

parallel_tool_callsboolean

Parallel tool-call flag.

previous_response_idstring

Previous response ID.

conversationobject

Conversation state object or ID.

prompt_cache_keystring

Prompt cache key.

storeboolean

Response storage flag.

truncationstring

Input truncation strategy.

includearray

Additional response fields to include.

metadataobject

Application metadata.

extra_bodyobject

Additional request body fields.

provider_optionsobject

Upstream configuration.

schema_paramsobject

Structured output schema parameters.

network_searchboolean

Network search flag.

reasoningobject

Canonical reasoning/thinking control configuration.

reasoning.enabledboolean

Enable reasoning when the selected model supports it.

reasoning.effortstring

Reasoning effort. Supports xhigh, high, medium, low, minimal, and none.

reasoning.max_tokensinteger

Maximum reasoning tokens. Used for execution when supplied with effort.

reasoning.excludeboolean

Reason internally without returning reasoning text.

response_formatobject

Requested response format for structured model output.

logprobsboolean

Whether to return log probabilities for generated output tokens.

top_logprobsinteger

Number of most likely tokens to return at each generated token position; requires logprobs=true.

stoparray

Stop sequences that terminate generation; the verified contract uses an array of up to four strings.

verbositystring

Controls the level of detail in the model response.

seedinteger

Best-effort sampling seed for reproducible outputs.

ninteger

Number of response choices to generate.

logit_biasobject

Adjusts the likelihood of specified token IDs.

响应格式

response_format 用于控制助手的输出格式：

{"type":"text"} 返回普通文本。
{"type":"json_object"} 返回有效的 JSON 对象。
{"type":"json_schema","json_schema":{...}} 返回符合所提供 schema 的 JSON。json_schema 对象包含 name、可选的 strict 和 schema 字段。

如果结构化响应被拒绝，choices[].message.refusal 将包含拒绝信息，而 choices[].message.content 可能为 null。

Response

idstring

唯一补全 ID。

objectstring

非流式响应中为 chat.completion。

createdinteger

Unix 时间戳。

modelstring

本次请求使用的模型。

choicesobject[]

生成结果。

choices.indexinteger

结果下标。

choices.messageobject

assistant 消息。

choices.logprobsobject | null

生成 token 的对数概率信息。未请求对数概率时为 null。

choices.finish_reasonstring

stop、length 或 tool_calls。

usageobject

Token 用量和费用信息。

usage.prompt_tokensinteger

Input token count when returned.

usage.completion_tokensinteger

Output token count when returned.

usage.total_tokensinteger

Total token count when returned.

usage.input_tokensinteger

Input token count for providers that use input/output names.

usage.output_tokensinteger

Output token count for providers that use input/output names.

usage.generated_imagesinteger

Generated image count when returned.

usage.audio_duration_secondsnumber

Audio duration used for billing when returned.

usage.video_duration_secondsnumber

Video duration used for billing when returned.

usage.provider_request_idstring

Provider request ID used for upstream correlation.

请求

1curl -X POST https://api.token360.ai/v1/chat/completions \
2  -H "Authorization: Bearer sk-your-api-key" \
3  -H "Content-Type: application/json" \
4  -d '{
5    "model": "glm-5.1",
6    "messages": [
7      {"role": "user", "content": "Reply with the exact text: Hello from Token360."}
8    ],
9    "temperature": 0.1,
10    "max_completion_tokens": 20
11  }'

响应

JSON

1{
2  "id": "your-chat-completion-id",
3  "object": "chat.completion",
4  "created": 1776351874,
5  "model": "glm-5.1",
6  "choices": [
7    {
8      "index": 0,
9      "message": {
10        "role": "assistant",
11        "content": "Hello from Token360."
12      },
13      "finish_reason": "stop"
14    }
15  ],
16  "usage": {
17    "prompt_tokens": 57,
18    "completion_tokens": 7,
19    "total_tokens": 64
20  }
21}

此页面对您有帮助吗？