Create chat completion
/v1/chat/completionsCreate a chat completion response for the given conversation.
Authentication
Authorization Bearer
API key as bearer token in Authorization header.
Request Body
modelstringdefault:glm-5.1requiredThe text model code to call.
Example: "glm-5.1"
messagesMessage[]requiredConversation messages used as the model prompt.
messages.roleenum<string>default:userrequiredRole of the message author.
Available options: system user assistant tool
messages.contentstringrequiredText content for system, user, assistant, and tool messages.
messages.tool_callsobject[]Tool calls generated by an assistant message.
messages.tool_call_idstringrequired for tool messagesTool call ID that a tool message responds to.
temperaturenumberdefault:1Sampling temperature. Lower values are more deterministic.
Example: 0.7
max_tokensintegerMaximum number of generated tokens.
streambooleandefault:falseReturn Server-Sent Events when true.
top_pnumberdefault:1Nucleus sampling value.
stopstring | string[]Stop sequence or sequences.
thinkingobjectReasoning control, such as {"type":"enabled"}, {"type":"disabled"}, or {"type":"auto"}.
response_formatobjectStructured output format, such as json_object or json_schema.
toolsobject[]Tool definitions the model may call.
tool_choicestring | objectdefault:autoTool selection strategy.
Response
idstringUnique completion ID.
objectstringAlways chat.completion for non-streaming responses.
createdintegerUnix timestamp.
modelstringModel used for the request.
choicesobject[]Generated choices.
choices.indexintegerChoice index.
choices.messageobjectAssistant message.
choices.finish_reasonstringstop, length, or tool_calls.
usageobjectToken usage and cost details.
Previous
Error Handling
Next
Create image