API Reference
Chat Completions
POST https://api.platform.preferredai.jp/v1/chat/completions
This endpoint generates responses from generation AI models based on provided chat messages.
Currently, only the PLaMo Prime model is supported.
| Model ID | Context Length | Maximum Output Tokens | Reasoning Capability |
|---|---|---|---|
plamo-2.2-prime | 32,768 tokens | 4,096 tokens | Not supported |
plamo-3.0-prime-beta | 65,536 tokens | 20,000 tokens | Supported |
Models plamo-2.2-prime and earlier have been upgraded to plamo-2.2-prime as of 2026/01/28.
While plamo-3.0-prime-beta is available as a separate model, it is currently only available to monitored enterprise customers. Those interested should submit an inquiry/application via this form.
Sample Usage
For curl, set your PLaMo API key in an environment variable named PLAMO_API_KEY, and for Python libraries, set it in an environment variable named OPENAI_API_KEY.
Request
$ curl \
-H "Authorization: Bearer ${PLAMO_API_KEY}" \
-H "Content-Type: application/json" \
"https://api.platform.preferredai.jp/v1/chat/completions" \
-d @- << EOF
{
"messages": [
{
"role": "system",
"content": "You are a travel advisor."
},
{
"role": "user",
"content": "Please recommend an optimal sightseeing route in Kanazawa from morning until evening."
}
],
"model": "plamo-2.2-prime"
}
EOFimport os
from langchain_openai import ChatOpenAI
llm = ChatOpenAI(
base_url="https://api.platform.preferredai.jp/v1",
model="plamo-2.2-prime",
streaming=True,
# other parameters...,
)
messages=[
{"role": "system", "content": "You are a travel advisor."},
{"role": "user", "content": "Please recommend an optimal sightseeing route in Kanazawa from morning until evening."},
]
for chunk in llm.stream(messages):
print(chunk.content, end="", flush=True)import os
from openai import OpenAI
client = OpenAI(
base_url="https://api.platform.preferredai.jp/v1",
# other parameters...,
)
completion = client.chat.completions.create(
model="plamo-2.2-prime",
messages=[
{"role": "system", "content": "You are a travel advisor."},
{"role": "user", "content": "Please recommend an optimal sightseeing route in Kanazawa from morning until evening."},
],
stream=True,
)
for chunk in completion:
if chunk.choices and chunk.choices[0].delta.content is not None:
print(chunk.choices[0].delta.content, end="", flush=True)Response
{
"id": "chat-eb6da3a371c546c8a8c4629794328c5b",
"object": "chat.completion",
"created": 1733220118,
"model": "plamo-2.2-prime",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "The quadratic formula for solving second-degree equations is as follows:...",
"tool_calls": []
},
"logprobs": null,
"finish_reason": "stop",
"stop_reason": null
}
],
"usage": {
"prompt_tokens": 169,
"total_tokens": 394,
"completion_tokens": 225
},
"prompt_logprobs": null
}data: {"id":"chat-e05656fa4721494a93d54410881f00c8","object":"chat.completion.chunk","created":1734345308,"model":"plamo-2.2-prime","choices":[{"index":0,"delta":{"role":"assistant"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":143,"total_tokens":143,"completion_tokens":0}}
data: {"id":"chat-e05656fa4721494a93d54410881f00c8","object":"chat.completion.chunk","created":1734345308,"model":"plamo-2.2-prime","choices":[{"index":0,"delta":{"content":"PL"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":143,"total_tokens":144,"completion_tokens":1}}
data: {"id":"chat-e05656fa4721494a93d54410881f00c8","object":"chat.completion.chunk","created":1734345308,"model":"plamo-2.2-prime","choices":[{"index":0,"delta":{"content":"a"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":143,"total_tokens":145,"completion_tokens":2}}
data: {"id":"chat-e05656fa4721494a93d54410881f00c8","object":"chat.completion.chunk","created":1734345308,"model":"plamo-2.2-prime","choices":[{"index":0,"delta":{"content":"Mo"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":143,"total_tokens":146,"completion_tokens":3}}
data: {"id":"chat-e05656fa4721494a93d54410881f00c8","object":"chat.completion.chunk","created":1734345308,"model":"plamo-2.2-prime","choices":[{"index":0,"delta":{"content":""},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"prompt_tokens":143,"total_tokens":148,"completion_tokens":5}}
data: {"id":"chat-e05656fa4721494a93d54410881f00c8","object":"chat.completion.chunk","created":1734345308,"model":"plamo-2.2-prime","choices":[],"usage":{"prompt_tokens":143,"total_tokens":148,"completion_tokens":5}}
data: [DONE]For the /chat/completions endpoint, models supporting reasoning capabilities support both reasoning and reasoning_content. Non-streaming responses include reasoning and reasoning_content within each choice.message, while streaming responses (when stream: true) incrementally include reasoning and reasoning_content in each chunk's choice.delta.
Request Parameters
Please include the following parameters in the request body in JSON format.
| Parameter | Type | Required | Default Value | Description |
|---|---|---|---|---|
| model | string | required | Specifies the ID of the model to use. | |
| messages | array of MessageObject | required | Sets the list of chat messages. Each message must include a role and content. | |
| frequency_penalty | float | optional | 0.0 | Controls the frequency of token repetition. Valid values range from -2.0 to 2.0. |
| max_tokens | integer or null | optional | 4096 | Stops generation once the specified number of tokens have been produced. Depending on the resulting content, generation may terminate earlier. Cannot be set to exceed the model's maximum output token limit. |
| n | integer | optional | 1 | Specifies the number of times to generate content in a single request. Valid values are 1 or 2. |
| presence_penalty | float | optional | 0.0 | Applies a penalty for tokens that have already appeared in the output. Valid values range from -2.0 to 2.0. |
| seed | integer or null | optional | null | Specifies the random seed for token generation. Depending on internal state, different results may occur even when using the same seed value. |
| safety_identifier | string or null | optional | null | An identifier for user identification. This may be used by the PLaMo API side to detect anomalies or other issues. Since it may be persisted on the server, do not include any personal information or sensitive data. |
| stop | array of strings, strings, or null | optional | null | Stops generation when any of the specified words are encountered. |
| stream | boolean | optional | false | Set to 'true' when using streaming responses. |
| stream_options | StreamOptionObject or null | optional | null | Specify options when enabling streaming. See StreamOptionObject below for details. |
| temperature | float | optional | 0.3 | Specifies the sampling temperature. Valid values range from 0 to 2.0. |
| tools | array of ChatCompletionToolsObject | optional | null | Defines the functions to be used for Function Calling. |
| tool_choice | ChatCompletionToolChoiceObject or "none" or "auto" or "required" | optional | null | Specifies the function to be invoked. |
| top_p | float | optional | 1.0 | Controls how far into the token generation results to consider. Valid values range from 0 to 1.0. |
Response (when not using streaming)
| Field | Type | Description |
|---|---|---|
| id | string | A unique ID assigned for each message. |
| choices | array of ChoiceObject | Returns chat-style generation results. |
| created | integer | Returns the generated UNIX timestamp. |
| model | integer | Returns the model ID. |
| object | string | Always returns 'chat.completion'. |
| usage | UsageObject | Returns an object containing usage statistics. |
Response Format (When Streaming is Used)
The response is returned using server-sent events. The response will conclude with an event: [DONE].
| Field | Type | Description |
|---|---|---|
| id | string | A unique ID assigned for each message. |
| choices | array of StreamChoiceObject | Returns chat-style generation results. |
| created | integer | Unix timestamp of when the generation was completed. |
| model | integer | Returns the model ID. |
| object | string | Always returns 'chat.completion.chunk'. |
| usage | UsageObject | Returns an object containing usage statistics. |
Important Note about max_tokens
Currently, the API will output an error when the sum of input tokens + max_tokens exceeds the model's context length. Even if the actual generated output contains fewer tokens than the context length when rendered, the API will still produce an error during the input phase.
Specifically, if max_tokens is not explicitly set, it defaults to 4096, which means an error will occur even if the input length is below the context limit. In such cases, please explicitly specify max_tokens to ensure that input tokens + max_tokens does not exceed the model's context length.
Tokenize API
POST https://api.platform.preferredai.jp/v1/tokenize
Converts a given string into tokens and returns both the token sequence and the total number of tokens.
Note that in Chat Completion, the text will be converted to a chat format, so various prompts will be automatically inserted. Therefore, the token count returned here may not strictly match the actual number of tokens generated in the final text output.
Also, since PLaMo uses the same Tokenizer as this publicly available model, you can integrate similar functionality directly into your own system rather than using the API.
Example Usage
Request
$ curl \
-H "Authorization: Bearer ${PLAMO_API_KEY}" \
-H "Content-Type: application/json" \
"https://api.platform.preferredai.jp/v1/tokenize" \
-d '{"prompt": "Hello", "model": "plamo-2.2-prime"}'Response
{
"count": 4,
"max_model_len": 16384,
"tokens": [
1,
1,
35913,
37607
]
}Request Parameters
Specify the following parameters in JSON format within the request body.
| Parameter | Type | Required | Default Value | Description |
|---|---|---|---|---|
| model | string | yes | Specify the ID of the model to be used. | |
| prompt | string | Either messages or prompt must be specified | Specify the string for which you want to calculate tokens. | |
| messages | array of MessageObject | Either messages or prompt must be specified | Specify the strings for which you want to calculate tokens. The same format as chat messages used in the Chat Completions endpoint can be used. | |
| add_special_tokens | boolean | no | true | Specify whether to add special tokens like BOS. |
| add_generation_prompt | boolean | no | true | Specify whether to apply internal generation prompts during tokenization. This option is only valid when messages are specified. |
Response
| Field | Type | Description |
|---|---|---|
| count | integer | Total number of tokens. |
| max_model_len | integer | Maximum number of tokens the model can process. |
| tokens | array of integer | The actual token sequence. |
Models API
Retrieves data including available models.
Example
Request
For curl, set your PLaMo API key in the PLAMO_API_KEY environment variable, and for Python libraries, set it in the OPENAI_API_KEY environment variable.
$ curl \
-H "Authorization: Bearer ${PLAMO_API_KEY}" \
-H "Content-Type: application/json" \
"https://api.platform.preferredai.jp/v1/models"from openai import OpenAI
client = OpenAI(
base_url="https://api.platform.preferredai.jp/v1",
)
models = client.models.list()
print(models)
model_name = models.data[0].id
model = client.models.retrieve(model_name)
print(model)Response
{
"data": [
{
"id": "plamo-2.2-prime",
"created": 1732978800,
"object": "model",
"owned_by": "system"
}
],
"object": "list"
}Index Models
GET https://api.platform.preferredai.jp/v1/models
Retrieves a list of available models.
Response
| Field | Type | Description |
|---|---|---|
| object | string | Always contains the value list. |
| data | array of ModelObject | Returns an array of model objects |
Get Model Details
GET https://api.platform.preferredai.jp/v1/models/{model}
Fetches detailed information about a specified model.
Path Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
| model | string | yes | Specifies the ID of the model to retrieve. |
Response
Returns a ModelObject.
Object List
MessageObject
Represents a chat message, primarily consisting of System, User, and Assistant components.
| Parameter | Type | Required | Default Value | Description |
|---|---|---|---|---|
| content | string | yes | Specifies the content of the message. | |
| role | string | yes | Indicates the role of the message; must be one of 'system', 'user', or 'assistant'. | |
| name | string | no | A name used for further identification when the same role is referenced. | |
| reasoning | string | no | Sets the reasoning provided in the response. | |
| reasoning_content | string | no | Sets the reasoning_content provided in the response. |
StreamOptionObject
Specifies options when using streaming responses.
| Parameter | Type | Required | Default Value | Description |
|---|---|---|---|---|
| include_usage | boolean | no | Specifies whether to return usage statistics just before the streaming response completes. |
ChatCompletionToolsObject
Defines the functions to be used with the Function Calling feature.
| Field | Type | Description |
|---|---|---|
| type | string | Must be set to 'function'. |
| function | ChatCompletionsToolsFunctionObject | Defines the actual function to be used with Function Calling. |
ChatCompletionToolsFunctionObject
Defines the actual function implementation.
| Field | Type | Description |
|---|---|---|
| name | string | The name of the function. |
| description | string | A description of the function. |
| parameters | string | Definition of arguments that can be passed to the function, expressed in JSON Schema. For details on JSON Schema, see: https://json-schema.org/understanding-json-schema |
ChatCompletionToolChoiceObject
Specifies which function to use.
| Field | Type | Description |
|---|---|---|
| type | string | Must be set to 'function'. |
| function | ChatCompletionToolChoiceFunctionObject | Specifies the function to use. |
ChatCompletionToolChoiceFunctionObject
Specifies the function details.
| Field | Type | Description |
|---|---|---|
| name | string | Specifies the name of the function specified in the 'tools' parameter. |
ChoiceObject
Represents the result of generating a response in chat format.
| Field | Type | Description |
|---|---|---|
| index | integer | Returns the index of the object when multiple ChoiceObjects are available. |
| finish_reason | 'stop' or 'length' | Returns 'stop' if the specified word was encountered or if generation naturally concluded; returns 'length' if the maximum token limit was exceeded. |
| logprobs | null | Currently not supported. |
| message | ChatMessageObject | Returns the content of the generated message. |
| stop_reason | integer, string, or null | Currently not supported. |
ChatMessageObject
Represents the content of a generated message.
| Field | Type | Description |
|---|---|---|
| content | string | Returns the message content. |
| reasoning | string or null | Returns the model's reasoning process. This field may only be present for models supporting reasoning. |
| reasoning_content | string or null | Contains the same content as reasoning. |
| role | string | Returns the role of the message, which can be one of 'system', 'user', or 'assistant'. |
| tool_calls | ChatMessageToolCallsObject | Information about Function Calling generated by the model. |
ChatMessageToolCallsObject
Indicates the result of a function call.
| Field | Type | Description |
|---|---|---|
| id | string | A unique identifier. |
| type | string | Must be 'function'. |
| function | ChatMessageToolCallsFunctionObject | Information about the Function called via Function Calling. |
ChatMessageToolCallsFunctionObject
Information about the called Function.
| Field | Type | Description |
|---|---|---|
| name | string | Indicates the name of the Function. |
| arguments | string | Function arguments specified in JSON format. |
StreamChoiceObject
Represents the result of generated chat messages in a stream format.
| Field | Type | Description |
|---|---|---|
| index | integer | Returns the index of the object when multiple ChoiceObject instances exist. |
| finish_reason | 'stop' or 'length' | Returns stop if the specified word is encountered or generation ends naturally, and length if the maximum generated token count is exceeded. |
| logprobs | null | Currently not supported. |
| delta | ChatMessageDeltaObject | Returns the content of the additionally generated message. |
| stop_reason | integer or string or null | Currently not supported. |
ChatMessageDeltaObject
Represents the content of a message in a stream format.
| Field | Type | Description |
|---|---|---|
| content | string | Returns the content of the message. The content field may be absent in cases such as when a role assignment is emitted. |
| reasoning | string or null | Returns the model's reasoning process. This field may be present only for models that support reasoning. |
| reasoning_content | string or null | Contains the same content as reasoning. |
| role | string | Returns the role of the message, which is one of system, user, or assistant. This field is set only when a new role is assigned. |
UsageObject
Indicates token generation results and related information.
| Field | Type | Description |
|---|---|---|
| completion_tokens | integer | Returns the number of generated tokens. |
ModelObject
Indicates information about a model.
| Field | Type | Description |
|---|---|---|
| id | string | A string that identifies the model. It can be used as an API parameter. |
| created | integer | The time when the model was created. |
| object | string | The type of this object. Currently, this field is always model. |
| owned_by | string | The owner of the model. Currently, this field is always system. |
Old API Endpoints
The API endpoints under https://platform.preferredai.jp/api/completion/v1 have been migrated. The new endpoint is https://api.platform.preferredai.jp/v1.
The old API endpoints remain available, but they are deprecated. Please use the new API endpoints.
