API Reference

`Chat Completions`

POST https://api.platform.preferredai.jp/v1/chat/completions

This endpoint generates responses from generation AI models based on provided chat messages.

Currently, only the PLaMo Prime model is supported.

Model ID	Context Length	Maximum Output Tokens	Reasoning Capability
`plamo-2.2-prime`	32,768 tokens	4,096 tokens	Not supported
`plamo-3.0-prime-beta`	65,536 tokens	20,000 tokens	Supported

Models plamo-2.2-prime and earlier have been upgraded to plamo-2.2-prime as of 2026/01/28.

While plamo-3.0-prime-beta is available as a separate model, it is currently only available to monitored enterprise customers. Those interested should submit an inquiry/application via this form.

Sample Usage

For curl, set your PLaMo API key in an environment variable named PLAMO_API_KEY, and for Python libraries, set it in an environment variable named OPENAI_API_KEY.

Request

curlPython (langchain-openai)Python (openai)

bash

$ curl \
    -H "Authorization: Bearer ${PLAMO_API_KEY}" \
    -H "Content-Type: application/json" \
    "https://api.platform.preferredai.jp/v1/chat/completions" \
    -d @- << EOF
{
  "messages": [
    {
      "role": "system",
      "content": "You are a travel advisor."
    },
    {
      "role": "user",
      "content": "Please recommend an optimal sightseeing route in Kanazawa from morning until evening."
    }
  ],
  "model": "plamo-2.2-prime"
}
EOF

python

import os
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(
    base_url="https://api.platform.preferredai.jp/v1",
    model="plamo-2.2-prime",
    streaming=True,
    # other parameters...,
)

messages=[
    {"role": "system", "content": "You are a travel advisor."},
    {"role": "user", "content": "Please recommend an optimal sightseeing route in Kanazawa from morning until evening."},
]

for chunk in llm.stream(messages):
    print(chunk.content, end="", flush=True)

python

import os
from openai import OpenAI

client = OpenAI(
    base_url="https://api.platform.preferredai.jp/v1",
    # other parameters...,
)

completion = client.chat.completions.create(
    model="plamo-2.2-prime",
    messages=[
        {"role": "system", "content": "You are a travel advisor."},
        {"role": "user", "content": "Please recommend an optimal sightseeing route in Kanazawa from morning until evening."},
    ],
    stream=True,
)

for chunk in completion:
    if chunk.choices and chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="", flush=True)

Response

non-streamingstreaming

json


{
    "id": "chat-eb6da3a371c546c8a8c4629794328c5b",
    "object": "chat.completion",
    "created": 1733220118,
    "model": "plamo-2.2-prime",
    "choices": [
        {
            "index": 0,
            "message": {
                "role": "assistant",
                "content": "The quadratic formula for solving second-degree equations is as follows:...",
                "tool_calls": []
            },
            "logprobs": null,
            "finish_reason": "stop",
            "stop_reason": null
        }
    ],
    "usage": {
        "prompt_tokens": 169,
        "total_tokens": 394,
        "completion_tokens": 225
    },
    "prompt_logprobs": null
}

json

data: {"id":"chat-e05656fa4721494a93d54410881f00c8","object":"chat.completion.chunk","created":1734345308,"model":"plamo-2.2-prime","choices":[{"index":0,"delta":{"role":"assistant"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":143,"total_tokens":143,"completion_tokens":0}}

data: {"id":"chat-e05656fa4721494a93d54410881f00c8","object":"chat.completion.chunk","created":1734345308,"model":"plamo-2.2-prime","choices":[{"index":0,"delta":{"content":"PL"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":143,"total_tokens":144,"completion_tokens":1}}

data: {"id":"chat-e05656fa4721494a93d54410881f00c8","object":"chat.completion.chunk","created":1734345308,"model":"plamo-2.2-prime","choices":[{"index":0,"delta":{"content":"a"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":143,"total_tokens":145,"completion_tokens":2}}

data: {"id":"chat-e05656fa4721494a93d54410881f00c8","object":"chat.completion.chunk","created":1734345308,"model":"plamo-2.2-prime","choices":[{"index":0,"delta":{"content":"Mo"},"logprobs":null,"finish_reason":null}],"usage":{"prompt_tokens":143,"total_tokens":146,"completion_tokens":3}}

data: {"id":"chat-e05656fa4721494a93d54410881f00c8","object":"chat.completion.chunk","created":1734345308,"model":"plamo-2.2-prime","choices":[{"index":0,"delta":{"content":""},"logprobs":null,"finish_reason":"stop","stop_reason":null}],"usage":{"prompt_tokens":143,"total_tokens":148,"completion_tokens":5}}

data: {"id":"chat-e05656fa4721494a93d54410881f00c8","object":"chat.completion.chunk","created":1734345308,"model":"plamo-2.2-prime","choices":[],"usage":{"prompt_tokens":143,"total_tokens":148,"completion_tokens":5}}

data: [DONE]

For the /chat/completions endpoint, models supporting reasoning capabilities support both reasoning and reasoning_content. Non-streaming responses include reasoning and reasoning_content within each choice.message, while streaming responses (when stream: true) incrementally include reasoning and reasoning_content in each chunk's choice.delta.

Request Parameters

Please include the following parameters in the request body in JSON format.

Parameter	Type	Required	Default Value	Description
model	string	required		Specifies the ID of the model to use.
messages	array of MessageObject	required		Sets the list of chat messages. Each message must include a `role` and `content`.
frequency_penalty	float	optional	0.0	Controls the frequency of token repetition. Valid values range from -2.0 to 2.0.
max_tokens	integer or null	optional	4096	Stops generation once the specified number of tokens have been produced. Depending on the resulting content, generation may terminate earlier. Cannot be set to exceed the model's maximum output token limit.
n	integer	optional	1	Specifies the number of times to generate content in a single request. Valid values are 1 or 2.
presence_penalty	float	optional	0.0	Applies a penalty for tokens that have already appeared in the output. Valid values range from -2.0 to 2.0.
seed	integer or null	optional	null	Specifies the random seed for token generation. Depending on internal state, different results may occur even when using the same seed value.
safety_identifier	string or null	optional	null	An identifier for user identification. This may be used by the PLaMo API side to detect anomalies or other issues. Since it may be persisted on the server, do not include any personal information or sensitive data.
stop	array of strings, strings, or null	optional	null	Stops generation when any of the specified words are encountered.
stream	boolean	optional	false	Set to 'true' when using streaming responses.
stream_options	StreamOptionObject or null	optional	null	Specify options when enabling streaming. See StreamOptionObject below for details.
temperature	float	optional	0.3	Specifies the sampling temperature. Valid values range from 0 to 2.0.
tools	array of ChatCompletionToolsObject	optional	null	Defines the functions to be used for Function Calling.
tool_choice	ChatCompletionToolChoiceObject or "none" or "auto" or "required"	optional	null	Specifies the function to be invoked.
top_p	float	optional	1.0	Controls how far into the token generation results to consider. Valid values range from 0 to 1.0.

Response (when not using streaming)

Field	Type	Description
id	string	A unique ID assigned for each message.
choices	array of ChoiceObject	Returns chat-style generation results.
created	integer	Returns the generated UNIX timestamp.
model	integer	Returns the model ID.
object	string	Always returns 'chat.completion'.
usage	UsageObject	Returns an object containing usage statistics.

Response Format (When Streaming is Used)

The response is returned using server-sent events. The response will conclude with an event: [DONE].

Field	Type	Description
id	string	A unique ID assigned for each message.
choices	array of StreamChoiceObject	Returns chat-style generation results.
created	integer	Unix timestamp of when the generation was completed.
model	integer	Returns the model ID.
object	string	Always returns 'chat.completion.chunk'.
usage	UsageObject	Returns an object containing usage statistics.

Important Note about max_tokens

Currently, the API will output an error when the sum of input tokens + max_tokens exceeds the model's context length. Even if the actual generated output contains fewer tokens than the context length when rendered, the API will still produce an error during the input phase.

Specifically, if max_tokens is not explicitly set, it defaults to 4096, which means an error will occur even if the input length is below the context limit. In such cases, please explicitly specify max_tokens to ensure that input tokens + max_tokens does not exceed the model's context length.

`Tokenize API`

POST https://api.platform.preferredai.jp/v1/tokenize

Converts a given string into tokens and returns both the token sequence and the total number of tokens.

Note that in Chat Completion, the text will be converted to a chat format, so various prompts will be automatically inserted. Therefore, the token count returned here may not strictly match the actual number of tokens generated in the final text output.

Also, since PLaMo uses the same Tokenizer as this publicly available model, you can integrate similar functionality directly into your own system rather than using the API.

Example Usage

Request

curl

bash

$ curl \
    -H "Authorization: Bearer ${PLAMO_API_KEY}" \
    -H "Content-Type: application/json" \
    "https://api.platform.preferredai.jp/v1/tokenize" \
    -d '{"prompt": "Hello", "model": "plamo-2.2-prime"}'

Response

curl

bash

{
    "count": 4,
    "max_model_len": 16384,
    "tokens": [
        1,
        1,
        35913,
        37607
    ]
}

Request Parameters

Specify the following parameters in JSON format within the request body.

Parameter	Type	Required	Default Value	Description
model	string	yes		Specify the ID of the model to be used.
prompt	string	Either messages or prompt must be specified		Specify the string for which you want to calculate tokens.
messages	array of MessageObject	Either messages or prompt must be specified		Specify the strings for which you want to calculate tokens. The same format as chat messages used in the Chat Completions endpoint can be used.
add_special_tokens	boolean	no	true	Specify whether to add special tokens like BOS.
add_generation_prompt	boolean	no	true	Specify whether to apply internal generation prompts during tokenization. This option is only valid when messages are specified.

Response

Field	Type	Description
count	integer	Total number of tokens.
max_model_len	integer	Maximum number of tokens the model can process.
tokens	array of integer	The actual token sequence.

`Models API`

Retrieves data including available models.

Example

Request

For curl, set your PLaMo API key in the PLAMO_API_KEY environment variable, and for Python libraries, set it in the OPENAI_API_KEY environment variable.

curlPython(openai)

bash

$ curl \
    -H "Authorization: Bearer ${PLAMO_API_KEY}" \
    -H "Content-Type: application/json" \
    "https://api.platform.preferredai.jp/v1/models"

python

from openai import OpenAI
client = OpenAI(
    base_url="https://api.platform.preferredai.jp/v1",
)

models = client.models.list()
print(models)

model_name = models.data[0].id
model = client.models.retrieve(model_name)
print(model)

Response

curl

json

{
    "data": [
        {
            "id": "plamo-2.2-prime",
            "created": 1732978800,
            "object": "model",
            "owned_by": "system"
        }
    ],
    "object": "list"
}

`Index Models`

GET https://api.platform.preferredai.jp/v1/models

Retrieves a list of available models.

Response

Field	Type	Description
object	string	Always contains the value `list`.
data	array of ModelObject	Returns an array of model objects

`Get Model Details`

GET https://api.platform.preferredai.jp/v1/models/{model}

Fetches detailed information about a specified model.

Path Parameters

Parameter	Type	Required	Description
model	string	yes	Specifies the ID of the model to retrieve.

Response

Returns a ModelObject.

Object List

MessageObject

Represents a chat message, primarily consisting of System, User, and Assistant components.

Parameter	Type	Required	Description
content	string	yes	Specifies the content of the message.
role	string	yes	Indicates the role of the message; must be one of 'system', 'user', or 'assistant'.
name	string	no	A name used for further identification when the same role is referenced.
reasoning	string	no	Sets the `reasoning` provided in the response.
reasoning_content	string	no	Sets the `reasoning_content` provided in the response.

StreamOptionObject

Specifies options when using streaming responses.

Parameter	Type	Required	Default Value	Description
include_usage	boolean	no		Specifies whether to return usage statistics just before the streaming response completes.

ChatCompletionToolsObject

Defines the functions to be used with the Function Calling feature.

Field	Type	Description
type	string	Must be set to 'function'.
function	ChatCompletionsToolsFunctionObject	Defines the actual function to be used with Function Calling.

ChatCompletionToolsFunctionObject

Defines the actual function implementation.

Field	Type	Description
name	string	The name of the function.
description	string	A description of the function.
parameters	string	Definition of arguments that can be passed to the function, expressed in JSON Schema. For details on JSON Schema, see: https://json-schema.org/understanding-json-schema

ChatCompletionToolChoiceObject

Specifies which function to use.

Field	Type	Description
type	string	Must be set to 'function'.
function	ChatCompletionToolChoiceFunctionObject	Specifies the function to use.

ChatCompletionToolChoiceFunctionObject

Specifies the function details.

Field	Type	Description
name	string	Specifies the name of the function specified in the 'tools' parameter.

ChoiceObject

Represents the result of generating a response in chat format.

Field	Type	Description
index	integer	Returns the index of the object when multiple ChoiceObjects are available.
finish_reason	'stop' or 'length'	Returns 'stop' if the specified word was encountered or if generation naturally concluded; returns 'length' if the maximum token limit was exceeded.
logprobs	null	Currently not supported.
message	ChatMessageObject	Returns the content of the generated message.
stop_reason	integer, string, or null	Currently not supported.

ChatMessageObject

Represents the content of a generated message.

Field	Type	Description
content	string	Returns the message content.
reasoning	string or null	Returns the model's reasoning process. This field may only be present for models supporting reasoning.
reasoning_content	string or null	Contains the same content as `reasoning`.
role	string	Returns the role of the message, which can be one of 'system', 'user', or 'assistant'.
tool_calls	ChatMessageToolCallsObject	Information about Function Calling generated by the model.

ChatMessageToolCallsObject

Indicates the result of a function call.

Field	Type	Description
id	string	A unique identifier.
type	string	Must be 'function'.
function	ChatMessageToolCallsFunctionObject	Information about the Function called via Function Calling.

ChatMessageToolCallsFunctionObject

Information about the called Function.

Field	Type	Description
name	string	Indicates the name of the Function.
arguments	string	Function arguments specified in JSON format.

StreamChoiceObject

Represents the result of generated chat messages in a stream format.

Field	Type	Description
index	integer	Returns the index of the object when multiple ChoiceObject instances exist.
finish_reason	'stop' or 'length'	Returns `stop` if the specified word is encountered or generation ends naturally, and `length` if the maximum generated token count is exceeded.
logprobs	null	Currently not supported.
delta	ChatMessageDeltaObject	Returns the content of the additionally generated message.
stop_reason	integer or string or null	Currently not supported.

ChatMessageDeltaObject

Represents the content of a message in a stream format.

Field	Type	Description
content	string	Returns the content of the message. The `content` field may be absent in cases such as when a role assignment is emitted.
reasoning	string or null	Returns the model's reasoning process. This field may be present only for models that support reasoning.
reasoning_content	string or null	Contains the same content as `reasoning`.
role	string	Returns the role of the message, which is one of `system`, `user`, or `assistant`. This field is set only when a new role is assigned.

UsageObject

Indicates token generation results and related information.

Field	Type	Description
completion_tokens	integer	Returns the number of generated tokens.

ModelObject

Indicates information about a model.

Field	Type	Description
id	string	A string that identifies the model. It can be used as an API parameter.
created	integer	The time when the model was created.
object	string	The type of this object. Currently, this field is always `model`.
owned_by	string	The owner of the model. Currently, this field is always `system`.

Old API Endpoints

The API endpoints under https://platform.preferredai.jp/api/completion/v1 have been migrated. The new endpoint is https://api.platform.preferredai.jp/v1.

The old API endpoints remain available, but they are deprecated. Please use the new API endpoints.

API Reference ​

Chat Completions ​

Sample Usage ​

Request ​

Response ​

Request Parameters ​

Response (when not using streaming) ​

Response Format (When Streaming is Used) ​

Important Note about max_tokens ​

Tokenize API ​

Example Usage ​

Request ​

Response ​

Request Parameters ​

Response ​

Models API ​

Example ​

Request ​

Response ​

Index Models ​

Response ​

Get Model Details ​

Path Parameters ​

Response ​

Object List ​

MessageObject ​

StreamOptionObject ​

ChatCompletionToolsObject ​

ChatCompletionToolsFunctionObject ​

ChatCompletionToolChoiceObject ​

ChatCompletionToolChoiceFunctionObject ​

ChoiceObject ​

ChatMessageObject ​

ChatMessageToolCallsObject ​

ChatMessageToolCallsFunctionObject ​

StreamChoiceObject ​

ChatMessageDeltaObject ​

UsageObject ​

ModelObject ​

Old API Endpoints ​

API Reference

`Chat Completions`

Sample Usage

Request

Response

Request Parameters

Response (when not using streaming)

Response Format (When Streaming is Used)

Important Note about max_tokens

`Tokenize API`

Example Usage

Request

Response

Request Parameters

Response

`Models API`

Example

Request

Response

`Index Models`

Response

`Get Model Details`

Path Parameters

Response

Object List

MessageObject

StreamOptionObject

ChatCompletionToolsObject

ChatCompletionToolsFunctionObject

ChatCompletionToolChoiceObject

ChatCompletionToolChoiceFunctionObject

ChoiceObject

ChatMessageObject

ChatMessageToolCallsObject

ChatMessageToolCallsFunctionObject

StreamChoiceObject

ChatMessageDeltaObject

UsageObject

ModelObject

Old API Endpoints