Completions 补全（Legacy）

Completions 补全

给定prompt，该模型将返回一个或多个预测的completions以及每个位置的alternative tokens的概率。大多数开发人员应该使用我们的聊天完成API来利用我们最好的和最新的models。

注：此接口是旧版本或过时的接口

1. Create completion

POST 
https://api.zhizengzeng.com/v1/completions

为提供的提示和参数创建completions。

请求演示：

curl https://api.zhizengzeng.com/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-3.5-turbo-instruct",
    "prompt": "Say this is a test",
    "max_tokens": 7,
    "temperature": 0
  }'

响应：

{
  "id": "cmpl-uqkvlQyYK7bGYrRHQ0eXlWi7",
  "object": "text_completion",
  "created": 1589478378,
  "model": "gpt-3.5-turbo-instruct",
  "system_fingerprint": "fp_44709d6fcb",
  "choices": [
    {
      "text": "\n\nThis is indeed a test",
      "index": 0,
      "logprobs": null,
      "finish_reason": "length"
    }
  ],
  "usage": {
    "prompt_tokens": 5,
    "completion_tokens": 7,
    "total_tokens": 12
  }
}

Request body(入参详解)

model （string，必填）
要使用的模型ID。有关哪些模型适用于Chat API的详细信息，请查看模型端点兼容性表

prompt （string or array，必填）
The prompt(s) 生成completions，编码为字符串，字符串数组，代币数组或代币数组数组。
请注意，<| endofText |>是该模型在训练过程中看到的文档分离器，因此，如果未指定提示，则模型将像从新文档的开头一样生成。

best_of （integer or null，选填，Defaults to 1）
生成BEST_OF完成服务器端，并返回“ BEST”（每个令牌的日志概率最高）。结果无法流。
当与N一起使用时，BEST_OF控制候选候选的数量，n指定返回多少 - BEST_OF必须大于n。
注意：由于此参数会生成许多完成，因此可以快速消耗您的令牌配额。仔细使用并确保您对MAX_TOKENS具有合理的设置并停止。

echo （boolean or null，选填，Defaults to false）
Echo back the prompt in addition to the completion

frequency_penalty （number，选填，Defaults to 0）

介于-2.0和2.0之间的数字。正值会根据文本中新令牌的现有频率对其进行惩罚，从而降低模型重复相同行的可能性。

请参阅有关频率和存在惩罚的更多信息

logit_bias （map，选填，Defaults to null）

修改完成时指定标记出现的可能性。

接受一个JSON对象，将标记（由分词器中的标记ID指定）映射到从 -100 到 100 的相关偏差值。在采样之前，模型生成的logits会加上这个偏差。确切的影响因模型而异，但是 -1 到 1 之间的值应该会减少或增加选择概率；像 -100 或 100 这样的值应该会导致相关标记被禁止或独占选择。

logprobs （number，选填，Defaults to 0）

是否返回输出标记的对数概率。如果为 true，则返回消息内容中返回的每个输出标记的对数概率。此选项当前在 gpt-4-vision-preview 模型上不可用。

max_tokens （integer，选填，Defaults to inf）

在聊天完成中生成的最大 tokens 数。

输入令牌和生成的令牌的总长度受模型上下文长度的限制。

n （integer，选填，Defaults to 1）

每个输入消息要生成多少聊天完成选项数

presence_penalty （number，选填，Defaults to 0）

介于 -2.0 和 2.0 之间的数字。正值会根据它们是否出现在文本中迄今为止来惩罚新令牌，从而增加模型谈论新主题的可能性。

请参阅有关频率和状态惩罚的更多信息

seed （integer，选填，Defaults to 1）
此功能处于测试阶段。如果指定，我们的系统将尽最大努力进行确定性采样，以便使用相同种子和参数的重复请求应返回相同的结果。不保证确定性，您应该参考system_fingerprint响应参数来监控后端的变化。

stop （string or array，选填，Defaults to null）

最多生成4个序列，API将停止生成更多的标记。

stream （boolean，选填，Defaults to false）

如果设置了，将发送部分消息增量，就像在 ChatGPT 中一样。令牌将作为数据服务器推送事件随着它们变得可用而被发送，流通过 data: [DONE] 消息终止。请参阅OpenAI Cookbook 以获取示例代码。

suffix （string or null，选填，Defaults to null）
插入文本完成后出现的后缀。
此参数仅支持GPT-3.5-Turbo-Instruct。

temperature （number or null，选填，Defaults to 1）

使用哪个采样温度，在 0和2之间。

较高的值，如0.8会使输出更随机，而较低的值，如0.2会使其更加集中和确定性。

我们通常建议修改这个（temperature ）为 top_p 但两者不能同时存在，二选一。

top_p （number or null，选填，Defaults to 1）

温度采样的替代方法称为核采样，其中模型考虑具有 top_p 概率质量的标记的结果。因此 0.1 意味着仅考虑包含前 10% 概率质量的标记。
我们通常建议更改此值或温度temperature，但不能同时更改两者。

user （string，选填）

一个唯一的标识符，代表您的终端用户，可以帮助OpenAI监测和检测滥用。了解更多信息。

示例1. no streaming

python的

from openai import OpenAI
client = OpenAI()

client.completions.create(
  model="gpt-3.5-turbo-instruct",
  prompt="Say this is a test",
  max_tokens=7,
  temperature=0
)

node的：

import OpenAI from "openai";

const openai = new OpenAI();

async function main() {
  const completion = await openai.completions.create({
    model: "gpt-3.5-turbo-instruct",
    prompt: "Say this is a test.",
    max_tokens: 7,
    temperature: 0,
  });

  console.log(completion);
}
main();

回复：

{
  "id": "cmpl-uqkvlQyYK7bGYrRHQ0eXlWi7",
  "object": "text_completion",
  "created": 1589478378,
  "model": "gpt-3.5-turbo-instruct",
  "system_fingerprint": "fp_44709d6fcb",
  "choices": [
    {
      "text": "\n\nThis is indeed a test",
      "index": 0,
      "logprobs": null,
      "finish_reason": "length"
    }
  ],
  "usage": {
    "prompt_tokens": 5,
    "completion_tokens": 7,
    "total_tokens": 12
  }
}

示例2. streaming

curl的：

curl https://api.zhizengzeng.com/v1/completions \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer $OPENAI_API_KEY" \
  -d '{
    "model": "gpt-3.5-turbo-instruct",
    "prompt": "Say this is a test",
    "max_tokens": 7,
    "temperature": 0,
    "stream": true
  }'

python的：

from openai import OpenAI
client = OpenAI()

for chunk in client.completions.create(
  model="gpt-3.5-turbo-instruct",
  prompt="Say this is a test",
  max_tokens=7,
  temperature=0,
  stream=True
):
  print(chunk.choices[0].text)

node的：

import OpenAI from "openai";

const openai = new OpenAI();

async function main() {
  const stream = await openai.completions.create({
    model: "gpt-3.5-turbo-instruct",
    prompt: "Say this is a test.",
    stream: true,
  });

  for await (const chunk of stream) {
    console.log(chunk.choices[0].text)
  }
}
main();

回复：

{
  "id": "cmpl-7iA7iJjj8V2zOkCGvWF2hAkDWBQZe",
  "object": "text_completion",
  "created": 1690759702,
  "choices": [
    {
      "text": "This",
      "index": 0,
      "logprobs": null,
      "finish_reason": null
    }
  ],
  "model": "gpt-3.5-turbo-instruct"
  "system_fingerprint": "fp_44709d6fcb",
}