AI image generation with OpenAI API

AI image generation with OpenAI API

使用 OpenAI API 进行 AI 图像生成

OpenAI exposes image generation through the Image API (POST /images/generations). The official openai npm package wraps it as client.images.generate. This post walks through the main request parameters and how to save generated images from Node.js. The examples use gpt-image-2, OpenAI’s latest GPT Image model. GPT Image models always return base64-encoded image data in data[].b64_json. Use output_format for the on-disk file type and put artistic direction in the prompt. For text generation with the same package, see the Chat Completions API and Responses API posts. Image generation is also available through Responses API tools, but this post focuses on the dedicated Image API endpoint.

OpenAI 通过 Image API (POST /images/generations) 提供图像生成功能。官方的 openai npm 包将其封装为 client.images.generate。本文将介绍主要的请求参数以及如何在 Node.js 中保存生成的图像。示例中使用的是 OpenAI 最新的 GPT 图像模型 gpt-image-2。GPT 图像模型始终在 data[].b64_json 中返回 base64 编码的图像数据。你可以使用 output_format 指定磁盘上的文件类型,并将艺术指导写入 prompt 中。关于使用同一包进行文本生成的内容,请参阅 Chat Completions API 和 Responses API 的相关文章。虽然图像生成也可以通过 Responses API 工具实现,但本文重点介绍专用的 Image API 端点。

The running scenario: generate marketing hero images for a fictional todo app.

运行场景:为一款虚构的待办事项应用生成营销主图。

Prerequisites

  • OpenAI account
  • Generated API key
  • Enabled billing
  • Node.js version 26
  • openai package installed (npm i openai)

前置条件

  • OpenAI 账户
  • 已生成的 API 密钥
  • 已启用计费
  • Node.js 26 版本
  • 已安装 openai 包 (npm i openai)

Client setup

客户端设置

Create a client with your API key (read from the environment in production). 使用你的 API 密钥创建一个客户端(在生产环境中应从环境变量中读取)。

import OpenAI from 'openai';
const client = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

The same SDK can target other hosts that implement a compatible API by setting baseURL and apiKey: 通过设置 baseURLapiKey,同一个 SDK 也可以指向其他实现了兼容 API 的主机:

const client = new OpenAI({
  apiKey: process.env.LLM_API_KEY,
  baseURL: 'https://your-gateway.example/v1',
});

Azure OpenAI uses AzureOpenAI instead. Confirm your provider supports the Image API and the model you pass to model. Azure OpenAI 则使用 AzureOpenAI。请确认你的服务提供商支持 Image API 以及你传入 model 参数的模型。

Basic integration

基础集成

Call client.images.generate with model and prompt. The examples use gpt-image-2, older snapshots include gpt-image-1.5, gpt-image-1, and gpt-image-1-mini. Pin a snapshot (for example gpt-image-2-2026-04-21) when you need stable behavior across deploys. The prompt describes what to generate. GPT Image models accept up to about 32,000 characters. Be specific about subject, layout, colors, and style. GPT Image models always return base64 in data[].b64_json. Decode it and write the file yourself.

调用 client.images.generate 并传入 modelprompt。示例中使用的是 gpt-image-2,旧版本快照包括 gpt-image-1.5gpt-image-1gpt-image-1-mini。当你需要在不同部署中保持行为稳定时,请锁定特定快照(例如 gpt-image-2-2026-04-21)。Prompt 用于描述要生成的内容。GPT 图像模型最多可接受约 32,000 个字符。请详细说明主题、布局、颜色和风格。GPT 图像模型始终在 data[].b64_json 中返回 base64 数据。你需要自行解码并写入文件。

import { writeFile } from 'node:fs/promises';

const prompt = `
  Minimal flat illustration for a productivity app landing page. 
  Show a todo dashboard with a checklist, calendar widget, and soft pastel palette. 
  No text labels on screen elements.
`.trim();

const result = await client.images.generate({
  model: 'gpt-image-2',
  prompt,
});

await writeFile('hero.png', Buffer.from(result.data[0].b64_json, 'base64'));

n

n 参数

Use n to generate multiple images in one request (default 1, maximum 10). Loop over result.data to save each image. 使用 n 参数可以在一次请求中生成多张图像(默认 1 张,最多 10 张)。遍历 result.data 以保存每张图像。

import { writeFile } from 'node:fs/promises';

const result = await client.images.generate({
  model: 'gpt-image-2',
  prompt: 'Minimal flat illustration of a todo app dashboard, variant layout, soft pastel colors',
  n: 2,
});

for (const [index, item] of result.data.entries()) {
  await writeFile(
    `hero-${index}.png`,
    Buffer.from(item.b64_json, 'base64'),
  );
}

Size

尺寸

Control dimensions with size. Common presets are 1024x1024 (square), 1536x1024 (landscape), and 1024x1536 (portrait). auto lets the model pick based on the prompt. gpt-image-2 also accepts custom WIDTHxHEIGHT strings when width and height are multiples of 16, the aspect ratio is between 1:3 and 3:1, and total pixels stay within the documented limits.

使用 size 控制尺寸。常见的预设包括 1024x1024(正方形)、1536x1024(横向)和 1024x1536(纵向)。auto 允许模型根据提示词自动选择。gpt-image-2 还接受自定义的 WIDTHxHEIGHT 字符串,前提是宽度和高度必须是 16 的倍数,宽高比在 1:3 到 3:1 之间,且总像素数在文档规定的限制内。

const result = await client.images.generate({
  model: 'gpt-image-2',
  prompt: 'Minimal flat illustration of a todo app dashboard, portrait orientation, soft pastel colors',
  size: '1024x1536',
});

Quality

质量

Set rendering quality with quality. Use low for fast drafts and iterations, then medium or high for final assets. Default is auto.

使用 quality 设置渲染质量。使用 low 进行快速草稿和迭代,使用 mediumhigh 生成最终素材。默认值为 auto

const draft = await client.images.generate({
  model: 'gpt-image-2',
  prompt: 'Minimal flat illustration of a todo app dashboard, soft pastel colors',
  quality: 'low',
});

const final = await client.images.generate({
  model: 'gpt-image-2',
  prompt: 'Minimal flat illustration of a todo app dashboard, soft pastel colors, polished details',
  quality: 'high',
  size: '1024x1536',
});

Output format

输出格式

GPT Image models return base64 in the JSON response. Use output_format to control the encoded file type: png (default), jpeg, or webp.

GPT 图像模型在 JSON 响应中返回 base64 数据。使用 output_format 控制编码的文件类型:png(默认)、jpegwebp

import { writeFile } from 'node:fs/promises';

const result = await client.images.generate({
  model: 'gpt-image-2',
  prompt: 'Minimal flat illustration of a todo app dashboard, soft pastel colors',
  output_format: 'jpeg',
});

await writeFile('hero.jpg', Buffer.from(result.data[0].b64_json, 'base64'));

Compression

压缩

When output_format is jpeg or webp, set output_compression from 0 to 100 to trade file size for quality. JPEG is often faster than PNG when latency matters.

output_formatjpegwebp 时,设置 output_compression(0 到 100)以权衡文件大小和质量。在对延迟敏感的情况下,JPEG 通常比 PNG 更快。

const result = await client.images.generate({
  model: 'gpt-image-2',
  prompt: 'Minimal flat illustration of a todo app dashboard, soft pastel colors',
  output_format: 'webp',
  output_compression: 50,
});

Background

背景

Use background: 'transparent' with png or webp on models that support it when you need a cutout asset. gpt-image-2 does not support transparent backgrounds; use gpt-image-1.5 or an earlier GPT Image model for that workflow, or bake the background into the prompt.

当你需要抠图素材时,可以在支持的模型上配合 pngwebp 使用 background: 'transparent'gpt-image-2 不支持透明背景;请使用 gpt-image-1.5 或更早的 GPT 图像模型来实现该工作流,或者将背景描述直接写入提示词中。

const result = await client.images.generate({
  model: 'gpt-image-1.5',
  prompt: 'Flat icon of a checkmark, no background, centered',
  output_format: 'png',
  background: 'transparent',
});

Production notes

生产环境注意事项

  • Cost: scales with quality and size. See OpenAI pricing before generating at scale.

  • Moderation: use moderation: 'auto' (default) or low on GPT Image models when you need less restrictive filtering.

  • Errors: handle image_generation_user_error (for example moderation_blocked) by changing the prompt or inputs; do not blindly retry.

  • Latency: complex prompts can take up to about two minutes.

  • Storage: decode and persist files yourself. GPT Image responses are base64 in JSON.

  • 成本:随质量和尺寸增加。在大规模生成前请查看 OpenAI 定价。

  • 审核:当你需要较宽松的过滤时,可以在 GPT 图像模型上使用 moderation: 'auto'(默认)或 low

  • 错误:通过修改提示词或输入来处理 image_generation_user_error(例如 moderation_blocked);不要盲目重试。

  • 延迟:复杂的提示词可能需要长达两分钟的处理时间。

  • 存储:需自行解码并持久化文件。GPT 图像响应以 JSON 格式返回 base64 数据。