What I learned generating OG images for articles with Playwright and zero API cost

What I learned generating OG images for articles with Playwright and zero API cost

使用 Playwright 生成文章 OG 图片且零 API 成本的经验总结

The conclusion first: for a batch of under a few hundred static articles, generating OG images by screenshotting HTML templates with Playwright costs nothing, gives you full CSS control, and requires zero external API keys. The trade-offs are real — it’s slow per image, it’s not suitable for on-demand generation, and it has a hidden dependency on network availability during the build step. But for my use case, those trade-offs don’t hurt. Here’s how the script works, what broke, and what I’d do differently. 先说结论:对于几百篇以内的静态文章,通过 Playwright 对 HTML 模板进行截图来生成 OG 图片是完全免费的,它能让你拥有完整的 CSS 控制权,且无需任何外部 API 密钥。当然,这种方案确实存在权衡——每张图片的生成速度较慢,不适合按需实时生成,且在构建阶段对网络连通性有隐性依赖。但对于我的使用场景来说,这些权衡完全可以接受。以下是脚本的工作原理、遇到的问题以及我未来会做的改进。

Why I avoided image generation APIs

为什么我避开了图片生成 API

My three directory sites — aiappdex.com, findindiegame.com, ossfind.com — are fully static Astro 5 SSG builds. Articles publish automatically through a GitHub Actions pipeline. The pipeline already handles Dev.to, Hashnode, and Bluesky distribution, plus YouTube thumbnail generation with ffmpeg. I didn’t want to add a billed API dependency to this stack. 我的三个目录网站(aiappdex.com, findindiegame.com, ossfind.com)都是基于 Astro 5 的全静态 SSG 构建。文章通过 GitHub Actions 流水线自动发布。该流水线已经处理了 Dev.to、Hashnode 和 Bluesky 的分发,以及使用 ffmpeg 生成 YouTube 缩略图。我不想在这个技术栈中增加一个需要付费的 API 依赖。

The options I considered: 我考虑过的方案:

  • Cloudinary with remote transformations: works for on-demand, but requires a paid plan for custom fonts and the transformation URL syntax is brittle to URL-encode correctly. Cloudinary 远程转换: 适用于按需生成,但使用自定义字体需要付费套餐,且转换 URL 的语法在进行 URL 编码时非常脆弱。
  • @vercel/og (Satori-based): excellent for Next.js and Vercel serverless functions, but my sites are static pages on Cloudflare Pages — there’s no Edge runtime to serve dynamic OG images from. @vercel/og (基于 Satori): 非常适合 Next.js 和 Vercel 无服务器函数,但我的网站是部署在 Cloudflare Pages 上的静态页面,没有 Edge 运行时来提供动态 OG 图片。
  • node-canvas: full control, zero cost, but native C++ binding compilation in GitHub Actions runners is a recurring pain point. It works, but it adds a non-trivial setup step to CI. node-canvas: 完全可控且零成本,但在 GitHub Actions 运行器中编译原生 C++ 绑定是一个反复出现的痛点。它确实能用,但给 CI 增加了一个不小的配置步骤。
  • Pillow (Python image library): draws to a bitmap directly. Fine for simple layouts, but anything involving custom fonts, gradients, or CSS flexbox behavior is either impossible or requires dozens of manual coordinate calculations. Pillow (Python 图片库): 直接绘制位图。对于简单布局还可以,但任何涉及自定义字体、渐变或 CSS Flexbox 布局的操作要么无法实现,要么需要进行数十次手动坐标计算。

The Playwright approach: build an HTML string with CSS, pass it to a headless browser, screenshot it. The browser handles fonts, gradients, flexbox, and every other CSS feature I want to use. No API key. No external service. Just a 160-line Python script and Playwright installed in the runner. Playwright 方案: 构建一个包含 CSS 的 HTML 字符串,传给无头浏览器,然后截图。浏览器可以处理字体、渐变、Flexbox 以及我想要使用的所有 CSS 特性。无需 API 密钥,无需外部服务,只需一个 160 行的 Python 脚本和安装在运行器中的 Playwright。

How the HTML template and accent color system works

HTML 模板与强调色系统的工作原理

The script builds a full HTML document as a string, fills in the article title, date, and tags, and hands it to Playwright. The template has a dark card layout with an Inter typeface loaded from Google Fonts CDN. The one non-obvious piece is the accent color selection. Each article has tags like [“webdev”, “astro”, “tutorial”, “githubactions”]. The script matches these against five regex rules to pick an accent color. 脚本将完整的 HTML 文档构建为字符串,填入文章标题、日期和标签,然后交给 Playwright。模板采用深色卡片布局,并从 Google Fonts CDN 加载 Inter 字体。其中一个不太直观的部分是强调色的选择。每篇文章都有类似 [“webdev”, “astro”, “tutorial”, “githubactions”] 的标签。脚本会将这些标签与五条正则表达式规则进行匹配,从而选出强调色。

Rules are checked in order; the first match wins. An article tagged [“ai”, “webdev”] would pick purple, because ai matches the first rule before webdev matches the second. The accent color is inserted into the HTML at three points: the background radial gradient, the brand mark block, and the tag pill borders. This gives each article a visually distinct color family without requiring any per-article design decision. 规则按顺序检查,匹配即止。如果文章标签为 [“ai”, “webdev”],它会选择紫色,因为 “ai” 在第一条规则中匹配,优先级高于第二条规则中的 “webdev”。强调色会被插入到 HTML 的三个位置:背景径向渐变、品牌标识块以及标签胶囊边框。这使得每篇文章在视觉上都有独特的色系,而无需针对每篇文章进行单独的设计决策。

Font size also adjusts dynamically: titles over 70 characters render at 54px; shorter titles render at 64px. This is a heuristic that prevents long titles from overflowing the card boundary. 字体大小也会动态调整:超过 70 个字符的标题以 54px 渲染;较短的标题以 64px 渲染。这是一种启发式方法,可以防止长标题溢出卡片边界。

The key implementation: wait_until=“networkidle”

关键实现:wait_until=“networkidle”

The core Playwright call is: 核心的 Playwright 调用如下:

page.set_content(html, wait_until="networkidle")

The wait_until="networkidle" argument was the critical discovery. Without it, Playwright fires the screenshot as soon as the DOM is ready — before Google Fonts has loaded and applied Inter. The result: the fallback system-ui font renders instead, which looks noticeably different and varies by the runner’s OS default. wait_until="networkidle" 参数是关键发现。没有它,Playwright 会在 DOM 就绪后立即截图——此时 Google Fonts 尚未加载并应用 Inter 字体。结果就是:系统会回退到默认字体,这看起来差异明显,且取决于运行器操作系统的默认设置。

networkidle tells Playwright to wait until there are no more than 0 network connections for 500ms. In practice this means the Google Fonts CDN request completes and Inter loads before the screenshot fires. This adds roughly 300–500ms per image. The template includes <link rel="preconnect" ...> to minimize latency. networkidle 告诉 Playwright 等待直到 500 毫秒内没有网络连接。实际上,这意味着 Google Fonts CDN 请求完成且 Inter 字体加载完成后才会截图。这每张图片大约增加 300-500 毫秒的耗时。模板中包含了 <link rel="preconnect" ...> 以最小化延迟。

The browser instance stays open across all articles

浏览器实例在所有文章生成过程中保持开启

One implementation detail that matters for batch performance: the script opens a single browser instance and reuses the same page object across all articles, calling set_content() in a loop rather than navigating to a URL. This is faster than opening a new browser per article because Playwright browser startup time is around 500ms. For 22 articles, that’s ~11 seconds saved. 一个对批量性能至关重要的实现细节是:脚本只打开一个浏览器实例,并在所有文章中复用同一个页面对象,通过循环调用 set_content() 而不是导航到 URL。这比每篇文章打开一个新浏览器要快,因为 Playwright 的浏览器启动时间约为 500 毫秒。对于 22 篇文章,这节省了约 11 秒。

The clip parameter on screenshot() is necessary even though the viewport is already set to 1200x630. Without it, Playwright screenshots include a 1px bottom border artifact on some versions of Chromium. The clip forces the exact pixel region I want. 即使视口已经设置为 1200x630,screenshot() 中的 clip 参数仍然是必要的。没有它,Playwright 在某些 Chromium 版本中截图时会包含 1 像素的底部边框伪影。clip 参数强制截取我想要的精确像素区域。

Two image formats from one pipeline

同一流水线生成两种图片格式

The same GitHub Actions job runs two separate scripts: generate-og.py for the standard 1200×630 OG image, and generate-summary.py for a 1080×1350 portrait image optimized for Bluesky’s visual post format. 同一个 GitHub Actions 任务会运行两个独立的脚本:generate-og.py 用于生成标准的 1200×630 OG 图片,generate-summary.py 用于生成针对 Bluesky 视觉发布格式优化的 1080×1350 竖版图片。