Programmable Whitelist-based Configs: Embedding Rye in Go

Programmable Whitelist-based Configs: Embedding Rye in Go

基于可编程白名单的配置:在 Go 中嵌入 Rye

Config spec feature creep

配置规范的功能蔓延

Configuration starts simple. A few keys and values, then you want to group values, and nest them, somebody asks for simple expressions, conditionals, variables … 配置通常从简单开始。几个键值对,然后你想要对值进行分组、嵌套,接着有人要求加入简单的表达式、条件判断、变量……

Any sufficiently complicated configuration system contains an ad-hoc, informally-specified, bug-ridden, slow implementation of half of a programming language. (Greenspun’s Tenth Rule, applied to config) 任何足够复杂的配置系统,最终都会包含一个临时编写、非正式定义、充满 Bug、运行缓慢的、半吊子编程语言实现。(格林斯潘第十定律在配置领域的应用)

You gradually end up with a programming language that was never designed as one. YAML added templating, Nginx added if. Terraform invented HCL. Helm is pretty bad, two languages, two escaping contexts, paired with sometimes conflicting indentation rules, and a preprocessor step. 你最终会得到一种从未被设计为编程语言的“编程语言”。YAML 加入了模板功能,Nginx 加入了 if,Terraform 发明了 HCL。Helm 则相当糟糕:它有两种语言、两种转义上下文,还伴随着有时会冲突的缩进规则以及一个预处理步骤。

{{- if eq .Values.env "production" }}
  - path: /admin
    backend: admin-svc
{{- end }}

HCL avoids many of Helm’s complexity and duality, but still exposes a fixed set of language constructs available to your config, which could be too much, or too little. For example there is no if: HCL 避免了 Helm 的许多复杂性和二元性,但它仍然向你的配置暴露了一组固定的语言结构,这可能太多,也可能太少。例如,它没有 if

dynamic "route" {
  for_each = var.env == "production" ? [1] : []
  content {
    path = "/admin"
  }
}

Bring a full language in

引入完整的编程语言

Another method is to just add a scripting language: Lua, Python. Now the config can do everything, and some more. 另一种方法是直接加入一种脚本语言:Lua 或 Python。现在配置可以做任何事情,甚至更多。

if os.getenv("ENV") == "production" then
  add_route("/admin", admin_handler)
end

A full language usually also means os.execute(), io.open(), and require(). We can remove os, io, and require before handing it untrusted code. But we are just denylisting, and … denylisting is preventing known dangers, but there are always unknown unknowns, and you can’t prevent what you don’t know you should. 一种完整的语言通常也意味着包含 os.execute()io.open()require()。我们可以在将代码交给不受信任的环境前移除 osiorequire。但这只是“黑名单”机制……黑名单只能预防已知的危险,但总会有“未知的未知”,你无法预防那些你根本不知道该预防的东西。

Allowlist with Rye

使用 Rye 进行白名单控制

Rye takes the opposite approach: start with no language features at all, then explicitly allowlist capabilities. Here’s a quick preview: Rye 采取了相反的方法:从没有任何语言特性开始,然后显式地将功能加入白名单。以下是快速预览:

// Go side: grant exactly two operations
evaldo.RegisterBuiltinsFilter(ps, []string{"_++", "os/cwd?"})

On Go side, we register just two builtin functions _++ and os/cwd? (cwd? - current working directory built-in defined inside context os). 在 Go 端,我们只注册了两个内置函数 _++os/cwd?(cwd? 是定义在 os 上下文中的获取当前工作目录的内置函数)。

; Config side: use them
docs: os/cwd? ++ "/docs"

This is the entire vocabulary. Everything else was never given, no other word is defined. If you want to read a file Read %my-secrets ; Error: Word Read not found. 这就是全部的词汇表。除此之外的一切都没有被赋予,没有定义其他任何词。如果你想读取文件 Read %my-secrets,会报错:Error: Word Read not found

What is Rye

什么是 Rye

Rye is a general purpose language written in pure Go (no CGO), you can also import it like a Go library. Rye is a homoiconic language and every active word is just a function. Every active word is added on a library level. There is no if, fn, loop behaviour hardcoded into the evaluator. Rye 是一种用纯 Go 编写(无 CGO)的通用语言,你也可以像使用 Go 库一样导入它。Rye 是一种同像语言(homoiconic),每个活跃的词(active word)本质上都是一个函数。每个活跃词都是在库层面添加的。评估器中没有硬编码 iffnloop 等行为。

What about Starlark

关于 Starlark

Starlark was built for exactly this. It’s a mature solution and it brings a lot to the table. We are still talking about concepts here. These are the differences. Starlark gives you if, for, and def unconditionally. You can’t take them away. While Rye has no reserved forms. Words like if are just functions you choose to register. Starlark’s modules are more all-or-nothing. Rye lets you grant _+ but not _*. Starlark 正是为此而生的。它是一个成熟的解决方案,功能丰富。我们在这里讨论的是概念上的差异:Starlark 无条件地为你提供 iffordef,你无法移除它们。而 Rye 没有保留字,像 if 这样的词只是你选择注册的函数。Starlark 的模块化更倾向于“全有或全无”,而 Rye 允许你只授权 _+ 而不授权 _*

Example: Markdown serving web-server

示例:Markdown 服务 Web 服务器

We will make a Go webserver that reads markdown, converts it to HTML and serves it over HTTP. Rye is used for config file. 我们将制作一个 Go Web 服务器,读取 Markdown,将其转换为 HTML 并通过 HTTP 提供服务。Rye 将用于配置文件。

Step 1 - The minimal server (~50 lines + validation)

第一步 - 最小化服务器(约 50 行代码 + 验证)

package main

import (
	"fmt"
	"html/template"
	"log"
	"net/http"
	"os"
	"path/filepath"
	"strings"
	"github.com/refaktor/rye/env"
	"github.com/refaktor/rye/evaldo"
	"github.com/refaktor/rye/loader"
	"github.com/yuin/goldmark"
)

// ... (省略部分代码)

func main() {
	raw, err := os.ReadFile("config.rye")
	if err != nil {
		log.Fatalf("failed to read config: %v", err)
	}
	ps := env.NewProgramState()
	blk := loader.LoadString(string(raw), false, ps)
    
    // ... (省略部分逻辑)

	http.HandleFunc("/", func(w http.ResponseWriter, r *http.Request) {
		slug := strings.TrimPrefix(r.URL.Path, "/")
		path, err := safeMarkdownPath(dir, slug)
		if err != nil {
			http.Error(w, err.Error(), http.StatusBadRequest)
			return
		}
		md, err := os.ReadFile(path)
		if err != nil {
			http.NotFound(w, r)
			return
		}
		var buf strings.Builder
		goldmark.Convert(md, &buf)
		tpl.Execute(w, template.HTML(buf.String()))
	})
	fmt.Printf("Serving on port %s\n", port)
	http.ListenAndServe(":"+port, nil)
}

And the config: 配置文件如下:

port: "3000"
docs-dir: "content"

The config looks like YAML, but it’s normal Rye code. Rye is not space or newline sensitive, but requires spacing between each token (parens also). 这个配置看起来像 YAML,但它实际上是标准的 Rye 代码。Rye 对空格或换行符不敏感,但要求每个标记(包括括号)之间必须有空格。