Show HN: Building a web server in assembly to give my life (a lack of) meaning
Show HN: Building a web server in assembly to give my life (a lack of) meaning
ymawky — web server in ARM assembly This is ymawky (yuh maw kee), a web server written entirely in ARM64 assembly. ymawky is a syscall-only, no libc, fork-per-connection web server written by hand. While it is developed for MacOS, I’ve tried to make it as portable as possible — however, it’s likely you will still need to make some (hopefully minor) significant tweaks to get this to run on Linux/other Unix systems. See Implementation Notes for more details.
ymawky — ARM 汇编编写的 Web 服务器 这是 ymawky (yuh maw kee),一个完全用 ARM64 汇编语言编写的 Web 服务器。ymawky 是一个纯手工打造、仅使用系统调用(syscall)、不依赖 libc 库,且采用“每个连接一个进程(fork-per-connection)”模式的 Web 服务器。虽然它是为 MacOS 开发的,但我已尽可能使其具备可移植性——不过,你很可能仍需要进行一些(希望是微小的)重大调整,才能使其在 Linux 或其他 Unix 系统上运行。详情请参阅“实现说明(Implementation Notes)”。
Building Requires Xcode Command Line Tools. Install with xcode-select --install. ymawky only runs on apple silicon (arm64). Run make to build. Ensure there is a www/ directory next to the ymawky executable. That’s the document root where ymawky searches for files. GET with an empty filename (GET /) will search for www/index.html, so you might want to make sure there’s an index.html as well. ymawky will try to serve static error pages when a client’s request results in error, eg 404. The pages it searches for in err/(code).html, so ensure err/ exists alongisde ymawky and www/. See Configuration to modify the default file and docroot.
构建需要 Xcode 命令行工具。请使用 xcode-select --install 进行安装。ymawky 仅能在 Apple Silicon (arm64) 架构上运行。运行 make 进行构建。请确保 ymawky 可执行文件旁边存在一个 www/ 目录,这是 ymawky 搜索文件的文档根目录。使用空文件名进行 GET 请求(GET /)时,它会搜索 www/index.html,因此你可能需要确保存在该文件。当客户端请求导致错误(例如 404)时,ymawky 会尝试提供静态错误页面。它会在 err/(code).html 中搜索这些页面,因此请确保 err/ 目录与 ymawky 和 www/ 位于同一层级。有关修改默认文件和文档根目录的信息,请参阅“配置(Configuration)”。
Running ./ymawky to start running the web server on 127.0.0.1:8080. ./ymawky [port] to start running the web server on 127.0.0.1:[port]. ./ymawky [literally-any-character-other-than-0-9] to start running the web server on 127.0.0.1:8080 in debug mode. Debug mode disables forking, and makes ymawky only handle one request. (I needed to do this because lldb wasn’t letting me debug the children, ugh.) Unfortunately, while custom ports are supported, custom addresses are not. as of right now, ymawky can only run on 127.0.0.1. This is solely because I haven’t implemented it — but if you’d like to consider this a safety feature, then I guess it could be intentional.
运行 ./ymawky 即可在 127.0.0.1:8080 上启动 Web 服务器。使用 ./ymawky [端口号] 可在 127.0.0.1:[端口号] 上启动。使用 ./ymawky [任何非数字字符] 可在调试模式下于 127.0.0.1:8080 启动服务器。调试模式会禁用进程派生(forking),使 ymawky 一次仅处理一个请求。(我这样做是因为 lldb 不允许我调试子进程,唉。)遗憾的是,虽然支持自定义端口,但不支持自定义地址。目前,ymawky 只能在 127.0.0.1 上运行。这纯粹是因为我还没实现该功能——但如果你愿意将其视为一种安全特性,那我想这也可以是“有意为之”的。
To see ymawky in action, start running ymawky with ./ymawky [port]. Then open your web browser of choice (or use curl), and visit 127.0.0.1:8080/ or 127.0.0.1:8080/pretty/index.html. Bask in the warmth of assembly.
要查看 ymawky 的实际运行效果,请使用 ./ymawky [端口号] 启动它。然后打开你喜欢的浏览器(或使用 curl),访问 127.0.0.1:8080/ 或 127.0.0.1:8080/pretty/index.html。尽情沉浸在汇编语言的温暖中吧。
What can it do?
ymawky is a static-file web server. It doesn’t support server-side code to generate content on-the-fly, or more advanced URL parsing, such as /search?query=term. That’s not to say it’s non-functional, though. Supported HTTP methods: GET, PUT, DELETE, OPTIONS, HEAD. Basic protection from slowloris-like Denial of Service attacks. Decodes % hex encoding, eg, %20 decodes to a space in filenames, and %61 decodes to ‘a’. Smart path traversal detection and prevention. Blocks .. from traversing paths, while not disallowing multiple periods when they’re part of a file:
- GET /../../../etc/passwd -> 403 Forbidden
- GET /ohwell…txt -> 200 OK
- GET /../src/ymawky.S -> 403 Forbidden
- GET /hehe..txt -> 200 OK
它能做什么?
ymawky 是一个静态文件 Web 服务器。它不支持用于即时生成内容的服务器端代码,也不支持更高级的 URL 解析(例如 /search?query=term)。但这并不意味着它功能缺失。支持的 HTTP 方法包括:GET、PUT、DELETE、OPTIONS、HEAD。具备针对类似 Slowloris 的拒绝服务攻击的基本防护。支持解码 % 十六进制编码(例如,%20 解码为文件名中的空格,%61 解码为 ‘a’)。具备智能路径遍历检测与防御功能。它会阻止 .. 进行路径遍历,但不会禁止文件名中包含多个点:
- GET /../../../etc/passwd -> 403 禁止访问
- GET /ohwell…txt -> 200 成功
- GET /../src/ymawky.S -> 403 禁止访问
- GET /hehe..txt -> 200 成功
Automatically prepends www/ to requested files. GET /index.html will retrieve www/index.html. Empty GET / requests default to GET www/index.html. PUT requests support uploads of up to 1GiB, though this can be configured for larger files. PUT is atomic due to writing to a temporary file then renaming, allowing concurrent PUT requests without leaving partially-written files. Content-Length: parsing and verification in PUT requests. MIME type detection, giving Content-Type in the response header with the corresponding MIME type. Accepts Range: bytes= ranges in GET requests, supporting full ranges bytes=X-N, suffix ranges bytes=-N, and open-ended ranges bytes=X-. Video scrubbing is well supported. Basic HTTP version parsing. Requests need to specify HTTP/1.1 or HTTP/1.0, and if requesting HTTP/1.1, a Host: field needs to be present in the header. Currently, ymawky doesn’t do anything with Host, but per RFC 9112 Section 3.2, the Header must be sent. Serves custom HTML pages for error codes, such as 404, or 500. Look in the err/ directory for an example. If the requested resource is a directory, list all files and subdirs in the directory. Note that this excludes www/ (or whatever your docroot is): GET / will always search for index.html if no file is given.
自动在请求的文件前添加 www/ 前缀。GET /index.html 将获取 www/index.html。空的 GET / 请求默认获取 www/index.html。PUT 请求支持最大 1GiB 的上传,不过可以配置以支持更大的文件。PUT 操作是原子的,因为它先写入临时文件再重命名,从而允许并发的 PUT 请求,且不会留下部分写入的文件。支持 PUT 请求中的 Content-Length 解析与验证。支持 MIME 类型检测,并在响应头中提供相应的 Content-Type。GET 请求接受 Range: bytes= 范围,支持完整范围 bytes=X-N、后缀范围 bytes=-N 以及开放式范围 bytes=X-。视频拖动(scrubbing)支持良好。具备基础的 HTTP 版本解析。请求需要指定 HTTP/1.1 或 HTTP/1.0;如果请求 HTTP/1.1,头部必须包含 Host: 字段。目前 ymawky 不会对 Host 做任何处理,但根据 RFC 9112 第 3.2 节,必须发送该头部。为错误代码(如 404 或 500)提供自定义 HTML 页面。请查看 err/ 目录获取示例。如果请求的资源是目录,则列出该目录下的所有文件和子目录。注意,这不包括 www/(或你设置的任何文档根目录):如果未指定文件,GET / 将始终搜索 index.html。
“Safety” This is a web server written entirely by-hand in ARM64 assembly as a fun project. It’s probably got a lot of vulnerabilities I’m unaware of. However, I did do my best to make it safer. Here are some safety precautions ymawky takes:
- Rejects paths >= PATH_MAX (4096 bytes)
- Reject any paths that include path traversal —
/../.. - Reject any requests that do not contain a path within 16 bytes
- Confined to
www/. Any path requested getswww/prepended to it - Rejects any path containing symlinks, with
O_NOFOLLOW_ANY - PUT writes to a temporary file,
www/.ymawky_tmp_<pid>. Upon successfully receiving the whole file, this temporary file is then renamed to the requested filename. This prevents partial or corrupted PUT requests from overwriting existing files. - Reject any requests whose path starts with
www/.ymawky_tmp_. This prevents someone from GETing a temporary file, and prevents someone from sending PUT/.ymawky_tmp_4533or something. - Must receive data within 10 seconds. If it’s slower, the connection will close. If the entire header is not received within 10 seconds total, the connection will be closed. This is to prevent slowloris-like attacks.
“安全性” 这是一个完全由我手工用 ARM64 汇编编写的 Web 服务器,仅作为一个有趣的个人项目。它可能存在许多我未察觉的漏洞。不过,我已尽力使其更安全。以下是 ymawky 采取的一些安全预防措施:
- 拒绝路径长度 >= PATH_MAX (4096 字节) 的请求
- 拒绝任何包含路径遍历(如
/../..)的路径 - 拒绝任何不包含 16 字节以内路径的请求
- 限制在
www/目录内。任何请求的路径都会被加上www/前缀 - 使用
O_NOFOLLOW_ANY拒绝任何包含符号链接的路径 - PUT 操作写入临时文件
www/.ymawky_tmp_<pid>。在成功接收完整文件后,该临时文件会被重命名为请求的文件名。这防止了部分写入或损坏的 PUT 请求覆盖现有文件。 - 拒绝任何路径以
www/.ymawky_tmp_开头的请求。这防止了他人获取临时文件,也防止了他人发送 PUT/.ymawky_tmp_4533之类的请求。 - 必须在 10 秒内接收数据。如果速度较慢,连接将关闭。如果 10 秒内未接收到完整头部,连接也将关闭。这是为了防止类似 Slowloris 的攻击。