Deploying Vector High-Performance Observability Data Pipeline on Ubuntu 24.04
Deploying Vector High-Performance Observability Data Pipeline on Ubuntu 24.04
在 Ubuntu 24.04 上部署高性能可观测性数据管道 Vector
Vector is a high-performance observability data pipeline from Datadog that collects, transforms, and routes logs, metrics, and traces across heterogeneous backends. This guide deploys Vector using Docker Compose with Traefik handling automatic HTTPS for the GraphQL API and HTTP ingest endpoint, plus a working sources → transforms → sinks pipeline. By the end, you’ll have Vector accepting JSON over HTTPS and forwarding it to multiple sinks on your server.
Vector 是由 Datadog 开发的一款高性能可观测性数据管道,用于在异构后端之间收集、转换和路由日志、指标及追踪数据。本指南将使用 Docker Compose 部署 Vector,并利用 Traefik 为 GraphQL API 和 HTTP 摄入端点自动处理 HTTPS,同时构建一个完整的“源 → 转换 → 接收器”管道。完成本指南后,你将拥有一个能够通过 HTTPS 接收 JSON 数据并将其转发到服务器上多个接收器的 Vector 实例。
Set Up the Directory Structure and Configuration
设置目录结构与配置
- Create the project directory structure:
- 创建项目目录结构:
$ mkdir -p ~/vector/{config,data}
$ cd ~/vector
- Create the environment file:
- 创建环境变量文件:
$ nano .env
DOMAIN=vector.example.com
LETSENCRYPT_EMAIL=admin@example.com
- Create the Vector pipeline configuration:
- 创建 Vector 管道配置文件:
$ nano config/vector.yaml
api:
enabled: true
address: "0.0.0.0:8686"
sources:
demo_logs:
type: "demo_logs"
format: "syslog"
interval: 1.0
http_input:
type: "http_server"
address: "0.0.0.0:8080"
decoding:
codec: "json"
transforms:
parse_logs:
type: "remap"
inputs:
- "demo_logs"
- "http_input"
source: |
.processed_at = now()
.pipeline = "vector-demo"
sinks:
console_output:
type: "console"
inputs:
- "parse_logs"
encoding:
codec: "json"
file_output:
type: "file"
inputs:
- "parse_logs"
path: "/var/lib/vector/logs-%Y-%m-%d.log"
encoding:
codec: "json"
http_output:
type: "http"
inputs:
- "parse_logs"
uri: "https://httpbin.org/post"
encoding:
codec: "json"
batch:
max_bytes: 1048576
timeout_secs: 10
Deploy with Docker Compose
使用 Docker Compose 部署
- Create the Docker Compose manifest:
- 创建 Docker Compose 清单文件:
$ nano docker-compose.yaml
services:
traefik:
image: traefik:v3.6
container_name: traefik
command:
- "--providers.docker=true"
- "--providers.docker.exposedbydefault=false"
- "--entrypoints.web.address=:80"
- "--entrypoints.websecure.address=:443"
- "--entrypoints.web.http.redirections.entrypoint.to=websecure"
- "--entrypoints.web.http.redirections.entrypoint.scheme=https"
- "--certificatesresolvers.letsencrypt.acme.httpchallenge=true"
- "--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web"
- "--certificatesresolvers.letsencrypt.acme.email=${LETSENCRYPT_EMAIL}"
- "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json"
ports:
- "80:80"
- "443:443"
volumes:
- "./letsencrypt:/letsencrypt"
- "/var/run/docker.sock:/var/run/docker.sock:ro"
restart: unless-stopped
vector:
image: timberio/vector:0.44.0-alpine
container_name: vector
expose:
- "8080"
- "8686"
volumes:
- "./config/vector.yaml:/etc/vector/vector.yaml:ro"
- "./data:/var/lib/vector"
labels:
- "traefik.enable=true"
- "traefik.http.routers.vector-api.rule=Host(`${DOMAIN}`) && (PathPrefix(`/playground`) || PathPrefix(`/graphql`) || PathPrefix(`/health`))"
- "traefik.http.routers.vector-api.entrypoints=websecure"
- "traefik.http.routers.vector-api.tls.certresolver=letsencrypt"
- "traefik.http.routers.vector-api.service=vector-api"
- "traefik.http.services.vector-api.loadbalancer.server.port=8686"
- "traefik.http.routers.vector-ingest.rule=Host(`${DOMAIN}`) && PathPrefix(`/ingest`)"
- "traefik.http.routers.vector-ingest.entrypoints=websecure"
- "traefik.http.routers.vector-ingest.tls.certresolver=letsencrypt"
- "traefik.http.routers.vector-ingest.service=vector-ingest"
- "traefik.http.services.vector-ingest.loadbalancer.server.port=8080"
- "traefik.http.middlewares.strip-ingest.stripprefix.prefixes=/ingest"
- "traefik.http.routers.vector-ingest.middlewares=strip-ingest"
restart: unless-stopped
- Start the services:
- 启动服务:
$ docker compose up -d
- Verify the services are running:
- 验证服务是否正在运行:
$ docker compose ps
$ docker compose logs vector
Verify the Pipeline
验证管道
- POST a JSON log to the ingest endpoint:
- 向摄入端点发送一条 JSON 日志:
$ curl -X POST https://vector.example.com/ingest \
-H "Content-Type: application/json" \
-d '{"level":"error","service":"api","message":"Database connection timeout","user_id":12345}'
- Confirm the file sink wrote the event:
- 确认文件接收器已写入该事件:
$ ls -lh data/
$ grep "Database connection timeout" data/logs-*.log
- Stream the live console sink:
- 查看控制台接收器的实时输出:
$ docker compose logs -f vector
Next Steps
后续步骤
Vector is running with HTTPS ingest and three sinks active. From here you can: Vector 现已运行,支持 HTTPS 摄入并启用了三个接收器。接下来你可以:
- Add sources for files, Kafka, syslog, journald, or Kubernetes logs.
- 为文件、Kafka、syslog、journald 或 Kubernetes 日志添加源。
- Route to production sinks (Loki, Elasticsearch, S3, Datadog, Splunk).
- 路由到生产环境接收器(如 Loki、Elasticsearch、S3、Datadog、Splunk)。
- Use VRL (Vector Remap Language) for richer transforms and enrichment.
- 使用 VRL (Vector Remap Language) 进行更丰富的转换和数据增强。
For the full guide with additional tips, visit the original article on Vultr Docs. 如需获取包含更多技巧的完整指南,请访问 Vultr Docs 上的原始文章。