100 Days of DevOps, Day 1: Linux User Management and AWS Key Pairs

100 Days of DevOps, Day 1: Linux User Management and AWS Key Pairs

DevOps 百日挑战,第一天:Linux 用户管理与 AWS 密钥对

Doing the work and being able to explain the work are two different skills. I’ve had the first one for 8 years. I’m building the second one now. I’m a Cloud Platform Engineer. AWS, Kubernetes, Terraform, Linux. Regulated environments, healthcare, production systems. Real experience. Almost zero public documentation of it. That’s the gap I’m closing, starting from Day 1. “做”工作和“解释”工作是两种不同的技能。我拥有第一种技能已经 8 年了,现在我正在培养第二种。我是一名云平台工程师,精通 AWS、Kubernetes、Terraform 和 Linux。我拥有在受监管环境、医疗保健领域和生产系统中的真实经验,但几乎没有任何公开的文档记录。从第一天开始,我就是要填补这个空白。

The platform is KodeKloud. Each session gives you tasks across multiple tools. I’m posting the Linux and AWS tasks here. Here’s what I built and what actually matters about each one. 我使用的平台是 KodeKloud。每个环节都会提供跨多种工具的任务。我将在这里发布 Linux 和 AWS 的任务,并分享我构建的内容以及每个任务的关键点。

Task 1 (Linux): Create a User with a Non-Interactive Shell

任务 1 (Linux):创建一个非交互式 Shell 的用户

The task was to create a system user that can own processes but cannot log in interactively. This is what you do for service accounts. 任务要求创建一个系统用户,该用户可以拥有进程,但不能进行交互式登录。这正是服务账号(service accounts)的做法。

# SSH into the private server via the jump server
ssh user@hostname
# Switch to root
sudo su -
# Create the user with a non-interactive shell
useradd username -s /sbin/nologin
# Verify the user was added
cat /etc/passwd | grep username

Key points: /sbin/nologin prevents the user from getting a shell session — they can own processes but cannot log in. Always verify by grepping /etc/passwd — the last field confirms the shell. I’ve been doing this in production environments for years. I still verify every time. Not because I’m unsure. Because in a regulated environment, you don’t assume, you confirm. 关键点:/sbin/nologin 可以防止用户获取 shell 会话——他们可以拥有进程但无法登录。务必通过 grep /etc/passwd 进行验证——最后一行字段确认了 shell 类型。我在生产环境中这样做已经很多年了,但我每次仍然会进行验证。不是因为我不确定,而是因为在受监管的环境中,你不能假设,你必须确认。

Task 2 (AWS): Create an EC2 Key Pair via CLI

任务 2 (AWS):通过 CLI 创建 EC2 密钥对

Goal: Generate and register an RSA key pair in AWS EC2 for SSH access to instances. 目标:在 AWS EC2 中生成并注册 RSA 密钥对,用于 SSH 访问实例。

# Create the key pair and save the private key locally
aws ec2 create-key-pair \
  --key-name my-key-pair \
  --key-type rsa \
  --key-format pem \
  --query "KeyMaterial" \
  --output text > my-key-pair.pem

# Verify the key pair exists in AWS
aws ec2 describe-key-pairs --key-names my-key-pair

Key points: The private key is returned only once at creation — save it immediately, AWS does not store it. Set correct permissions before use: chmod 400 my-key-pair.pem. I’ve seen this cause real problems in production environments where the key wasn’t backed up properly. Always run chmod 400 my-key-pair.pem after saving it. SSH will refuse to use a key file with open permissions. It won’t tell you that’s the reason straight away. 关键点:私钥仅在创建时返回一次,请立即保存,AWS 不会存储它。使用前请设置正确的权限:chmod 400 my-key-pair.pem。我曾见过因为密钥未妥善备份而在生产环境中引发严重问题。保存后务必运行 chmod 400。SSH 会拒绝使用权限过大的密钥文件,而且它不会直接告诉你这就是原因。

What Day 1 Taught Me That 8 Years Didn’t

第一天教会了我 8 年来未曾领悟的事

Nothing here was technically new to me. That’s not the point. The point is that explaining something clearly, step by step, with the reasoning, is a skill completely separate from being able to do it. Most engineers build the doing skill and ignore the explaining skill. I was one of them. 这里的内容在技术上对我来说并不新鲜,但这并不是重点。重点在于:清晰地、一步步地解释某件事及其背后的逻辑,是一项与“能够完成任务”完全不同的技能。大多数工程师只培养了“做”的技能,却忽略了“解释”的技能。我曾经就是其中之一。

Day 2 is already done. If you’re running your own DevOps challenge or thinking about starting one, I’d ask you this: would you rather keep building experience silently and have nothing to show for it at the end, or build it loudly and have 100 posts that prove you did the work? 第二天的工作已经完成。如果你正在进行自己的 DevOps 挑战,或者考虑开始挑战,我想问你:你是愿意默默地积累经验,最后却一无所获,还是愿意大声地记录下来,留下 100 篇证明你努力过的文章?