Great Stack to Doesn't Work Bonus: SQL vs NoSQL: Which One in 2026?
Great Stack to Doesn’t Work Bonus: SQL vs NoSQL: Which One in 2026?
Great Stack to Doesn’t Work 系列番外篇:SQL vs NoSQL,2026 年该选哪一个?
The honest decision framework, not another flame war. The SQL vs NoSQL debate has been running for 15 years and it still generates more heat than light. Here’s the framework that actually helps you decide. 这是一个诚实的决策框架,而不是又一场口水战。关于 SQL 与 NoSQL 的争论已经持续了 15 年,但往往是“热度有余,深度不足”。以下是真正能帮你做出决策的框架。
The Real Question It’s not “SQL or NoSQL.” It’s: what does your access pattern look like? If your application is mostly reading and writing related data through well-defined queries — orders with line items, users with addresses, products with categories — relational databases are purpose-built for this. JOINs are not expensive when they’re indexed. Transactions are not slow when they’re scoped correctly. PostgreSQL handles 50 million rows comfortably on a single node. 真正的核心问题不在于“选 SQL 还是 NoSQL”,而在于:你的访问模式(Access Pattern)是什么样的?如果你的应用主要是通过定义明确的查询来读写关联数据——比如带有订单项的订单、带有地址的用户、带有类别的产品——那么关系型数据库就是为此而生的。只要建立了索引,JOIN 操作并不昂贵;只要范围界定得当,事务处理也并不缓慢。PostgreSQL 在单节点上可以轻松处理 5000 万行数据。
If your application is reading and writing self-contained documents with predictable access by a primary key, and you rarely need cross-document queries — user profiles, product catalogs, content management — a document database simplifies your code. No ORM mapping hell. No migration files for adding a field. 如果你的应用是在读写自包含的文档,且主要通过主键进行可预测的访问,且很少需要跨文档查询——例如用户资料、产品目录、内容管理——那么文档数据库可以简化你的代码。无需陷入 ORM 映射的地狱,也不必为了增加一个字段而去编写迁移文件。
If your application writes massive volumes and reads by partition key with eventual consistency — time-series data, IoT telemetry, activity feeds at scale — wide-column stores like Cassandra were built for this specific workload. 如果你的应用需要处理海量写入,并以分区键(Partition Key)进行最终一致性读取——例如时序数据、物联网遥测、大规模活动流——那么像 Cassandra 这样的宽列存储就是为这种特定工作负载而构建的。
The 2026 Reality PostgreSQL has eaten NoSQL’s lunch in many areas. JSONB support means you can store and query unstructured data inside PostgreSQL with GIN indexes. You get the document model flexibility without giving up transactions, JOINs, and a 30-year ecosystem. For 80% of startups and mid-size companies, PostgreSQL is the only database you need. 2026 年的现实是:PostgreSQL 在许多领域已经蚕食了 NoSQL 的市场。JSONB 的支持意味着你可以在 PostgreSQL 中存储和查询非结构化数据,并利用 GIN 索引进行加速。你既获得了文档模型的灵活性,又没有放弃事务、JOIN 操作以及 30 年积累的生态系统。对于 80% 的初创公司和中型企业来说,PostgreSQL 是你唯一需要的数据库。
MongoDB has gotten more relational. Multi-document ACID transactions (since 4.0), schema validation, aggregation pipelines that look suspiciously like SQL. It’s converging toward what PostgreSQL already does, but with a different starting point. MongoDB 变得越来越“关系型”了。从 4.0 版本开始支持多文档 ACID 事务、模式验证,以及看起来非常像 SQL 的聚合管道。它正在向 PostgreSQL 已经实现的功能靠拢,只是出发点不同。
DynamoDB dominates serverless. If you’re in AWS and your access pattern is simple key-value with known query patterns, DynamoDB’s pricing model (pay-per-request) and operational simplicity are hard to beat. But the moment you need ad-hoc queries or flexible access patterns, you’re fighting the database. DynamoDB 在无服务器(Serverless)领域占据主导地位。如果你身处 AWS 环境,且访问模式是简单的键值对查询,那么 DynamoDB 的定价模式(按请求付费)和运维简便性是难以超越的。但一旦你需要即席查询(Ad-hoc queries)或灵活的访问模式,你就会发现自己在与数据库“作对”。
Cassandra is for a specific scale problem. If you don’t need to write millions of rows per second across multiple data centers with tunable consistency, you don’t need Cassandra. The operational overhead is significant. Cassandra 是为了解决特定的规模问题而存在的。如果你不需要在多个数据中心以可调一致性每秒写入数百万行数据,你就不需要 Cassandra。它的运维开销非常大。
Decision Tree Start with PostgreSQL unless you have a specific reason not to. Then: 决策树:除非你有明确的理由不选,否则请从 PostgreSQL 开始。然后:
- Need flexible schema with primarily key-based access? → Consider MongoDB
- 需要灵活的模式且主要基于主键访问?→ 考虑 MongoDB
- Need massive write throughput with geographic distribution? → Consider Cassandra
- 需要海量写入吞吐量且具备地理分布能力?→ 考虑 Cassandra
- Need serverless, pay-per-request, AWS-native? → Consider DynamoDB
- 需要无服务器、按请求付费、AWS 原生?→ 考虑 DynamoDB
- Need time-series at scale? → Consider TimescaleDB (PostgreSQL extension) or InfluxDB
- 需要大规模时序数据处理?→ 考虑 TimescaleDB(PostgreSQL 扩展)或 InfluxDB
- Need graph queries (social networks, recommendation engines)? → Consider Neo4j or PostgreSQL’s recursive CTEs
- 需要图查询(社交网络、推荐引擎)?→ 考虑 Neo4j 或 PostgreSQL 的递归 CTE
The worst decision is choosing NoSQL because “we might need to scale.” Scale is not a database choice. It’s an architecture problem. Most applications will never outgrow a single well-configured PostgreSQL instance. And the ones that do will need to re-architect regardless of their database. 最糟糕的决定就是因为“我们以后可能需要扩展”而选择 NoSQL。扩展性不是数据库的选择问题,而是架构问题。大多数应用永远不会超出单个配置良好的 PostgreSQL 实例的处理能力。而那些真正超出能力的,无论使用什么数据库,都需要进行架构重构。
The One Thing Nobody Tells You The database you choose determines your debugging story. When something goes wrong with PostgreSQL, you have EXPLAIN ANALYZE, pg_stat_statements, 30 years of Stack Overflow answers, and a query planner that tells you exactly what it’s doing. When something goes wrong with Cassandra, you’re reading GC logs and compaction stats. When DynamoDB throttles your reads, the only fix is to provision more capacity or redesign your partition key. When MongoDB’s aggregation pipeline is slow, the explain output is a nested JSON document that takes 20 minutes to parse. Choose the database whose failure mode you’re most equipped to handle. Because it will fail, and your ability to debug it determines your recovery time. 没人告诉你的真相是:你选择的数据库决定了你的调试体验。当 PostgreSQL 出问题时,你有 EXPLAIN ANALYZE、pg_stat_statements、30 年的 Stack Overflow 答案,以及一个能明确告诉你它在做什么的查询规划器。当 Cassandra 出问题时,你得去读 GC 日志和压缩统计信息。当 DynamoDB 限制你的读取时,唯一的解决办法就是增加容量或重新设计分区键。当 MongoDB 的聚合管道变慢时,其 explain 输出是一个需要花 20 分钟才能解析的嵌套 JSON 文档。选择那种你最能应对其故障模式的数据库,因为数据库终会出故障,而你调试它的能力决定了你的恢复时间。
Over to You SQL or NoSQL for your current project — and why? Has anyone actually migrated from one to the other mid-project? How did it go? 轮到你了:你当前的项目在使用 SQL 还是 NoSQL?为什么?有没有人在项目进行到一半时从一种迁移到另一种?过程如何?