Database Manual / Administration

Development Checklist开发检查表

The following checklist, along with the Operations Checklist for Self-Managed Deployments, provides recommendations to help you avoid issues in your production MongoDB deployment.以下清单以及自我管理部署的操作清单提供了建议,以帮助您避免生产MongoDB部署中的问题。

Data Durability数据持久性

  • Ensure that your replica set includes at least three data-bearing voting members and that your write operations use w: majority write concern. 确保您的副本集至少包含三个承载数据的投票成员,并且您的写入操作使用w: majority写入关注Three data-bearing voting members are required for replica-set wide data durability.复制集范围内的数据持久性需要三个数据承载投票成员。
  • Ensure that all instances use journaling.确保所有实例都使用日志记录

Schema Design模式设计

Data in MongoDB has a dynamic schema. Collections do not enforce document structure. MongoDB中的数据具有动态模式。集合不强制执行文档结构。This facilitates iterative development and polymorphism. Nevertheless, collections often hold documents with highly homogeneous structures. For more information, see Data Modeling.这有助于迭代开发和多态性。然而,集合中的文件往往具有高度同质的结构。有关更多信息,请参阅数据建模

  • Determine the set of collections that you will need and the indexes required to support your queries. With the exception of the _id index, you must create all indexes explicitly: MongoDB does not automatically create any indexes other than _id.确定您需要的集合集以及支持查询所需的索引。除了_id索引之外,您必须显式创建所有索引:MongoDB不会自动创建_id以外的任何索引。
  • Ensure that your schema design supports your deployment type: if you are planning to use sharded clusters for horizontal scaling, design your schema to include a strong shard key. 确保您的模式设计支持您的部署类型:如果您计划使用分片集群进行水平扩展,请将模式设计为包含强分片键。While you can change your shard key later, it is important to carefully consider your shard key choice to avoid scalability and perfomance issues.虽然您可以稍后更改分片键,但重要的是要仔细考虑分片键的选择,以避免可扩展性和性能问题。
  • Ensure that your schema design does not rely on indexed arrays that grow in length without bound. Typically, best performance can be achieved when such indexed arrays have fewer than 1000 elements.确保您的架构设计不依赖于长度无约束增长的索引数组。通常,当这种索引数组的元素少于1000个时,可以实现最佳性能。
  • Consider the document size limits when designing your schema. The BSON Document Size limit is 16MB per document. If you require larger documents, use GridFS.在设计模式时考虑文档大小限制。BSON文档大小限制为每份文档16MB。如果您需要更大的文档,请使用GridFS

Replication复制

  • Use an odd number of voting members to ensure that elections proceed successfully. You can have up to 7 voting members. 使用奇数投票成员以确保选举顺利进行。您最多可以有7名投票成员。If you have an even number of voting members, and constraints, such as cost, prohibit adding another secondary to be a voting member, you can add an arbiter to ensure an odd number of votes. 如果你有偶数个投票成员,并且成本等限制禁止添加另一个次要成员作为投票成员,你可以添加一个仲裁器来确保奇数票。For additional considerations when using an arbiter for a 3-member replica set (P-S-A), see Replica Set Arbiter.有关为3个成员的副本集(P-S-a)使用仲裁器时的其他注意事项,请参阅副本集仲裁器
  • Ensure that your secondaries remain up-to-date by using monitoring tools and by specifying appropriate write concern.通过使用监控工具和指定适当的写入关注,确保你的中学保持最新状态。
  • Do not use secondary reads to scale overall read throughput. See: Can I use more replica nodes to scale for an overview of read scaling. 不要使用辅助读取来扩展整体读取吞吐量。请参阅:我可以使用更多副本节点来扩展读取扩展的概述吗For information about secondary reads, see: Read Preference.有关二次读取的信息,请参阅:读取首选项

Sharding分片

  • Ensure that your shard key distributes the load evenly on your shards. See: Shard Keys for more information.确保你的分片键将负载均匀地分布在分片上。有关更多信息,请参阅:分片键
  • Use targeted operations for workloads that need to scale with the number of shards.对需要随着分片数量而扩展的工作负载使用有针对性的操作
  • Secondaries no longer return orphaned data unless using read concern "available" (which is the default read concern for reads against secondaries when not associated with causally consistent sessions).辅助设备不再返回孤立数据,除非使用读取关注"available"(这是与因果一致会话不关联时对辅助设备进行读取的默认读取关注)。
    All members of the shard replica set maintain chunk metadata, allowing them to filter out orphans when not using "available". 分片副本集的所有成员都维护块元数据,允许他们在不使用"available"时筛选掉孤儿。As such, non-targeted or broadcast queries that are not using "available" can be safely run on any member and will not return orphaned data.因此,不使用"available"非目标或广播查询可以在任何成员上安全运行,并且不会返回孤立数据。
    The "available" read concern can return orphaned documents from secondary members since it does not check for updated chunk metadata. "available"读取关注可能会从次要成员返回孤立文档,因为它不会检查更新的块元数据。However, if the return of orphaned documents is immaterial to an application, the "available" read concern provides the lowest latency reads possible among the various read concerns.然而,如果孤立文档的返回对应用程序来说无关紧要,那么"available"读取关注在各种读取关注中提供了尽可能低的延迟读取。
  • Pre-split and manually balance chunks when inserting large data sets into a new non-hashed sharded collection. Pre-splitting and manually balancing enables the insert load to be distributed among the shards, increasing performance for the initial load.在将大型数据集插入新的非哈希分片集合时,进行预分割和手动平衡块。预拆分和手动平衡使插入负载能够在分片之间分布,从而提高了初始负载的性能。

Drivers驱动程序

  • Make use of connection pooling. Most MongoDB drivers support connection pooling. Adjust the connection pool size to suit your use case, beginning at 110-115% of the typical number of concurrent database requests.利用连接池。大多数MongoDB驱动程序都支持连接池。调整连接池大小以适应您的用例,从并发数据库请求的典型数量的110-115%开始。
  • Ensure that your applications handle transient write and read errors during replica set elections.确保您的应用程序在副本集选择期间处理瞬态写入和读取错误。
  • Ensure that your applications handle failed requests and retry them if applicable. Drivers do not automatically retry failed requests.确保您的应用程序处理失败的请求,并在适用的情况下重试。驱动程序不会自动重试失败的请求。
  • Use exponential backoff logic for database request retries.对数据库请求重试使用指数退避逻辑。
  • Use cursor.maxTimeMS() for reads and wtimeout for writes if you need to cap execution time for database operations.如果需要限制数据库操作的执行时间,请使用cursor.maxTimeMS()进行读取,使用wtimeout进行写入。