FAQ: MongoDB Storage

~~On this page~~本页内容

~~Storage Engine Fundamentals~~存储引擎基础知识
~~Can you mix storage engines in a replica set?~~您可以在复制集中混合使用存储引擎吗？
~~Storage Recommendations~~存储建议
~~WiredTiger Storage Engine~~WiredTiger存储引擎
~~Data Storage Diagnostics~~数据存储诊断

~~This document addresses common questions regarding MongoDB's storage system.~~本文档解决了有关MongoDB存储系统的常见问题。

Storage Engine Fundamentals存储引擎基础知识

What is a storage engine?什么是存储引擎？

~~A storage engine is the part of a database that is responsible for managing how data is stored, both in memory and on disk.~~ 存储引擎是数据库的一部分，负责管理数据在内存和磁盘上的存储方式。~~Many databases support multiple storage engines, where different engines perform better for specific workloads.~~ 许多数据库支持多个存储引擎，其中不同的引擎对特定的工作负载表现更好。~~For example, one storage engine might offer better performance for read-heavy workloads, and another might support a higher throughput for write operations.~~例如，一个存储引擎可能为读取繁重的工作负载提供更好的性能，另一个可能为写入操作提供更高的吞吐量。

Tip

Can you mix storage engines in a replica set?您可以在复制集中混合使用存储引擎吗？

~~Yes. You can have replica set members that use different storage engines (WiredTiger and in-memory)~~对您可以拥有使用不同存储引擎（WiredTiger和内存中）的复制副本集成员

Note

~~Starting in version 4.2, MongoDB removes the deprecated MMAPv1 storage engine.~~从4.2版本开始，MongoDB删除了不推荐使用的MMAPv1存储引擎。

Storage Recommendations存储建议

How many collections and indexes can be in a cluster?一个集群中可以有多少个集合和索引？

~~Cluster performance might degrade once the combined number of collections and indexes reaches beyond 100,000.~~ 一旦集合和索引的总数超过100000，集群性能可能会下降。~~In addition, many large collections have a greater impact on performance than smaller collections.~~此外，许多大型集合比小型集合对性能的影响更大。

WiredTiger Storage EngineWiredTiger存储引擎

Can I upgrade an existing deployment to WiredTiger?我可以将现有部署升级到WiredTiger吗？

~~Yes. See:~~对请参阅：

How much compression does WiredTiger provide?WiredTiger提供了多少压缩？

~~The ratio of compressed data to uncompressed data depends on your data and the compression library used.~~ 压缩数据与未压缩数据的比率取决于您的数据和使用的压缩库。~~By default, collection data in WiredTiger use Snappy block compression; zlib and zstd compression is also available.~~ 默认情况下，WiredTiger中的采集数据使用Snappy块压缩；zlib和zstd压缩也可用。~~Index data use prefix compression by default.~~默认情况下，索引数据使用前缀压缩。

To what size should I set the WiredTiger internal cache?我应该将WiredTiger内部缓存设置为什么大小？

~~With WiredTiger, MongoDB utilizes both the WiredTiger internal cache and the filesystem cache.~~有了WiredTiger，MongoDB既利用了WiredTinger内部缓存，也利用了文件系统缓存。

~~Starting in MongoDB 3.4, the default WiredTiger internal cache size is the larger of either:~~从MongoDB 3.4开始，默认的WiredTiger内部缓存大小是以下两者中较大的一个：

50% of (RAM - 1 GB), or
256 MB.

~~For example, on a system with a total of 4GB of RAM the WiredTiger cache will use 1.5GB of RAM (0.5 * (4 GB - 1 GB) = 1.5 GB).~~ 例如，在总内存为4GB的系统上，WiredTiger缓存将使用1.5GB的RAM（0.5 * (4 GB - 1 GB) = 1.5 GB）。Conversely, a system with a total of 1.25 GB of RAM will allocate 256 MB to the WiredTiger cache because that is more than half of the total RAM minus one gigabyte (0.5 * (1.25 GB - 1 GB) = 128 MB < 256 MB).反过来说，一个总RAM为1.25 GB的系统将为WiredTiger缓存分配256 MB，因为这超过了总RAM的一半减去1 GB（0.5 * (1.25 GB - 1 GB) = 128 MB < 256 MB）。

Note

~~In some instances, such as when running in a container, the database can have memory constraints that are lower than the total system memory.~~ 在某些情况下，例如在容器中运行时，数据库的内存约束可能低于系统总内存。~~In such instances, this memory limit, rather than the total system memory, is used as the maximum RAM available.~~在这种情况下，这个内存限制，而不是整个系统内存，被用作可用的最大RAM。

~~To see the memory limit, see hostInfo.system.memLimitMB.~~要查看内存限制，请参阅hostInfo.system.memLimitMB。

~~By default, WiredTiger uses Snappy block compression for all collections and prefix compression for all indexes.~~ 默认情况下，WiredTiger对所有集合使用Snappy块压缩，对所有索引使用前缀压缩。~~Compression defaults are configurable at a global level and can also be set on a per-collection and per-index basis during collection and index creation.~~压缩默认值可以在全局级别进行配置，也可以在集合和索引创建期间按每个集合和每个索引进行设置。

~~Different representations are used for data in the WiredTiger internal cache versus the on-disk format:~~WiredTiger内部缓存中的数据与磁盘上的格式使用不同的表示形式：

~~Data in the filesystem cache is the same as the on-disk format, including benefits of any compression for data files. The filesystem cache is used by the operating system to reduce disk I/O.~~文件系统缓存中的数据与磁盘上的格式相同，包括对数据文件进行任何压缩的好处。操作系统使用文件系统缓存来减少磁盘I/O。
~~Indexes loaded in the WiredTiger internal cache have a different data representation to the on-disk format, but can still take advantage of index prefix compression to reduce RAM usage.~~ WiredTiger内部缓存中加载的索引具有与磁盘上格式不同的数据表示形式，但仍然可以利用索引前缀压缩来减少RAM的使用。~~Index prefix compression deduplicates common prefixes from indexed fields.~~索引前缀压缩从索引字段中消除常见前缀的重复。
~~Collection data in the WiredTiger internal cache is uncompressed and uses a different representation from the on-disk format.~~ WiredTiger内部缓存中的采集数据未压缩，使用不同于磁盘格式的表示形式。~~Block compression can provide significant on-disk storage savings, but data must be uncompressed to be manipulated by the server.~~块压缩可以显著节省磁盘上的存储空间，但数据必须经过压缩才能由服务器操作。

~~Via the filesystem cache, MongoDB automatically uses all free memory that is not used by the WiredTiger cache or by other processes.~~通过文件系统缓存，MongoDB自动使用WiredTiger缓存或其他进程未使用的所有可用内存。

To adjust the size of the WiredTiger internal cache, see storage.wiredTiger.engineConfig.cacheSizeGB and --wiredTigerCacheSizeGB. Avoid increasing the WiredTiger internal cache size above its default value.要调整WiredTiger内部缓存的大小，请参阅storage.wiredTiger.engineConfig.cacheSizeGB和--wiredTigerCacheSizeGB。避免将WiredTiger内部缓存大小增加到其默认值以上。

Note

~~The storage.wiredTiger.engineConfig.cacheSizeGB limits the size of the WiredTiger internal cache.~~ storage.wiredTiger.engineConfig.cacheSizeGB限制了WiredTiger内部缓存的大小。~~The operating system will use the available free memory for filesystem cache, which allows the compressed MongoDB data files to stay in memory.~~ 操作系统将使用可用的空闲内存进行文件系统缓存，这允许压缩的MongoDB数据文件留在内存中。~~In addition, the operating system will use any free RAM to buffer file system blocks and file system cache.~~此外，操作系统将使用任何空闲的RAM来缓冲文件系统块和文件系统缓存。

~~To accommodate the additional consumers of RAM, you may have to decrease WiredTiger internal cache size.~~为了容纳更多的RAM消耗者，您可能需要减小WiredTiger内部缓存的大小。

~~The default WiredTiger internal cache size value assumes that there is a single mongod instance per machine.~~ 默认的WiredTiger内部缓存大小值假定每台机器有一个mongod实例。~~If a single machine contains multiple MongoDB instances, then you should decrease the setting to accommodate the other mongod instances.~~如果一台机器包含多个MongoDB实例，那么应该减少设置以容纳其他mongod实例。

If you run mongod in a container (e.g. lxc, cgroups, Docker, etc.) that does not have access to all of the RAM available in a system, you must set storage.wiredTiger.engineConfig.cacheSizeGB to a value less than the amount of RAM available in the container. 如果在无法访问系统中所有可用RAM的容器（例如lxc、cgroups、Docker等）中运行mongod，则必须将storage.wiredTiger.engineConfig.cacheSizeGB设置为小于容器中可用RAM量的值。~~The exact amount depends on the other processes running in the container. See memLimitMB.~~确切的数量取决于容器中运行的其他进程。请参阅memLimitMB。

~~To view statistics on the cache and eviction rate, see the wiredTiger.cache field returned from the serverStatus command.~~要查看缓存和逐出率的统计信息，请参阅serverStatus命令返回的wiredTiger.cache字段。

How much memory does MongoDB allocate per connection?MongoDB为每个连接分配了多少内存？

~~Each connection uses up to 1 megabyte of RAM.~~每个连接最多使用1兆字节的RAM。

~~To optimize memory use for connections, ensure that you:~~要优化连接的内存使用，请确保：

~~Monitor the number of open connections to your deployment. Too many open connections result in excessive use of RAM and reduce available memory for the working set.~~监视到部署的打开连接数。过多的开放连接会导致RAM的过度使用，并减少工作集的可用内存。
~~Close connection pools when they are no longer needed.~~ 当不再需要连接池时，请关闭它们。~~A connection pool is a cache of open, ready-to-use database connections maintained by the driver.~~ 连接池是由驱动程序维护的打开的、随时可用的数据库连接的缓存。~~Closing unneeded pools makes additional memory resources available.~~关闭不需要的池可以获得额外的内存资源。
~~Manage the size of your connection pool.~~ 管理连接池的大小。~~The maxPoolSize connection string option specifies the maximum number of open connections in the pool.~~ maxPoolSize连接字符串选项指定池中打开的连接的最大数量。~~By default, you can have up to 100 open connections in the pool. Lowering the maxPoolSize reduces the maximum amount of RAM used for connections.~~默认情况下，池中最多可以有100个打开的连接。降低maxPoolSize会减少用于连接的最大RAM量。

Tip

~~To configure your connection pool, see Connection Pool Configuration Settings.~~要配置连接池，请参阅连接池配置设置。

How frequently does WiredTiger write to disk?WiredTiger写入磁盘的频率是多少？

~~Checkpoints~~检查点

~~Starting in version 3.6, MongoDB configures WiredTiger to create checkpoints (i.e. write the snapshot data to disk) at intervals of 60 seconds.~~ 从3.6版本开始，MongoDB将WiredTiger配置为每隔60秒创建一个检查点（即将快照数据写入磁盘）。~~In earlier versions, MongoDB sets checkpoints to occur in WiredTiger on user data at an interval of 60 seconds or when 2 GB of journal data has been written, whichever occurs first.~~在早期版本中，MongoDB将检查点设置为在WiredTiger中每隔60秒或写入2GB日志数据时（以先发生者为准）对用户数据进行检查。

Journal Data

~~WiredTiger syncs the buffered journal records to disk upon any of the following conditions:~~ WiredTiger在以下任何一种情况下都会将缓冲的日志记录同步到磁盘：

~~For replica set members (primary and secondary members),~~对于副本集成员（主要和次要成员），
- ~~If there are operations waiting for oplog entries. Operations that can wait for oplog entries include:~~如果有操作正在等待oplog条目。可以等待oplog条目的操作包括：
  - ~~forward scanning queries against the oplog~~针对oplog的前向扫描查询
  - ~~read operations performed as part of causally consistent sessions~~作为因果一致会话的一部分执行的读取操作
- ~~Additionally for secondary members, after every batch application of the oplog entries.~~此外，对于辅助成员，在每次批量应用oplog条目之后。
~~If a write operation includes or implies a write concern of j: true.~~如果写入操作包含或暗示j: true的写入关注。

Note

~~Write concern "majority" implies j: true if the writeConcernMajorityJournalDefault is true.~~如果writeConcernMajorityJournalDefault为true，则写关注"majority"表示j:true。
~~At every 100 milliseconds (See storage.journal.commitIntervalMs).~~每100毫秒（请参阅storage.journal.commitIntervalMs）。
~~When WiredTiger creates a new journal file.~~ WiredTiger创建新的日志文件时。~~Because MongoDB uses a journal file size limit of 100 MB, WiredTiger creates a new journal file approximately every 100 MB of data.~~由于MongoDB使用100MB的日志文件大小限制，WiredTiger大约每100MB的数据就会创建一个新的日志文件。

How do I reclaim disk space in WiredTiger?如何在WiredTiger中回收磁盘空间？

~~The WiredTiger storage engine maintains lists of empty records in data files as it deletes documents.~~ WiredTiger存储引擎在删除文档时会维护数据文件中的空记录列表。~~This space can be reused by WiredTiger, but will not be returned to the operating system unless under very specific circumstances.~~WiredTiger可以重复使用此空间，但除非在非常特殊的情况下，否则不会将其返回到操作系统。

~~The amount of empty space available for reuse by WiredTiger is reflected in the output of db.collection.stats() under the heading wiredTiger.block-manager.file bytes available for reuse.~~WiredTiger可重复使用的空空间量反映在db.collection.stats()的输出中，标题为wiredTiger.block-manager.file bytes available for reuse。

~~To allow the WiredTiger storage engine to release this empty space to the operating system, you can de-fragment your data file.~~ 要允许WiredTiger存储引擎将此空白空间释放到操作系统，您可以对数据文件进行去分片化。~~This can be achieved using the compact command.~~ 这可以使用compact命令来实现。~~For more information on its behavior and other considerations, see compact.~~有关其行为和其他注意事项的更多信息，请参阅compact。

Data Storage Diagnostics数据存储诊断

How can I check the size of a collection?如何检查集合的大小？

~~To view the statistics for a collection, including the data size, use the db.collection.stats() method from within mongosh.~~ 要查看集合的统计信息，包括数据大小，请使用mongosh中的db.collection.stats()方法。~~The following example issues db.collection.stats() for the orders collection:~~以下示例为订单集合发出db.collection.stats()：

db.orders.stats();

~~MongoDB also provides the following methods to return specific sizes for the collection:~~MongoDB还提供了以下方法来返回集合的特定大小：

db.collection.dataSize() ~~to return the uncompressed data size in bytes for the collection.~~以返回集合的未压缩数据大小（以字节为单位）。
db.collection.storageSize() ~~to return the size in bytes of the collection on disk storage.~~ 返回磁盘存储上集合的大小（以字节为单位）。~~If collection data is compressed (which is the default for WiredTiger), the storage size reflects the compressed size and may be smaller than the value returned by db.collection.dataSize().~~如果采集数据被压缩（这是WiredTiger的默认值），则存储大小会反映压缩后的大小，并且可能小于db.collection.dataSize()返回的值。
db.collection.totalIndexSize() ~~to return the index sizes in bytes for the collection.~~ 以返回集合的索引大小（以字节为单位）。~~If an index uses prefix compression (which is the default for WiredTiger), the returned size reflects the compressed size.~~如果索引使用前缀压缩（这是WiredTiger的默认值），则返回的大小反映压缩后的大小。

~~The following script prints the statistics for each database:~~以下脚本打印每个数据库的统计信息：

db.adminCommand("listDatabases").databases.forEach(function (d) {
   mdb = db.getSiblingDB(d.name);
   printjson(mdb.stats());
})

~~The following script prints the statistics for each collection in each database:~~以下脚本打印每个数据库中每个集合的统计信息：

db.adminCommand("listDatabases").databases.forEach(function (d) {
   mdb = db.getSiblingDB(d.name);
   mdb.getCollectionNames().forEach(function(c) {
      s = mdb[c].stats();
      printjson(s);
   })
})

How can I check the size of the individual indexes for a collection?如何检查集合的各个索引的大小？

~~To view the size of the data allocated for each index, use the db.collection.stats() method and check the indexSizes field in the returned document.~~要查看为每个索引分配的数据大小，请使用db.collection.stats()方法并检查返回文档中的indexSizes字段。

~~If an index uses prefix compression (which is the default for WiredTiger), the returned size for that index reflects the compressed size.~~如果索引使用前缀压缩（这是WiredTiger的默认值），则该索引返回的大小反映压缩后的大小。

How can I get information on the storage use of a database?如何获取有关数据`inventory`储使用情况的信息？

~~The db.stats() method in mongosh returns the current state of the "active" database.~~ mongosh中的db.stats()方法返回“活动”数据库的当前状态。~~For the description of the returned fields, see dbStats Output.~~有关返回字段的描述，请参阅dbStats输出。

← GridFS Frequently Asked Questions →

FAQ: MongoDB Storage

Storage Engine Fundamentals存储引擎基础知识

What is a storage engine?什么是存储引擎？

See also: 另请参阅：

Can you mix storage engines in a replica set?您可以在复制集中混合使用存储引擎吗？

Storage Recommendations存储建议

How many collections and indexes can be in a cluster?一个集群中可以有多少个集合和索引？

WiredTiger Storage EngineWiredTiger存储引擎

Can I upgrade an existing deployment to WiredTiger?我可以将现有部署升级到WiredTiger吗？

How much compression does WiredTiger provide?WiredTiger提供了多少压缩？

To what size should I set the WiredTiger internal cache?我应该将WiredTiger内部缓存设置为什么大小？

How much memory does MongoDB allocate per connection?MongoDB为每个连接分配了多少内存？

How frequently does WiredTiger write to disk?WiredTiger写入磁盘的频率是多少？

How do I reclaim disk space in WiredTiger?如何在WiredTiger中回收磁盘空间？

Data Storage Diagnostics数据存储诊断

How can I check the size of a collection?如何检查集合的大小？

How can I check the size of the individual indexes for a collection?如何检查集合的各个索引的大小？

How can I get information on the storage use of a database?如何获取有关数据inventory储使用情况的信息？

How can I get information on the storage use of a database?如何获取有关数据`inventory`储使用情况的信息？