WiredTiger Storage EngineWiredTiger存储引擎

On this page本页内容

Starting in MongoDB 3.2, the WiredTiger storage engine is the default storage engine. 从MongoDB 3.2开始,WiredTiger存储引擎是默认的存储引擎。For existing deployments, if you do not specify the --storageEngine or the storage.engine setting, the version 3.2+ mongod instance can automatically determine the storage engine used to create the data files in the --dbpath or storage.dbPath. See Default Storage Engine Change.

Document Level Concurrency文档级并发

WiredTiger uses document-level concurrency control for write operations. WiredTiger对写入操作使用文档级并发控制。As a result, multiple clients can modify different documents of a collection at the same time.因此,多个客户端可以同时修改集合的不同文档。

For most read and write operations, WiredTiger uses optimistic concurrency control. 对于大多数读写操作,WiredTiger使用乐观并发控制。WiredTiger uses only intent locks at the global, database and collection levels. WiredTiger仅在全局、数据库和收集级别使用意向锁。When the storage engine detects conflicts between two operations, one will incur a write conflict causing MongoDB to transparently retry that operation.当存储引擎检测到两个操作之间存在冲突时,其中一个操作将引发写入冲突,导致MongoDB透明地重试该操作。

Some global operations, typically short lived operations involving multiple databases, still require a global “instance-wide” lock. 一些全局操作(通常是涉及多个数据库的短期操作)仍然需要全局“实例范围”锁。Some other operations, such as collMod, still require an exclusive database lock.其他一些操作,例如collMod,仍然需要独占数据库锁。

Snapshots and Checkpoints快照和检查点

WiredTiger uses MultiVersion Concurrency Control (MVCC). WiredTiger使用多版本并发控制(MVCC)。At the start of an operation, WiredTiger provides a point-in-time snapshot of the data to the operation. 在操作开始时,WiredTiger向操作提供数据的时间点快照。A snapshot presents a consistent view of the in-memory data.快照显示内存中数据的一致视图。

When writing to disk, WiredTiger writes all the data in a snapshot to disk in a consistent way across all data files. 写入磁盘时,WiredTiger将快照中的所有数据以一致的方式跨所有数据文件写入磁盘。The now-durable data act as a checkpoint in the data files. The checkpoint ensures that the data files are consistent up to and including the last checkpoint; i.e. checkpoints can act as recovery points.

Starting in version 3.6, MongoDB configures WiredTiger to create checkpoints (i.e. write the snapshot data to disk) at intervals of 60 seconds. 从版本3.6开始,MongoDB将WiredTiger配置为每隔60秒创建一次检查点(即将快照数据写入磁盘)。In earlier versions, MongoDB sets checkpoints to occur in WiredTiger on user data at an interval of 60 seconds or when 2 GB of journal data has been written, whichever occurs first.在早期版本中,MongoDB将检查点设置为在WiredTiger中每隔60秒对用户数据进行检查,或者在写入2GB日志数据时进行检查,以先发生的为准。

During the write of a new checkpoint, the previous checkpoint is still valid. 在写入新的检查点期间,以前的检查点仍然有效。As such, even if MongoDB terminates or encounters an error while writing a new checkpoint, upon restart, MongoDB can recover from the last valid checkpoint.因此,即使MongoDB在写入新的检查点时终止或遇到错误,在重新启动时,MongoDB也可以从最后一个有效的检查点恢复。

The new checkpoint becomes accessible and permanent when WiredTiger’s metadata table is atomically updated to reference the new checkpoint. 当WiredTiger的元数据表原子更新以引用新的检查点时,新的检查点将变得可访问且永久。Once the new checkpoint is accessible, WiredTiger frees pages from the old checkpoints.一旦可以访问新的检查点,WiredTiger将从旧的检查点释放页面。

Using WiredTiger, even without journaling, MongoDB can recover from the last checkpoint; however, to recover changes made after the last checkpoint, run with journaling.使用WiredTiger,即使没有日志记录,MongoDB也可以从最后一个检查点恢复;但是,要恢复在最后一个检查点之后所做的更改,请运行日志记录

Note

Starting in MongoDB 4.0, you cannot specify --nojournal option or storage.journal.enabled: false for replica set members that use the WiredTiger storage engine.

Journal

WiredTiger uses a write-ahead log (i.e. journal) in combination with checkpoints to ensure data durability.

The WiredTiger journal persists all data modifications between checkpoints. If MongoDB exits between checkpoints, it uses the journal to replay all data modified since the last checkpoint. For information on the frequency with which MongoDB writes the journal data to disk, see Journaling Process.

WiredTiger journal is compressed using the snappy compression library. To specify a different compression algorithm or no compression, use the storage.wiredTiger.engineConfig.journalCompressor setting. For details on changing the journal compressor, see Change WiredTiger Journal Compressor.

Note

If a log record less than or equal to 128 bytes (the mininum log record size for WiredTiger), WiredTiger does not compress that record.如果日志记录小于或等于128字节(WiredTiger的mininum日志记录大小),WiredTige不会压缩该记录。

You can disable journaling for standalone instances by setting storage.journal.enabled to false, which can reduce the overhead of maintaining the journal. For standalone instances, not using the journal means that, when MongoDB exits unexpectedly, you will lose all data modifications prior to the last checkpoint.

Note

Starting in MongoDB 4.0, you cannot specify --nojournal option or storage.journal.enabled: false for replica set members that use the WiredTiger storage engine.

Compression

With WiredTiger, MongoDB supports compression for all collections and indexes. 使用WiredTiger,MongoDB支持所有集合和索引的压缩。Compression minimizes storage use at the expense of additional CPU.压缩以增加CPU为代价,最大限度地减少了存储使用。

By default, WiredTiger uses block compression with the snappy compression library for all collections and prefix compression for all indexes.默认情况下,WiredTiger对所有集合使用块压缩和snappy压缩库,并对所有索引使用前缀压缩

For collections, the following block compression libraries are also available:对于集合,还可以使用以下块压缩库:

To specify an alternate compression algorithm or no compression, use the storage.wiredTiger.collectionConfig.blockCompressor setting.

For indexes, to disable prefix compression, use the storage.wiredTiger.indexConfig.prefixCompression setting.

Compression settings are also configurable on a per-collection and per-index basis during collection and index creation. See Specify Storage Engine Options and db.collection.createIndex() storageEngine option.

For most workloads, the default compression settings balance storage efficiency and processing requirements.对于大多数工作负载,默认压缩设置平衡了存储效率和处理要求。

The WiredTiger journal is also compressed by default. 默认情况下,WiredTiger日志也会压缩。For information on journal compression, see Journal.有关日志压缩的信息,请参阅日志

Memory Use

With WiredTiger, MongoDB utilizes both the WiredTiger internal cache and the filesystem cache.对于WiredTiger,MongoDB利用WiredTigerInternal缓存和文件系统缓存。

Starting in MongoDB 3.4, the default WiredTiger internal cache size is the larger of either:从MongoDB 3.4开始,默认WiredTiger内部缓存大小为以下两者中的较大值:

For example, on a system with a total of 4GB of RAM the WiredTiger cache will use 1.5GB of RAM (0.5 * (4 GB - 1 GB) = 1.5 GB). Conversely, a system with a total of 1.25 GB of RAM will allocate 256 MB to the WiredTiger cache because that is more than half of the total RAM minus one gigabyte (0.5 * (1.25 GB - 1 GB) = 128 MB < 256 MB).

Note

In some instances, such as when running in a container, the database can have memory constraints that are lower than the total system memory. In such instances, this memory limit, rather than the total system memory, is used as the maximum RAM available.

To see the memory limit, see hostInfo.system.memLimitMB.

By default, WiredTiger uses Snappy block compression for all collections and prefix compression for all indexes. Compression defaults are configurable at a global level and can also be set on a per-collection and per-index basis during collection and index creation.

Different representations are used for data in the WiredTiger internal cache versus the on-disk format:

Via the filesystem cache, MongoDB automatically uses all free memory that is not used by the WiredTiger cache or by other processes.

To adjust the size of the WiredTiger internal cache, see storage.wiredTiger.engineConfig.cacheSizeGB and --wiredTigerCacheSizeGB. Avoid increasing the WiredTiger internal cache size above its default value.