Monitoring for MongoDB监测MongoDB

On this page本页内容

Monitoring is a critical component of all database administration. 监控是所有数据库管理的关键组成部分。A firm grasp of MongoDB's reporting will allow you to assess the state of your database and maintain your deployment without crisis. 牢牢掌握MongoDB的报告将允许您评估数据库的状态,并在没有危机的情况下维护部署。Additionally, a sense of MongoDB's normal operational parameters will allow you to diagnose problems before they escalate to failures.此外,对MongoDB正常运行参数的了解将允许您在问题升级为故障之前诊断问题。

This document presents an overview of the available monitoring utilities and the reporting statistics available in MongoDB. 本文档概述了MongoDB中可用的监控实用程序和报告统计数据。It also introduces diagnostic strategies and suggestions for monitoring replica sets and sharded clusters.它还介绍了监控副本集和分片集群的诊断策略和建议。

Monitoring Strategies监测战略

MongoDB provides various methods for collecting data about the state of a running MongoDB instance:MongoDB提供了各种方法来集合有关正在运行的MongoDB实例状态的数据:

  • Starting in version 4.0, MongoDB offers free Cloud monitoring for standalones and replica sets.从4.0版开始,MongoDB为Standalone和副本集提供免费云监控
  • MongoDB distributes a set of utilities that provides real-time reporting of database activities.MongoDB分发了一组实用程序,提供数据库活动的实时报告。
  • MongoDB provides various database commands that return statistics regarding the current database state with greater fidelity.MongoDB提供了各种数据库命令,可以更逼真地返回有关当前数据库状态的统计信息。
  • MongoDB Atlas is a cloud-hosted database-as-a-service for running, monitoring, and maintaining MongoDB deployments.是一种云托管的数据库即服务,用于运行、监视和维护MongoDB部署。
  • MongoDB Cloud Manager云管理器 is a hosted service that monitors running MongoDB deployments to collect data and provide visualization and alerts based on that data.是一种托管服务,用于监控运行中的MongoDB部署,以集合数据并基于该数据提供可视化和警报。
  • MongoDB Ops Manager is an on-premise solution available in MongoDB Enterprise Advanced that monitors running MongoDB deployments to collect data and provide visualization and alerts based on that data.MongoDB Ops Manager是MongoDB企业高级版中提供的一个内部解决方案,它可以监视运行中的MongoDB部署,以集合数据,并基于这些数据提供可视化和警报。

Each strategy can help answer different questions and is useful in different contexts. 每种策略都有助于回答不同的问题,并且在不同的环境中很有用。These methods are complementary.这些方法是互补的。

MongoDB Reporting ToolsMongoDB报告工具

This section provides an overview of the reporting methods distributed with MongoDB. 本节概述了MongoDB发布的报告方法。It also offers examples of the kinds of questions that each method is best suited to help you address.它还提供了每种方法最适合帮助你解决的问题的例子。

Free Monitoring免费监控

New in version 4.0.在版本4.0中新增

MongoDB offers free Cloud monitoring for standalones or replica sets.MongoDB为Standalone或副本集提供免费云监控

By default, you can enable/disable free monitoring during runtime using db.enableFreeMonitoring() and db.disableFreeMonitoring().默认情况下,可以使用db.enableFreeMonitoring()db.disableFreeMonitoring()在运行时启用/禁用自由监视。

Free monitoring provides up to 24 hours of data. 免费监控可提供多达24小时的数据。For more details, see Free Monitoring.有关更多详细信息,请参阅免费监控

Utilities公用事业

The MongoDB distribution includes a number of utilities that quickly return statistics about instances' performance and activity. Typically, these are most useful for diagnosing issues and assessing normal operation.MongoDB发行版包括许多实用程序,可以快速返回有关实例性能和活动的统计信息。通常,这些对诊断问题和评估正常运行最有用。

mongostat

mongostat captures and returns the counts of database operations by type (e.g. insert, query, update, delete, etc.). 按类型捕获并返回数据库操作的计数(例如插入、查询、更新、删除等)。These counts report on the load distribution on the server.这些计数报告服务器上的负载分布。

Use mongostat to understand the distribution of operation types and to inform capacity planning. 使用mongostat了解运营类型的分布,并告知产能规划。See the mongostat reference page for details.有关详细信息,请参阅mongostat参考页面。

mongotop

mongotop tracks and reports the current read and write activity of a MongoDB instance, and reports these statistics on a per collection basis.跟踪和报告MongoDB实例的当前读写活动,并按集合报告这些统计信息。

Use mongotop to check if your database activity and use match your expectations. 使用mongotop检查您的数据库活动和使用是否符合您的期望。See the mongotop reference page for details.有关详细信息,请参阅mongotop参考页面。

HTTP Console

Changed in version 3.6.在版本3.6中更改

MongoDB 3.6 removes the deprecated HTTP interface and REST API to MongoDB.MongoDB 3.6将不推荐使用的HTTP接口和REST API删除到MongoDB中。

Commands命令

MongoDB includes a number of commands that report on the state of the database.MongoDB包含许多报告数据库状态的命令。

These data may provide a finer level of granularity than the utilities discussed above. 这些数据可以提供比上面讨论的实用程序更精细的粒度级别。Consider using their output in scripts and programs to develop custom alerts, or to modify the behavior of your application in response to the activity of your instance. 考虑在脚本和程序中使用它们的输出来开发自定义警报,或者响应于实例的活动修改应用程序的行为。The db.currentOp() method is another useful tool for identifying the database instance's in-progress operations.db.currentOp()方法是另一个用于识别数据库实例正在进行的操作的有用工具。

serverStatus

The serverStatus command, or db.serverStatus() from the shell, returns a general overview of the status of the database, detailing disk usage, memory use, connection, journaling, and index access. shell中的serverStatus命令或db.serverStatus()返回数据库状态的一般概述,详细说明磁盘使用、内存使用、连接、日志记录和索引访问。The command returns quickly and does not impact MongoDB performance.该命令返回速度很快,不会影响MongoDB的性能。

serverStatus outputs an account of the state of a MongoDB instance. 输出MongoDB实例状态的说明。This command is rarely run directly. 这个命令很少直接运行。In most cases, the data is more meaningful when aggregated, as one would see with monitoring tools including MongoDB Cloud Manager and Ops Manager. 在大多数情况下,数据在聚合时更有意义,这在MongoDB Cloud ManagerOps Manager等监控工具中可以看到。Nevertheless, all administrators should be familiar with the data provided by serverStatus.不过,所有管理员都应该熟悉serverStatus提供的数据。

dbStats

The dbStats command, or db.stats() from the shell, returns a document that addresses storage use and data volumes. dbStats命令或shell中的db.stats()返回一个解决存储使用和数据量的文档。The dbStats reflect the amount of storage used, the quantity of data contained in the database, and object, collection, and index counters.dbStats反映使用的存储量、数据库中包含的数据量以及对象、集合和索引计数器。

Use this data to monitor the state and storage capacity of a specific database. 使用此数据监视特定数据库的状态和存储容量。This output also allows you to compare use between databases and to determine the average document size in a database.此输出还允许您比较数据库之间的使用情况,并确定数据库中的平均文档大小。

collStats

The collStats or db.collection.stats() from the shell that provides statistics that resemble dbStats on the collection level, including a count of the objects in the collection, the size of the collection, the amount of disk space used by the collection, and information about its indexes.shell中的collStatsdb.collection.stats(),它在集合级别上提供类似于dbStats的统计信息,包括集合中对象的计数、集合的大小、集合使用的磁盘空间量以及有关其索引的信息。

replSetGetStatus

The replSetGetStatus command (rs.status() from the shell) returns an overview of your replica set's status. replSetGetStatus命令(shell中的rs.status()命令)返回副本集状态的概览。The replSetGetStatus document details the state and configuration of the replica set and statistics about its members.replSetGetStatus文档详细说明了复制集的状态和配置,以及有关其成员的统计信息。

Use this data to ensure that replication is properly configured, and to check the connections between the current host and the other members of the replica set.使用此数据可确保正确配置复制,并检查当前主机与副本集其他成员之间的连接。

Hosted (SaaS) Monitoring Tools托管(SaaS)监控工具

These are monitoring tools provided as a hosted service, usually through a paid subscription.这些是作为托管服务提供的监控工具,通常通过付费订阅提供。

Name名称Notes笔记

MongoDB Cloud Manager

MongoDB Cloud Manager is a cloud-based suite of services for managing MongoDB deployments. MongoDB Cloud Manager是一套基于云的服务,用于管理MongoDB部署。MongoDB Cloud Manager provides monitoring, backup, and automation functionality. MongoDB Cloud Manager提供监控、备份和自动化功能。For an on-premise solution, see also Ops Manager, available in MongoDB Enterprise Advanced.有关内部部署解决方案,请参阅Ops Manager,可在MongoDB企业高级中找到。
VividCortexVividCortex provides deep insights into MongoDB production database workload and query performance -- in one-second resolution. VictorCortex以1秒的分辨率深入了解MongoDB生产数据库的工作负载和查询性能Track latency, throughput, errors, and more to ensure scalability and exceptional performance of your application on MongoDB.跟踪延迟、吞吐量、错误等,以确保应用程序在MongoDB上的可扩展性和优异性能。
ScoutSeveral plugins, including MongoDB Monitoring, MongoDB Slow Queries, and MongoDB Replica Set Monitoring.几个插件,包括MongoDB监控MongoDB慢速查询MongoDB副本集监控
Server Density服务器密度Dashboard for MongoDB, MongoDB specific alerts, replication failover timeline and iPhone, iPad and Android mobile apps.,特定于MongoDB的警报、复制故障切换时间表以及iPhone、iPad和Android移动应用程序。
Application Performance Management应用程序性能管理IBM has an Application Performance Management SaaS offering that includes monitor for MongoDB and other applications and middleware.IBM有一个应用程序性能管理SaaS产品,包括MongoDB和其他应用程序及中间件的monitor。
New RelicNew Relic offers full support for application performance management. New Relic为应用程序性能管理提供全面支持。In addition, New Relic Plugins and Insights enable you to view monitoring metrics from Cloud Manager in New Relic.此外,New Relic插件和洞察使您能够在New Relic中查看Cloud Manager的监控指标。
DatadogInfrastructure monitoring基础设施监测 to visualize the performance of your MongoDB deployments.可视化MongoDB部署的性能。
SPM Performance MonitoringSPM性能监控Monitoring, Anomaly Detection and Alerting监控、异常检测和警报 SPM monitors all key MongoDB metrics together with infrastructure incl. SPM监控所有关键MongoDB指标以及基础设施,包括。Docker and other application metrics, e.g. Node.js, Java, NGINX, Apache, HAProxy or Elasticsearch. Docker和其他应用程序指标,例如Node.js、Java、NGINX、Apache、HAProxy或Elasticsearch。SPM provides correlation of metrics and logs.SPM提供度量和日志的关联。
Pandora FMSPandora FMS provides the PandoraFMS-mongodb-monitoring plugin to monitor MongoDB.Pandora FMS提供PandoraFMS-mongodb-monitoring监控插件来监控mongodb。

Process Logging进程日志记录

During normal operation, mongod and mongos instances report a live account of all server activity and operations to either standard output or a log file. 在正常操作期间,mongodmongos实例会向标准输出或日志文件报告所有服务器活动和操作的实时帐户。The following runtime settings control these options.以下运行时设置控制这些选项。

  • quiet. Limits the amount of information written to the log or output.。限制写入日志或输出的信息量。
  • verbosity. Increases the amount of information written to the log or output. 。增加写入日志或输出的信息量。You can also modify the logging verbosity during runtime with the logLevel parameter or the db.setLogLevel() method in the shell.还可以在运行时使用logLevel参数或shell中的db.setLogLevel()方法修改日志详细信息。
  • path. Enables logging to a file, rather than the standard output. 。允许记录到文件,而不是标准输出。You must specify the full path to the log file when adjusting this setting.调整此设置时,必须指定日志文件的完整路径。
  • logAppend. Adds information to a log file instead of overwriting the file.。将信息添加到日志文件中,而不是覆盖该文件。
Note注意

You can specify these configuration operations as the command line arguments to mongod or mongos可以将这些配置操作指定为mongodmongos的命令行参数

For example:

mongod -v --logpath /var/log/mongodb/server1.log --logappend

Starts a mongod instance in verbose mode, appending data to the log file at /var/log/mongodb/server1.log/.

The following database commands also affect logging:以下数据库命令也会影响日志记录:

Log Redaction日志编辑

Available in MongoDB Enterprise only仅在MongoDB Enterprise中提供

A mongod running with security.redactClientLogData redacts messages associated with any given log event before logging, leaving only metadata, source files, or line numbers related to the event. 使用security.redactClientLogData运行的mongod会在记录之前对与任何给定日志事件相关的消息进行编辑,只留下与事件相关的元数据、源文件或行号。security.redactClientLogData prevents potentially sensitive information from entering the system log at the cost of diagnostic detail.security.redactClientLogData防止潜在的敏感信息进入系统日志,但会以牺牲诊断细节为代价。

For example, the following operation inserts a document into a mongod running without log redaction. The mongod has systemLog.component.command.verbosity set to 1:例如,以下操作将一个文档插入到运行时没有日志编辑的mongod中。mongodsystemLog.component.command.verbosity设置为1

db.clients.insertOne( { "name" : "Joe", "PII" : "Sensitive Information" } )

This operation produces the following log event:此操作将生成以下日志事件:

2017-06-09T13:35:23.446-04:00 I COMMAND  [conn1] command internal.clients
   appName: "MongoDB Shell"
   command: insert {
      insert: "clients",
      documents: [ {
            _id: ObjectId('593adc5b99001b7d119d0c97'),
            name: "Joe",
            PII: " Sensitive Information"
         } ],
      ordered: true
   }
   ...

A mongod running with security.redactClientLogData performing the same insert operation produces the following log event:使用security.redactClientLogData运行的mongod执行相同的插入操作,会生成以下日志事件:

2017-06-09T13:45:18.599-04:00 I COMMAND  [conn1] command internal.clients
   appName: "MongoDB Shell"
   command: insert {
      insert: "###", documents: [ {
         _id: "###", name: "###", PII: "###"
      } ],
      ordered: "###"
   }

Use redactClientLogData in conjunction with Encryption at Rest and TLS/SSL (Transport Encryption) to assist compliance with regulatory requirements.redactClientLogDataRest加密TLS/SSL(传输加密)结合使用,以帮助遵守法规要求。

Diagnosing Performance Issues诊断性能问题

As you develop and operate applications with MongoDB, you may want to analyze the performance of the database as the application. 在使用MongoDB开发和操作应用程序时,您可能需要分析数据库作为应用程序的性能。MongoDB Performance discusses some of the operational factors that can influence performance.MongoDB性能讨论了一些可能影响性能的操作因素。

Replication and Monitoring复制和监视

Beyond the basic monitoring requirements for any MongoDB instance, for replica sets, administrators must monitor replication lag. 除了对任何MongoDB实例的基本监视要求之外,对于副本集,管理员还必须监视复制延迟。"Replication lag" refers to the amount of time that it takes to copy (i.e. replicate) a write operation on the primary to a secondary. “复制延迟”指将primary上的写入操作复制(即复制)到secondary所需的时间。Some small delay period may be acceptable, but significant problems emerge as replication lag grows, including:一些小的延迟期可能是可以接受的,但随着复制延迟的增长,会出现重大问题,包括:

  • Growing cache pressure on the primary.主服务器上的缓存压力越来越大。
  • Operations that occurred during the period of lag are not replicated to one or more secondaries. 延迟期间发生的操作不会复制到一个或多个辅助设备。If you're using replication to ensure data persistence, exceptionally long delays may impact the integrity of your data set.如果使用复制来确保数据持久性,那么异常长的延迟可能会影响数据集的完整性。
  • If the replication lag exceeds the length of the operation log (oplog) then MongoDB will have to perform an initial sync on the secondary, copying all data from the primary and rebuilding all indexes. 如果复制延迟超过了操作日志(oplog)的长度,那么MongoDB必须在辅助服务器上执行初始同步,从主服务器复制所有数据并重建所有索引。[1] This is uncommon under normal circumstances, but if you configure the oplog to be smaller than the default, the issue can arise.这在正常情况下并不常见,但如果将oplog配置为小于默认值,则可能会出现问题。

    Note注意

    The size of the oplog is only configurable during the first run using the --oplogSize argument to the mongod command, or preferably, the oplogSizeMB setting in the MongoDB configuration file. oplog的大小只能在第一次运行期间使用mongod命令的--oplogSize参数进行配置,或者最好使用MongoDB配置文件中的oplogSizeMB设置。If you do not specify this on the command line before running with the --replSet option, mongod will create a default sized oplog.如果在使用--replSet选项运行之前未在命令行中指定此选项,mongod将创建默认大小的oplog。

    By default, the oplog is 5 percent of total available disk space on 64-bit systems. 默认情况下,oplog是64位系统上总可用磁盘空间的5%。For more information about changing the oplog size, see the Change the Size of the Oplog.有关更改oplog大小的更多信息,请参阅更改oplog的大小

Flow Control流量控制

Starting in MongoDB 4.2, administrators can limit the rate at which the primary applies its writes with the goal of keeping the majority committed lag under a configurable maximum value flowControlTargetLagSeconds.从MongoDB 4.2开始,管理员可以限制主应用其写操作的速率,目的是将大多数提交的延迟保持在可配置的最大值flowControlTargetLagSeconds之下。

By default, flow control is enabled.默认情况下,流量控制处于启用状态。

Note注意

For flow control to engage, the replica set/sharded cluster must have: featureCompatibilityVersion (FCV) of 4.2 and read concern majority enabled. 要启用流控制,副本集/分片集群必须具有:4.2featureCompatibilityVersion(FCV)和已启用的读关注多数。That is, enabled flow control has no effect if FCV is not 4.2 or if read concern majority is disabled.也就是说,如果FCV不是4.2或读关注多数被禁用,则启用的流控制无效。

See also: Check the Replication Lag.另请参见:检查复制延迟

Replica Set Status副本集状态

Replication issues are most often the result of network connectivity issues between members, or the result of a primary that does not have the resources to support application and replication traffic. 复制问题通常是由于成员之间的网络连接问题,或者是由于primary没有资源来支持应用程序和复制流量。To check the status of a replica, use the replSetGetStatus or the following helper in the shell:要检查副本的状态,请使用replSetGetStatus或shell中的以下帮助程序:

rs.status()

The replSetGetStatus reference provides a more in-depth overview view of this output. replSetGetStatus参考提供了此输出的更深入的概览视图。In general, watch the value of optimeDate, and pay particular attention to the time difference between the primary and the secondary members.一般来说,注意optimeDate的值,并特别注意主要成员次要成员之间的时间差。

[1] Starting in MongoDB 4.0, the oplog can grow past its configured size limit to avoid deleting the majority commit point.从MongoDB 4.0开始,oplog可以增长到超过其配置的大小限制,以避免删除多数提交点

Free Monitoring免费监控

Note注意

Starting in version 4.0, MongoDB offers free monitoring for standalone and replica sets. 从4.0版开始,MongoDB为独立和副本集提供免费监控For more information, see Free Monitoring.有关更多信息,请参阅免费监控

Slow Application of Oplog EntriesOplog条目应用缓慢

Starting in version 4.2 (also available starting in 4.0.6), secondary members of a replica set now log oplog entries that take longer than the slow operation threshold to apply. 从版本4.2开始(也可从4.0.6开始使用),副本集的次要成员现在会记录需要比慢速操作阈值更长时间才能应用的oplog条目These slow oplog messages:这些缓慢的oplog消息:

  • Are logged for the secondaries in the diagnostic log.诊断日志中记录辅助设备的。
  • Are logged under the REPL component with the text applied op: <oplog entry> took <num>ms.记录在REPL组件下,文本为applied op: <oplog entry> took <num>ms
  • Do not depend on the log levels (either at the system or component level)不要依赖于日志级别(系统或组件级别)
  • Do not depend on the profiling level.不要依赖于分析级别。
  • May be affected by slowOpSampleRate, depending on your MongoDB version:可能会受到slowOpSampleRate的影响,具体取决于您的MongoDB版本:

    • In MongoDB 4.2 and earlier, these slow oplog entries are not affected by the slowOpSampleRate. 在MongoDB 4.2及更早版本中,这些缓慢的oplog条目不受slowOpSampleRate的影响。MongoDB logs all slow oplog entries regardless of the sample rate.无论采样率如何,MongoDB都会记录所有慢速oplog条目。
    • In MongoDB 4.4 and later, these slow oplog entries are affected by the slowOpSampleRate.在MongoDB 4.4及更高版本中,这些较慢的oplog条目会受到slowOpSampleRate的影响。

The profiler does not capture slow oplog entries.探查器不会捕获较慢的oplog条目。

Sharding and Monitoring分片和监控

In most cases, the components of sharded clusters benefit from the same monitoring and analysis as all other MongoDB instances. 在大多数情况下,分片集群的组件受益于与所有其他MongoDB实例相同的监控和分析。In addition, clusters require further monitoring to ensure that data is effectively distributed among nodes and that sharding operations are functioning appropriately.此外,集群需要进一步监控,以确保数据在节点之间有效分布,并确保分片操作正常运行。

Tip提示
See also: 参阅:

See the Sharding documentation for more information.有关更多信息,请参阅分片文档。

Config Servers配置服务器

The config database maintains a map identifying which documents are on which shards. 配置数据库维护一个地图,标识哪些文档在哪些分片上。The cluster updates this map as chunks move between shards. 当区块在分片之间移动时,集群会更新此地图。When a configuration server becomes inaccessible, certain sharding operations become unavailable, such as moving chunks and starting mongos instances. 当配置服务器变得不可访问时,某些分片操作将变得不可用,例如移动块和启动mongos实例。However, clusters remain accessible from already-running mongos instances.但是,集群仍然可以从已经运行的mongos实例访问。

Because inaccessible configuration servers can seriously impact the availability of a sharded cluster, you should monitor your configuration servers to ensure that the cluster remains well balanced and that mongos instances can restart.由于无法访问的配置服务器会严重影响分片集群的可用性,因此您应该监控配置服务器,以确保集群保持良好平衡,并且mongos实例可以重新启动。

MongoDB Cloud Manager and Ops Manager monitor config servers and can create notifications if a config server becomes inaccessible. MongoDB云管理器Ops Manager监控配置服务器,并在配置服务器无法访问时创建通知。See the MongoDB Cloud Manager documentation and Ops Manager documentation for more information.有关更多信息,请参阅MongoDB云管理器文档Ops Manager文档

Balancing and Chunk Distribution平衡和组块分布

The most effective sharded cluster deployments evenly balance chunks among the shards. 最有效的分片集群部署会在分片之间均匀地平衡To facilitate this, MongoDB has a background balancer process that distributes data to ensure that chunks are always optimally distributed among the shards.为了便于实现这一点,MongoDB有一个后台均衡器进程,用于分配数据,以确保块始终以最佳方式分布在分片之间。

Issue the db.printShardingStatus() or sh.status() command to the mongos from within mongosh. mongosh内部向mongos发出db.printShardingStatus()sh.status()命令。This returns an overview of the entire cluster including the database name, and a list of the chunks.这将返回整个集群的概览,包括数据库名称和区块列表。

Stale Locks陈旧的锁

To check the lock status of the database, connect to a mongos instance using mongosh. 要检查数据库的锁定状态,请使用mongosh连接到mongos实例。Issue the following command sequence to switch to the config database and display all outstanding locks on the shard database:发出以下命令序列以切换到config数据库并显示分片数据库上所有未完成的锁:

use config
db.locks.find()

The balancing process takes a special "balancer" lock that prevents other balancing activity from transpiring. 平衡过程需要一个特殊的“均衡器”锁,防止发生其他平衡活动。In the config database, use the following command to view the "balancer" lock.config数据库中,使用以下命令查看“均衡器”锁。

db.locks.find( { _id : "balancer" } )

Changed in version 3.4.在版本3.4中更改

Starting in 3.4, the primary of the CSRS config server holds the "balancer" lock, using a process id named "ConfigServer". 从3.4开始,CSRS配置服务器的主服务器使用名为“ConfigServer”的进程id持有“balancer”锁。This lock is never released. 这把锁永远不会松开。To determine if the balancer is running, see Check if Balancer is Running.要确定均衡器是否正在运行,请参阅检查均衡器是否正在运行

Storage Node Watchdog存储节点监视程序

Note注意
  • Starting in MongoDB 4.2, the Storage Node Watchdog is available in both the Community and MongoDB Enterprise editions.从MongoDB 4.2开始,社区版和MongoDB企业版都提供了存储节点监视程序
  • In earlier versions (3.2.16+, 3.4.7+, 3.6.0+, 4.0.0+), the Storage Node Watchdog is only available in MongoDB Enterprise edition.在早期版本(3.2.16+、3.4.7+、3.6.0+、4.0.0+)中,存储节点监视程序仅在MongoDB Enterprise edition中可用。

The Storage Node Watchdog monitors the following MongoDB directories to detect filesystem unresponsiveness:存储节点监视程序监视以下MongoDB目录,以检测文件系统无响应:

By default, the Storage Node Watchdog is disabled. 默认情况下,存储节点监视程序处于禁用状态。You can only enable the Storage Node Watchdog on a mongod at startup time by setting the watchdogPeriodSeconds parameter to an integer greater than or equal to 60. 只有在启动时将watchdogPeriodSeconds参数设置为大于或等于60的整数,才能在mongod上启用存储节点Watchdog。However, once enabled, you can pause the Storage Node Watchdog and restart during runtime. 但是,启用后,可以暂停存储节点监视程序,并在运行时重新启动。See watchdogPeriodSeconds parameter for details.有关详细信息,请参阅watchdogPeriodSeconds参数。

If any of the filesystems containing the monitored directories become unresponsive, the Storage Node Watchdog terminates the mongod and exits with a status code of 61. 如果包含受监视目录的任何文件系统没有响应,存储节点看门狗将终止mongod并以61的状态代码退出。If the mongod is the primary of a replica set, the termination initiates a failover, allowing another member to become primary.如果mongod是副本集的主要成员,则终止将启动故障转移,允许另一个成员成为主要成员。

Once a mongod has terminated, it may not be possible to cleanly restart it on the same machine.一旦mongod终止,可能无法在同一台机器上干净地重新启动它。

Note注意
Symlinks符号链接

If any of its monitored directories is a symlink to other volumes, the Storage Node Watchdog does not monitor the symlink target.如果其任何受监视的目录是指向其他卷的符号链接,则存储节点监视程序不会监视符号链接目标。

For example, if the mongod uses storage.directoryPerDB: true (or --directoryperdb) and symlinks a database directory to another volume, the Storage Node Watchdog does not follow the symlink to monitor the target.例如,如果mongod使用storage.directoryPerDB: true(或--directoryperdb)并将数据库目录符号链接到另一个卷,则存储节点监视程序不会跟随符号链接来监视目标。

The maximum time the Storage Node Watchdog can take to detect an unresponsive filesystem and terminate is nearly twice the value of watchdogPeriodSeconds.存储节点监视程序检测到无响应文件系统并终止所需的最长时间几乎是watchdogPeriodSeconds(监视周期秒值)的两倍。

←  Recover a Standalone after an Unexpected ShutdownFree Monitoring →