This document describes how to use Health Managers to monitor and manage sharded cluster health issues.本文档描述了如何使用运行状况管理器来监视和管理分片集群的运行状况问题。
Overview概述
A Health Manager runs health checks on a health manager facet at a specified intensity level. Health Manager checks run at specified time intervals. 健康管理器以指定的强度级别对健康管理器方面运行健康检查。运行状况管理器检查以指定的时间间隔运行。A Health Manager can be configured to move a failing mongos out of a cluster automatically. 健康管理器可以配置为自动将失败的mongos移出集群。Progress Monitor ensures that Health Manager checks do not become stuck or unresponsive.进度监视器确保运行状况管理器检查不会卡住或无响应。
Health Manager Facets健康管理器方面
The following table shows the available Health Manager facets:下表显示了可用的健康管理器方面:
configServer | |
dns | |
ldap |
Health Manager Intensity Levels健康管理器强度级别
The following table shows the available Health Manager intensity levels:下表显示了可用的健康管理器强度级别:
critical | mongos移出集群。activeFaultDurationSecs before stopping and moving the mongos out of the cluster automatically.activeFaultDurationSecs指定的时间,然后停止并自动将mongos移出集群。 |
non-critical | mongos仍将留在集群中。 |
off |
Active Fault Duration活动故障持续时间
When a failure is detected and the Health Manager intensity level is set to 当检测到故障并且健康管理器强度级别设置为严重时,健康管理器会等待critical, the Health Manager waits the amount of time specified by activeFaultDurationSecs before stopping and moving the mongos out of the cluster automatically.activeFaultDurationSecs指定的时间,然后停止并自动将mongos移出集群。
Progress Monitor进度监视器
Progress Monitor runs tests to ensure that Health Manager checks do not become stuck or unresponsive. Progress Monitor runs these tests in intervals specified by 进度监视器运行测试以确保运行状况管理器检查不会卡住或无响应。进度监视器以interval. interval指定的间隔运行这些测试。If a health check begins but does not complete within the timeout given by 如果健康检查开始但未在deadline, Progress Monitor stops the mongos and removes it from the cluster.deadline给定的超时内完成,进度监视器将停止mongos并将其从集群中删除。
progressMonitor Fields字段
| Units | ||
|---|---|---|
interval | Milliseconds | |
deadline | mongos失败之前超时。 | Seconds |
Examples示例
The following examples show how Health Managers can be configured. For information on Health Manager parameters, see Health Manager Parameters.以下示例显示了如何配置健康管理器。有关健康管理器参数的信息,请参阅健康管理器的参数。
Intensity强度
For example, to set the 例如,要将dns Health Manager facet to the critical intensity level, issue the following at startup:dns健康管理器方面设置为临界强度级别,请在启动时发出以下命令:
mongos --setParameter 'healthMonitoringIntensities={ values:[ { type:"dns", intensity: "critical"} ] }'
Or if using the 或者,如果在连接到正在运行的setParameter command in a mongosh session that is connected to a running mongos:mongos的mongosh会话中使用setParameter命令:
db.adminCommand(
{
setParameter: 1,
healthMonitoringIntensities: { values: [ { type: "dns", intensity: "critical" } ] } } )
}
)
Parameters set with 使用setParameter do not persist across restarts. See the setParameter page for details.setParameter设置的参数在重新启动后不会持续存在。有关详细信息,请参阅setParameter页面。
To make this setting persistent, set 要使此设置持久化,请使用healthMonitoringIntensities in your mongos config file using the setParameter option as in the following example:setParameter选项在mongos配置文件中设置healthMonitoringIntensities,如下例所示:
setParameter:
healthMonitoringIntensities: "{ values:[ { type:\"dns\", intensity: \"critical\"} ] }"
healthMonitoringIntensities accepts an array of documents, 接受一组文档和values. Each document in values takes two fields:values。values中的每个文档都有两个字段:
type, the Health Manager facet,健康管理器方面intensity, the intensity level,强度水平
See 有关详细信息,请参阅healthMonitoringIntensities for details.healthMonitoringIntensities。
Intervals间隔
For example, to set the 例如,要将ldap Health Manager facet to the run health checks every 30 seconds, issue the following at startup:ldap健康管理器方面设置为每30秒运行一次健康检查,请在启动时发出以下命令:
mongos --setParameter 'healthMonitoringIntervals={ values:[ { type:"ldap", interval: "30000"} ] }'
Or if using the 或者,如果在连接到正在运行的setParameter command in a mongosh session that is connected to a running mongos:mongos的mongosh会话中使用setParameter命令:
db.adminCommand(
{
setParameter: 1,
healthMonitoringIntervals: { values: [ { type: "ldap", interval: "30000" } ] } } )
}
)
Parameters set with 使用setParameter do not persist across restarts. See the setParameter page for details.setParameter设置的参数在重新启动后不会持续存在。有关详细信息,请参阅setParameter页面。
To make this setting persistent, set 要使此设置持久化,请使用healthMonitoringIntervals in your mongos config file using the setParameter option as in the following example:setParameter选项在mongos配置文件中设置healthMonitoringIntervals,如下例所示:
setParameter:
healthMonitoringIntervals: "{ values: [{type: \"ldap\", interval: 200}] }"
healthMonitoringIntervals accepts an array of documents, 接受一组文档和values. Each document in values takes two fields:values。values中的每个文档都有两个字段:
type, the Health Manager facet,健康管理器方面interval, the time interval it runs at, in milliseconds,它运行的时间间隔,以毫秒为单位
See 有关详细信息,请参阅healthMonitoringIntervals for details.healthMonitoringIntervals。
Active Fault Duration活动故障持续时间
For example, to set the duration from failure to crash to five minutes, issue the following at startup:例如,要将故障到崩溃的持续时间设置为五分钟,请在启动时发出以下命令:
mongos --setParameter activeFaultDurationSecs=300
Or if using the 或者,如果在连接到正在运行的setParameter command in a mongosh session that is connected to a running mongos:mongos的mongosh会话中使用setParameter命令:
db.adminCommand(
{
setParameter: 1,
activeFaultDurationSecs: 300
}
)
Parameters set with 使用setParameter do not persist across restarts. See the setParameter page for details.setParameter设置的参数在重新启动后不会持续存在。有关详细信息,请参阅setParameter页面。
To make this setting persistent, set 要使此设置持久化,请使用activeFaultDurationSecs in your mongos config file using the setParameter option as in the following example:setParameter选项在mongos配置文件中设置activeFaultDurationSecs,如下例所示:
setParameter:
activeFaultDurationSecs: 300
See 有关详细信息,请参阅activeFaultDurationSecs for details.activeFaultDurationSecs。
Progress Monitor进度监视器
Progress Monitor runs tests to ensure that Health Manager checks do not become stuck or unresponsive. Progress Monitor runs these tests in intervals specified by 进度监视器运行测试以确保运行状况管理器检查不会卡住或无响应。进度监视器以interval指定的间隔运行这些测试。interval. If a health check begins but does not complete within the timeout given by 如果健康检查开始但未在deadline, Progress Monitor stops the mongos and removes it from the cluster.deadline给定的超时内完成,进度监视器将停止mongos并将其从集群中删除。
To set the 要将interval to 1000 milliseconds and the deadline to 300 seconds, issue the following at startup:interval设置为1000毫秒,将deadline设置为300秒,请在启动时发出以下命令:
mongos --setParameter 'progressMonitor={"interval": 1000, "deadline": 300}'
Or if using the 或者,如果在连接到正在运行的setParameter command in a mongosh session that is connected to a running mongos:mongos的mongosh会话中使用setParameter命令:
db.adminCommand(
{
setParameter: 1,
progressMonitor: { interval: 1000, deadline: 300 } )
}
)
Parameters set with 使用setParameter do not persist across restarts. See the setParameter page for details.setParameter设置的参数在重新启动后不会持续存在。有关详细信息,请参阅setParameter页面。
To make this setting persistent, set 要使此设置持久化,请使用progressMonitor in your mongos config file using the setParameter option as in the following example:setParameter选项在mongos配置文件中设置progressMonitor,如下例所示:
setParameter:
progressMonitor: "{ interval: 1000, deadline: 300 }"
See 有关详细信息,请参阅progressMonitor for details.progressMonitor。