Database Manual / Reference / mongosh Methods / Collections

db.collection.analyzeShardKey() (mongosh method方法)

Definition定义

db.collection.analyzeShardKey(key, opts)
Calculates metrics for evaluating a shard key for an unsharded or sharded collection. Metrics are based on sampled queries. 计算用于评估未分片或分片集合的分片键的指标。度量基于抽样查询。You can use configureQueryAnalyzer to configure query sampling on a collection.您可以使用configureQueryAnalyzer在集合上配置查询采样。

Compatibility兼容性

This method is available in deployments hosted in the following environments:此方法在以下环境中托管的部署中可用:

  • MongoDB Atlas: The fully managed service for MongoDB deployments in the cloud:云中MongoDB部署的完全托管服务

Important

This command is not supported in M0 and Flex clusters. For more information, see Unsupported Commands.M0和Flex集群不支持此命令。有关详细信息,请参阅不支持的命令

  • MongoDB Enterprise: The subscription-based, self-managed version of MongoDB:MongoDB的基于订阅的自我管理版本
  • MongoDB Community: The source-available, free-to-use, and self-managed version of MongoDB:MongoDB的源代码可用、免费使用和自我管理版本

Syntax语法

db.collection.analyzeShardKey() has this syntax:具有以下语法:

db.collection.analyzeShardKey(
<shardKey>,
{
keyCharacteristics: <bool>,
readWriteDistribution: <bool>,
sampleRate: <double>,
sampleSize: <int>
}
)

Fields字段

Field字段Type类型Necessity必要性Description描述
keydocument文档Required必需

Shard key to analyze. This can be a candidate shard key for an unsharded collection or sharded collection or the current shard key for a sharded collection.要分析的分片键。这可以是未分片集合或分片集合的候选分片键,也可以是分片集合中的当前分片键。

There is no default value.没有默认值。

opts.keyCharacteristicsboolean布尔值Optional可选

Whether or not the metrics about the characteristics of the shard key are calculated. 是否计算了关于分片键特征的度量。For details, see keyCharacteristics.有关详细信息,请参阅keyCharacteristics

Defaults to true.默认为true

opts.readWriteDistributionboolean布尔值Optional可选

Whether or not the metrics about the read and write distribution are calculated. 是否计算了关于读写分布的度量。For details, see readWriteDistribution.有关详细信息,请参阅readWriteDistribution

Defaults to true.默认为true

opts.sampleRatedouble双精度浮点数Optional可选

The proportion of the documents in the collection to sample when calculating the metrics about the characteristics of the shard key. 在计算分片键特征的度量时,集合中的文档与样本的比例。If you set sampleRate, you cannot set sampleSize.如果设置sampleRate,则无法设置sampleSize

Must greater than 0, up to and including 1.必须大于0,最多为1

There is no default value.没有默认值。

opts.sampleSizeinteger整数Optional可选

The number of documents to sample when calculating the metrics about the characteristics of the shard key. If you set sampleSize, you cannot set sampleRate.计算分片键特征的度量时要采样的文档数量。如果设置了sampleSize,则无法设置sampleRate

If not specified and sampleRate is not specified, the sample size defaults to sample size set by analyzeShardKeyCharacteristicsDefaultSampleSize.如果未指定且未指定sampleRate,则样本大小默认为analyzeShardKeyCharacteristicsDefaultSampleSize设置的样本大小。

Behavior行为

For behavior, see analyzeShardKey Behavior.有关行为,请参阅analyzeShardKey行为

Access Control访问控制

For details, see analyzeShardKey Access Control.有关详细信息,请参阅analyzeShardKey访问控制

Output输出

For sample output, see analyzeShardKey Output.有关示例输出,请参阅analyzeShardKey输出

Examples示例

Consider a simplified version of a social media app. The collection we are trying to shard is the post collection.考虑一个社交媒体应用程序的简化版本。我们试图分片的集合是post集合。

Documents in the post collection have the following schema:post集合中的文档具有以下模式:

{
userId: <uuid>,
firstName: <string>,
lastName: <string>,
body: <string>, // the field that can be modified.
date: <date>, // the field that can be modified.
}

Background Information背景信息

  • The app has 1500 users.该应用程序有1500名用户。
  • There are 30 last names and 45 first names, some more common than others.有30个姓氏和45个名字,有些比其他名字更常见。
  • There are three celebrity users.有三位名人用户。
  • Each user follows exactly five other users and has a very high probability of following at least one celebrity user.每个用户只关注另外五个用户,并且很有可能关注至少一个名人用户。

Sample Workload示例工作量

  • Each user posts about two posts a day at random times. They edit each post once, right after it is posted.每个用户每天随机发布大约两条帖子。他们在每篇帖子发布后立即编辑一次。
  • Each user logs in every six hours to read their own profile and posts by the users they follow from the past 24 hours. They also reply under a random post from the past three hours.每个用户每六个小时登录一次,阅读他们自己的个人资料以及他们在过去24小时内关注的用户的帖子。他们还随机回复了过去三个小时的帖子。
  • For every user, the app removes posts that are more than three days old at midnight.对于每个用户,该应用程序都会在午夜删除超过三天的帖子。

Workload Query Patterns工作负载查询模式

This workload has the following query patterns:此工作负载具有以下查询模式:

  • find command with filter { userId: , firstName: , lastName: }使用筛选器{ userId: , firstName: , lastName: }find命令
  • find command with filter { $or: [{ userId: , firstName: , lastName:, date: { $gte: }, ] }使用筛选器{ $or: [{ userId: , firstName: , lastName:, date: { $gte: }, ] }find命令。
  • findAndModify command with filter { userId: , firstName: , lastName: , date: } to update the body and date field.使用筛选器{ userId: , firstName: , lastName: , date: }findAndModify命令以更新bodydate字段。
  • update command with multi: false and filter { userId: , firstName: , lastName: , date: { $gte: , $lt: } } to update the body and date field.使用multi: false和筛选器{ userId: , firstName: , lastName: , date: { $gte: , $lt: } }update命令以更新bodydate命令。
  • delete command with multi: true and filter { userId: , firstName: , lastName: , date: { $lt: } }使用multi: true和筛选器{ userId: , firstName: , lastName: , date: { $lt: } }delete命令。

Below are example metrics returned by db.collection.analyzeShardKey method for some candidate shard keys, with sampled queries collected from seven days of workload.下面是db.collection.analyzeShardKey方法为一些候选分片键返回的示例指标,其中包含从七天的工作负载中集合的采样查询。

Note

Before you run the db.collection.analyzeShardKey method method, read the Supporting Indexes section. 在运行db.collection.analyzeShardKey方法之前,请阅读支持索引部分。If you require supporting indexes for the shard key you are analyzing, use the db.collection.createIndex() method to create the indexes.如果您需要为正在分析的分片键提供支持索引,请使用db.collection.createIndex()方法创建索引。

{ _id: 1 } keyCharacteristics

This example uses the db.collection.analyzeShardKey method to provide metrics on the { _id: 1 } shard key on the social.post collection.此示例使用db.collection.analyzeShardKey方法提供social.post集合上{ _id: 1 }分片键的度量。

The following code block uses db.collection.configureQueryAnalyzer() to turn on query sampling:以下代码块使用db.collection.configureQueryAnalyzer()打开查询采样:

use social
db.post.configureQueryAnalyzer(
{
mode: "full",
samplesPerSecond: 5
}
)

After db.collection.configureQueryAnalyzer() collects query samples, the following code block uses the db.collection.analyzeShardKey method to sample 10,000 documents and calculate results:db.collection.configureQueryAnalyzer()集合查询样本后,以下代码块使用db.collection.analyzeShardKey方法对10000个文档进行采样并计算结果:

use social
db.post.analyzeShardKey(
{ _id: 1 },
{
keyCharacteristics: true,
readWriteDistribution: false,
sampleSize: 10000
}
)

{ lastName: 1 } keyCharacteristics

This db.collection.analyzeShardKey method provides metrics on the { lastName: 1 } shard key on the social.post collection:db.collection.analyzeShardKey方法提供social.post集合上{ lastName: 1 }分片键的度量:

use social
db.post.analyzeShardKey(
{ lastName: 1 },
{
keyCharacteristics: true,
readWriteDistribution: false
}
)

The output for this example resembles the following:此示例的输出类似于以下内容:

{
"keyCharacteristics": {
"numDocsTotal" : 9039,
"avgDocSizeBytes" : 153,
"numDocsSampled" : 9039,
"isUnique" : false,
"numDistinctValues" : 30,
"mostCommonValues" : [
{
"value" : {
"lastName" : "Smith"
},
"frequency" : 1013
},
{
"value" : {
"lastName" : "Johnson"
},
"frequency" : 984
},
{
"value" : {
"lastName" : "Jones"
},
"frequency" : 962
},
{
"value" : {
"lastName" : "Brown"
},
"frequency" : 925
},
{
"value" : {
"lastName" : "Davies"
},
"frequency" : 852
}
],
"monotonicity" : {
"recordIdCorrelationCoefficient" : 0.0771959161,
"type" : "not monotonic"
},
}
}

{ userId: 1 } keyCharacteristics

This db.collection.analyzeShardKey method provides metrics on the { userId: 1 } shard key on the social.post collection:db.collection.analyzeShardKey方法提供了social.post集合上{userId:1}分片键的度量:

use social
db.post.analyzeShardKey(
{ userId: 1 },
{
keyCharacteristics: true,
readWriteDistribution: false
}
)

The output for this example resembles the following:此示例的输出类似于以下内容:

{
"keyCharacteristics": {
"numDocsTotal" : 9039,
"avgDocSizeBytes" : 162,
"numDocsSampled" : 9039,
"isUnique" : false,
"numDistinctValues" : 1495,
"mostCommonValues" : [
{
"value" : {
"userId" : UUID("aadc3943-9402-4072-aae6-ad551359c596")
},
"frequency" : 15
},
{
"value" : {
"userId" : UUID("681abd2b-7a27-490c-b712-e544346f8d07")
},
"frequency" : 14
},
{
"value" : {
"userId" : UUID("714cb722-aa27-420a-8d63-0d5db962390d")
},
"frequency" : 14
},
{
"value" : {
"userId" : UUID("019a4118-b0d3-41d5-9c0a-764338b7e9d1")
},
"frequency" : 14
},
{
"value" : {
"userId" : UUID("b9c9fbea-3c12-41aa-bc69-eb316047a790")
},
"frequency" : 14
}
],
"monotonicity" : {
"recordIdCorrelationCoefficient" : -0.0032039729,
"type" : "not monotonic"
},
}
}

{ userId: 1 } readWriteDistribution

This db.collection.analyzeShardKey method provides metrics on the { userId: 1 } shard key on the social.post collection:db.collection.analyzeShardKey方法提供了social.post集合上{userId:1}分片键的度量:

use social
db.post.analyzeShardKey(
{ userId: 1 },
{
keyCharacteristics: false,
readWriteDistribution: true
}
)

The output for this example resembles the following:此示例的输出类似于以下内容:

{
"readDistribution" : {
"sampleSize" : {
"total" : 61363,
"find" : 61363,
"aggregate" : 0,
"count" : 0,
"distinct" : 0
},
"percentageOfSingleShardReads" : 50.0008148233,
"percentageOfMultiShardReads" : 49.9991851768,
"percentageOfScatterGatherReads" : 0,
"numReadsByRange" : [
688,
775,
737,
776,
652,
671,
1332,
1407,
535,
428,
985,
573,
1496,
...
],
},
"writeDistribution" : {
"sampleSize" : {
"total" : 49638,
"update" : 30680,
"delete" : 7500,
"findAndModify" : 11458
},
"percentageOfSingleShardWrites" : 100,
"percentageOfMultiShardWrites" : 0,
"percentageOfScatterGatherWrites" : 0,
"numWritesByRange" : [
389,
601,
430,
454,
462,
421,
668,
833,
493,
300,
683,
460,
...
],
"percentageOfShardKeyUpdates" : 0,
"percentageOfSingleWritesWithoutShardKey" : 0,
"percentageOfMultiWritesWithoutShardKey" : 0
}
}

Learn More了解更多