Database Manual / Reference / Database Commands / Sharding

`analyzeShardKey` (database command数据库命令)

Definition定义

analyzeShardKey: ~~New in version 7.0.~~在版本7.0中新增。

Calculates metrics for evaluating a shard key for an unsharded or sharded collection. Metrics are based on sampled queries. You can use configureQueryAnalyzer to configure query sampling on a collection.计算用于评估未分片或分片集合的分片键的指标。度量基于抽样查询。您可以使用configureQueryAnalyzer在集合上配置查询采样。

Compatibility兼容性

~~This command is available in deployments hosted in the following environments:~~此命令在以下环境中托管的部署中可用：

MongoDB Atlas~~: The fully managed service for MongoDB deployments in the cloud~~：云中MongoDB部署的完全托管服务

Note

~~This command is supported in all MongoDB Atlas clusters.~~ 所有MongoDB Atlas集群都支持此命令。~~For information on Atlas support for all commands, see Unsupported Commands.~~有关Atlas支持所有命令的信息，请参阅不支持的命令。

MongoDB Enterprise~~: The subscription-based, self-managed version of MongoDB~~：MongoDB的基于订阅的自我管理版本
MongoDB Community~~: The source-available, free-to-use, and self-managed version of MongoDB~~：MongoDB的源代码可用、免费使用和自我管理版本

Syntax语法

analyzeShardKey ~~has this syntax:~~具有以下语法：

db.collection.analyzeShardKey(
   <shardKey>,
   {
     keyCharacteristics: <bool>,
     readWriteDistribution: <bool>,
     sampleRate: <double>,
     sampleSize: <int>
   }
 )

Command Fields命令字段

~~Field~~字段	~~Type~~类型	~~Necessity~~必要性	~~Description~~描述
`shardKey`	~~document~~文档	~~Required~~必需	~~Shard key to analyze. This can be a candidate shard key for an unsharded collection or sharded collection or the current shard key for a sharded collection.~~要分析的分片键。这可以是未分片集合或分片集合的候选分片键，也可以是分片集合中的当前分片键。 ~~There is no default value.~~没有默认值。
`keyCharacteristics`	~~boolean~~布尔值	~~Optional~~可选	~~Whether or not the metrics about the characteristics of the shard key are calculated. For details, see keyCharacteristics.~~是否计算了关于分片键特征的度量。有关详细信息，请参阅`keyCharacteristics`。 ~~Defaults to `true`.~~默认为`true`。
`readWriteDistribution`	~~boolean~~布尔值	~~Optional~~可选	~~Whether or not the metrics about the read and write distribution are calculated. For details, see readWriteDistribution.~~是否计算了关于读写分布的度量。有关详细信息，请参阅`readWriteDistribution`。 ~~Defaults to `true`.~~默认为`true`。 ~~To return read and write distribution metrics for a collection using `analyzeShardKey`, you must configure the query analyzer to sample the queries run on the collection.~~ 要使用`analyzeShardKey`返回集合的读写分布指标，您必须配置查询分析器以对在集合上运行的查询进行采样。~~Otherwise, `analyzeShardKey` returns the read and write distribution metrics as `0` values.~~ 否则，`analyzeShardKey`将读写分布指标返回为`0`值。~~To configure the query analyzer, see configureQueryAnalyzer (database command).~~要配置查询分析器，请参阅configureQueryAnalyzer（数据库命令）。
`sampleRate`	~~double~~双精度浮点数	~~Optional~~可选	~~The proportion of the documents in the collection to sample when calculating the metrics about the characteristics of the shard key.~~ 在计算分片键特征的度量时，集合中的文档与样本的比例。~~If you set `sampleRate`, you cannot set `sampleSize`.~~如果设置`sampleRate`，则无法设置`sampleSize`。 ~~Must greater than `0`, up to and including `1`.~~必须大于`0`，最多为`1`。 ~~There is no default value.~~没有默认值。
`sampleSize`	~~integer~~整数	~~Optional~~可选	~~The number of documents to sample when calculating the metrics about the characteristics of the shard key. If you set `sampleSize`, you cannot set `sampleRate`.~~计算分片键特征的度量时要采样的文档数量。如果设置了`sampleSize`，则无法设置`sampleRate`。 ~~If not specified and `sampleRate` is not specified, the sample size defaults to sample size set by `analyzeShardKeyCharacteristicsDefaultSampleSize`.~~如果未指定且未指定`sampleRate`，则样本大小默认为`analyzeShardKeyCharacteristicsDefaultSampleSize`设置的样本大小。

Behavior行为

analyzeShardKey ~~returns different metrics depending on the keyCharacteristic and readWriteDistribution values you specify when you run the method.~~根据运行该方法时指定的keyCharacteristic和readWriteDistribution值，返回不同的度量。

Metrics About Shard Key Characteristics关于分片关键特征的指标

keyCharacteristic ~~consists of the metrics about the cardinality, frequency, and monotonicity of the shard key. These metrics are only returned when keyCharacteristics is true.~~由关于分片键的基数、频率和单调性的度量组成。只有当keyCharacteristics为true时，才会返回这些指标。

~~The metrics are calculated when analyzeShardKey is run based on documents sampled from the collection. The calculation requires the shard key to have a supporting index.~~ 当analyzeShardKey基于从集合中采样的文档运行时，会计算这些指标。计算需要分片键有一个支持索引。~~If there is no supporting index, no metrics are returned.~~如果没有支持索引，则不会返回任何指标。

~~You can configure sampling with the sampleRate and sampleSize fields.~~ 您可以使用sampleRate和sampleSize字段配置采样。~~Both are optional, but only one can be specified.~~ 两者都是可选的，但只能指定一个。~~When both sampleRate and sampleSize are unspecified, MongoDB uses the value of the analyzeShardKeyCharacteristicsDefaultSampleSize parameter, which has a default value of 10 million.~~当未指定sampleRate和sampleSize时，MongoDB使用analyzeShardKeyCharacteristicsDefaultSampleSize参数的值，该参数的默认值为1000万。

~~To calculate metrics based on all documents in the collection, set the sampleRate to 1.~~要基于集合中的所有文档计算度量，请将sampleRate设置为1。

Metrics About the Read and Write Distribution读写分布的度量

readWriteDistribution ~~contains metrics about the query routing patterns and the hotness of shard key ranges. These metrics are based on sampled queries.~~包含有关查询路由模式和分片键范围热度的指标。这些指标基于抽样查询。

~~To configure query sampling for a collection, use the configureQueryAnalyzer command.~~ 要为集合配置查询采样，请使用configureQueryAnalyzer命令。~~The read and write distribution metrics are only returned if readWriteDistribution is true.~~ 只有当readWriteDistribution为true时，才会返回读写分布度量。~~The metrics are calculated when analyzeShardKey is run and the metrics use the sampled read and write queries. If there are no sampled queries, read and write distribution metrics aren't returned.~~在运行analyzeShardKey时计算度量，度量使用采样的读写查询。如果没有采样查询，则不会返回读写分布指标。

~~If there are no sampled read queries, the command returns writeDistribution but omits readDistribution.~~如果没有采样读取查询，则该命令返回writeDistribution，但省略readDistribution。
~~If there are no sampled write queries, the command returns readDistribution but omits writeDistribution.~~如果没有采样写查询，则该命令返回readDistribution，但省略writeDistribution。

~~To return read and write distribution metrics for a collection using analyzeShardKey, you must configure the query analyzer to sample the queries run on the collection.~~ 要使用analyzeShardKey返回集合的读写分布指标，您必须配置查询分析器以对在集合上运行的查询进行采样。~~Otherwise, analyzeShardKey returns the read and write distribution metrics as 0 values. To configure the query analyzer, see configureQueryAnalyzer (database command).~~否则，analyzeShardKey将读写分布指标返回为0值。要配置查询分析器，请参阅configureQueryAnalyzer（数据库命令）。

`keyCharacteristics` ~~Value~~值	`readWriteDistribution` ~~Value~~值	~~Results Returned~~结果返回
`true`	`false`	`analyzeShardKey` ~~returns keyCharacteristics metrics and omits readWriteDistribution metrics.~~返回`keyCharacteristics`指标并省略`readWriteDistribution`指标。 ~~If the shard key doesn't have a supporting index, `analyzeShardKey` returns an `IllegalOperation` error.~~如果分片键没有支持索引，`analyzeShardKey`将返回`IllegalOperation`错误。
`false`	`true`	`analyzeShardKey` ~~returns `readWriteDistribution` metrics and omits `keyCharacteristics` metrics.~~返回`readWriteDistribution`度量并省略`keyCharacteristics`度量。
`true`	`true`	`analyzeShardKey` ~~returns both `readWriteDistribution` metrics and `keyCharacteristics` metrics.~~返回`readWriteDistribution`指标和`keyCharacteristics`指标。 ~~If the shard key doesn't have a supporting index, `analyzeShardKey` returns `readWriteDistribution` metrics and omits `keyCharacteristics` metrics.~~如果分片键没有支持索引，`analyzeShardKey`将返回`readWriteDistribution`度量并省略`keyCharacteristics`度量。

Non-Blocking Behavior非阻塞行为

analyzeShardKey ~~does not block reads or writes to the collection.~~不会阻止对集合的读取或写入。

Query Sampling查询采样

The quality of the metrics about the read and write distribution is determined by how representative the workload is when query sampling occurs. For some applications, returning representative metrics may require leaving query sampling on for several days.关于读写分布的度量质量取决于查询采样时工作负载的代表性。对于某些应用程序，返回代表性指标可能需要让查询采样持续几天。

Supporting Indexes支持指标

~~The supporting index required by analyzeShardKey is different from the supporting index required by the shardCollection command.~~analyzeShardKey所需的支持索引与shardCollection命令所需的支撑索引不同。

~~This table shows the supporting indexes for the same shard key for both analyzeShardKey and shardCollection:~~此表显示了analyzeShardKey和shardCollection的同一分片键的支持索引：

~~Command~~命令	~~Shard Key~~分片钥匙	~~Supporting Indexes~~支持指标
`analyzeShardKey`	`{ a.x: 1, b: "hashed" }`	`{ a.x: 1, b: 1, ... }` `{ a.x: "hashed", b: 1, ... }` `{ a.x: 1, b: "hashed", ... }` `{ a.x: "hashed", b: "hashed", ...}`
`shardCollection`	`{ a.x: 1, b: "hashed" }`	`{ a.x: 1, b: "hashed", ... }`

~~This allows you to analyze a shard key that may not yet have a supporting index required for sharding it.~~这允许您分析可能还没有分片所需的支持索引的分片键。

~~Both analyzeShardKey and shardCollection have the following index requirements:~~analyzedShardKey和shardCollection都有以下索引要求：

~~Index has a simple collation~~索引有一个简单的排序规则
~~Index is not multi-key~~索引不是多键
~~Index is not sparse~~索引不是稀疏的
~~Index is not partial~~索引不是部分的

~~To create supporting indexes, use the db.collection.createIndex() method.~~要创建支持索引，请使用db.collection.createIndex()方法。

Read Preference读取首选项

~~To minimize the performance, run analyzeShardKey with the secondary or secondaryPreferred read preference.~~ 为了最大限度地降低性能，请使用secondary或secondaryPreferred读取项运行analyzeShardKey。~~On a sharded cluster, mongos automatically sets the read preference to secondaryPreferred if not specified.~~在分片集群上，如果没有指定，mongos会自动将读取首选项设置为secondaryPreferred。

Limitations局限性

~~You cannot run analyzeShardKey on Atlas flex clusters.~~您无法在Atlas flex集群上运行analyzeShardKey。
~~You cannot run analyzeShardKey on standalone deployments.~~您无法在独立部署上运行analyzeShardKey。
~~You cannot run analyzeShardKey directly against a --shardsvr replica set. When running on a sharded cluster, analyzeShardKey must run against a mongos.~~您不能直接对--shardsvr副本集运行analyzeShardKey。在分片集群上运行时，analyzeShardKey必须与mongos运行。
~~You cannot run analyzeShardKey against time series collections.~~您无法对时间序列集合运行analyzeShardKey。
~~You cannot run analyzeShardKey against collections with Queryable Encryption.~~您无法对具有可查询加密的集合运行analyzeShardKey。

Access Control访问控制

analyzeShardKey ~~requires one of these roles:~~需要以下角色之一：

enableSharding ~~privilege action against the collection being analyzed.~~针对正在分析的集合的权限操作。
clusterManager ~~role against the cluster.~~针对集群的角色。

Output输出

analyzeShardKey ~~returns information regarding keyCharacteristics and readWriteDistribution.~~返回有关keyCharacteristics和readWriteDistribution的信息。

keyCharacteristics ~~provides metrics about the cardinality, frequency, and monotonicity of the shard key.~~提供有关分片键的基数、频率和单调性的度量。
readWriteDistribution ~~provides metrics about query routing patterns and the hotness of shard key ranges.~~提供有关查询路由模式和分片键范围热度的指标。

`keyCharacteristics`

~~This is the structure of the keyCharacteristics document that is returned when keyCharacteristics is set to true:~~这是当keyCharacteristics设置为true时返回的keyCharacteristics文档的结构：

{
   keyCharacteristics: {
      numDocsTotal: <integer>,
      numOrphanDocs: <integer>,
      avgDocSizeBytes: <integer>,
      numDocsSampled: <integer>,
      isUnique: <bool>,
      numDistinctValues: <integer>,
      mostCommonValues: [
        { value: <shardkeyValue>, frequency: <integer> },
        ...
      ],
      monotonicity: {
        recordIdCorrelationCoefficient: <double>,
        type: "monotonic"|"not monotonic"|"unknown",
    }
  }
}

~~Field~~字段	~~Type~~类型	~~Description~~描述	~~Usage~~用法
`numDocsTotal`	~~integer~~整数	~~The number of documents in the collection.~~集合中的文档数量。
`numOrphanDocs`	~~integer~~整数	~~The number of orphan documents.~~孤儿文件的数量。	~~Orphan documents are not excluded from metrics calculation for performance reasons.~~ 出于性能原因，孤立文档不排除在指标计算之外。If `numOrphanDocs` is large relative to `numDocsTotal`, consider waiting until the number of orphan documents is very small compared to the total number of documents in the collection to run the command.如果`numOrphantDocs`相对于`numDocsTotal`较大，请考虑等待孤立文档的数量与集合中的文档总数相比非常小，然后运行该命令。
`avgDocSizeBytes`	~~integer~~整数	~~The average size of documents in the collection, in bytes.~~集合中文档的平均大小，以字节为单位。	~~If `numDocsTotal` is comparable to `numDocsSampled`, you can estimate the size of the largest chunks by multiplying the `frequency` of each `mostCommonValues` by `avgDocSizeBytes`.~~如果`numDocsTotal`与`numDocsSampled`相当，则可以通过将每个`mostCommonValues`的频率乘以`avgDocSizeBytes`来估计最大块的大小。
`numDocsSampled`	~~integer~~整数	~~The number of sampled documents.~~抽样文件的数量。
`numDistinctValues`	~~integer~~整数	~~The number of distinct shard key values.~~不同分片键值的数量。	~~Choose a shard key with a large `numDistinctValues` since the number of distinct shard key values is the maximum number of chunks that the balancer can create.~~选择一个`numDistinctValues`较大的分片键，因为不同分片键值的数量是平衡器可以创建的最大块数。
`isUnique`	~~boolean~~布尔值	~~Indicates whether the shard key is unique. This is only set to `true` if there is a unique index for the shard key.~~指示分片键是否唯一。只有当分片键有唯一索引时，才会将其设置为`true`。	~~If the shard key is unique, then the number of distinct values is equal to the number of documents.~~如果分片键是唯一的，那么不同值的数量等于文档的数量。
`mostCommonValues`	~~array of documents~~文档数组	~~An array of value and `frequency` (number of documents) of the top most common shard key values.~~最常见的分片键值的值和`frequency`（文档数量）数组。	The frequency of a shard key value is the minimum number of documents in the chunk containing that value. If the frequency is large, then the chunk can become a bottleneck for storage, reads and writes. Choose a shard key where the frequency for each most common value is low relative to `numDocsSampled`.分片键值的频率是包含该值的块中文档的最小数量。如果频率很高，那么块可能会成为存储、读取和写入的瓶颈。选择一个分片键，其中每个最常见值的频率相对于`numDocsSampled`较低。 ~~The number of most common shard key values can be configured by setting `analyzeShardKeyNumMostCommonValues` which defaults to `5`.~~ 最常见的分片键值的数量可以通过设置`analyzeShardKeyNumMostCommonValues`来配置，默认值为`5`。~~To avoid exceeding the 16MB BSON size limit for the response, each value is set to "truncated" if its size exceeds 15MB / analyzeShardKey NumMostCommonValues.~~为了避免超过响应的16MB BSON大小限制，如果每个值的大小超过15MB/analysizeShardKey NumMostCommonValues，则将其设置为“截断”。
`mostCommonValues[n].value`	~~document~~文档	~~The shard key.~~分片键
`mostCommonValues[n].frequency`	~~integer~~整数	~~The number of documents for a given shard key.~~给定分片键的文档数。	~~Choose a shard key where the frequency for each most common value is low relative to `numDocsSampled`.~~选择一个分片键，其中每个最常见值的频率相对于`numDocsSampled`较低。
`monotonicity.` `recordIdCorrelationCoefficient`	~~double~~双精度浮点数	~~Only set if the monotonicity is known.~~仅当单调性已知时设置。	~~This is set to `"unknown"` when the one of the following is true:~~当以下之一为真时，此设置为`"unknown"`： ~~The shard key does not have a supporting index per `shardCollection` definition.~~分片键没有每个`shardCollection`定义的支持索引。 ~~The collection is clustered.~~该集合是群集的。 ~~The shard key is a hashed compound shard key where the hashed field is not the first field.~~分片键是一个哈希复合分片键，其中哈希字段不是第一个字段。 The monotonicity check can return an incorrect result if the collection has gone through chunk migrations. Chunk migration deletes documents from the donor shard and re-inserts them on the recipient shard. There is no guarantee that the insertion order from the client is preserved.如果集合经历了块迁移，单调性检查可能会返回不正确的结果。块迁移从捐赠者分片中删除文档，并将其重新插入到接受者分片中。无法保证保留来自客户端的插入顺序。 ~~You can configure the threshold for the correlation coefficient with analyzeShardKeyMonotonicity CorrelationCoefficientThreshold.~~您可以使用analyzeShardKeyMonotonicity CorrelationCoefficientThreshold配置相关系数的阈值。
`monotoncity.type`	~~string~~字符串	~~Can be one of:~~可以是以下之一： `"monotonic"`, `"not monotonic"`, `"unknown"`	~~Avoid a shard key with type `"monotonic"` unless you do not expect to insert new documents often.~~避免使用类型为`"monotonic"`的分片键，除非您不希望经常插入新文档。 ~~If a collection is sharded on a shard key that is monotonically increasing or decreasing, new documents will be inserted onto the shard that owns the `MaxKey` or `MinKey` chunk.~~ 如果一个集合在单调递增或递减的分片键上被分片，新文档将被插入到拥有`MaxKey`或`MinKey`块的分片上。~~That shard can become the bottleneck for inserts and the data will likely be unbalanced most of the time since the balancer will need to compete with the inserts that come in.~~该分片可能会成为插入的瓶颈，数据在大多数情况下可能会不平衡，因为平衡器需要与进来的插入竞争。

`readWriteDistribution`

~~This is the structure of the document that is returned when readWriteDistribution is set to true:~~这是当readWriteDistribution设置为true时返回的文档结构：

{
   readDistribution: {
     sampleSize: {
       total: <integer>,
       find: <integer>,
       aggregate: <integer>,
       count: <integer>,
       distinct: <integer>
     },
     percentageOfSingleShardReads: <double>,
     percentageOfMultiShardReads: <double>,
     percentageOfScatterGatherReads: <double>,
     numReadsByRange: [
       <integer>,
       ...
     ]
   },
   writeDistribution: {
     sampleSize: {
       total: <integer>,
       update: <integer>,
       delete: <integer>,
       findAndModify: <integer>
     },
     percentageOfSingleShardWrites: <double>,
     percentageOfMultiShardWrites: <double>,
     percentageOfScatterGatherWrites: <double>,
     numWritesByRange: [
       <integer>,
       ...
     ],
     percentageOfShardKeyUpdates: <double>,
     percentageOfSingleWritesWithoutShardKey: <double>,
     percentageOfMultiWritesWithoutShardKey: <double>
   }
}

~~To return read and write distribution metrics for a collection using analyzeShardKey, you must configure the query analyzer to sample the queries run on the collection.~~ 要使用analyzeShardKey返回集合的读写分布指标，您必须配置查询分析器以对在集合上运行的查询进行采样。~~Otherwise, analyzeShardKey returns the read and write distribution metrics as 0 values.~~ 否则，analyzeShardKey将读写分布指标返回为0值。~~To configure the query analyzer, see configureQueryAnalyzer (database command).~~要配置查询分析器，请参阅configureQueryAnalyzer（数据库命令）。

`readDistribution` Fields字段

~~Field~~字段	~~Type~~类型	~~Description~~描述	Usage
`sampleSize.total`	~~integer~~整数	~~Total number of sampled read queries.~~采样读取查询的总数。
`sampleSize.find`	~~integer~~整数	~~Total number of sampled `find` queries.~~采样`find`查询的总数。
`sampleSize.aggregate`	~~integer~~整数	~~Total number of sampled `aggregate` queries.~~采`aggregate`合查询的总数。
`sampleSize.count`	~~integer~~整数	~~Total number of sampled `count` queries.~~采样`count`查询的总数。
`sampleSize.distinct`	~~integer~~整数	~~Total number of sampled `distinct` queries.~~采样的`distinct`查询的总数。
`percentageOfSingleShardReads`	~~double~~双精度浮点数	~~Percentage of reads that target a single shard, regardless of how the data is distributed.~~针对单个分片的读取百分比，无论数据如何分布。
`percentageOfMultiShardReads`	~~double~~双精度浮点数	~~Percentage of reads that target multiple shards.~~针对多个分片的读取百分比。	~~This category includes the reads that may target only a single shard if the data is distributed such that the values targeted by the read fall under a single shard.~~此类别包括可能仅针对单个分片的读取，如果数据是分布式的，则读取的目标值落在单个分片下。 ~~If the queries operate on a large amount of data, then targeting multiple shards instead of one may result in a decrease in latency due to the parallel query execution.~~如果查询对大量数据进行操作，那么针对多个分片而不是一个分片可能会由于并行查询执行而减少延迟。
`percentageOfScatterGatherReads`	~~double~~双精度浮点数	~~Percentage of reads that are scatter-gather, regardless of how the data is distributed.~~分散聚集的读取百分比，无论数据如何分布。	~~Avoid a shard key with high value for this metric. While scatter-gather queries are low-impact on the shards that do not have the target data, they still have some performance impact.~~避免使用此度量值较高的分片键。虽然分散集合查询对没有目标数据的分片的影响很小，但它们仍然会对性能产生一些影响。 ~~On a cluster with a large number of shards, scatter-gather queries perform significantly worse than queries that target a single shard.~~在具有大量分片的集群上，分散-聚集查询的性能明显不如针对单个分片的查询。
`numReadsByRange`	~~array of integers~~整数数组	~~Array of numbers representing the number of times that each range sorted from `MinKey` to `MaxKey` is targeted.~~一组数字，表示从`MinKey`到`MaxKey`排序的每个范围被定位的次数。	~~Avoid a shard key where the distribution of `numReadsByRange` is very skewed since that implies that there is likely to be one or more hot shards for reads.~~避免使用`numReadsByRange`分布非常偏斜的分片键，因为这意味着可能有一个或多个热分片用于读取。 ~~Choose a shard key where the sum of `numReadsByRange` is similar to `sampleSize.total`.~~选择一个分片键，其中`numReadsByRange`的总和类似于`sampleSizetotal`。 ~~The number of ranges can be configured using the `analyzeShardKeyNumRanges` parameter which defaults to `100`. The value is `100` because the goal is to find a shard key that scales up to 100 shards.~~范围的数量可以使用`analyzeShardKeyNumRanges`参数配置，该参数默认为`100`。该值为`100`，因为目标是找到一个可扩展到`100`个分片的分片键。

`writeDistribution` Fields字段

~~Field~~字段	~~Type~~类型	~~Description~~描述	Usage
`sampleSize.total`	~~integer~~整数	~~Total number of sampled write queries.~~采样写入查询的总数。
`sampleSize.update`	~~integer~~整数	~~Total number of sampled `update` queries.~~采样`update`查询的总数。
`sampleSize.delete`	~~integer~~整数	~~Total number of sampled `delete` queries.~~采样`delete`查询的总数。
`sampleSize.findAndModify`	~~integer~~整数	~~Total number of sampled `findAndModify` queries.~~采样的`findAndModify`查询总数。
`percentageOfSingleShardWrites`	~~double~~双精度浮点数	~~Percentage of writes that target a single shard, regardless of how the data is distributed.~~针对单个分片的写入百分比，无论数据如何分布。
`percentageOfMultiShardWrites`	~~double~~双精度浮点数	~~Percentage of writes that target multiple shards.~~针对多个分片的写入百分比。	~~This category includes the writes that may target only a single shard if the data is distributed such that the values targeted by the write fall under a single shard.~~此类别包括可能仅针对单个分片的写入，如果数据是分布式的，则写入的目标值落在单个分片下。
`percentageOfScatterGatherWrites`	~~double~~双精度浮点数	~~Percentage of writes that are scatter-gather, regardless of how the data is distributed.~~无论数据如何分布，分散聚集的写入百分比。	~~Avoid a shard key with a high value for this metric because it is generally more performant for a write to target a single shard.~~避免使用此度量值较高的分片键，因为针对单个分片的写入通常更具性能。
`numWritesByRange`	~~array of integers~~整数数组	~~Array of numbers representing the number of times that each range sorted from `MinKey` to `MaxKey` is targeted.~~一组数字，表示从`MinKey`到`MaxKey`排序的每个范围被定位的次数。	~~Avoid a shard key where the distribution of `numWritesByRange` is a very skewed since that implies that there is likely to be one or more hot shards for writes.~~避免使用`numWritesByRange`的分布非常偏斜的分片键，因为这意味着可能有一个或多个热分片用于写入。 ~~Choose a shard key where the sum of `numWritesByRange` is similar to `sampleSize.total`.~~选择一个分片键，其中`numWritesByRange`的总和类似于`sampleSize.total`。 ~~The number of ranges can be configured using the `analyzeShardKeyNumRanges` parameter which defaults to `100`. The value is `100` because the goal is to find a shard key that scales up to 100 shards.~~范围的数量可以使用`analyzeShardKeyNumRanges`参数配置，该参数默认为`100`。该值为`100`，因为目标是找到一个可扩展到`100`个分片的分片键。
`percentageOfShardKeyUpdates`	~~double~~双精度浮点数	~~Percentage of write queries that update a document's shard key value.~~更新文档分片键值的写入查询的百分比。	Avoid a shard key with a high `percentageOfShardKeyUpdates`. Updates to a document's shard key value may cause the document to move to a different shard, which requires executing an internal transaction on the shard that the query targets. 避免使用`percentageOfShardKeyUpdates`较高的分片键。对文档分片键值的更新可能会导致文档移动到不同的分片，这需要在查询所针对的分片上执行内部事务。~~For details on changing a document's shard key value, see Change a Shard Key.~~有关更改文档分片键值的详细信息，请参阅更改分片键值。 ~~Updates are currently only supported as retryable writes or in a transaction, and have a batch size limit of `1`.~~目前，更新仅在可重试写入或事务中受支持，批大小限制为`1`。
`percentageOfSingleWritesWithoutShardKey`	~~double~~双精度浮点数	~~The percentage of write queries that are `multi=false` and not targetable to a single shard.~~`multi=false`且不能针对单个分片的写入查询的百分比。	~~Avoid a shard key with a high value for this metric.~~避免使用此度量值较高的分片键。 ~~Performing this type of write is expensive because they can involve running internal transactions.~~执行这种类型的写入是昂贵的，因为它们可能涉及运行内部事务。
`percentageOfMultiWritesWithoutShardKey`	~~double~~双精度浮点数	~~The percentage of write queries that are `multi=true` and not targetable to a single shard.~~`multi=true`且不能针对单个分片的写入查询的百分比。	~~Avoid a shard key with a high value for this metric.~~避免使用此度量值较高的分片键。

Examples示例

~~Consider a simplified version of a social media app. The collection we are trying to shard is the post collection.~~考虑一个社交媒体应用程序的简化版本。我们试图分片的集合是post集合。

~~Documents in the post collection have the following schema:~~post集合中的文档具有以下模式：

{
   userId: <uuid>,
   firstName: <string>,
   lastName: <string>,
   body: <string>,  // the field that can be modified.
   date: <date>,    // the field that can be modified.
}

Background Information背景信息

~~The app has 1500 users.~~该应用程序有1500名用户。
~~There are 30 last names and 45 first names, some more common than others.~~有30个姓氏和45个名字，有些比其他名字更常见。
~~There are three celebrity users.~~有三位名人用户。
~~Each user follows exactly five other users and has a very high probability of following at least one celebrity user.~~每个用户只关注另外五个用户，并且很有可能关注至少一个名人用户。

Sample Workload示例工作量

~~Each user posts about two posts a day at random times. They edit each post once, right after it is posted.~~每个用户每天随机发布大约两条帖子。他们在每篇帖子发布后立即编辑一次。
~~Each user logs in every six hours to read their own profile and posts by the users they follow from the past 24 hours. They also reply under a random post from the past three hours.~~每个用户每六个小时登录一次，阅读他们自己的个人资料以及他们在过去24小时内关注的用户的帖子。他们还随机回复了过去三个小时的帖子。
~~For every user, the app removes posts that are more than three days old at midnight.~~对于每个用户，该应用程序都会在午夜删除超过三天的帖子。

Workload Query Patterns工作负载查询模式

~~This workload has the following query patterns:~~此工作负载具有以下查询模式：

find ~~command with filter~~ 带筛选器的命令{ userId: , firstName: , lastName: }
find ~~command with filter~~ 带筛选器的命令{ $or: [{ userId: , firstName: , lastName:, date: { $gte: }, ] }
findAndModify ~~command with filter { userId: , firstName: , lastName: , date: } to update the body and date field.~~使用筛选器{ userId: , firstName: , lastName: , date: }的命令更新正文和日期字段。
~~update command with multi: false and filter { userId: , firstName: , lastName: , date: { $gte: , $lt: } } to update the body and date field.~~使用multi:false和筛选器{ userId: , firstName: , lastName: , date: { $gte: , $lt: } }的update命令来更新正文和日期字段。
~~delete command with multi: true and filter { userId: , firstName: , lastName: , date: { $lt: } }~~带multi:true和筛选器{ userId: , firstName: , lastName: , date: { $lt: } }的delete命令

~~Below are example metrics returned by analyzeShardKey command for some candidate shard keys, with sampled queries collected from seven days of workload.~~下面是analyzeShardKey命令返回的一些候选分片键的示例指标，其中包含从七天的工作负载中集合的采样查询。

Note

~~Before you run analyzeShardKey commands, read the Supporting Indexes section earlier on this page.~~ 在运行analyzeShardKey命令之前，请阅读本页前面的支持索引部分。~~If you require supporting indexes for the shard key you are analyzing, use the db.collection.createIndex() method to create the indexes.~~如果您需要为正在分析的分片键提供支持索引，请使用db.collection.createIndex()方法创建索引。

`{ _id: 1 } keyCharacteristics`

~~This example uses the analyzeShardKey command to provide metrics on the { _id: 1 } shard key on the social.post collection.~~此示例使用analyzeShardKey命令提供social.post集合上{ _id: 1 }分片键的度量。

~~The following code block uses db.collection.configureQueryAnalyzer() to turn on query sampling:~~以下代码块使用db.collection.configureQueryAnalyzer()打开查询采样：

use social
db.post.configureQueryAnalyzer(
   {
      mode: "full",
      samplesPerSecond: 5
   }
)

~~After db.collection.configureQueryAnalyzer() collects query samples, the following code block uses the analyzeShardKey command to sample 10,000 documents and calculate results:~~db.collection.configureQueryAnalyzer()集合查询样本后，以下代码块使用analyzeShardKey命令对10000个文档进行采样并计算结果：

use social
db.post.analyzeShardKey(
   { _id: 1 },
   {
      keyCharacteristics: true,
      readWriteDistribution: false,
      sampleSize: 10000
   }
)

`{ lastName: 1 } keyCharacteristics`

~~This analyzeShardKey command provides metrics on the { lastName: 1 } shard key on the social.post collection:~~此analyzeShardKey命令提供social.post集合上{ lastName: 1 }分片键的度量：

use social
db.post.analyzeShardKey(
   { lastName: 1 },
   {
      keyCharacteristics: true,
      readWriteDistribution: false
   }
)

~~The output for this example resembles the following:~~此示例的输出类似于以下内容：

{
   "keyCharacteristics": {
     "numDocsTotal" : 9039,
     "avgDocSizeBytes" : 153,
     "numDocsSampled" : 9039,
     "isUnique" : false,
     "numDistinctValues" : 30,
     "mostCommonValues" : [
         {
           "value" : {
               "lastName" : "Smith"
           },
           "frequency" : 1013
         },
         {
           "value" : {
               "lastName" : "Johnson"
           },
           "frequency" : 984
         },
         {
           "value" : {
               "lastName" : "Jones"
           },
           "frequency" : 962
         },
         {
           "value" : {
               "lastName" : "Brown"
           },
           "frequency" : 925
         },
         {
           "value" : {
               "lastName" : "Davies"
           },
           "frequency" : 852
         }
     ],
     "monotonicity" : {
       "recordIdCorrelationCoefficient" : 0.0771959161,
       "type" : "not monotonic"
   },
 }
}

`{ userId: 1 } keyCharacteristics`

~~This analyzeShardKey command provides metrics on the { userId: 1 } shard key on the social.post collection:~~此analyzeShardKey命令提供social.post集合上{ userId: 1 }分片键的度量：

use social
db.post.analyzeShardKey(
   { userId: 1 },
   {
      keyCharacteristics: true,
      readWriteDistribution: false
   }
)

~~The output for this example resembles the following:~~此示例的输出类似于以下内容：

{
  "keyCharacteristics": {
    "numDocsTotal" : 9039,
    "avgDocSizeBytes" : 162,
    "numDocsSampled" : 9039,
    "isUnique" : false,
    "numDistinctValues" : 1495,
    "mostCommonValues" : [
      {
        "value" : {
          "userId" : UUID("aadc3943-9402-4072-aae6-ad551359c596")
        },
        "frequency" : 15
      },
     {
       "value" : {
         "userId" : UUID("681abd2b-7a27-490c-b712-e544346f8d07")
       },
       "frequency" : 14
     },
     {
       "value" : {
         "userId" : UUID("714cb722-aa27-420a-8d63-0d5db962390d")
       },
       "frequency" : 14
     },
     {
       "value" : {
         "userId" : UUID("019a4118-b0d3-41d5-9c0a-764338b7e9d1")
       },
       "frequency" : 14
     },
     {
       "value" : {
         "userId" : UUID("b9c9fbea-3c12-41aa-bc69-eb316047a790")
       },
       "frequency" : 14
     }
   ],
   "monotonicity" : {
     "recordIdCorrelationCoefficient" : -0.0032039729,
     "type" : "not monotonic"
   },
 }
}

`{ userId: 1 } readWriteDistribution`

use social
db.post.analyzeShardKey(
   { userId: 1 },
   {
      keyCharacteristics: false,
      readWriteDistribution: true
   }
)

~~The output for this example resembles the following:~~此示例的输出类似于以下内容：

{
   "readDistribution" : {
     "sampleSize" : {
       "total" : 61363,
       "find" : 61363,
       "aggregate" : 0,
       "count" : 0,
       "distinct" : 0
     },
     "percentageOfSingleShardReads" : 50.0008148233,
     "percentageOfMultiShardReads" : 49.9991851768,
     "percentageOfScatterGatherReads" : 0,
     "numReadsByRange" : [
       688,
       775,
       737,
       776,
       652,
       671,
       1332,
       1407,
       535,
       428,
       985,
       573,
       1496,
       ...
       ],
     },
   "writeDistribution" : {
     "sampleSize" : {
       "total" : 49638,
       "update" : 30680,
       "delete" : 7500,
       "findAndModify" : 11458
     },
     "percentageOfSingleShardWrites" : 100,
     "percentageOfMultiShardWrites" : 0,
     "percentageOfScatterGatherWrites" : 0,
     "numWritesByRange" : [
       389,
       601,
       430,
       454,
       462,
       421,
       668,
       833,
       493,
       300,
       683,
       460,
       ...
      ],
      "percentageOfShardKeyUpdates" : 0,
      "percentageOfSingleWritesWithoutShardKey" : 0,
      "percentageOfMultiWritesWithoutShardKey" : 0
    }
}

Learn More了解更多

Back

addShardToZone

balancerCollectionStatus

analyzeShardKey (database command数据库命令)

Definition定义

Compatibility兼容性

Note

Syntax语法

Command Fields命令字段

Behavior行为

Metrics About Shard Key Characteristics关于分片关键特征的指标

Metrics About the Read and Write Distribution读写分布的度量

Non-Blocking Behavior非阻塞行为

Query Sampling查询采样

Supporting Indexes支持指标

Read Preference读取首选项

Limitations局限性

Access Control访问控制

Output输出

keyCharacteristics

readWriteDistribution

readDistribution Fields字段

writeDistribution Fields字段

Examples示例

Background Information背景信息

Sample Workload示例工作量

Workload Query Patterns工作负载查询模式

Note

{ _id: 1 } keyCharacteristics

{ lastName: 1 } keyCharacteristics

{ userId: 1 } keyCharacteristics

{ userId: 1 } readWriteDistribution

Learn More了解更多

`analyzeShardKey` (database command数据库命令)

`keyCharacteristics`

`readWriteDistribution`

`readDistribution` Fields字段

`writeDistribution` Fields字段

`{ _id: 1 } keyCharacteristics`

`{ lastName: 1 } keyCharacteristics`

`{ userId: 1 } keyCharacteristics`

`{ userId: 1 } readWriteDistribution`