Database Manual / Reference / Query Language / Aggregation Stages

$shardedDataDistribution (aggregation stage)(聚合阶段)

Definition定义

$shardedDataDistribution

New in version 6.0.3.在版本6.0.3中新增。

Returns information on the distribution of data in sharded collections.返回分片集合中数据分布的信息。

Note

This aggregation stage is only available on mongos.此聚合阶段仅适用于mongos

This aggregation stage must be run on the admin database. The user must have the shardedDataDistribution privilege action.此聚合阶段必须在admin数据库上运行。用户必须具有shardedDataDistribution权限操作。

Syntax语法

The shardedDataDistribution stage has the following syntax:shardedDataDistribution阶段具有以下语法:

db.aggregate( [
{ $shardedDataDistribution: { } }
] )

Output Fields输出字段

The $shardedDataDistribution stage outputs an array of documents for each sharded collection in the database. These documents contain the following fields:$shardedDataDistribution阶段为数据库中的每个分片集合输出一组文档。这些文档包含以下字段:

Field Name字段名称Data Type数据类型Description描述
nsstring字符串Namespace of the sharded collection.分片集合的命名空间。
shardsarray数组Shards in the collection with the data distribution information for each shard.集合中的分片,每个分片都有数据分布信息。
shards.numOrphanedDocsinteger整数

Number of orphaned documents in the shard.分片中孤立文档的数量。

For time series collections, numOrphanedDocs contains the number of orphaned measurement buckets in the shard.对于时间序列集合numOrphanedDocs包含分片中孤立度量桶的数量。

shards.numOwnedDocumentsinteger整数

Number of documents owned by the shard.分片所拥有的文档数量。

For time series collections, numOwnedDocuments contains the number of measurement buckets in the shard.对于时间序列集合numOwnedDocuments包含分片中的度量桶数量。

shards.ownedSizeBytesinteger整数Size in bytes of documents owned by the shard when uncompressed.未压缩时分片拥有的文档的大小(以字节为单位)。
shards.orphanedSizeBytesinteger整数Size in bytes of orphaned documents in the shard when uncompressed.未压缩时分片中孤立文档的大小(以字节为单位)。

Starting in MongoDB 8.0, $shardedDataDistribution only returns output for a collection's primary shard if the primary shard has chunks or orphaned documents.从MongoDB 8.0开始,如果主分片有孤立文档$sharededDataDistribution只会返回集合primary分片的输出。

Behavior行为

After an unclean shutdown of a mongod using the Wired Tiger storage engine, size and count statistics reported by $shardedDataDistribution may be inaccurate.在使用Wired Tiger存储引擎不干净地关闭mongod后,$shardedDataDistribution报告的大小和计数统计数据可能不准确。

The amount of drift depends on the number of insert, update, or delete operations performed between the last checkpoint and the unclean shutdown. 漂移量取决于在最后一个检查点和不干净关闭之间执行的插入、更新或删除操作的数量。Checkpoints usually occur every 60 seconds. However, mongod instances running with non-default --syncdelay settings may have more or less frequent checkpoints.检查点通常每60秒出现一次。然而,使用非默认的--syncdelay设置运行的mongod实例可能或多或少地具有检查点。

Run validate on each collection on the mongod to restore statistics after an unclean shutdown.mongod上的每个集合上运行validate,以在不干净关闭后恢复统计数据。

After an unclean shutdown:不干净停机后:

Examples示例

MongoDB Shell

Return All Sharded Data Distibution Metrics返回所有分片数据分布指标

To return all sharded data distribution metrics, run the following:要返回所有分片数据分布指标,请运行以下命令:

db.aggregate([
{ $shardedDataDistribution: { } }
])

Example output:输出示例:

[
{
"ns": "test.names",
"shards": [
{
"shardName": "shard-1",
"numOrphanedDocs": 0,
"numOwnedDocuments": 6,
"ownedSizeBytes": 366,
"orphanedSizeBytes": 0
},
{
"shardName": "shard-2",
"numOrphanedDocs": 0,
"numOwnedDocuments": 6,
"ownedSizeBytes": 366,
"orphanedSizeBytes": 0
}
]
}
]

Return Metrics for a Specific Shard特定分片的回报指标

To return sharded data distribution metrics for a specific shard, run the following:要返回特定分片的分片数据分布指标,请运行以下命令:

db.aggregate([
{ $shardedDataDistribution: { } },
{ $match: { "shards.shardName": "<name of the shard>" } }
])

Return Metrics for a Namespace命名空间的返回度量

To return sharded data distribution data for a namespace, run the following:要返回命名空间的分片数据分发数据,请运行以下命令:

db.aggregate([
{ $shardedDataDistribution: { } },
{ $match: { "ns": "<database>.<collection>" } }
])

Confirm No Orphaned Documents Remain确认无遗留孤立文件

Starting in MongoDB 6.0.3, you can run an aggregation using the $shardedDataDistribution stage to confirm no orphaned documents remain:从MongoDB 6.0.3开始,您可以使用$sharededDataDistribution阶段运行聚合,以确认没有孤立文档残留:

db.aggregate([
{ $shardedDataDistribution: { } },
{ $match: { "ns": "<database>.<collection>" } }
])

$shardedDataDistribution has output similar to the following:输出类似于以下内容:

[
{
"ns": "test.names",
"shards": [
{
"shardName": "shard-1",
"numOrphanedDocs": 0,
"numOwnedDocuments": 6,
"ownedSizeBytes": 366,
"orphanedSizeBytes": 0
},
{
"shardName": "shard-2",
"numOrphanedDocs": 0,
"numOwnedDocuments": 6,
"ownedSizeBytes": 366,
"orphanedSizeBytes": 0
}
]
}
]

Ensure that "numOrphanedDocs" is 0 for each shard in the cluster.确保集群中每个分片的"numOrphanedDocs"0

Node.js

To use the MongoDB Node.js driver to add a $shardedDataDistribution stage to an aggregation pipeline, use the $shardedDataDistribution operator in a pipeline object.要使用MongoDB Node.js驱动程序将$sharededDataDistribution阶段添加到聚合管道中,请在管道对象中使用$shardedDataDistribution运算符。

Return All Sharded Data Distribution Metrics返回所有分片数据分布指标

The following example creates a pipeline stage that returns information about the distribution of data in sharded collections. The example then runs the aggregation pipeline:以下示例创建了一个管道阶段,该阶段返回有关分片集合中数据分布的信息。然后,该示例运行聚合管道:

const pipeline = [{$shardedDataDistribution: {} }];

const adminDb = client.db("admin");
const cursor = adminDb.aggregate(pipeline);
return cursor;

Return Metrics for a Specific Shard特定分片的回报指标

The following example returns information about the distribution of data for a specific shard:以下示例返回特定分片的数据分布信息:

const pipeline = [
{ $shardedDataDistribution: {} },
{ $match: {"shards.shardName": 'atlas-kn29y8-shard-0'} }
];

const adminDb = client.db("admin");
const cursor = adminDb.aggregate(pipeline);
return cursor;

Return Metrics for a Namespace命名空间的返回度量

The following example returns information about the distribution of data for a specific namespace:以下示例返回特定命名空间的数据分布信息:

const pipeline = [
{ $shardedDataDistribution: {} },
{ $match: {"ns": "sample_mflix.movies"} }
];

const adminDb = client.db("admin");
const cursor = adminDb.aggregate(pipeline);
return cursor;