Database Manual / Reference / Query Language / Aggregation Stages

$collStats (aggregation stage)(聚合阶段)

Definition定义

$collStats

Returns statistics regarding a collection or view.返回有关集合或视图的统计信息。

The $collStats stage has the following prototype form:$collStats阶段具有以下原型形式:

{
$collStats:
{
latencyStats: { histograms: <boolean> },
storageStats: { scale: <number> },
count: {},
queryExecStats: {}
}
}

The $collStats stage accepts an argument document with the following optional fields:$collStats阶段接受具有以下可选字段的参数文档:

Field Name字段名称Description描述
latencyStatsAdds latency statistics to the return document.延迟统计信息添加到返回文档中。
latencyStats.histogramsAdds latency histogram information to the embedded documents in latencyStats if true.如果为true,则将延迟直方图信息添加到latencyStats中的嵌入式文档中。
storageStatsAdds storage statistics to the return document.存储统计信息添加到返回文档中。
  • Specify an empty document (i.e. storageStats: {}) to use the default scale factor of 1 for the various size data. Scale factor of 1 displays the returned sizes in bytes.指定一个空文档(即storageStats: {}),对各种大小的数据使用默认的比例因子1。比例因子1以字节为单位显示返回的大小。
  • Specify the scale factor (i.e. storageStats: { scale: <number> }) to use the specified scale factor for the various size data. For example, to display kilobytes rather than bytes, specify a scale value of 1024.指定比例因子(即storageStats: { scale: <number> }),以便对各种大小的数据使用指定的比例因子。例如,要显示千字节而不是字节,请指定1024的刻度值。
    If you specify a non-integer scale factor, MongoDB uses the integer part of the specified factor. For example, if you specify a scale factor of 1023.999, MongoDB uses 1023 as the scale factor.如果指定非整数比例因子,MongoDB将使用指定因子的整数部分。例如,如果指定比例因子1023.999,MongoDB将使用1023作为比例因子。
    The scale factor does not affect those sizes that specify the unit of measurement in the field name, such as "bytes currently in the cache".比例因子不会影响字段名称中指定度量单位的大小,例如"bytes currently in the cache"
countAdds the total number of documents in the collection to the return document.将集合中的文档总数添加到退货文档中。
The count is based on the collection's metadata, which provides a fast but sometimes inaccurate count for sharded clusters.计数基于集合的元数据,该元数据为分片集群提供了快速但有时不准确的计数。
See count Field参阅计数字段
queryExecStatsAdds query execution statistics to the return document.查询执行统计信息添加到返回文档中。

For a collection in a replica set or a non-sharded collection in a cluster, $collStats outputs a single document. 对于副本集中的集合或集群中的非分片集合$collStats输出一个文档。For a sharded collection, $collStats outputs one document per shard. The output document includes the following fields:对于分片集合$collStats每个分片输出一个文档。输出文档包括以下字段:

Field Name字段名Description描述
nsThe namespace of the requested collection or view.请求的集合或视图的命名空间
shardThe name of the shard the output document corresponds to.输出文档对应的分片的名称。
Only present when $collStats runs on a sharded cluster. Both sharded and non-sharded collections will produce this field.仅在$collStats在分片集群上运行时出现。分片和非分片集合都会产生这个字段。
hostThe hostname and port of the mongod process which produced the output document.生成输出文档的mongod进程的主机名和端口。
localTimeThe current time on the MongoDB server, expressed as UTC milliseconds since the UNIX epoch.MongoDB服务器上的当前时间,表示为自UNIX纪元以来的UTC毫秒。
latencyStatsStatistics related to request latency for a collection or view. 与集合或视图的请求延迟相关的统计信息。See latencyStats Document for details on this document.有关此文档的详细信息,请参阅latencyStats文档
Only present when the latencyStats: {} option is specified.仅当指定latencyStats: {}选项时才显示。
storageStatsStatistics related to a collection's storage engine. See storageStats Document for details on this document.与集合的存储引擎相关的统计信息。有关此文档的详细信息,请参阅storageStats文档

The various size data is scaled by the specified factor (with the exception of those sizes that specify the unit of measurement in the field name).各种大小的数据按指定的因子进行缩放(在字段名称中指定测量单位的大小除外)。

Only present when the storageStats option is specified.仅在指定storageStats选项时显示。

Returns an error if applied to a view.如果应用于视图,则返回错误。

countThe total number of documents in the collection. This data is also available in storageStats.count.集合中的文档总数。此数据也可在storageStatscount中获得。
The count is based on the collection's metadata, which provides a fast but sometimes inaccurate count for sharded clusters.计数基于集合的元数据,该元数据为分片集群提供了快速但有时不准确的计数。
Only present when the count: {} option is specified. Returns an error if applied to a view.仅当指定了count: {}选项时才显示。如果应用于视图,则返回错误。
queryExecStatsStatistics related to query execution for the collection.与集合的查询执行相关的统计信息。
Only present when the queryExecStats: {} option is specified. Returns an error if applied to a view.仅当指定了queryExecStats: {}选项时才显示。如果应用于视图,则返回错误。

Behavior行为

$collStats must be the first stage in an aggregation pipeline, or else the pipeline returns an error.$collStats必须是聚合管道的第一阶段,否则管道将返回错误。

Accuracy After Unexpected Shutdown意外停机后的准确性

After an unclean shutdown of a mongod using the Wired Tiger storage engine, size and count statistics reported by $collStats may be inaccurate.在使用Wired Tiger存储引擎对mongod进行不干净的关闭后,$collStats报告的大小和计数统计数据可能不准确。

The amount of drift depends on the number of insert, update, or delete operations performed between the last checkpoint and the unclean shutdown. 漂移量取决于在最后一个检查点和不干净关闭之间执行的插入、更新或删除操作的数量。Checkpoints usually occur every 60 seconds. However, mongod instances running with non-default --syncdelay settings may have more or less frequent checkpoints.检查点通常每60秒出现一次。然而,使用非默认的--syncdelay设置运行的mongod实例可能或多或少地具有检查点。

Run validate on each collection on the mongod to restore statistics after an unclean shutdown.mongod上的每个集合上运行validate,以在不干净关闭后恢复统计数据。

After an unclean shutdown:不干净停机后:

Redaction补救措施

When using Queryable Encryption, $collStats output redacts certain information for encrypted collections:使用可查询加密时,$collStats输出会编辑加密集合的某些信息:

  • The output omits 输出省略"queryExecStats"
  • The output omits 输出省略"latencyStats"
  • The output redacts "WiredTiger", if present, to include only the url field.输出会编辑"WiredTiger"(如果存在),使其仅包含url字段。

Transactions事务

$collStats is not allowed in transactions.事务中不允许使用$collStats

Output输出

latencyStats DocumentlatencyStats文档

The latencyStats embedded document only exists in the output if you specify the latencyStats option.如果指定latencyStats选项,latencyStats嵌入文档仅存在于输出中。

Field Name字段名Description描述
readsLatency statistics for read requests.读取请求的延迟统计信息。
writesLatency statistics for write requests.写入请求的延迟统计信息。
commandsLatency statistics for database commands.数据库命令的延迟统计。
transactionsLatency statistics for database transactions.数据库事务的延迟统计。

Each of these fields contains an embedded document with the following fields:每个字段都包含一个嵌入式文档,其中包含以下字段:

Field Name字段名Description描述
latencyThe total latency, in microseconds.总延迟,单位为微秒。
opsThe total number of operations performed on the collection since startup.自启动以来对集合执行的操作总数。
histogramAn array of embedded documents, each representing a latency range. Each document covers twice the previous document's range. 一组嵌入式文档,每个文档代表一个延迟范围。每个文档覆盖的范围是前一个文档的两倍。For lower values between 2048 microseconds and roughly 1 second, the histogram includes half-steps.对于2048微秒到大约1秒之间的较低值,直方图包括半步。

This field only exists given the latencyStats: { histograms: true } option. Empty ranges with a zero count are omitted from the output.此字段仅在给定latencyStats: { histograms: true }选项时存在。输出中省略了count为零的空范围。

Each document has the following fields:每个文档都有以下字段:

Field Name字段名Description描述
microsThe inclusive lower bound of the current latency range, in microseconds.当前延迟范围的下限,单位为微秒。
The document's range spans between the previous document's micros value, exclusive, and this document's micros value, inclusive.文档的范围介于前一个文档的micros值(排除)和此文档的micros值(包含)之间。
countThe number of operations with latency less than or equal to micros.延迟小于或等于micros的操作数。

For example, if collStats returns the following histogram:例如,如果collStats返回以下直方图:

histogram: [
{ micros: Long(0), count: Long(10) },
{ micros: Long(2), count: Long(1) },
{ micros: Long(4096), count: Long(1) },
{ micros: Long(16384), count: Long(1000) },
{ micros: Long(49152), count: Long(100) }
]
This indicates that there were [1]:这表明存在[1]
  • 10 operations taking 2 microsecond or less10次操作耗时2微秒或更短
  • 1 operation in the range [2, 4) microseconds[2, 4)微秒范围内执行1次操作
  • 1 operation in the range [4096, 6144) microseconds[4096, 6144)微秒范围内执行1次操作
  • 1000 operations in the range [16384, 24576) microseconds[16384, 24576)微秒范围内执行1000次操作
  • 100 operations in the range [49152, 65536) microseconds[49152, 65536)微秒范围内执行100次操作
[1]
  • The ( symbol notation on this page means the value is exclusive.本页上的(符号表示该值是排除的。
  • The ] symbol notation on this page means the value is inclusive.本页上的]符号表示该值是包含的。

High-Latency $lookup Operations高延迟$lookup操作

Some high-latency $lookup operations may not generate a slow query log for the foreign collection. 某些高延迟$lookup操作可能不会为外部集合生成慢速查询日志。This can occur because slow query logs correspond with operations that are reported in the database profiler, whereas latency metrics increment only when a collection lock is acquired.这可能是因为慢速查询日志与数据库分析器中报告的操作相对应,而延迟指标仅在获取集合锁时才会增加。

If the $lookup query on a shard can perform a local read, the $lookup doesn't record a separate operation for querying the foreign collection. 如果分片上的$lookup查询可以执行本地读取,则$lookup不会记录查询外部集合的单独操作。A local read refers to when the query on the foreign collection targets only the same shard where the current operation is being executed. 本地读取是指对外部集合的查询仅针对执行当前操作的同一分片。As a result, the $lookup operation increases the $collStats latency metrics and operation counts, but does not generate a slow query log for the foreign collection.因此,$lookup操作增加了$collStats的延迟度量和操作计数,但不会为外部集合生成慢速查询日志。

storageStats DocumentstorageStats文档

The storageStats embedded document only exists in the output if you specify the storageStats option.如果指定storageStats选项,则storageStats嵌入文档仅存在于输出中。

The contents of this document are dependent on the storage engine in use. See Output输出 for a reference on this document.本文档的内容取决于所使用的存储引擎。有关本文档的参考,请参阅输出

storageStats Output on Time Series CollectionsstorageStats时间序列集合的输出

When you run $collStats on a time series collection with the storageStats: {} option, the output includes time series data.当您使用storageStats:{}选项对时间序列集合运行$collStats时,输出将包括时间序列数据。

To learn more about the fields returned in the timeseries: {} document, see bucketCatalog.要了解更多关于timeseries: {}文档中返回的字段的信息,请参阅bucketCatalog

Performing $collStats with the storageStats option on a view results in an error.在视图上使用storageStats选项执行$collStats会导致错误。

count Field计数字段

The count field only exists in the output if you specify the count option.仅当指定count选项时,count字段才存在于输出中。

Note

The count is based on the collection's metadata, which provides a fast but sometimes inaccurate count for sharded clusters.计数基于集合的元数据,该元数据为分片集群提供了快速但有时不准确的计数。

The total number of documents in the collection is also available as storageStats.count when storageStats: {} is specified. For more information, see storageStats Document.当指定storageStats: {}时,集合中的文档总数也可用作storageStats.count。有关更多信息,请参阅storageStats文档

queryExecStats Document文档

The queryExecStats embedded document only exists in the output if you specify the queryExecStats option. It includes an embedded collectionScans document with the following fields:如果指定queryExecStats选项,则queryExecStat嵌入文档仅存在于输出中。它包括一个嵌入式collectionScans文档,其中包含以下字段:

Field Name字段名Description描述
totalThe total number of queries that performed a collection scan.执行集合扫描的查询总数。
nonTailableThe number of queries that performed a collection scan, but didn't use a tailable cursor.执行集合扫描但未使用可跟踪游标的查询数。

Examples示例

MongoDB Shell

latencyStats

If you run $collStats with the latencyStats: {} option on a matrices collection:如果在matrices集合上运行带有latencyStats:{}选项的$collStats

db.matrices.aggregate( [ { $collStats: { latencyStats: { histograms: true } } } ] )

The query returns a result similar to the following:查询返回类似于以下内容的结果:

{ "ns" : "test.matrices",
"host" : "mongo.example.net:27017",
"localTime" : ISODate("2017-10-06T19:43:56.599Z"),
"latencyStats" :
{ "reads" :
{ "histogram" : [
{ "micros" : Long(16),
"count" : Long(3) },
{ "micros" : Long(32),
"count" : Long(1) },
{ "micros" : Long(128),
"count" : Long(1) } ],
"latency" : Long(264),
"ops" : Long(5) },
"writes" :
{ "histogram" : [
{ "micros" : Long(32),
"count" : Long(1) },
{ "micros" : Long(64),
"count" : Long(3) },
{ "micros" : Long(24576),
"count" : Long(1) } ],
"latency" : Long(27659),
"ops" : Long(5) },
"commands" :
{ "histogram" : [
{
"micros" : Long(196608),
"count" : Long(1)
}
],
"latency" : Long(0),
"ops" : Long(0) },
"transactions" : {
"histogram" : [ ],
"latency" : Long(0),
"ops" : Long(0)
}
}
}

storageStats

If you run $collStats with the storageStats: {} option on a matrices collection using the WiredTiger Storage Engine:如果使用WiredTiger存储引擎matrices集合上运行带有storageStats:{}选项的$collStats

db.matrices.aggregate( [ { $collStats: { storageStats: { } } } ] )

The query returns a result similar to the following:查询返回类似于以下内容的结果:

  {
localTime : 2020-03-06T01:44:57.437Z,
storageStats: {
size: 608500363,
count: 1104369,
avgObjectSize: 550,
storageSize: 4096,
freeStorageSize: 2490380,
capped: false,
wiredTiger : {
...
},
nindexes : 2,
indexDetails : {
...
},
indexBuilds: [
_id_1_abc_1
],
totalIndexSize: 260337664,
indexSizes: {
_id_ : 9891840,
_id_1_abc_1 : 250445824
},
totalSize: 613216256,
scaleFactor : 1
},
host : 'mongo.example.net:27017',
ns : 'test.matrices'
}

storageStats on Time Series Collectionsstorage时间序列集合统计信息

The following command runs $collStats with the storageStats: {} option on a weather time series collection using the WiredTiger Storage Engine, and filters for only time series data:以下命令使用WiredTiger存储引擎在天气时间序列集合上运行带有storageStats:{}选项的$collStats,并仅筛选时间序列数据:

db.weather.aggregate( [ { $collStats: { storageStats: { } } } ] ).toArray()[0].storageStats.timeseries

The query returns a result similar to the following, which includes time series data for internal diagnostic use:查询返回类似于以下内容的结果,其中包括用于内部诊断的时间序列数据:

{
bucketsNs: 'test.weather',
bucketCount: 12,
avgBucketSize: 300,
numActiveBuckets: 1,
numBucketInserts: 12,
numBucketUpdates: 0,
numBucketsOpenedDueToMetadata: 1,
numBucketsClosedDueToCount: 0,
numBucketsClosedDueToSchemaChange: 0,
numBucketsClosedDueToSize: 0,
numBucketsClosedDueToTimeForward: 11,
numBucketsClosedDueToMemoryThreshold: 0,
numCommits: 12,
numMeasurementsGroupCommitted: 0,
numWaits: 0,
numMeasurementsCommitted: 12,
avgNumMeasurementsPerCommit: 1,
numBucketsClosedDueToReopening: 0,
numBucketsArchivedDueToMemoryThreshold: 0,
numBucketsArchivedDueToTimeBackward: 0,
numBucketsReopened: 0,
numBucketsKeptOpenDueToLargeMeasurements: 0,
numBucketsClosedDueToCachePressure: 0,
numBucketsFrozen: 0,
numCompressedBucketsConvertedToUnsorted: 0,
numBucketsFetched: 0,
numBucketsQueried: 0,
numBucketFetchesFailed: 0,
numBucketQueriesFailed: 1,
numBucketReopeningsFailed: 0,
numDuplicateBucketsReopened: 0
}

Note

In-progress Indexes进行中索引

The returned storageStats includes information about indexes being built. For details, see:返回的storageStats包括有关正在构建的索引的信息。有关详细信息,请参阅:

count

If you run $collStats with the count: {} option on a matrices collection:如果在matrices集合上运行带有count:{}选项的$collStats

db.matrices.aggregate( [ { $collStats: { count: { } } } ] )

The query returns a result similar to the following:查询返回类似于以下内容的结果:

{
"ns" : "test.matrices",
"host" : "mongo.example.net:27017",
"localTime" : ISODate("2017-10-06T19:43:56.599Z"),
"count" : 1103869
}

queryExecStats

If you run $collStats with the queryExecStats: {} option on a matrices collection:如果在matrices集合上使用queryExecStats:{}选项运行$collStats

db.matrices.aggregate( [ { $collStats: { queryExecStats: { } } } ] )

The query returns a result similar to the following:查询返回类似于以下内容的结果:

{
"ns": "test.matrices",
"host": "mongo.example.net:27017",
"localTime": ISODate("2020-06-03T14:23:29.711Z"),
"queryExecStats": {
"collectionScans": {
"total": Long(33),
"nonTailable": Long(31)
}
}
}

$collStats on Sharded Collections分片集合上的$collStats

$collStats outputs one document per shard when run on sharded collections. Each output document contains a shard field with the name of the shard the document corresponds to.$collStats分片集合上运行时,每个分片输出一个文档。每个输出文档都包含一个shard字段,其中包含文档对应的分片的名称。

For example, if you run $collStats on a sharded collection with the count: {} option on a collection named matrices:例如,如果在名为matrices的集合上使用count: {}选项对分片集合运行$collStats

db.matrices.aggregate( [ { $collStats: { count: { } } } ] )

The query returns a result similar to the following:查询返回类似于以下内容的结果:

{
"ns" : "test.matrices",
"shard" : "s1",
"host" : "s1-mongo1.example.net:27017",
"localTime" : ISODate("2017-10-06T15:14:21.258Z"),
"count" : 661705
}
{
"ns" : "test.matrices",
"shard" : "s2",
"host" : "s2-mongo1.example.net:27017",
"localTime" : ISODate("2017-10-06T15:14:21.258Z"),
"count" : 442164
}
Node.js

To use the MongoDB Node.js driver to add a $collStats stage to an aggregation pipeline, use the $collStats operator in a pipeline object.要使用MongoDB Node.js驱动程序将$collStats阶段添加到聚合管道中,请在管道对象中使用$collStatis运算符。

The following examples demonstrate how to use the options available for the $collStats stage.以下示例演示了如何使用$collStats阶段可用的选项。

latencyStats

The following example creates and runs a $collStats pipeline stage with the latencyStats option:以下示例创建并运行具有latencyStats选项的$collStats管道阶段:

const pipeline = [
{
$collStats: {
latencyStats: {histograms: true}
}
}
];

const cursor = collection.aggregate(pipeline);
return cursor;

storageStats

The following example creates and runs a $collStats pipeline stage with the storageStats option:以下示例创建并运行一个带有storageStats选项的$collStats管道阶段:

const pipeline = [{ $collStats: { storageStats: {} } }]; 

const cursor = collection.aggregate(pipeline);
return cursor;

storageStats on Time Series Collections时间序列集合上的storageStats

The following example creates and runs a $collStats pipeline stage with the storageStats option on a time series collection and filters for only time series data:以下示例在时间序列集合上创建并运行带有storageStats选项的$collStats管道阶段,并仅筛选时间序列数据:

const pipeline = [{ $collStats: { storageStats: {} } }];

const cursor = collection.aggregate(pipeline);
const timeSeriesStats = resultsTimeSeries[0].storageStats.timeseries;

return timeSeriesStats;

count

The following example runs a $collStats pipeline stage with the count option on a collection:以下示例在集合上运行一个带有count选项的$collStats管道阶段:

const pipeline = [{ $collStats: { count: {} } }]; 

const cursor = collection.aggregate(pipeline);
return cursor;

queryExecStats

The following example creates and runs a $collStats pipeline stage with the queryExecStats option:以下示例创建并运行一个带有queryExecStats选项的$collStats管道阶段:

const pipeline = [{ $collStats: { queryExecStats: {} } }]; 

const cursor = collection.aggregate(pipeline);
return cursor;

Note

Sharded Collections分片化集合

$collStats outputs one document per shard when run on sharded collections. $collStats分片集合上运行时,每个分片输出一个文档。Each output document contains a shard field with the name of the shard the document corresponds to.每个输出文档都包含一个分片字段,其中包含文档对应的分片的名称。