$sampleRate (aggregation)

~~On this page~~本页内容

~~Definition~~定义
~~Behavior~~行为
~~Examples~~示例

Definition定义

$sampleRate

New in version 4.4.2.在版本4.4.2中新增。

~~Matches a random selection of input documents.~~ 匹配随机选择的输入文档。~~The number of documents selected approximates the sample rate expressed as a percentage of the total number of documents.~~所选文档的数量近似于以文档总数百分比表示的采样率。

~~The $sampleRate operator has the following syntax:~~$sampleRate运算符语法如下：

{ $sampleRate: <non-negative float> }

Behavior行为

~~The selection process uses a uniform random distribution.~~ 选择过程使用均匀随机分布。~~The sample rate is a floating point number between 0 and 1, inclusive, which represents the probability that a given document will be selected as it passes through the pipeline.~~采样率是一个介于0和1之间的浮点数，包含0和1，表示给定文档通过管道时被选中的概率。

~~For example, a sample rate of 0.33 selects roughly one document in three.~~例如，采样率为0.33时，大约三分之一的文档被选中。

~~This expression:~~此表达式：

{ $match: { $sampleRate: 0.33 } }

~~is equivalent to using the $rand operator as follows:~~相当于使用$rand运算符，如下所示：

{ $match: { $expr: { $lt: [ { $rand: {} }, 0.33 ] } } }

~~Repeated runs on the same data will produce different outcomes since the selection process is non-deterministic.~~ 对同一数据重复运行将产生不同的结果，因为选择过程是不确定的。~~In general, smaller datasets will show more variability in the number of documents selected on each run.~~ 通常，较小的数据集将显示每次运行时所选文档数的更多可变性。~~As collection size increases, the number of documents chosen will approach the expected value for a uniform random distribution.~~随着集合大小的增加，所选文档的数量将接近均匀随机分布的预期值。

~~Note~~注意

~~If an exact number of documents is required from each run, the $sample operator should be used instead of $sampleRate.~~如果每次运行都需要确切数量的文档，则应使用$sample运算符而不是$sampleRate。

Examples示例

~~This code creates a small collection with 100 documents.~~这段代码创建了一个包含100个文档的小集合。

N = 100
bulk = db.collection.initializeUnorderedBulkOp()
for ( i = 0; i < N; i++) { bulk.insert( {_id: i, r: 0} ) }
bulk.execute()

~~The $sampleRate operator can be used in a pipeline to select random documents from the collection.~~ 可以在管道中使用$sampleRate运算符从集合中选择随机文档。~~In this example we use $sampleRate to select about one third of the documents.~~在本例中，我们使用$sampleRate选择大约三分之一的文档。

db.collection.aggregate(
   [
     { $match: { $sampleRate: 0.33 } },
     { $count: "numMatches" }
   ]
)

~~This is the output from 5 runs on the sample collection:~~这是样本集合上5次运行的输出：

{ "numMatches" : 38 }
{ "numMatches" : 36 }
{ "numMatches" : 29 }
{ "numMatches" : 29 }
{ "numMatches" : 28 }

~~Tip~~提示

~~See also:~~ 参阅：

← $rtrim (aggregation)$second (aggregation) →