Database Manual / Reference / Query Language / CRUD Commands

distinct (database command)(数据库命令)

Definition定义

distinct

Finds the distinct values for a specified field across a single collection. 在单个集合中查找指定字段的不同值。distinct returns a document that contains an array of the distinct values. The return document also contains an embedded document with query statistics and the query plan.返回一个包含不同值数组的文档。退货单还包含一个嵌入式文档,其中包含查询统计信息和查询计划。

Tip

In mongosh, this command can also be run through the db.collection.distinct() helper method.mongosh中,此命令也可以通过db.collection.distinct()辅助方法运行。

Helper methods are convenient for mongosh users, but they may not return the same level of information as database commands. In cases where the convenience is not needed or the additional return fields are required, use the database command.助手方法对mongosh用户来说很方便,但它们可能不会返回与数据库命令相同级别的信息。如果不需要便利性或需要额外的返回字段,请使用数据库命令。

Compatibility兼容性

This command is available in deployments hosted in the following environments:此命令在以下环境中托管的部署中可用:

  • MongoDB Atlas: The fully managed service for MongoDB deployments in the cloud:云中MongoDB部署的完全托管服务

Important

This command has limited support in M0 and Flex clusters. For more information, see Unsupported Commands.此命令在M0和Flex集群中的支持有限。有关详细信息,请参阅不支持的命令

  • MongoDB Enterprise: The subscription-based, self-managed version of MongoDB:MongoDB的基于订阅的自我管理版本
  • MongoDB Community: The source-available, free-to-use, and self-managed version of MongoDB:MongoDB的源代码可用、免费使用和自我管理版本

Syntax语法

The command has the following syntax:该命令具有以下语法:

db.runCommand(
{
distinct: "<collection>",
key: "<field>",
query: <query>,
readConcern: <read concern document>,
collation: <collation document>,
comment: <any>,
hint: <string or document>
}
)

Command Fields命令字段

The command takes the following fields:该命令包含以下字段:

Field字段Type类型Description描述
distinctstring字符串The name of the collection to query for distinct values.要查询不同值的集合的名称。
keystring字符串The field for which to return distinct values.返回不同值的字段。
querydocument文档Optional. A query that specifies the documents from which to retrieve the distinct values.可选。一个查询,指定从中检索不同值的文档。
readConcerndocument文档

Optional. Specifies the read concern.可选。指定读取关注

The readConcern option has the following syntax: readConcern选项具有以下语法:readConcern: { level: <value> }

Possible read concern levels are:可能的读取关注级别包括:

  • "local". This is the default read concern level for read operations against the primary and secondaries.。这是针对primary和secondary读取操作的默认读取关注级别。
  • "available". Available for read operations against the primary and secondaries. "available" behaves the same as "local" against the primary and non-sharded secondaries. The query returns the instance's most recent data.。可用于对primary和secondary进行读取操作。"available""local"在primary和非分片secondary上的行为相同。查询返回实例的最新数据。
  • "majority". Available for replica sets that use WiredTiger storage engine.。适用于使用WiredTiger存储引擎的副本集。
  • "linearizable". Available for read operations on the primary only.。仅适用于primary上的读取操作。
  • "snapshot". Available for multi-document transactions and certain read operations outside of multi-document transactions.。可用于多文档事务和多文档事务之外的某些读取操作。

For more formation on the read concern levels, see Read Concern Levels.有关读取关注级别的更多信息,请参阅读取关注级别

collationdocument文档

Optional.可选。

Specifies the collation to use for the operation.指定用于操作的排序规则

Collation排序规则 allows users to specify language-specific rules for string comparison, such as rules for lettercase and accent marks.允许用户为字符串比较指定特定于语言的规则,例如字母大小写和重音标记的规则。

The collation option has the following syntax:collation选项具有以下语法:

collation: {
locale: <string>,
caseLevel: <boolean>,
caseFirst: <string>,
strength: <int>,
numericOrdering: <boolean>,
alternate: <string>,
maxVariable: <string>,
backwards: <boolean>
}

When specifying collation, the locale field is mandatory; all other collation fields are optional. For descriptions of the fields, see Collation Document.指定排序规则时,locale字段是必填的;所有其他排序字段都是可选的。有关字段的描述,请参阅排序规则文档

If the collation is unspecified but the collection has a default collation (see db.createCollection()), the operation uses the collation specified for the collection.如果未指定排序规则,但集合具有默认排序规则(请参阅db.createCollection()),则操作将使用为集合指定的排序规则。

If no collation is specified for the collection or for the operations, MongoDB uses the simple binary comparison used in prior versions for string comparisons.如果没有为集合或操作指定排序规则,MongoDB将使用以前版本中用于字符串比较的简单二进制比较。

You cannot specify multiple collations for an operation. For example, you cannot specify different collations per field, or if performing a find with a sort, you cannot use one collation for the find and another for the sort.不能为操作指定多个排序规则。例如,您不能为每个字段指定不同的排序规则,或者如果使用排序执行查找,则不能对查找使用一个排序规则,对排序使用另一个。

commentany任意

Optional. A user-provided comment to attach to this command. Once set, this comment appears alongside records of this command in the following locations:可选。用户提供了要附加到此命令的注释。设置后,此注释将与此命令的记录一起出现在以下位置:

A comment can be any valid BSON type (string, integer, object, array, etc).注释可以是任何有效的BSON类型(字符串、整数、对象、数组等)。

hintstring or document字符串或文档

Optional. Specify the index name, either as a string or a document. If specified, the query planner only considers plans using the hinted index. 可选。指定索引名称,可以是字符串或文档。如果指定,查询计划器只考虑使用提示索引的计划。For more details, see Specify an Index.有关更多详细信息,请参阅指定索引

New in version 7.1.在版本7.1中新增。

Note

Results must not be larger than the maximum BSON size. 结果不得大于最大BSON大小If your results exceed the maximum BSON size, use the aggregation pipeline to retrieve distinct values using the $group operator, as described in Retrieve Distinct Values with the Aggregation Pipeline.<如果您的结果超过了最大BSON大小,请使用聚合管道使用$group运算符检索不同值,如使用聚合管道检索不同值中所述。

MongoDB also provides the shell wrapper method db.collection.distinct() for the distinct command. Additionally, many MongoDB drivers provide a wrapper method. Refer to the specific driver documentation.MongoDB还为distinct命令提供了shell包装器方法db.collection.distinct()。此外,许多MongoDB驱动程序都提供了包装器方法。请参阅特定的驱动程序文档。

Behavior行为

In a sharded cluster, the distinct command may return orphaned documents.分片集群中,distinct命令可能会返回孤立的文档。

For time series collections, the distinct command can't make efficient use of indexes. 对于时间序列集合distinct命令无法有效利用索引。Instead, use a $group aggregation to group documents by distinct values. 相反,使用$group聚合按不同的值对文档进行分组。For details, see Time Series Limitations.有关详细信息,请参阅时间序列限制

Array Fields数组字段

If the value of the specified field is an array, distinct considers each element of the array as a separate value.如果指定field的值是一个数组,distinct会将数组的每个元素视为一个单独的值。

For instance, if a field has as its value [ 1, [1], 1 ], then distinct considers 1, [1], and 1 as separate values.例如,如果一个字段的值为[ 1, [1], 1 ],则distinct1[1]1视为单独的值。

Starting in MongoDB 6.0, the distinct command returns the same results for collections and views when using arrays.从MongoDB 6.0开始,当使用数组时,distinct命令对集合和视图返回相同的结果。

For examples, see:例如,请参阅:

Index Use索引使用

When possible, distinct operations can use indexes.如果可能,distinct操作可以使用索引。

Indexes can also cover distinct operations. See Run Covered Queries for more information on queries covered by indexes.索引还可以涵盖distinct操作。有关索引覆盖的查询的更多信息,请参阅运行覆盖的查询

Transactions事务

To perform a distinct operation within a transaction:要在事务中执行不同的操作,请执行以下操作:

Important

In most cases, a distributed transaction incurs a greater performance cost over single document writes, and the availability of distributed transactions should not be a replacement for effective schema design. 在大多数情况下,分布式事务比单文档写入产生更大的性能成本,分布式事务的可用性不应取代有效的模式设计。For many scenarios, the denormalized data model (embedded documents and arrays) will continue to be optimal for your data and use cases. That is, for many scenarios, modeling your data appropriately will minimize the need for distributed transactions.对于许多场景,非规范化数据模型(嵌入式文档和数组)将继续是您的数据和用例的最佳选择。也就是说,对于许多场景,适当地对数据进行建模将最大限度地减少对分布式事务的需求。

For additional transactions usage considerations (such as runtime limit and oplog size limit), see also Production Considerations.有关其他事务使用注意事项(如运行时限制和oplog大小限制),另请参阅生产注意事项

Client Disconnection客户端断开连接

If the client that issued distinct disconnects before the operation completes, MongoDB marks distinct for termination using killOp.如果发出distinct的客户端在操作完成之前断开连接,MongoDB将使用killOp标记distinct以终止。

Replica Set Member State Restriction副本集成员状态限制

To run on a replica set member, distinct operations require the member to be in PRIMARY or SECONDARY state. If the member is in another state, such as STARTUP2, the operation errors.要在副本集成员上运行,distinct操作要求该成员处于PRIMARYSECONDARY状态。如果成员处于另一种状态,如STARTUP2,则操作错误。

Index Filters and Collations索引筛选器和排序

Starting in MongoDB 6.0, an index filter uses the collation previously set using the planCacheSetFilter command.从MongoDB 6.0开始,索引筛选器使用之前使用planCacheSetFilter命令设置的排序规则

Starting in MongoDB 8.0, use query settings instead of adding index filters. Index filters are deprecated starting in MongoDB 8.0.从MongoDB 8.0开始,使用查询设置,而不是添加索引筛选器。索引筛选器从MongoDB 8.0开始就被弃用。

Query settings have more functionality than index filters. Also, index filters aren't persistent and you cannot easily create index filters for all cluster nodes. 查询设置比索引筛选器具有更多功能。此外,索引筛选器不是持久的,您无法轻松为所有集群节点创建索引筛选器。To add query settings and explore examples, see setQuerySettings.要添加查询设置并探索示例,请参阅setQuerySettings

Query Settings查询设置

New in version 8.0.在版本8.0中新增。

You can use query settings to set index hints, set operation rejection filters, and other fields. 您可以使用查询设置来设置索引提示、设置操作拒绝筛选器和其他字段。The settings apply to the query shape on the entire cluster. The cluster retains the settings after shutdown.这些设置适用于整个集群上的查询形状。集群在关闭后保留设置。

The query optimizer uses the query settings as an additional input during query planning, which affects the plan selected to run the query. You can also use query settings to block a query shape.查询优化器在查询规划期间使用查询设置作为额外输入,这会影响为运行查询而选择的计划。您还可以使用查询设置来阻止查询形状。

To add query settings and explore examples, see setQuerySettings.要添加查询设置并探索示例,请参阅setQuerySettings

You can add query settings for find, distinct, and aggregate commands.您可以为finddistinctaggregate命令添加查询设置。

Query settings have more functionality and are preferred over deprecated index filters.查询设置具有更多功能,并且优于已弃用的索引筛选器

To remove query settings, use removeQuerySettings. 要删除查询设置,请使用removeQuerySettingsTo obtain the query settings, use a $querySettings stage in an aggregation pipeline.要获取查询设置,请在聚合管道中使用$querySettings阶段。

Examples示例

The examples use the inventory collection that contains the following documents:示例使用包含以下文档的inventory集合:

{ "_id": 1, "dept": "A", "item": { "sku": "111", "color": "red" }, "sizes": [ "S", "M" ] }
{ "_id": 2, "dept": "A", "item": { "sku": "111", "color": "blue" }, "sizes": [ "M", "L" ] }
{ "_id": 3, "dept": "B", "item": { "sku": "222", "color": "blue" }, "sizes": "S" }
{ "_id": 4, "dept": "A", "item": { "sku": "333", "color": "black" }, "sizes": [ "S" ] }

Return Distinct Values for a Field为字段返回不同的值

The following example returns the distinct values for the field dept from all documents in the inventory collection:以下示例返回inventory集合中所有文档的字段dept的不同值:

db.runCommand ( { distinct: "inventory", key: "dept" } )

The command returns a document with a field named values that contains the distinct dept values:该命令返回一个文档,其中包含一个名为values的字段,该字段包含不同的dept值:

{
"values" : [ "A", "B" ],
"ok" : 1
}

Return Distinct Values for an Embedded Field为嵌入式字段返回不同的值

The following example returns the distinct values for the field sku, embedded in the item field, from all documents in the inventory collection:以下示例返回inventory集合中所有文档中嵌入在项目字段中的字段sku的不同值:

db.runCommand ( { distinct: "inventory", key: "item.sku" } )

The command returns a document with a field named values that contains the distinct sku values:该命令返回一个文档,其中包含一个名为values的字段,该字段包含不同的sku值:

{
"values" : [ "111", "222", "333" ],
"ok" : 1
}

Tip

Dot Notation for information on accessing fields within embedded documents嵌入式文档中访问字段信息的点符号

Return Distinct Values for an Array Field为数组字段返回不同的值

The following example returns the distinct values for the field sizes from all documents in the inventory collection:以下示例返回inventory集合中所有文档的字段sizes的不同值:

db.runCommand ( { distinct: "inventory", key: "sizes" } )

The command returns a document with a field named values that contains the distinct sizes values:该命令返回一个文档,其中包含一个名为values的字段,该字段包含不同的sizes值:

{
"values" : [ "M", "S", "L" ],
"ok" : 1
}

For information on distinct and array fields, see the Behavior section.有关distinct字段和数组字段的信息,请参阅行为部分。

Arrays in Collections and Views集合和视图中的数组

Starting in MongoDB 6.0, the distinct command returns the same results for collections and views when using arrays.从MongoDB 6.0开始,当使用数组时,distinct命令对集合和视图返回相同的结果。

The following example creates a collection named sensor with an array of temperature values for each document:以下示例创建了一个名为sensor的集合,其中包含每个文档的温度值数组:

db.sensor.insertMany( [
{ _id: 0, temperatures: [ { value: 1 }, { value: 4 } ] },
{ _id: 1, temperatures: [ { value: 2 }, { value: 8 } ] },
{ _id: 2, temperatures: [ { value: 3 }, { value: 12 } ] },
{ _id: 3, temperatures: [ { value: 1 }, { value: 4 } ] }
] )

The following example creates a view named sensorView from the sensor collection:以下示例从sensor(传感器)集合中创建了一个名为sensorView的视图:

db.createView( "sensorView", "sensor", [] )

The following example uses distinct to return the unique values from the temperatures array in the sensor collection:以下示例使用distinctsensor(传感器)集合中的temperatures(温度)数组返回唯一值:

db.sensor.distinct( "temperatures.1.value" )

The 1 in temperatures.1.value specifies the temperatures array index.temperatures.1.value中的1指定了temperatures(温度)数组索引。

Example output:输出示例:

[ 4, 8, 12 ]

Example for sensorView:sensorView示例:

db.sensorView.distinct( "temperatures.1.value" )

Example output:输出示例:

  • [ 4, 8, 12 ] starting in MongoDB 6.0 (identical to result returned from the sensor collection).从MongoDB 6.0开始(与sensor(传感器)集合返回的结果相同)。
  • [] in MongoDB versions earlier than 6.0.[]在早于6.0的MongoDB版本中。

Specify Query with distinctdistinct指定查询

The following example returns the distinct values for the field sku, embedded in the item field, from the documents whose dept is equal to "A":以下示例从dept等于"A"的文档中返回嵌入在item字段中的字段sku的不同值:

db.runCommand ( { distinct: "inventory", key: "item.sku", query: { dept: "A"} } )

The command returns a document with a field named values that contains the distinct sku values:该命令返回一个文档,其中包含一个名为values的字段,该字段包含不同的sku值:

{
"values" : [ "111", "333" ],
"ok" : 1
}

Specify a Collation指定排序规则

Collation排序规则 allows users to specify language-specific rules for string comparison, such as rules for lettercase and accent marks.允许用户为字符串比较指定特定于语言的规则,例如字母大小写和重音标记的规则。

A collection myColl has the following documents:myColl集合有以下文件:

{ _id: 1, category: "café", status: "A" }
{ _id: 2, category: "cafe", status: "a" }
{ _id: 3, category: "cafE", status: "a" }

The following aggregation operation includes the Collation option:以下聚合操作包括排序规则选项:

db.runCommand(
{
distinct: "myColl",
key: "category",
collation: { locale: "fr", strength: 1 }
}
)

For descriptions on the collation fields, see Collation Document.有关排序规则字段的说明,请参阅排序规则文档

Override Default Read Concern覆盖默认读取关注

To override the default read concern level of "local", use the readConcern option.要覆盖默认的读取关注级别"local",请使用readConcern选项。

The following operation on a replica set specifies a Read Concern of "majority" to read the most recent copy of the data confirmed as having been written to a majority of the nodes.对副本集的以下操作指定读取关注"majority",以读取已确认已写入大多数节点的数据的最新副本。

Note

Regardless of the read concern level, the most recent data on a node may not reflect the most recent version of the data in the system.无论读取关注级别如何,节点上的最新数据可能不会反映系统中数据的最新版本。

db.runCommand(
{
distinct: "restaurants",
key: "rating",
query: { cuisine: "italian" },
readConcern: { level: "majority" }
}
)

To ensure that a single thread can read its own writes, use "majority" read concern and "majority" write concern against the primary of the replica set.为了确保单个线程可以读取自己的写入,请对副本集的主线程使用"majority"读取关注和"majority"写入关注。

Specify an Index指定索引

You can specify an index name or pattern using the hint option.您可以使用提示选项指定索引名称或模式。

To specify a hint based on an index name:要基于索引名称指定提示,请执行以下操作:

db.runCommand ( { distinct: "inventory", key: "dept", hint: "sizes" } )

To specify a hint based on an index pattern:要基于索引模式指定提示,请执行以下操作:

db.runCommand ( { distinct: "inventory", key: "dept", hint: { sizes: 1 } } )