distinct

On this page本页内容

Definition定义

distinct

Finds the distinct values for a specified field across a single collection. 在单个集合中查找指定字段的不同值。distinct returns a document that contains an array of the distinct values. distinct返回包含不同值数组的文档。The return document also contains an embedded document with query statistics and the query plan.退货凭证还包含一个带有查询统计信息和查询计划的嵌入式文档。

The command takes the following form该命令采用以下形式

{
  distinct: "<collection>",
  key: "<field>",
  query: <query>,
  readConcern: <read concern document>,
  collation: <collation document>,
  comment: <any>
}

The command contains the following fields:该命令包含以下字段:

Field字段Type类型Description描述
distinctstringThe name of the collection to query for distinct values.要查询不同值的集合的名称。
keystringThe field for which to return distinct values.要为其返回不同值的字段。
querydocumentOptional. 可选。A query that specifies the documents from which to retrieve the distinct values.指定要从中检索不同值的文档的查询。
readConcerndocument

Optional. 可选。Specifies the read concern.指定读取关注

Starting in MongoDB 3.6, the readConcern option has the following syntax: readConcern: { level: <value> }从MongoDB 3.6开始,readConcern选项具有以下语法:readConcern: { level: <value> }

Possible read concern levels are:可能的读取问题级别包括:

  • "local". This is the default read concern level for read operations against the primary and secondaries.。这是针对主设备和辅助设备的读取操作的默认读取关注级别。
  • "available". Available for read operations against the primary and secondaries. 。可用于针对主要和次要磁盘的读取操作。"available" behaves the same as "local" against the primary and non-sharded secondaries. 对于主分区和非分片的次分区,其行为与"local"相同。The query returns the instance's most recent data.查询返回实例的最新数据。
  • "majority". Available for replica sets that use WiredTiger storage engine.。适用于使用WiredTiger存储引擎的副本集。
  • "linearizable". Available for read operations on the primary only.。仅适用于primary上的读取操作。

For more formation on the read concern levels, see Read Concern Levels.有关读取关注级别的更多信息,请参阅读取关注级别

collationdocument

Optional.

Specifies the collation to use for the operation.指定用于操作的排序规则

Collation排序规则 allows users to specify language-specific rules for string comparison, such as rules for lettercase and accent marks.允许用户为字符串比较指定特定于语言的规则,例如字母大小写和重音标记的规则。

The collation option has the following syntax:排序选项具有以下语法:

collation: {
   locale: <string>,
   caseLevel: <boolean>,
   caseFirst: <string>,
   strength: <int>,
   numericOrdering: <boolean>,
   alternate: <string>,
   maxVariable: <string>,
   backwards: <boolean>
}

When specifying collation, the locale field is mandatory; all other collation fields are optional. 指定排序规则时,locale字段是必需的;所有其他排序字段都是可选的。For descriptions of the fields, see Collation Document.有关字段的描述,请参阅排序规则文档

If the collation is unspecified but the collection has a default collation (see db.createCollection()), the operation uses the collation specified for the collection.如果未指定排序规则,但集合具有默认排序规则(请参阅db.createCollection()),则操作将使用为集合指定的排序规则。

If no collation is specified for the collection or for the operations, MongoDB uses the simple binary comparison used in prior versions for string comparisons.如果没有为集合或操作指定排序规则,MongoDB将使用以前版本中使用的简单二进制比较进行字符串比较。

You cannot specify multiple collations for an operation. 不能为一个操作指定多个排序规则。For example, you cannot specify different collations per field, or if performing a find with a sort, you cannot use one collation for the find and another for the sort.例如,不能为每个字段指定不同的排序规则,或者如果使用排序执行查找,则不能将一个排序规则用于查找,而将另一个用于排序。

commentany

Optional. 可选。A user-provided comment to attach to this command. 用户提供了附加到此命令的注释。Once set, this comment appears alongside records of this command in the following locations:设置后,此注释将显示在该命令的记录旁边的以下位置:

A comment can be any valid BSON type(string, integer, object, array, etc).注释可以是任何有效的BSON类型(字符串、整数、对象、数组等)。

New in version 4.4.在版本4.4中新增

Note注意

Results must not be larger than the maximum BSON size. 结果不得大于最大BSON大小If your results exceed the maximum BSON size, use the aggregation pipeline to retrieve distinct values using the $group operator, as described in Retrieve Distinct Values with the Aggregation Pipeline.如果结果超过最大BSON大小,请使用聚合管道使用$group运算符检索不同值,如使用聚合管道检索不同值中所述。

MongoDB also provides the shell wrapper method db.collection.distinct() for the distinct command. MongoDB还为distinct命令提供了外壳包装方法db.collection.distinct()Additionally, many MongoDB drivers provide a wrapper method. 此外,许多MongoDB驱动程序都提供了包装方法。Refer to the specific driver documentation.请参阅特定的驱动程序文档。

Behavior行为

In a sharded cluster, the distinct command may return orphaned documents.分片集群中,distinct命令可能会返回孤立文档。

Array Fields数组字段

If the value of the specified field is an array, distinct considers each element of the array as a separate value.如果指定field的值是数组,distinct会将数组的每个元素视为一个单独的值。

For instance, if a field has as its value [ 1, [1], 1 ], then distinct considers 1, [1], and 1 as separate values.例如,如果字段的值为[ 1, [1], 1 ],则distinct会将1[1]1视为单独的值。

For an example, see Return Distinct Values for an Array Field.有关示例,请参阅返回数组字段的不同值

Index Use索引使用

When possible, distinct operations can use indexes.如果可能,distinct操作可以使用索引。

Indexes can also coverdistinct operations. 索引还可以覆盖distinct的操作。See Covered Query for more information on queries covered by indexes.有关索引涵盖的查询的更多信息,请参阅覆盖查询

Transactions事务

To perform a distinct operation within a transaction:要在事务中执行不同的操作,请执行以下操作:

Important重要

In most cases, multi-document transaction incurs a greater performance cost over single document writes, and the availability of multi-document transactions should not be a replacement for effective schema design. 在大多数情况下,与单文档写入相比,多文档事务会带来更大的性能成本,并且多文档事务的可用性不应取代有效的模式设计。For many scenarios, the denormalized data model (embedded documents and arrays) will continue to be optimal for your data and use cases. 对于许多场景,非规范化的数据模型(嵌入式文档和数组)将继续是您的数据和用例的最佳选择。That is, for many scenarios, modeling your data appropriately will minimize the need for multi-document transactions.也就是说,对于许多场景,适当地建模数据将最小化多文档事务的需要。

For additional transactions usage considerations (such as runtime limit and oplog size limit), see also Production Considerations.有关其他事务使用注意事项(如运行时限制和oplog大小限制),请参阅生产注意事项

Client Disconnection客户端断开连接

Starting in MongoDB 4.2, if the client that issued the distinct disconnects before the operation completes, MongoDB marks the distinct for termination (i.e. killOp on the operation).从MongoDB 4.2开始,如果发出distinct命令的客户端在操作完成之前断开连接,MongoDB会将distinct标记为终止(即操作上的killOp)。

Replica Set Member State Restriction副本集成员状态限制

Starting in MongoDB 4.4, to run on a replica set member, distinct operations require the member to be in PRIMARY or SECONDARY state. 从MongoDB 4.4开始,要在副本集成员上运行,distinct操作要求该成员处于PRIMARYSECONDARY状态。If the member is in another state, such as STARTUP2, the operation errors.如果成员处于其他状态,例如STARTUP2,则操作错误。

In previous versions, the operations can also be run when the member is in STARTUP2. 在以前的版本中,当成员位于STARTUP2中时,也可以运行这些操作。However, the operations wait until the member transitions to RECOVERING.但是,操作会一直等到成员转换到RECOVERING

Examples示例

The examples use the inventory collection that contains the following documents:示例使用包含以下文档的inventory集合:

{ "_id": 1, "dept": "A", "item": { "sku": "111", "color": "red" }, "sizes": [ "S", "M" ] }
{ "_id": 2, "dept": "A", "item": { "sku": "111", "color": "blue" }, "sizes": [ "M", "L" ] }
{ "_id": 3, "dept": "B", "item": { "sku": "222", "color": "blue" }, "sizes": "S" }
{ "_id": 4, "dept": "A", "item": { "sku": "333", "color": "black" }, "sizes": [ "S" ] }

Return Distinct Values for a Field返回字段的不同值

The following example returns the distinct values for the field dept from all documents in the inventory collection:以下示例从inventory集合中的所有文档中返回字段dept的不同值:

db.runCommand ( { distinct: "inventory", key: "dept" } )

The command returns a document with a field named values that contains the distinct dept values:该命令返回一个文档,其中包含一个名为values的字段,该字段包含不同的dept值:

{
   "values" : [ "A", "B" ],
   "ok" : 1
}

Return Distinct Values for an Embedded Field返回嵌入字段的不同值

The following example returns the distinct values for the field sku, embedded in the item field, from all documents in the inventory collection:以下示例从inventory集合中的所有文档中返回嵌入在item字段中的字段sku的不同值:

db.runCommand ( { distinct: "inventory", key: "item.sku" } )

The command returns a document with a field named values that contains the distinct sku values:该命令返回一个文档,其中包含一个名为values的字段,该字段包含不同的sku值:

{
  "values" : [ "111", "222", "333" ],
  "ok" : 1
}
Tip提示
See also: 参阅:

Dot Notation点表示法 for information on accessing fields within embedded documents有关访问嵌入文档中字段的信息

Return Distinct Values for an Array Field返回数组字段的不同值

The following example returns the distinct values for the field sizes from all documents in the inventory collection:以下示例从inventory集合中的所有文档返回字段sizes的不同值:

db.runCommand ( { distinct: "inventory", key: "sizes" } )

The command returns a document with a field named values that contains the distinct sizes values:该命令返回一个文档,其中包含一个名为values的字段,该字段包含不同的sizes值:

{
  "values" : [ "M", "S", "L" ],
  "ok" : 1
}

For information on distinct and array fields, see the Behavior section.有关distinct字段和数组字段的信息,请参阅行为部分。

Specify Query with distinct使用distinct指定查询

The following example returns the distinct values for the field sku, embedded in the item field, from the documents whose dept is equal to "A":以下示例从dept等于"A"的文档中返回嵌入item字段中的字段sku的不同值:

db.runCommand ( { distinct: "inventory", key: "item.sku", query: { dept: "A"} } )

The command returns a document with a field named values that contains the distinct sku values:该命令返回一个文档,其中包含一个名为values的字段,该字段包含不同的sku值:

{
  "values" : [ "111", "333" ],
  "ok" : 1
}

Specify a Collation指定排序规则

Collation排序规则 allows users to specify language-specific rules for string comparison, such as rules for lettercase and accent marks.允许用户为字符串比较指定特定于语言的规则,例如字母大小写和重音标记的规则。

A collection myColl has the following documents:集合myColl包含以下文档:

{ _id: 1, category: "café", status: "A" }
{ _id: 2, category: "cafe", status: "a" }
{ _id: 3, category: "cafE", status: "a" }

The following aggregation operation includes the Collation option:以下聚合操作包括排序规则选项:

db.runCommand(
   {
      distinct: "myColl",
      key: "category",
      collation: { locale: "fr", strength: 1 }
   }
)

For descriptions on the collation fields, see Collation Document.有关排序字段的描述,请参阅排序规则文档

Override Default Read Concern覆盖默认读取问题

To override the default read concern level of "local", use the readConcern option.要覆盖默认的读取关注级别"local",请使用readConcern选项。

The following operation on a replica set specifies a Read Concern of "majority" to read the most recent copy of the data confirmed as having been written to a majority of the nodes.以下对副本集的操作将读取关注点指定为"majority",以读取确认已写入大多数节点的数据的最新副本。

Note注意

Regardless of the read concern level, the most recent data on a node may not reflect the most recent version of the data in the system.无论读取关注级别如何,节点上的最新数据可能不会反映系统中数据的最新版本。

db.runCommand(
   {
     distinct: "restaurants",
     key: "rating",
     query: { cuisine: "italian" },
     readConcern: { level: "majority" }
   }
)

To ensure that a single thread can read its own writes, use "majority" read concern and "majority" write concern against the primary of the replica set.要确保单个线程可以读取自己的写入,请对副本集的主线程使用"majority"读取关注点和"majority"写入关注点。

←  countmapReduce →