Database Manual / Introduction

Documents文档

MongoDB stores data records as BSON documents. BSON is a binary representation of JSON documents, though it contains more data types than JSON. MongoDB将数据记录存储为BSON文档。BSON是JSON文档的二进制表示,尽管它包含的数据类型比JSON多。For the BSON spec, see bsonspec.org. See also BSON Types.有关BSON规范,请参阅bsonspec.org。另请参阅BSON类型

A MongoDB document.

Compatibility兼容性

MongoDB stores records as documents for deployments hosted in the following environments:MongoDB将记录存储为在以下环境中托管的部署的文档:

  • MongoDB Atlas: The fully managed service for MongoDB deployments in the cloud:云中MongoDB部署的完全托管服务
  • MongoDB Enterprise企业版: The subscription-based, self-managed version of MongoDB:MongoDB的基于订阅的自我管理版本
  • MongoDB Community社区版: The source-available, free-to-use, and self-managed version of MongoDB:MongoDB的源代码可用、免费使用和自我管理版本

Document Structure文档结构

MongoDB documents are composed of field-and-value pairs and have the following structure:MongoDB文档由字段和值对组成,具有以下结构:

{
field1: value1,
field2: value2,
field3: value3,
...
fieldN: valueN
}

The value of a field can be any of the BSON data types, including other documents, arrays, and arrays of documents. For example, the following document contains values of varying types:字段的值可以是任何BSON数据类型,包括其他文档、数组和文档数组。例如,以下文档包含不同类型的值:

var mydoc = {
_id: ObjectId("5099803df3f4948bd2f98391"),
name: { first: "Alan", last: "Turing" },
birth: new Date('Jun 23, 1912'),
death: new Date('Jun 07, 1954'),
contribs: [ "Turing machine", "Turing test", "Turingery" ],
views : Long(1250000)
}

The above fields have the following data types:上述字段具有以下数据类型:

  • _id holds an ObjectId.保存了ObjectId
  • name holds an embedded document that contains the fields first and last.保存一个包含字段firstlast的嵌入式文档。
  • birth and death hold values of the Date type.birthdeath保存日期类型的值。
  • contribs holds an array of strings.包含一个字符串数组
  • views holds a value of the NumberLong type.包含NumberLong类型的值。

Field Names字段名称

Field names are strings.字段名是字符串。

Documents have the following restrictions on field names:文档对字段名有以下限制:

  • The field name _id is reserved for use as a primary key; its value must be unique in the collection, is immutable, and may be of any type other than an array or regex. 字段名_id被保留用作主键;它的值在集合中必须是唯一的,是不可变的,并且可以是数组或正则表达式以外的任何类型。If the _id contains subfields, the subfield names cannot begin with a ($) symbol.如果_id包含子字段,则子字段名称不能以($)符号开头。
  • Field names cannot contain the null character.字段名不能包含空字符。
  • The server permits storage of field names that contain dots (.) and dollar signs ($).服务器允许存储包含点(.)和美元符号($)的字段名。
  • MongodB 5.0 adds improved support for the use of ($) and (.) in field names. There are some restrictions. See Field Name Considerations for more details.MongodB 5.0增加了对字段名中使用($)和(.)的改进支持。有一些限制。有关更多详细信息,请参阅字段名称注意事项
  • Each field name must be unique within the document. You must not store documents with duplicate fields because MongoDB CRUD operations might behave unexpectedly if a document has duplicate fields.文档中的每个字段名称都必须是唯一的。您不能存储具有重复字段的文档,因为如果文档具有重复字段,MongoDB CRUD操作可能会出现意外行为。

The MongoDB Query Language doesn't support documents with duplicate field names:MongoDB查询语言不支持具有重复字段名的文档:

  • Although some BSON builders may support creating a BSON document with duplicate field names, inserting these documents into MongoDB isn't supported even if the insert succeeds, or appears to succeed.尽管一些BSON构建器可能支持创建具有重复字段名的BSON文档,但即使插入成功或似乎成功,也不支持将这些文档插入MongoDB。
  • For example, inserting a BSON document with duplicate field names through a MongoDB driver may result in the driver silently dropping the duplicate values prior to insertion, or may result in an invalid document being inserted that contains duplicate fields. Querying those documents leads to inconsistent results.例如,通过MongoDB驱动程序插入具有重复字段名的BSON文档可能会导致驱动程序在插入之前自动删除重复值,或者可能导致插入包含重复字段的无效文档。查询这些文档会导致结果不一致。
  • Updating documents with duplicate field names isn't supported, even if the update succeeds or appears to succeed.不支持更新具有重复字段名的文档,即使更新成功或似乎成功。

Starting in MongoDB 6.1, to see if a document has duplicate field names, use the validate command with the full field set to true. 从MongoDB 6.1开始,要查看文档是否有重复的字段名,请使用将完整字段设置为truevalidate命令。In any MongoDB version, use the $objectToArray aggregation operator to see if a document has duplicate field names.在任何MongoDB版本中,使用$objectToArray聚合运算符查看文档是否有重复的字段名。

Dot Notation点符号

MongoDB uses the dot notation to access the elements of an array and to access the fields of an embedded document.MongoDB使用点符号来访问数组的元素和嵌入式文档的字段。

Arrays数组

To specify or access an element of an array by the zero-based index position, concatenate the array name with the dot (.) and zero-based index position, and enclose in quotes:要通过从零开始的索引位置指定或访问数组的元素,请将数组名称与点(.)和从零开始索引位置连接起来,并括在引号中:

"<array>.<index>"

For example, given the following field in a document:例如,给定文档中的以下字段:

{
...
contribs: [ "Turing machine", "Turing test", "Turingery" ],
...
}

To specify the third element in the contribs array, use the dot notation "contribs.2".要指定contribs数组中的第三个元素,请使用点符号"contribs.2"

For examples querying arrays, see:有关查询数组的示例,请参阅:

Tip

  • $[] all positional operator for update operations,所有位置运算符用于更新操作,
  • $[<identifier>] filtered positional operator for update operations,用于更新操作的筛选位置运算符,
  • $ positional operator for update operations,用于更新操作的位置运算符,
  • $ projection operator when array index position is unknown数组索引位置未知时的投影算子
  • Query an Array for dot notation examples with arrays.数组中查询带有数组的点表示法示例。

Embedded Documents嵌入式文档

To specify or access a field of an embedded document with dot notation, concatenate the embedded document name with the dot (.) and the field name, and enclose in quotes:要使用点符号指定或访问嵌入式文档的字段,请将嵌入式文档名称与点(.)和字段名称连接起来,并用引号括起来:

"<embedded document>.<field>"

For example, given the following field in a document:例如,给定文档中的以下字段:

{
...
name: { first: "Alan", last: "Turing" },
contact: { phone: { type: "cell", number: "111-222-3333" } },
...
}
  • To specify the field named last in the name field, use the dot notation "name.last".要在name字段中指定名为last的字段,请使用点符号"name.last"
  • To specify the number in the phone document in the contact field, use the dot notation "contact.phone.number".要在contact字段中指定phone文档中的number,请使用点符号"contact.phone.number"

Warning

Partition fields cannot use field names that contain a dot (.).分区字段不能使用包含点(.)的字段名。

For examples querying embedded documents, see:有关查询嵌入式文档的示例,请参阅:

Document Limitations文件限制

Documents have the following attributes:文档具有以下属性:

Document Size Limit文档大小限制

The maximum BSON document size is 16 mebibytes.BSON文档的最大大小为16兆字节。

The maximum document size helps ensure that a single document cannot use an excessive amount of RAM or, during transmission, an excessive amount of bandwidth. 最大文档大小有助于确保单个文档不会使用过多的RAM,或者在传输过程中不会使用过大的带宽。To store documents larger than the maximum size, MongoDB provides the GridFS API. 为了存储大于最大大小的文档,MongoDB提供了GridFSneneneba API。For more information about GridFS, see mongofiles and the documentation for your driver有关GridFS的更多信息,请参阅mongofiles驱动程序的文档

Document Field Order文档字段顺序

Unlike JavaScript objects, the fields in a BSON document are ordered.与JavaScript对象不同,BSON文档中的字段是按顺序排列的。

Field Order in Queries查询中的字段顺序

For queries, the field order behavior is as follows:对于查询,字段顺序行为如下:

  • When comparing documents, field ordering is significant. For example, when comparing documents with fields a and b in a query:在比较文档时,字段顺序很重要。例如,在将文档与查询中的字段ab进行比较时:

    • {a: 1, b: 1} is equal to 等于{a: 1, b: 1}
    • {a: 1, b: 1} is not equal to 不等于{b: 1, a: 1}
  • For efficient query execution, the query engine may reorder fields during query processing. 为了高效执行查询,查询引擎可以在查询处理期间对字段进行重新排序。Among other cases, reordering fields may occur when processing these projection operators: $project, $addFields, $set, and $unset.在其他情况下,处理这些投影运算符时可能会对字段进行重新排序:$project$addFields$set$unset

    • Field reordering may occur in intermediate results as well as the final results returned by a query.字段重新排序可能发生在中间结果以及查询返回的最终结果中。
    • Because some operations may reorder fields, you should not rely on specific field ordering in the results returned by a query that uses the projection operators listed earlier.由于某些操作可能会对字段进行重新排序,因此不应依赖于使用前面列出的投影运算符的查询返回的结果中的特定字段排序。

Field Order in Write Operations写操作中的字段顺序

For write operations, MongoDB preserves the order of the document fields except for the following cases:对于写操作,MongoDB会保留文档字段的顺序,但以下情况除外:

  • The _id field is always the first field in the document._id字段始终是文档中的第一个字段。
  • Updates that include renaming of field names may result in the reordering of fields in the document.包括字段名renaming的更新可能会导致文档中字段的重新排序。

The _id Field_id字段

In MongoDB, each document stored in a standard collection requires a unique _id field that acts as a primary key. 在MongoDB中,存储在标准集合中的每个文档都需要一个唯一的_id字段作为主键If an inserted document omits the _id field, the MongoDB driver automatically generates an ObjectId for the _id field.如果插入的文档省略了_id字段,MongoDB驱动程序会自动为_id字段生成一个ObjectId

This also applies to documents inserted through update operations with upsert: true.这也适用于通过upsert:true更新操作插入的文档。

Note

In time series collections, documents do not require a unique _id field because MongoDB does not create an index on the _id field.时间序列集合中,文档不需要唯一的_id字段,因为MongoDB不会在_id字段上创建索引。

The _id field has the following behavior and constraints:_id字段具有以下行为和约束:

  • By default, MongoDB creates a unique index on the _id field during the creation of a collection.默认情况下,MongoDB在创建集合时在_id字段上创建一个唯一索引。
  • The _id field is always the first field in the documents. If the server receives a document that does not have the _id field first, then the server will move the field to the beginning._id字段始终是文档中的第一个字段。如果服务器首先收到一个没有_id字段的文档,那么服务器将把该字段移到开头。
  • If the _id contains subfields, the subfield names cannot begin with a ($) symbol.如果_id包含子字段,则子字段名称不能以($)符号开头。
  • The _id field may contain values of any BSON data type, other than an array, regex, or undefined._id字段可以包含任何BSON数据类型的值,数组、正则表达式或undefined除外。

    Warning

    To ensure functioning replication, do not store values that are of the BSON regular expression type in the _id field.为确保复制功能正常,请勿在_id字段中存储BSON正则表达式类型的值。

The following are common options for storing values for _id:以下是存储_id值的常见选项:

  • Use an ObjectId.使用ObjectId
  • Use a natural unique identifier, if available. This saves space and avoids an additional index.如果可用,请使用自然唯一标识符。这节省了空间并避免了额外的索引。
  • Generate an auto-incrementing number.生成一个自动递增的数字。
  • Generate a UUID in your application code. For a more efficient storage of the UUID values in the collection and in the _id index, store the UUID as a value of the BSON BinData type.在应用程序代码中生成UUID。为了在集合和_id索引中更有效地存储UUID值,请将UUID存储为BSON BinData类型的值。

    Index keys that are of the BinData type are more efficiently stored in the index if:如果满足以下条件,BinData类型的索引键将更有效地存储在索引中:

    • the binary subtype value is in the range of 0-7 or 128-135, and二进制子类型值在0-7或128-135的范围内,以及
    • the length of the byte array is: 0, 1, 2, 3, 4, 5, 6, 7, 8, 10, 12, 14, 16, 20, 24, or 32.字节数组的长度为:0、1、2、3、4、5、6、7、8、10、12、14、16、20、24或32。
  • Use your driver's BSON UUID facility to generate UUIDs. Be aware that driver implementations may implement UUID serialization and deserialization logic differently, which may not be fully compatible with other drivers. 使用驱动程序的BSON UUID工具生成UUID。请注意,驱动程序实现可能会以不同的方式实现UUID序列化和反序列化逻辑,这可能与其他驱动程序不完全兼容。See your driver documentation for information concerning UUID interoperability.有关UUID互操作性的信息,请参阅驱动程序文档

Note

Most MongoDB driver clients include the _id field and generate an ObjectId before sending the insert operation to MongoDB. However, if the client sends a document without an _id field, the mongod adds the _id field and generates the ObjectId.大多数MongoDB驱动程序客户端都包含_id字段,并在向MongoDB发送插入操作之前生成ObjectId。但是,如果客户端发送的文档没有_id字段,mongod会添加_id字段并生成ObjectId

Other Uses of the Document Structure文档结构的其他用途

In addition to defining data records, MongoDB uses the document structure throughout, including but not limited to: query filters, update specifications documents, and index specification documents.除了定义数据记录外,MongoDB还全程使用文档结构,包括但不限于:查询筛选器更新规范文档索引规范文档

Query Filter Documents查询筛选文档

Query filter documents specify the conditions that determine which records to select for read, update, and delete operations.查询筛选器文档指定了确定选择哪些记录进行读取、更新和删除操作的条件。

You can use <field>:<value> expressions to specify the equality condition and query operator expressions.您可以使用<field>:<value>表达式来指定相等条件和查询运算符表达式。

{
<field1>: <value1>,
<field2>: { <operator>: <value> },
...
}

For examples, see:例如,请参阅:

Update Specification Documents更新规范文件

Update specification documents use update operators to specify the data modifications to perform on specific fields during an update operation.更新规范文档使用更新运算符指定在更新操作期间对特定字段执行的数据修改。

{
<operator1>: { <field1>: <value1>, ... },
<operator2>: { <field2>: <value2>, ... },
...
}

For examples, see Update specifications.有关示例,请参阅更新规范

Index Specification Documents索引规范文件

Index specification documents define the field to index and the index type:索引规范文档定义了要索引的字段和索引类型:

{ <field1>: <type1>, <field2>: <type2>, ...  }