Database Manual / Introduction

BSON Types类型

BSON is a binary serialization format used to store documents and make remote procedure calls in MongoDB. The BSON specification is located at bsonspec.org.是一种二进制序列化格式,用于在MongoDB中存储文档和进行远程过程调用。BSON规范文档位于bsonspec.org

Each BSON type has both integer and string identifiers as listed in the following table:每种BSON类型都有整数和字符串标识符,如下表所示:

Type类型Number数字Alias别名Notes备注
Double双精度浮点数1"double"
String字符串2"string"
Object3"object"
Array数组4"array"
Binary data二进制数据5"binData"
Undefined6"undefined"Deprecated.
ObjectId7"objectId"
Boolean布尔值8"bool"
Date9"date"
Null10"null"
Regular Expression正则表达式11"regex"
DBPointer12"dbPointer"Deprecated.
JavaScript13"javascript"
Symbol14"symbol"Deprecated.
JavaScript with scope有范围的JavaScript15"javascriptWithScope"Deprecated.
32-bit integer32位整数16"int"
Timestamp时间戳17"timestamp"
64-bit integer64位整数18"long"
Decimal12819"decimal"
Min key-1"minKey"
Max key127"maxKey"

To determine a field's type, see Type Checking.要确定字段的类型,请参阅类型检查

If you convert BSON to JSON, see the Extended JSON reference.如果将BSON转换为JSON,请参阅扩展JSON参考。

The following sections describe special considerations for particular BSON types.以下部分描述了特定BSON类型的特殊注意事项。

Binary Data二进制数据

A BSON binary binData value is a byte array. A binData value has a subtype that indicates how to interpret the binary data. The following table shows the subtypes:BSON二进制binData值是一个字节数组。binData值有一个子类型,指示如何解释二进制数据。下表显示了子类型:

Number数字Description描述
0Generic binary subtype通用二进制子类型
1Function data功能数据
2Binary (old)
3UUID (old)
4UUID
5MD5
6Encrypted BSON value加密的BSON值
7

Compressed time series data压缩时间序列数据

New in version 5.2.在版本5.2中新增。

8Sensitive data, such as a key or secret. MongoDB does not log literal values for binary data with subtype 8. Instead, MongoDB logs a placeholder value of ###.敏感数据,如键或秘密。MongoDB不会记录子类型为8的二进制数据的文字值。相反,MongoDB会记录一个占位符值###
9Vector data, which is densely packed arrays of numbers of the same type.矢量数据,它是由同一类型的数字组成的密集数组。
128Custom data自定义数据

ObjectId

ObjectIds are small, likely unique, fast to generate, and ordered. ObjectId values are 12 bytes in length, consisting of:ObjectId很小,可能是唯一的,生成速度快,并且是有序的。ObjectId值的长度为12个字节,由以下部分组成:

  • A 4-byte timestamp, representing the ObjectId's creation, measured in seconds since the Unix epoch.一个4字节的时间戳,表示ObjectId的创建,以Unix纪元以来的秒为单位。
  • A 5-byte random value generated once per client-side process. This random value is unique to the machine and process. If the process restarts or the primary node of the process changes, this value is re-generated.每个客户端进程生成一次5字节的随机值。此随机值对于机器和过程是唯一的。如果流程重新启动或流程的主节点发生更改,则会重新生成此值。
  • A 3-byte incrementing counter per client-side process, initialized to a random value. The counter resets when a process restarts.每个客户端进程有一个3字节的递增计数器,初始化为随机值。当进程重新启动时,计数器会重置。

For timestamp and counter values, the most significant bytes appear first in the byte sequence (big-endian). This is unlike other BSON values, where the least significant bytes appear first (little-endian).对于时间戳和计数器值,最重要的字节在字节序列中首先出现(大端序)。这与其他BSON值不同,在其他值中,最低有效字节首先出现(小字节序)。

If an integer value is used to create an ObjectId, the integer replaces the timestamp.如果使用整数值创建ObjectId,则该整数将替换时间戳。

In MongoDB, each document stored in a standard collection requires a unique _id field that acts as a primary key. 在MongoDB中,存储在标准集合中的每个文档都需要一个唯一的_id字段作为主键If an inserted document omits the _id field, the MongoDB driver automatically generates an ObjectId for the _id field.如果插入的文档省略了_id字段,MongoDB驱动程序会自动为_id字段生成一个ObjectId

This also applies to documents inserted through update operations with upsert: true.这也适用于通过upsert:true更新操作插入的文档。

MongoDB clients should add an _id field with a unique ObjectId. Using ObjectIds for the _id field provides the following additional benefits:MongoDB客户端应添加一个具有唯一ObjectId的_id字段。使用ObjectIds作为_id字段提供了以下额外好处:

  • You can access ObjectId creation time in mongosh using the ObjectId.getTimestamp() method.您可以使用ObjectId.getTimestamp()方法访问mongosh中的ObjectId创建时间。
  • ObjectIds are approximately ordered by creation time, but are not perfectly ordered. Sorting a collection on an _id field containing ObjectId values is roughly equivalent to sorting by creation time.ObjectId大致按创建时间排序,但并非完全有序。在包含ObjectId值的_id字段上对集合进行排序大致相当于按创建时间排序。

    Important

    While ObjectId values should increase over time, they are not necessarily monotonic. This is because they:虽然ObjectId值应该随着时间的推移而增加,但它们不一定是单调的。这是因为他们:

    • Only contain one second of temporal resolution, so ObjectId values created within the same second do not have a guaranteed ordering, and仅包含一秒的时间分辨率,因此在同一秒内创建的ObjectId值没有保证的顺序,并且
    • Are generated by clients, which may have differing system clocks.由可能具有不同系统时钟的客户端生成。

Use the ObjectId() methods to set and retrieve ObjectId values.使用ObjectId()方法设置和检索ObjectId值。

Starting in MongoDB 5.0, mongosh replaces the legacy mongo shell. 从MongoDB 5.0开始,mongosh取代了传统的mongo shell。The ObjectId() methods work differently in mongosh than in the legacy mongo shell. ObjectId()方法在mongosh中的工作方式与传统mongo shell不同。For more information on the legacy methods, see Legacy mongo Shell.有关遗留方法的更多信息,请参阅遗留mongoShell

String字符串

BSON strings are UTF-8. In general, drivers for each programming language convert from the language's string format to UTF-8 when serializing and deserializing BSON. BSON字符串是UTF-8。一般来说,在序列化和反序列化BSON时,每种编程语言的驱动程序都会从语言的字符串格式转换为UTF-8。This makes it possible to store most international characters in BSON strings with ease. 这使得可以轻松地将大多数国际字符存储在BSON字符串中。[1] In addition, MongoDB $regex queries support UTF-8 in the regex string.此外,MongoDB$regex查询在正则表达式字符串中支持UTF-8。

[1] Given strings using UTF-8 character sets, using sort() on strings will be reasonably correct. 给定使用UTF-8字符集的字符串,对字符串使用sort()将是合理正确的。However, because internally sort() uses the C++ strcmp api, the sort order may handle some characters incorrectly.但是,由于内部sort()使用C++strcmp api,排序顺序可能会错误地处理某些字符。

Timestamps时间戳

BSON has a special timestamp type for internal MongoDB use and is not associated with the regular Date type. This internal timestamp type is a 64 bit value where:BSON有一个特殊的时间戳类型供MongoDB内部使用,与常规Date类型无关。此内部时间戳类型是一个64位值,其中:

  • the most significant 32 bits are a time_t value (seconds since the Unix epoch)最重要的32位是time_t值(自Unix纪元以来的秒数)
  • the least significant 32 bits are an incrementing ordinal for operations within a given second.最低有效32位是给定秒内操作的递增ordinal

While the BSON format is little-endian, and therefore stores the least significant bits first, the mongod instance always compares the time_t value before the ordinal value on all platforms, regardless of endianness.虽然BSON格式是小字节序,因此首先存储最低有效位,但mongod实例在所有平台上总是比较ordinal(序数)值之前的time_t值,而不管字节序如何。

In replication, the oplog has a ts field. The values in this field reflect the operation time, which uses a BSON timestamp value.在复制中,oplog有一个ts字段。此字段中的值反映了使用BSON时间戳值的操作时间。

Within a single mongod instance, timestamp values in the oplog are always unique.在单个mongod实例中,oplog中的时间戳值始终是唯一的。

Note

The BSON timestamp type is for internal MongoDB use. For most cases, in application development, you will want to use the BSON date type. BSON时间戳类型供MongoDB内部使用。在大多数情况下,在应用程序开发中,您会希望使用BSON日期类型。See Date for more information.更多信息请参见日期

Date

BSON Date is a 64-bit integer that represents the number of milliseconds since the Unix epoch (Jan 1, 1970). This results in a representable date range of about 290 million years into the past and future.BSON Date是一个64位整数,表示自Unix纪元(1970年1月1日)以来的毫秒数。这导致了过去和未来约2.9亿年的可代表日期范围。

The official BSON specification refers to the BSON Date type as the UTC datetime.官方的BSON规范将BSON日期类型称为UTC日期时间。

BSON Date type is signed. [2] Negative values represent dates before 1970.BSON日期类型已签名。[2]负值表示1970年之前的日期。

To construct a Date in mongosh, you can use the new Date() or ISODate() constructor.要在mongosh中构造Date,可以使用新的Date()ISODate()构造函数。

Construct a Date With the New Date() Constructor使用New Date()构造函数构造日期

To construct a Date with the new Date() constructor, run the following command:要使用新的Date()构造函数构造Date,请运行以下命令:

var mydate1 = new Date()

The mydate1 variable outputs a date and time wrapped as an ISODate:mydate1变量输出包装为ISODate的日期和时间:

mydate1
ISODate("2020-05-11T20:14:14.796Z")

Construct a Date With the ISODate() Constructor使用ISODate()构造函数构造日期

To construct a Date using the ISODate() constructor, run the following command:要使用ISODate()构造函数构造Date,请运行以下命令:

var mydate2 = ISODate()

The mydate2 variable stores a date and time wrapped as an ISODate:mydate2变量存储包装为ISODate的日期和时间:

mydate2
ISODate("2020-05-11T20:14:14.796Z")

Convert a Date to a String将日期转换为字符串

To print the Date in a string format, use the toString() method:要以string格式打印Date,请使用toString()方法:

mydate1.toString()
Mon May 11 2020 13:14:14 GMT-0700 (Pacific Daylight Time)

Return the Month Portion of a Date返回日期的月份部分

You can also return the month portion of the Date value. Months are zero-indexed, so that January is month 0.您还可以返回Date值的月份部分。月份是零索引的,因此一月是0月份。

mydate1.getMonth()
4
[2] Prior to version 2.0, Date values were incorrectly interpreted as unsigned integers, which affected sorts, range queries, and indexes on Date fields. 在2.0版本之前,Date值被错误地解释为无符号整数,这会影响Date字段的排序、范围查询和索引。Because indexes are not recreated when upgrading, please re-index if you created an index on Date values with an earlier version, and dates before 1970 are relevant to your application.由于升级时不会重新创建索引,如果您使用早期版本在Date值上创建了索引,并且1970年之前的日期与您的应用程序相关,请重新索引。

decimal128 BSON Data TypeBSON数据类型

decimal128 is a 128-bit decimal representation for storing very large or very precise numbers, whenever rounding decimals is important. 是一种128位十进制表示,用于存储非常大或非常精确的数字,只要舍入小数很重要。It was created in August 2009 as part of the IEEE 754-2008 revision of floating points. When you need high precision when working with BSON data types, you should use decimal128.它创建于2009年8月,是IEEE 754-2008浮点修订版的一部分。当您在处理BSON数据类型时需要高精度时,您应该使用decimal128

decimal128 supports 34 decimal digits of precision, or significand along with an exponent range of -6143 to +6144. decimal128支持34位小数精度或有效位,以及-6143到+6144的指数范围。The significand is not normalized in the decimal128 standard, allowing for multiple possible representations: 10 x 10^-1 = 1 x 10^0 = .1 x 10^1 = .01 x 10^2, etc. 有效位未按照decimal128标准进行归一化,允许多种可能的表示:10 x 10^-1 = 1 x 10^0 = .1 x 10^1 = .01 x 10^2,以此类推。Having the ability to store maximum and minimum values in the order of 10^6144 and 10^-6143, respectively, allows for a lot of precision.能够分别以10^614410^-6143的顺序存储最大值和最小值,可以获得很高的精度。

Use decimal128 With the Decimal128() Constructor使用decimal128()构造函数的decimal128

In MongoDB, you can store data in decimal128 format using the Decimal128() constructor. If you pass in the decimal value as a string, MongoDB stores the value in the database as follows:在MongoDB中,您可以使用decimal128()构造函数以decimal128格式存储数据。如果你以字符串形式传入十进制值,MongoDB会将该值存储在数据库中,如下所示:

Decimal128("9823.1297")

You can also pass in the decimal value as a double:您还可以将十进制值作为double传入:

Decimal128.fromStringWithRounding("1234.99999999999")

You should also consider the usage and support your programming language has for decimal128. The following languages don’t natively support this feature and require a plugin or additional package to get the functionality:您还应该考虑您的编程语言对decimal128的使用和支持。以下语言本身不支持此功能,需要插件或附加包才能获得此功能:

Use Cases用例

When you perfom mathematical calculations programmatically, you can sometimes receive unexpected results. The following example in Node.js yields incorrect results:当您以编程方式执行数学计算时,有时会收到意外的结果。Node.js中的以下示例会产生不正确的结果:

> 0.1
0.1
> 0.2
0.2
> 0.1 * 0.2
0.020000000000000004
> 0.1 + 0.1
0.010000000000000002

Similarly, the following example in Java produces incorrect output:同样,Java中的以下示例会产生不正确的输出:


class Main {
public static void main(String[] args) {
System.out.println("0.1 * 0.2:");
System.out.println(0.1 * 0.2);
}
}
0.1 * 0.2:
0.020000000000000004

The same computations in Python, Ruby, Rust, and other languages produce the same results. This happens because binary floating-point numbers do not represent base 10 values well.Python、Ruby、Rust和其他语言中的相同计算会产生相同的结果。这是因为二进制浮点数不能很好地表示以10为底的值。

For example, the 0.1 used in the above examples is represented in binary as 0.0001100110011001101. 例如,上述示例中使用的0.1以二进制表示为0.0001100110011001101Most of the time, this does not cause any significant issues. However, in applications such as finance or banking where precision is important, use decimal128 as your data type.大多数时候,这不会造成任何重大问题。但是,在金融或银行等精度很重要的应用程序中,请使用decimal128作为数据类型。