Docs HomeMongoDB Manual

$binarySize (aggregation)

On this page本页内容

Definition定义

$binarySize

New in version 4.4. 4.4版新增。

Returns the size of a given string or binary data value's content in bytes.返回给定字符串或二进制数据值内容的大小(以字节为单位)。

$binarySize has the following syntax:具有以下语法:

{ $binarySize: <string or binData> }

The argument can be any valid expression as long as it resolves to either a string or binary data value. 参数可以是任何有效的表达式,只要它解析为字符串或二进制数据值即可。For more information on expressions, see Expressions.有关表达式的详细信息,请参阅表达式

Behavior行为

The argument for $binarySize must resolve to either:$binarySize的参数必须解析为:

  • A string,字符串,
  • A binary data value, or二进制数据值,或
  • null.

If the argument is a string or binary data value, the expression returns the size of the argument in bytes.如果参数是字符串或二进制数据值,则表达式将返回以字节为单位的参数大小。

If the argument is null, the expression returns null.如果参数为null,则表达式将返回null

If the argument resolves to any other data type, $binarySize errors.如果参数解析为任何其他数据类型,则$binarySize将出错。

String Size Calculation字符串大小计算

If the argument for $binarySize is a string, the operator counts the number of UTF-8 encoded bytes in a string where each character may use between one and four bytes.如果$binarySize的参数是字符串,则运算符计算字符串中UTF-8编码的字节数,其中每个字符可以使用一到四个字节。

For example, US-ASCII characters are encoded using one byte. Characters with diacritic markings and additional Latin alphabetical characters (i.e. Latin characters outside of the English alphabet) are encoded using two bytes. 例如,US-ASCII字符使用一个字节进行编码。带有变音符号标记的字符和附加的拉丁字母字符(即英语字母表之外的拉丁字符)使用两个字节进行编码。Chinese, Japanese and Korean characters typically require three bytes, and other planes of unicode (emoji, mathematical symbols, etc.) require four bytes.中文、日文和韩文字符通常需要三个字节,而unicode的其他平面(表情符号、数学符号等)需要四个字节。

Consider the following examples:考虑以下示例:

ExampleResultsNotes备注
{ $binarySize: "abcde" }
5Each character is encoded using one byte.每个字符都使用一个字节进行编码。
{ $binarySize: "Hello World!" }
12Each character is encoded using one byte.每个字符都使用一个字节进行编码。
{ $binarySize: "cafeteria" }
9Each character is encoded using one byte.每个字符都使用一个字节进行编码。
{ $binarySize: "cafétéria" }
11é is encoded using two bytes.使用两个字节进行编码。
{ $binarySize: "" }
0Empty strings return 0.空字符串返回0。
{ $binarySize: "$€λG" }
7 is encoded using three bytes. 使用三个字节进行编码。λ is encoded using two bytes.使用两个字节进行编码。
{ $binarySize: "寿司" }
6Each character is encoded using three bytes.每个字符使用三个字节进行编码。

Example实例

In mongosh, create a sample collection named images with the following documents:mongosh中,使用以下文档创建一个名为images的示例集合:

db.images.insertMany([
{ _id: 1, name: "cat.jpg", binary: new BinData(0, "OEJTfmD8twzaj/LPKLIVkA==")},
{ _id: 2, name: "big_ben.jpg", binary: new BinData(0, "aGVsZmRqYWZqYmxhaGJsYXJnYWZkYXJlcTU1NDE1Z2FmZCBmZGFmZGE=")},
{ _id: 3, name: "tea_set.jpg", binary: new BinData(0, "MyIRAFVEd2aImaq7zN3u/w==")},
{ _id: 4, name: "concert.jpg", binary: new BinData(0, "TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlzIHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2YgdGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGludWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRoZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4=")},
{ _id: 5, name: "empty.jpg", binary: new BinData(0, "") }
])

The following aggregation projects:以下聚合projects

  • The name fieldname字段
  • The imageSize field, which uses $binarySize to return the size of the document's binary field in bytes.imageSize字段,它使用$binarySize返回文档binary字段的大小(以字节为单位)。
db.images.aggregate([
{
$project: {
"name": "$name",
"imageSize": { $binarySize: "$binary" }
}
}
])

The operation returns the following result:该操作返回以下结果:

{ "_id" : 1, "name" : "cat.jpg", "imageSize" : 16 }
{ "_id" : 2, "name" : "big_ben.jpg", "imageSize" : 41 }
{ "_id" : 3, "name" : "teaset.jpg", "imageSize" : 16 }
{ "_id" : 4, "name" : "concert.jpg", "imageSize" : 269 }
{ "_id" : 5, "name" : "empty.jpg", "imageSize" : 0 }

Find Largest Binary Data查找最大的二进制数据

The following pipeline returns the image with the largest binary data size:以下管道返回具有最大二进制数据大小的图像:

db.images.aggregate([
// First Stage
{ $project: { name: "$name", imageSize: { $binarySize: "$binary" } } },
// Second Stage
{ $sort: { "imageSize" : -1 } },
// Third Stage
{ $limit: 1 }
])
First Stage第一阶段

The first stage of the pipeline projects:管道projects的第一阶段:

  • The name fieldname字段
  • The imageSize field, which uses $binarySize to return the size of the document's binary field in bytes.imageSize字段,它使用$binarySize返回文档二进制字段的大小(以字节为单位)。

This stage outputs the following documents to the next stage:本阶段向下一阶段输出以下文件:

{ "_id" : 1, "name" : "cat.jpg", "imageSize" : 16 }
{ "_id" : 2, "name" : "big_ben.jpg", "imageSize" : 41 }
{ "_id" : 3, "name" : "teaset.jpg", "imageSize" : 16 }
{ "_id" : 4, "name" : "concert.jpg", "imageSize" : 269 }
{ "_id" : 5, "name" : "empty.jpg", "imageSize" : 0 }
Second Stage第二阶段

The second stage sorts the documents by imageSize in descending order.第二阶段按imageSize降序对文档进行排序

This stage outputs the following documents to the next stage:本阶段向下一阶段输出以下文件:

{ "_id" : 4, "name" : "concert.jpg", "imageSize" : 269 }
{ "_id" : 2, "name" : "big_ben.jpg", "imageSize" : 41 }
{ "_id" : 1, "name" : "cat.jpg", "imageSize" : 16 }
{ "_id" : 3, "name" : "teaset.jpg", "imageSize" : 16 }
{ "_id" : 5, "name" : "empty.jpg", "imageSize" : 0 }
Third Stage第三阶段

The third stage limits the output documents to only return the document appearing first in the sort order:第三阶段限制输出文档仅返回排序顺序中第一个出现的文档:

{ "_id" : 4, "name" : "concert.jpg", "imageSize" : 269 }
Tip

See also: 另请参阅: