$binarySize (aggregation)
On this page本页内容
Definition定义
$binarySizeNew in version 4.4.4.4版新增。Returns the size of a given string or binary data value's content in bytes.返回给定字符串或二进制数据值内容的大小(以字节为单位)。$binarySizehas the following syntax:具有以下语法:{ $binarySize: <string or binData> }The argument can be any valid expression as long as it resolves to either a string or binary data value.参数可以是任何有效的表达式,只要它解析为字符串或二进制数据值即可。For more information on expressions, see Expressions.有关表达式的详细信息,请参阅表达式。
Behavior行为
The argument for $binarySize must resolve to either:$binarySize的参数必须解析为:
A string,字符串,A binary data value, or二进制数据值,或- null.
If the argument is a string or binary data value, the expression returns the size of the argument in bytes.如果参数是字符串或二进制数据值,则表达式将返回以字节为单位的参数大小。
If the argument is 如果参数为null, the expression returns null.null,则表达式将返回null。
If the argument resolves to any other data type, 如果参数解析为任何其他数据类型,则$binarySize errors.$binarySize将出错。
String Size Calculation字符串大小计算
If the argument for 如果$binarySize is a string, the operator counts the number of UTF-8 encoded bytes in a string where each character may use between one and four bytes.$binarySize的参数是字符串,则运算符计算字符串中UTF-8编码的字节数,其中每个字符可以使用一到四个字节。
For example, US-ASCII characters are encoded using one byte. Characters with diacritic markings and additional Latin alphabetical characters (i.e. Latin characters outside of the English alphabet) are encoded using two bytes. 例如,US-ASCII字符使用一个字节进行编码。带有变音符号标记的字符和附加的拉丁字母字符(即英语字母表之外的拉丁字符)使用两个字节进行编码。Chinese, Japanese and Korean characters typically require three bytes, and other planes of unicode (emoji, mathematical symbols, etc.) require four bytes.中文、日文和韩文字符通常需要三个字节,而unicode的其他平面(表情符号、数学符号等)需要四个字节。
Consider the following examples:考虑以下示例:
| Example | Results | |
|---|---|---|
{ $binarySize: "abcde" }
| 5 | |
{ $binarySize: "Hello World!" }
| 12 | |
{ $binarySize: "cafeteria" }
| 9 | |
{ $binarySize: "cafétéria" }
| 11 | é |
{ $binarySize: "" }
| 0 | |
{ $binarySize: "$€λG" }
| 7 | €λ |
{ $binarySize: "寿司" }
| 6 |
Example实例
In 在mongosh, create a sample collection named images with the following documents:mongosh中,使用以下文档创建一个名为images的示例集合:
db.images.insertMany([
{ _id: 1, name: "cat.jpg", binary: new BinData(0, "OEJTfmD8twzaj/LPKLIVkA==")},
{ _id: 2, name: "big_ben.jpg", binary: new BinData(0, "aGVsZmRqYWZqYmxhaGJsYXJnYWZkYXJlcTU1NDE1Z2FmZCBmZGFmZGE=")},
{ _id: 3, name: "tea_set.jpg", binary: new BinData(0, "MyIRAFVEd2aImaq7zN3u/w==")},
{ _id: 4, name: "concert.jpg", binary: new BinData(0, "TWFuIGlzIGRpc3Rpbmd1aXNoZWQsIG5vdCBvbmx5IGJ5IGhpcyByZWFzb24sIGJ1dCBieSB0aGlzIHNpbmd1bGFyIHBhc3Npb24gZnJvbSBvdGhlciBhbmltYWxzLCB3aGljaCBpcyBhIGx1c3Qgb2YgdGhlIG1pbmQsIHRoYXQgYnkgYSBwZXJzZXZlcmFuY2Ugb2YgZGVsaWdodCBpbiB0aGUgY29udGludWVkIGFuZCBpbmRlZmF0aWdhYmxlIGdlbmVyYXRpb24gb2Yga25vd2xlZGdlLCBleGNlZWRzIHRoZSBzaG9ydCB2ZWhlbWVuY2Ugb2YgYW55IGNhcm5hbCBwbGVhc3VyZS4=")},
{ _id: 5, name: "empty.jpg", binary: new BinData(0, "") }
])
The following aggregation 以下聚合projects:projects:
Thenamefieldname字段TheimageSizefield, which uses$binarySizeto return the size of the document'sbinaryfield in bytes.imageSize字段,它使用$binarySize返回文档binary字段的大小(以字节为单位)。
db.images.aggregate([
{
$project: {
"name": "$name",
"imageSize": { $binarySize: "$binary" }
}
}
])
The operation returns the following result:该操作返回以下结果:
{ "_id" : 1, "name" : "cat.jpg", "imageSize" : 16 }
{ "_id" : 2, "name" : "big_ben.jpg", "imageSize" : 41 }
{ "_id" : 3, "name" : "teaset.jpg", "imageSize" : 16 }
{ "_id" : 4, "name" : "concert.jpg", "imageSize" : 269 }
{ "_id" : 5, "name" : "empty.jpg", "imageSize" : 0 }
Find Largest Binary Data查找最大的二进制数据
The following pipeline returns the image with the largest binary data size:以下管道返回具有最大二进制数据大小的图像:
db.images.aggregate([
// First Stage
{ $project: { name: "$name", imageSize: { $binarySize: "$binary" } } },
// Second Stage
{ $sort: { "imageSize" : -1 } },
// Third Stage
{ $limit: 1 }
])
First Stage第一阶段-
The first stage of the pipeline管道projects:projects的第一阶段:Thenamefieldname字段TheimageSizefield, which uses$binarySizeto return the size of the document'sbinaryfield in bytes.imageSize字段,它使用$binarySize返回文档二进制字段的大小(以字节为单位)。
This stage outputs the following documents to the next stage:本阶段向下一阶段输出以下文件:{ "_id" : 1, "name" : "cat.jpg", "imageSize" : 16 }
{ "_id" : 2, "name" : "big_ben.jpg", "imageSize" : 41 }
{ "_id" : 3, "name" : "teaset.jpg", "imageSize" : 16 }
{ "_id" : 4, "name" : "concert.jpg", "imageSize" : 269 }
{ "_id" : 5, "name" : "empty.jpg", "imageSize" : 0 } Second Stage第二阶段-
The second stage第二阶段按sortsthe documents byimageSizein descending order.imageSize降序对文档进行排序。This stage outputs the following documents to the next stage:本阶段向下一阶段输出以下文件:{ "_id" : 4, "name" : "concert.jpg", "imageSize" : 269 }
{ "_id" : 2, "name" : "big_ben.jpg", "imageSize" : 41 }
{ "_id" : 1, "name" : "cat.jpg", "imageSize" : 16 }
{ "_id" : 3, "name" : "teaset.jpg", "imageSize" : 16 }
{ "_id" : 5, "name" : "empty.jpg", "imageSize" : 0 } Third Stage第三阶段-
The third stage第三阶段限制输出文档仅返回排序顺序中第一个出现的文档:limitsthe output documents to only return the document appearing first in the sort order:{ "_id" : 4, "name" : "concert.jpg", "imageSize" : 269 }