$strLenBytes (aggregation)

~~On this page~~本页内容

~~Definition~~定义
~~Behavior~~行为
~~Example~~示例

Definition定义

$strLenBytes

~~Returns the number of UTF-8 encoded bytes in the specified string.~~返回指定字符串中UTF-8编码的字节数。

$strLenBytes ~~has the following operator expression syntax:~~具有以下运算符表达式语法：

{ $strLenBytes: <string expression> }

~~The argument can be any valid expression as long as it resolves to a string.~~ 参数可以是任何有效的表达式，只要它解析为字符串。~~For more information on expressions, see Expressions.~~有关表达式的详细信息，请参阅表达式。

~~If the argument resolves to a value of null or refers to a missing field, $strLenBytes returns an error.~~如果参数解析为null值或引用缺少的字段，$strLenBytes将返回错误。

Behavior行为

~~The $strLenBytes operator counts the number of UTF-8 encoded bytes in a string where each character may use between one and four bytes.~~$strLenBytes运算符计算字符串中UTF-8编码的字节数，其中每个字符可以使用一到四个字节。

~~For example, US-ASCII characters are encoded using one byte.~~ 例如，US-ASCII字符使用一个字节进行编码。~~Characters with diacritic markings and additional Latin alphabetical characters (i.e. Latin characters outside of the English alphabet) are encoded using two bytes.~~ 带有变音符号标记的字符和其他拉丁字母字符（即英语字母表以外的拉丁字符）使用两个字节进行编码。~~Chinese, Japanese and Korean characters typically require three bytes, and other planes of unicode (emoji, mathematical symbols, etc.) require four bytes.~~中文、日文和韩文字符通常需要三个字节，而其他unicode平面（表情符号、数学符号等）需要四个字节。

~~The $strLenBytes operator differs from $strLenCP operator which counts the code points in the specified string regardless of how many bytes each character uses.~~$strLenBytes运算符与$strLenCP运算符不同，后者统计指定字符串中的代码点，而不管每个字符使用多少字节。

Example	Results	Notes
{ $strLenBytes: "abcde" }	`5`	~~Each character is encoded using one byte.~~每个字符使用一个字节进行编码。
{ $strLenBytes: "Hello World!" }	`12`	~~Each character is encoded using one byte.~~每个字符使用一个字节进行编码。
{ $strLenBytes: "cafeteria" }	`9`	~~Each character is encoded using one byte.~~每个字符使用一个字节进行编码。
{ $strLenBytes: "cafétéria" }	`11`	`é` ~~is encoded using two bytes.~~使用两个字节进行编码。
{ $strLenBytes: "" }	`0`	~~Empty strings return 0.~~空字符串返回0。
{ $strLenBytes: "$€λG" }	`7`	`€` ~~is encoded using three bytes.~~ 使用三个字节进行编码。`λ` ~~is encoded using two bytes.~~使用两个字节进行编码。
{ $strLenBytes: "寿司" }	`6`	~~Each character is encoded using three bytes.~~每个字符使用三个字节进行编码。

Example示例

Single-Byte and Multibyte Character Set单字节和多字节字符集

~~A collection named food contains the following documents:~~名为food的集合包含以下文档：

{ "_id" : 1, "name" : "apple" }
{ "_id" : 2, "name" : "banana" }
{ "_id" : 3, "name" : "éclair" }
{ "_id" : 4, "name" : "hamburger" }
{ "_id" : 5, "name" : "jalapeño" }
{ "_id" : 6, "name" : "pizza" }
{ "_id" : 7, "name" : "tacos" }
{ "_id" : 8, "name" : "寿司" }

~~The following operation uses the $strLenBytes operator to calculate the length of each name value:~~以下操作使用$strLenBytes运算符计算每个name值的length：

db.food.aggregate(
  [
    {
      $project: {
        "name": 1,
        "length": { $strLenBytes: "$name" }
      }
    }
  ]
)

~~The operation returns the following results:~~该操作返回以下结果：

{ "_id" : 1, "name" : "apple", "length" : 5 }
{ "_id" : 2, "name" : "banana", "length" : 6 }
{ "_id" : 3, "name" : "éclair", "length" : 7 }
{ "_id" : 4, "name" : "hamburger", "length" : 9 }
{ "_id" : 5, "name" : "jalapeño", "length" : 9 }
{ "_id" : 6, "name" : "pizza", "length" : 5 }
{ "_id" : 7, "name" : "tacos", "length" : 5 }
{ "_id" : 8, "name" : "寿司", "length" : 6 }

~~The documents with _id: 3 and _id: 5 each contain a diacritic character (é and ñ respectively) that requires two bytes to encode.~~ _id:3和_id:5的文档各自包含一个需要两个字节进行编码的变音字符（分别é和ñ）。~~The document with _id: 8 contains two Japanese characters that are encoded using three bytes each.~~ _id:8的文档包含两个日语字符，每个字符使用三个字节进行编码。~~This makes the length greater than the number of characters in name for the documents with _id: 3, _id: 5 and _id: 8.~~这使得具有_id:3、_id:5和_id:8的文档的长度大于name中的字符数。

~~Tip~~提示

~~See also:~~ 参阅：

← $strcasecmp (aggregation)$strLenCP (aggregation) →