Collation排序规则
On this page本页内容
Collation allows users to specify language-specific rules for string comparison, such as rules for lettercase and accent marks.排序规则允许用户为字符串比较指定特定于语言的规则,例如大小写和重音标记的规则。
You can specify collation for a collection or a view, an index, or specific operations that support collation.可以为集合或视图、索引或支持排序规则的特定操作指定排序规则。
Collation Document排序规则文档
A collation document has the following fields:排序规则文档包含以下字段:
{
locale: <string>,
caseLevel: <boolean>,
caseFirst: <string>,
strength: <int>,
numericOrdering: <boolean>,
alternate: <string>,
maxVariable: <string>,
backwards: <boolean>
}
When specifying collation, the 指定排序规则时,locale
field is mandatory; all other collation fields are optional. locale
字段是必需的;所有其他排序规则字段都是可选的。For descriptions of the fields, see Collation Document.有关字段的说明,请参阅排序规则文档。
Default collation parameter values vary depending on which locale you specify. 默认排序规则参数值因指定的区域设置而异。For a complete list of default collation parameters and the locales they are associated with, see Collation Default Parameters.有关默认排序规则参数及其关联的区域设置的完整列表,请参阅排序规则默认参数。
locale | string | locale value of "simple" . locale 值"simple" 。 | ||||||||||||
strength | integer |
| ||||||||||||
caseLevel | boolean | strength level 1 or 2 .strength 级别1 还是在strength 级别2 包括大小写比较的标志。true , include case comparison:true ,则包括大小写比较:
false , do not include case comparison at level 1 or 2 . false ,则不包括级别1 或级别2 的大小写比较。false .false 。 | ||||||||||||
caseFirst | string |
| ||||||||||||
numericOrdering | boolean | true , compare as numbers. true ,则以数字形式进行比较。"10" is greater than "2" ."10" 大于"2" 。false , compare as strings. For example, "10" is less than "2" .false ,则作为字符串进行比较。例如,"10" 小于"2" 。false .false 。 | ||||||||||||
alternate | string |
| ||||||||||||
maxVariable | string | alternate: "shifted" . alternate: "shifted" 时最多可忽略哪些字符。alternate: "non-ignorable" alternate: "non-ignorable" 则没有效果。
| ||||||||||||
backwards | boolean | true , compare from back to front.true ,则从后向前进行比较。false , compare from front to back.false ,则从前到后进行比较。false . false 。 | ||||||||||||
normalization | boolean | true , check if fully normalized and perform normalization to compare text.true ,请检查是否已完全规范化并执行规范化以比较文本。false , does not check.false ,则不进行检查。false .false 。 |
Operations that Support Collation支持排序规则的操作
You can specify collation for the following operations:可以为以下操作指定排序规则:
You cannot specify multiple collations for an operation. 不能为一个操作指定多个排序规则。For example, you cannot specify different collations per field, or if performing a find with a sort, you cannot use one collation for the find and another for the sort.例如,不能为每个字段指定不同的排序规则,或者如果使用排序执行查找,则不能为查找使用一个排序规则,为排序使用另一个排序顺序。
[1] | (1, 2) |
Behavior行为
Local Variants本地变体
Some collation locales have variants, which employ special language-specific rules. 一些排序规则区域设置具有变体,这些变体使用特定于语言的特殊规则。To specify a locale variant, use the following syntax:要指定区域设置变体,请使用以下语法:
{ "locale" : "<locale code>@collation=<variant>" }
For example, to use the 例如,要使用汉语排序规则的unihan
variant of the Chinese collation:unihan
变体:
{ "locale" : "zh@collation=unihan" }
For a complete list of all collation locales and their variants, see Collation Locales.有关所有排序规则区域及其变体的完整列表,请参阅排序规则区域设置。
Collation and Views排序规则和视图
You can specify a default collation for a view at creation time.可以在创建时为视图指定默认排序规则。If no collation is specified, the view's default collation is the "simple" binary comparison collator.如果未指定排序规则,则视图的默认排序规则是“简单”二进制比较排序规则。That is, the view does not inherit the collection's default collation.也就是说,视图不继承集合的默认排序规则。String comparisons on the view use the view's default collation.视图上的字符串比较使用视图的默认排序规则。An operation that attempts to change or override a view's default collation will fail with an error.尝试更改或覆盖视图的默认排序规则的操作将失败,并出现错误。If creating a view from another view, you cannot specify a collation that differs from the source view's collation.如果从其他视图创建视图,则不能指定与源视图的排序规则不同的排序规则。If performing an aggregation that involves multiple views, such as with如果执行涉及多个视图的聚合,例如使用$lookup
or$graphLookup
, the views must have the same collation.$lookup
或$graphLookup
,则这些视图必须具有相同的排序规则。
Collation and Index Use排序规则和索引使用
To use an index for string comparisons, an operation must also specify the same collation. 若要使用索引进行字符串比较,操作还必须指定相同的排序规则。That is, an index with a collation cannot support an operation that performs string comparisons on the indexed fields if the operation specifies a different collation.也就是说,如果具有排序规则的索引指定了不同的排序规则,则该索引无法支持对索引字段执行字符串比较的操作。
For example, the collection 例如,集合myColl
has an index on a string field category
with the collation locale "fr"
.myColl
在排序规则区域设置为"fr"
的字符串字段category
上有一个索引。
db.myColl.createIndex( { category: 1 }, { collation: { locale: "fr" } } )
The following query operation, which specifies the same collation as the index, can use the index:以下查询操作指定与索引相同的排序规则,可以使用索引:
db.myColl.find( { category: "cafe" } ).collation( { locale: "fr" } )
However, the following query operation, which by default uses the 但是,以下查询操作(默认情况下使用"simple"
binary collator, cannot use the index:"simple"
二进制排序器)不能使用索引:
db.myColl.find( { category: "cafe" } )
For a compound index where the index prefix keys are not strings, arrays, and embedded documents, an operation that specifies a different collation can still use the index to support comparisons on the index prefix keys.对于索引前缀键不是字符串、数组和嵌入文档的复合索引,指定不同排序规则的操作仍然可以使用索引来支持对索引前缀键的比较。
For example, the collection 例如,集合myColl
has a compound index on the numeric fields score
and price
and the string field category
; the index is created with the collation locale "fr"
for string comparisons:myColl
对数字字段score
和price
以及字符串字段category
有一个复合索引;索引是使用排序规则区域设置"fr"
创建的,用于字符串比较:
db.myColl.createIndex(
{ score: 1, price: 1, category: 1 },
{ collation: { locale: "fr" } } )
The following operations, which use 以下操作使用"simple"
binary collation for string comparisons, can use the index:"simple"
二进制排序规则进行字符串比较,可以使用索引:
db.myColl.find( { score: 5 } ).sort( { price: 1 } )
db.myColl.find( { score: 5, price: { $gt: NumberDecimal( "10" ) } } ).sort( { price: 1 } )
The following operation, which uses 以下操作使用"simple"
binary collation for string comparisons on the indexed category
field, can use the index to fulfill only the score: 5
portion of the query:"simple"
二进制排序规则对索引类别字段进行字符串比较,可以使用索引仅完成查询的score: 5
部分:
db.myColl.find( { score: 5, category: "cafe" } )
Collation and Unsupported Index Types排序规则和不支持的索引类型
The following indexes only support simple binary comparison and do not support collation:以下索引仅支持简单的二进制比较,不支持排序规则:
To create a 若要在具有非简单排序规则的集合上创建text
or 2d
index on a collection that has a non-simple collation, you must explicitly specify {collation: {locale: "simple"} }
when creating the index.text
索引或2d
索引,必须在创建索引时显式指定{collation: {locale: "simple"} }
。
Restrictions限制
numericOrdering
When specifying the 将numericOrdering
as true
the following restrictions apply:numericOrdering
指定为true
时,将应用以下限制:
Only contiguous non-negative integer substrings of digits are considered in the comparisons.在比较中只考虑数字的连续非负整数子串。numericOrdering
does not support:不支持:+
-
decimal separators, like decimal points and decimal commas小数分隔符,如小数点和小数逗号exponents指数
Only Unicode code points in the Number or Decimal Digit (Nd) category are treated as digits.只有数字或十进制数字(Nd)类别中的Unicode代码点被视为数字。If a digit length exceeds 254 characters, the excess characters are treated as a separate number.如果数字长度超过254个字符,则多余的字符将被视为一个单独的数字。
Consider a collection with the following string number and decimal values:考虑具有以下字符串编号和十进制值的集合:
db.c.insertMany(
[
{ "n" : "1" },
{ "n" : "2" },
{ "n" : "2.1" },
{ "n" : "-2.1" },
{ "n" : "2.2" },
{ "n" : "2.10" },
{ "n" : "2.20" },
{ "n" : "-10" },
{ "n" : "10" },
{ "n" : "20" },
{ "n" : "20.1" }
]
)
The following 以下find
query uses a collation document containing the numericOrdering
parameter:find
查询使用包含numericOrdering
参数的排序规则文档:
db.c.find(
{ }, { _id: 0 }
).sort(
{ n: 1 }
).collation( {
locale: 'en_US',
numericOrdering: true
} )
The operation returns the following results:该操作返回以下结果:
[
{ n: '-2.1' },
{ n: '-10' },
{ n: '1' },
{ n: '2' },
{ n: '2.1' },
{ n: '2.2' },
{ n: '2.10' },
{ n: '2.20' },
{ n: '10' },
{ n: '20' },
{ n: '20.1' }
]
numericOrdering: true
sorts the string values in ascending order as if they were numeric values.按升序对字符串值进行排序,就好像它们是数值一样。The two negative values两个负值-2.1
and-10
are not sorted in the expected sort order because they have unsupported-
characters.-2.1
和-10
没有按预期的排序顺序排序,因为它们包含不支持的-
字符。The value由于2.2
is sorted before the value2.10
, due to the fact that thenumericOrdering
parameter does not support decimal values.numericOrdering
参数不支持十进制值,因此值2.2
排序在值2.10
之前。As a result,结果,2.2
and2.10
are sorted in lexicographic order.2.2
和2.10
按字典顺序排序。