Bulk Write Operations大容量写入操作
On this page本页内容
Overview概述
MongoDB provides clients the ability to perform write operations in bulk. MongoDB为客户端提供了批量执行写操作的能力。Bulk write operations affect a single collection. 大容量写入操作会影响单个集合。MongoDB allows applications to determine the acceptable level of acknowledgement required for bulk write operations.MongoDB允许应用程序确定批量写入操作所需的可接受的确认级别。
The db.collection.bulkWrite()
method provides the ability to perform bulk insert, update, and delete operations.db.collection.bulkWrite()
方法提供了执行大容量插入、更新和删除操作的能力。
MongoDB also supports bulk insert through the MongoDB还支持通过db.collection.insertMany()
method.db.collection.insertMany()
方法进行大容量插入。
Ordered vs Unordered Operations有序操作与无序操作
Bulk write operations can be either ordered or unordered.大容量写入操作可以是有序的,也可以是无序的。
With an ordered list of operations, MongoDB executes the operations serially. 通过一个有序的操作列表,MongoDB可以串行执行操作。If an error occurs during the processing of one of the write operations, MongoDB will return without processing any remaining write operations in the list. 如果在处理其中一个写操作的过程中发生错误,MongoDB将返回,而不处理列表中任何剩余的写操作。See ordered Bulk Write请参阅有序批量写入
With an unordered list of operations, MongoDB can execute the operations in parallel, but this behavior is not guaranteed. 有了一个无序的操作列表,MongoDB可以并行执行操作,但这种行为并不能得到保证。If an error occurs during the processing of one of the write operations, MongoDB will continue to process remaining write operations in the list. 如果在处理其中一个写操作的过程中发生错误,MongoDB将继续处理列表中剩余的写操作。See Unordered Bulk Write Example.请参阅无序大容量写入示例。
Executing an ordered list of operations on a sharded collection will generally be slower than executing an unordered list since with an ordered list, each operation must wait for the previous operation to finish.在分片集合上执行有序操作列表通常比执行无序列表慢,因为对于有序列表,每个操作都必须等待上一个操作完成。
By default, 默认情况下,bulkWrite()
performs ordered
operations. To specify unordered
write operations, set ordered : false
in the options document.bulkWrite()
执行有序操作。要指定无序写入操作,请在选项文档中设置ordered : false
。
See Execution of Operations请参阅操作执行
bulkWrite() MethodsbulkWrite()
方法
bulkWrite()
supports the following write operations:支持以下写入操作:
Each write operation is passed to 每个写操作都作为数组中的文档传递给bulkWrite()
as a document in an array.bulkWrite()
。
Example实例
The example in this section uses the 本节中的示例使用pizzas
collection:pizzas
集合:
db.pizzas.insertMany( [
{ _id: 0, type: "pepperoni", size: "small", price: 4 },
{ _id: 1, type: "cheese", size: "medium", price: 7 },
{ _id: 2, type: "vegan", size: "large", price: 8 }
] )
The following 下面的bulkWrite()
example runs these operations on the pizzas
collection:bulkWrite()
示例对pizzas
集合运行这些操作:
Adds two documents using使用insertOne
.insertOne
添加两个文档。Updates a document using使用updateOne
.updateOne
更新文档。Deletes a document using使用deleteOne
.deleteOne
删除文档。Replaces a document using使用replaceOne
.replaceOne
替换文档。
try {
db.pizzas.bulkWrite( [
{ insertOne: { document: { _id: 3, type: "beef", size: "medium", price: 6 } } },
{ insertOne: { document: { _id: 4, type: "sausage", size: "large", price: 10 } } },
{ updateOne: {
filter: { type: "cheese" },
update: { $set: { price: 8 } }
} },
{ deleteOne: { filter: { type: "pepperoni"} } },
{ replaceOne: {
filter: { type: "vegan" },
replacement: { type: "tofu", size: "small", price: 4 }
} }
] )
} catch( error ) {
print( error )
}
Example output, which includes a summary of the completed operations:示例输出,其中包括已完成操作的摘要:
{
acknowledged: true,
insertedCount: 2,
insertedIds: { '0': 3, '1': 4 },
matchedCount: 2,
modifiedCount: 2,
deletedCount: 1,
upsertedCount: 0,
upsertedIds: {}
}
For more examples, see bulkWrite() Examples.有关更多示例,请参阅bulkWrite()
示例。
Strategies for Bulk Inserts to a Sharded Collection批量插入到分片集合的策略
Large bulk insert operations, including initial data inserts or routine data import, can affect sharded cluster performance. 大容量插入操作,包括初始数据插入或例行数据导入,可能会影响分片集群的性能。For bulk inserts, consider the following strategies:对于批量插入件,请考虑以下策略:
Pre-Split the Collection预拆分集合
If the sharded collection is empty, then the collection has only one initial chunk, which resides on a single shard. 如果分片的集合是空的,那么该集合只有一个初始块,它位于单个分片上。MongoDB must then take time to receive data, create splits, and distribute the split chunks to the available shards. 然后,MongoDB必须花费时间来接收数据、创建分割,并将分割块分发到可用的分片。To avoid this performance cost, you can pre-split the collection, as described in Split Chunks in a Sharded Cluster.为了避免这种性能成本,您可以预拆分集合,如分片集群中的拆分块中所述。
Unordered Writes to 无序写入mongos
To improve write performance to sharded clusters, use 要提高分片集群的写入性能,请使用bulkWrite()
with the optional parameter ordered
set to false
. bulkWrite()
,并将可选参数ordered
设置为false
。mongos
can attempt to send the writes to multiple shards simultaneously. 可以尝试同时向多个分片发送写入。For empty collections, first pre-split the collection as described in Split Chunks in a Sharded Cluster.对于空集合,首先按照分片集群中的拆分块中的说明预拆分集合。
Avoid Monotonic Throttling避免强直性脊柱炎
If your shard key increases monotonically during an insert, then all inserted data goes to the last chunk in the collection, which will always end up on a single shard. 如果您的分片键在插入过程中单调增加,那么所有插入的数据都会进入集合中的最后一个区块,该区块将始终位于单个分片上。Therefore, the insert capacity of the cluster will never exceed the insert capacity of that single shard.因此,集群的插入容量永远不会超过单个分片的插入容量。
If your insert volume is larger than what a single shard can process, and if you cannot avoid a monotonically increasing shard key, then consider the following modifications to your application:如果您的插入量大于单个分片所能处理的量,并且您无法避免分片键单调增加,那么请考虑对您的应用程序进行以下修改:
Reverse the binary bits of the shard key.反转分片键的二进制位。This preserves the information and avoids correlating insertion order with increasing sequence of values.这保留了信息,避免了插入顺序与值序列的增加相关。Swap the first and last 16-bit words to "shuffle" the inserts.交换第一个和最后一个16位单词以“打乱”插入内容。
The following example, in C++, swaps the leading and trailing 16-bit word of BSON ObjectIds generated so they are no longer monotonically increasing.下面的例子,在C++中,交换生成的BSON BSON ObjectIds的前导和尾随16位字,使它们不再单调增加。
using namespace mongo;
OID make_an_id() {
OID x = OID::gen();
const unsigned char *p = x.getData();
swap( (unsigned short&) p[0], (unsigned short&) p[10] );
return x;
}
void foo() {
// create an object
BSONObj o = BSON( "_id" << make_an_id() << "x" << 3 << "name" << "jane" );
// now we may insert o into a sharded collection
}
See also: 另请参阅:
Shard Keys for information on choosing a sharded key. 分片键获取有关选择分片化键的信息。Also see Shard Key Internals (in particular, Choose a Shard Key).另请参见分片键内部(特别是选择分片键)。