Docs Home → Develop Applications → MongoDB Manual

Perform Long-Running Snapshot Queries执行长时间运行的快照查询

~~On this page~~本页内容

~~Comparing Local and Snapshot Read Concerns~~比较本地和快照读取问题
~~Examples~~实例
~~Run Related Queries From the Same Point in Time~~从同一时间点运行相关查询
~~Read from a Consistent State of the Data from Some Point in the Past~~从过去某个点的数据的一致状态读取
~~Configure Snapshot Retention~~配置快照保留
~~Disk Space and History~~磁盘空间和历史记录

~~Snapshot queries allow you to read data as it appeared at a single point in time in the recent past.~~快照查询允许您读取最近某个时间点出现的数据。

~~Starting in MongoDB 5.0, you can use read concern "snapshot" to query data on secondary nodes.~~ 从MongoDB 5.0开始，您可以使用读取关注点"snapshot"来查询secondary节点上的数据。~~This feature increases the versatility and resilience of your application's reads.~~ 此功能增加了应用程序读取的多功能性和弹性。~~You do not need to create a static copy of your data, move it out into a separate system, and manually isolate these long-running queries from interfering with your operational workload.~~ 您不需要创建数据的静态副本，将其移到单独的系统中，也不需要手动隔离这些长时间运行的查询，以免干扰您的操作工作负载。~~Instead, you can perform long-running queries against a live, transactional database while reading from a consistent state of the data.~~相反，您可以在读取一致的数据状态的同时，对实时事务数据库执行长时间运行的查询。

~~Using read concern "snapshot" on secondary nodes does not impact your application's write workload.~~ 在辅助节点上使用读问题"snapshot"不会影响应用程序的写入工作负载。~~Only application reads benefit from long-running queries being isolated to secondaries.~~只有应用程序读取从隔离到辅助设备的长时间运行的查询中受益。

~~Use snapshot queries when you want to:~~当您希望执行以下操作时，请使用快照查询：

~~Perform multiple related queries and ensure that each query reads data from the same point in time.~~执行多个相关查询，并确保每个查询从同一时间点读取数据。
~~Ensure that you read from a consistent state of the data from some point in the past.~~确保从过去某个时间点读取的数据状态一致。

Comparing Local and Snapshot Read Concerns比较本地和快照读取问题

~~When MongoDB performs long-running queries using the default "local" read concern, the query results may contain data from writes that occur at the same time as the query.~~ 当MongoDB使用默认的"local"读取问题执行长时间运行的查询时，查询结果可能包含与查询同时发生的写入数据。~~As a result, the query may return unexpected or inconsistent results.~~因此，查询可能会返回意外或不一致的结果。

~~To avoid this scenario, create a session and specify read concern "snapshot".~~ 要避免这种情况，请创建一个会话并指定读取关注点"snapshot"。~~With read concern "snapshot", MongoDB runs your query with snapshot isolation, meaning that your query reads data as it appeared at a single point in time in the recent past.~~使用读取关注点"snapshot"，MongoDB以快照隔离的方式运行您的查询，这意味着您的查询读取最近某个时间点出现的数据。

Examples实例

~~The examples on this page show how you can use snapshot queries to:~~此页面上的示例显示了如何使用快照查询来：

~~Run Related Queries From the Same Point in Time~~从同一时间点运行相关查询
~~Read from a Consistent State of the Data from Some Point in the Past~~从过去某个点的数据的一致状态读取

Run Related Queries From the Same Point in Time从同一时间点运行相关查询

~~Read concern "snapshot" lets you run multiple related queries within a session and ensure that each query reads data from the same point in time.~~读取关注"snapshot"允许您在会话中运行多个相关查询，并确保每个查询都从同一时间点读取数据。

~~An animal shelter has a pets database that contains collections for each type of pet. The pets database has these collections:~~动物收容所有一个pets数据库，其中包含每种宠物的集合。pets数据库包含以下集合：

cats
dogs

~~Each document in each collection contains an adoptable field, indicating whether the pet is available for adoption.~~ 每个集合中的每个文件都包含一个adoptable字段，指示宠物是否可供收养。~~For example, a document in the cats collection looks like this:~~例如，cats集合中的一个文档如下所示：

{
   "name": "Whiskers",
   "color": "white",
   "age": 10,
   "adoptable": true
}

~~You want to run a query to see the total number of pets available for adoption across all collections.~~ 您想运行一个查询来查看所有集合中可供收养的宠物总数。~~To provide a consistent view of the data, you want to ensure that the data returned from each collection is from a single point in time.~~为了提供一致的数据视图，您需要确保从每个集合返回的数据来自单个时间点。

~~To accomplish this goal, use read concern "snapshot" within a session:~~要实现此目标，请在会话中使用读取关注点"snapshot"：

C C++11 Go Motor PHP Python Ruby

mongoc_client_session_t *cs = NULL;
mongoc_collection_t *cats_collection = NULL;
mongoc_collection_t *dogs_collection = NULL;
int64_t adoptable_pets_count = 0;
bson_error_t error;
mongoc_session_opt_t *session_opts;

cats_collection = mongoc_client_get_collection (client, "pets", "cats");
dogs_collection = mongoc_client_get_collection (client, "pets", "dogs");

/* Seed 'pets.cats' and 'pets.dogs' with example data */
if (!pet_setup (cats_collection, dogs_collection)) {
   goto cleanup;
}

/* start a snapshot session */
session_opts = mongoc_session_opts_new ();
mongoc_session_opts_set_snapshot (session_opts, true);
cs = mongoc_client_start_session (client, session_opts, &error);
mongoc_session_opts_destroy (session_opts);
if (!cs) {
   MONGOC_ERROR ("Could not start session: %s", error.message);
   goto cleanup;
}

/*
 * Perform the following aggregation pipeline, and accumulate the count in
 * `adoptable_pets_count`.
 *
 *  adoptablePetsCount = db.cats.aggregate(
 *      [ { "$match": { "adoptable": true } },
 *        { "$count": "adoptableCatsCount" } ], session=s
 *  ).next()["adoptableCatsCount"]
 *
 *  adoptablePetsCount += db.dogs.aggregate(
 *      [ { "$match": { "adoptable": True} },
 *        { "$count": "adoptableDogsCount" } ], session=s
 *  ).next()["adoptableDogsCount"]
 *
 * Remember in order to apply the client session to
 * this operation, you must append the client session to the options passed
 * to `mongoc_collection_aggregate`, i.e.,
 *
 * mongoc_client_session_append (cs, &opts, &error);
 * cursor = mongoc_collection_aggregate (
 *    collection, MONGOC_QUERY_NONE, pipeline, &opts, NULL);
 */
accumulate_adoptable_count (cs, cats_collection, &adoptable_pets_count);
accumulate_adoptable_count (cs, dogs_collection, &adoptable_pets_count);

printf ("there are %" PRId64 " adoptable pets\n", adoptable_pets_count);

~~The preceding series of commands:~~前面的一系列命令：

~~Uses MongoClient() to establish a connection to the MongoDB deployment.~~使用MongoClient()建立到MongoDB部署的连接。
~~Switches to the pets database.~~切换到pets数据库。
~~Establishes a session.~~ 建立会话。~~The command specifies snapshot=True, so the session uses read concern "snapshot".~~该命令指定snapshot=True，因此会话使用读问题"snapshot"。
~~Performs these actions for each collection in the pets database:~~对pets数据库中的每个集合执行以下操作：
- ~~Uses $match to filter for documents where the adoptable field is True.~~使用$match筛选adoptable字段为True的文档。
- ~~Uses $count to return a count of the filtered documents.~~使用$count返回已筛选文档的计数。
- ~~Increments the adoptablePetsCount variable with the count from the database.~~使用数据库中的计数增加adoptablePetsCount变量。
~~Prints the adoptablePetsCount variable.~~打印adoptablePetsCount变量。

~~All queries within the session read data as it appeared at the same point in time.~~ 会话中的所有查询都会在数据出现在同一时间点时读取数据。~~As a result, the final count reflects a consistent snapshot of the data.~~因此，最终计数反映了数据的一致快照。

Note

~~If the session lasts longer than the WiredTiger history retention period (300 seconds, by default), the query errors with a SnapshotTooOld error.~~ 如果会话持续时间超过WiredTiger历史记录保留期（默认情况下为300秒），则查询将出错，并显示SnapshotTooOld错误。~~To learn how to configure snapshot retention and enable longer-running queries, see Configure Snapshot Retention.~~要了解如何配置快照保留和启用运行时间更长的查询，请参阅配置快照保留。

Read from a Consistent State of the Data from Some Point in the Past从过去某个点的数据的一致状态读取

~~Read concern "snapshot" ensures that your query reads data as it appeared at some single point in time in the recent past.~~读取关注点"snapshot"可确保您的查询读取最近某个时间点出现的数据。

~~An online shoe store has a sales collection that contains data for each item sold at the store.~~ 在线鞋店有一个sales集合，其中包含该店销售的每件商品的数据。~~For example, a document in the sales collection looks like this:~~例如，sales集合中的文档如下所示：

{
   "shoeType": "boot",
   "price": 30,
   "saleDate": ISODate("2022-02-02T06:01:17.171Z")
}

~~Each day at midnight, a query runs to see how many pairs of shoes were sold that day.~~ 每天午夜，都会有一个查询，查看当天售出了多少双鞋。~~The daily sales query looks like this:~~每日销售查询如下所示：

C Go Motor PHP Python

mongoc_client_session_t *cs = NULL;
mongoc_collection_t *sales_collection = NULL;
bson_error_t error;
mongoc_session_opt_t *session_opts;
bson_t *pipeline = NULL;
bson_t opts = BSON_INITIALIZER;
mongoc_cursor_t *cursor = NULL;
const bson_t *doc = NULL;
bool ok = true;
bson_iter_t iter;
int64_t total_sales = 0;

sales_collection = mongoc_client_get_collection (client, "retail", "sales");

/* seed 'retail.sales' with example data */
if (!retail_setup (sales_collection)) {
   goto cleanup;
}

/* start a snapshot session */
session_opts = mongoc_session_opts_new ();
mongoc_session_opts_set_snapshot (session_opts, true);
cs = mongoc_client_start_session (client, session_opts, &error);
mongoc_session_opts_destroy (session_opts);
if (!cs) {
   MONGOC_ERROR ("Could not start session: %s", error.message);
   goto cleanup;
}

if (!mongoc_client_session_append (cs, &opts, &error)) {
   MONGOC_ERROR ("could not apply session options: %s", error.message);
   goto cleanup;
}

pipeline = BCON_NEW ("pipeline",
                     "[",
                     "{",
                     "$match",
                     "{",
                     "$expr",
                     "{",
                     "$gt",
                     "[",
                     "$saleDate",
                     "{",
                     "$dateSubtract",
                     "{",
                     "startDate",
                     "$$NOW",
                     "unit",
                     BCON_UTF8 ("day"),
                     "amount",
                     BCON_INT64 (1),
                     "}",
                     "}",
                     "]",
                     "}",
                     "}",
                     "}",
                     "{",
                     "$count",
                     BCON_UTF8 ("totalDailySales"),
                     "}",
                     "]");

cursor = mongoc_collection_aggregate (
   sales_collection, MONGOC_QUERY_NONE, pipeline, &opts, NULL);
bson_destroy (&opts);

ok = mongoc_cursor_next (cursor, &doc);

if (mongoc_cursor_error (cursor, &error)) {
   MONGOC_ERROR ("could not get totalDailySales: %s", error.message);
   goto cleanup;
}

if (!ok) {
   MONGOC_ERROR ("%s", "cursor has no results");
   goto cleanup;
}

ok = bson_iter_init_find (&iter, doc, "totalDailySales");
if (ok) {
   total_sales = bson_iter_as_int64 (&iter);
} else {
   MONGOC_ERROR ("%s", "missing key: 'totalDailySales'");
   goto cleanup;
}

~~The preceding query:~~前面的查询：

~~Uses $match with $expr to specify a filter on the saleDate field.~~使用$match和$expr在saleDate字段中指定一个筛选器。
- $expr ~~allows the use of aggregation expressions (such as NOW) in the $match stage.~~允许在$match阶段使用聚合表达式（如NOW）。
~~Uses the $gt operator and $dateSubtract expression to return documents where the saleDate is greater than one day before the time the query is executed.~~使用$gt运算符和$dateSubtract表达式返回saleDate大于执行查询前一天的文档。
~~Uses $count to return a count of the matching documents.~~ 使用$count返回匹配文档的计数。~~The count is stored in the totalDailySales variable.~~计数存储在totalDailySales变量中。
~~Specifies read concern "snapshot" to ensure that the query reads from a single point in time.~~指定读取问题"snapshot"以确保查询从单个时间点读取。

~~The sales collection is quite large, and as a result this query may take a few minutes to run.~~ sales集合相当大，因此运行此查询可能需要几分钟时间。~~Because the store is online, sales can occur at any time of day.~~因为这家商店是在线的，所以销售可以在一天中的任何时候进行。

~~For example, consider if:~~例如，如果：

~~The query begins executing at 12:00 AM.~~查询在上午12:00开始执行。
~~A customer buys three pairs of shoes at 12:02 AM.~~一位顾客在凌晨12:02买了三双鞋。
~~The query finishes executing at 12:04 AM.~~查询在上午12:04结束执行。

If the query doesn't use read concern "snapshot", sales that occur between when the query starts and when it finishes can be included in the query count, despite not occurring on the day the report is for. 如果查询不使用读问题"snapshot"，则在查询开始和结束之间发生的销售可以包括在查询计数中，尽管不是在报告发布的当天发生的。~~This could result in inaccurate reports with some sales being counted twice.~~这可能会导致报告不准确，有些销售额会被计算两次。

~~By specifying read concern "snapshot", the query only returns data that was present in the database at a point in time shortly before the query started executing.~~通过指定读取关注点"snapshot"，查询仅返回在查询开始执行前不久数据库中存在的数据。

Note

~~If the query takes longer than the WiredTiger history retention period (300 seconds, by default), the query errors with a SnapshotTooOld error.~~ 如果查询花费的时间超过WiredTiger历史记录保留期（默认情况下为300秒），则查询将出错，并显示SnapshotTooOld错误。~~To learn how to configure snapshot retention and enable longer-running queries, see Configure Snapshot Retention.~~要了解如何配置快照保留和启用运行时间更长的查询，请参阅配置快照保留。

Configure Snapshot Retention配置快照保留

~~By default, the WiredTiger storage engine retains history for 300 seconds.~~ 默认情况下，WiredTiger存储引擎会保留300秒的历史记录。~~You can use a session with snapshot=true for a total of 300 seconds from the time of the first operation in the session to the last.~~ 从会话中的第一次操作到最后一次操作，您可以使用snapshot=true的会话总共300秒。~~If you use the session for a longer period of time, the session fails with a SnapshotTooOld error.~~ 如果使用会话的时间较长，会话将失败，并显示SnapshotTooOld错误。~~Similarly, if you query data using read concern "snapshot" and your query lasts longer than 300 seconds, the query fails.~~同样，如果使用读取关注点"snapshot"查询数据，并且查询持续时间超过300秒，则查询将失败。

~~If your query or session run for longer than 300 seconds, consider increasing the snapshot retention period.~~ 如果查询或会话运行时间超过300秒，请考虑延长快照保留期。~~To increase the retention period, modify the minSnapshotHistoryWindowInSeconds parameter.~~要增加保留期，请修改minSnapshotHistoryWindowInSeconds参数。

~~For example, this command sets the value of minSnapshotHistoryWindowInSeconds to 600 seconds:~~例如，此命令将minSnapshotHistoryWindowInSeconds的值设置为600秒：

db.adminCommand( { setParameter: 1, minSnapshotHistoryWindowInSeconds: 600 } )

Important

~~To modify minSnapshotHistoryWindowInSeconds for a MongoDB Atlas cluster, you must contact Atlas Support.~~要修改MongoDB Atlas集群的minSnapshotHistoryWindowInSeconds，您必须联系Atlas支持。

Disk Space and History磁盘空间和历史记录

~~Increasing the value of minSnapshotHistoryWindowInSeconds increases disk usage because the server must maintain the history of older modified values within the specified time window.~~ 增加minSnapshotHistoryWindowInSecondss的值会增加磁盘使用量，因为服务器必须在指定的时间窗口内维护旧的修改值的历史记录。~~The amount of disk space used depends on your workload, with higher volume workloads requiring more disk space.~~使用的磁盘空间量取决于您的工作负载，卷越大的工作负载需要更多的磁盘空间。

← Query for Null or Missing Fields Iterate a Cursor in mongosh →