Model IoT Data物联网数据模型

On this page本页内容

The Internet of Things (IoT) is a network of physical objects that are connected to the internet. 物联网(IoT)是连接到互联网的物理对象的网络。Many of these devices, like sensors, generate data.其中许多设备,如传感器,产生数据。

To store and retrieve this data efficiently, you can use the bucket pattern.为了有效地存储和检索这些数据,可以使用bucket模式。

The Bucket Pattern桶型

A common method to organize IoT data is to group the data into buckets. 组织物联网数据的常见方法是将数据分组到桶中。Bucketing organizes specific groups of data to help:Bucketing组织特定的数据组,以帮助:

  • Discover historical trends,发现历史趋势,
  • Forecast future trends, and预测未来趋势,以及
  • Optimize storage usage.优化存储使用。

Common parameters to group data by are:分组数据的常用参数为:

  • time时间
  • data source (if you have multiple data sets)数据源(如果有多个数据集)
  • customer顾客
  • type of data (for example, transaction type in financial data)数据类型(例如,财务数据中的事务类型)
Note注意

Starting in MongoDB 5.0, time series collections are the recommended collection type for time series data. 从MongoDB 5.0开始,时间序列集合是时间序列数据的推荐集合类型。Do not use the bucket pattern in conjunction with time series collections as this can degrade performance.不要将桶模式与时间序列集合结合使用,因为这会降低性能。

Consider a collection that stores temperature data obtained from a sensor. 考虑存储从传感器获得的温度数据的集合。The sensor records the temperature every minute and stores the data in a collection called temperatures:传感器每分钟记录一次温度,并将数据存储在称为temperatures的集合中:

// temperatures collection
{
  "_id": 1,
  "sensor_id": 12345,
  "timestamp": ISODate("2019-01-31T10:00:00.000Z"),
  "temperature": 40
}
{
  "_id": 2,
  "sensor_id": 12345,
  "timestamp": ISODate("2019-01-31T10:01:00.000Z"),
  "temperature": 40
}
{
  "_id": 3,
  "sensor_id": 12345,
  "timestamp": ISODate("2019-01-31T10:02:00.000Z"),
  "temperature": 41
}
...

This approach does not scale well in terms of data and index size. 这种方法在数据和索引大小方面的扩展性不好。For example, if the application requires indexes on the sensor_id and timestamp fields, every incoming reading from the sensor would need to be indexed to improve performance.例如,如果应用程序需要对sensor_idtimestamp字段进行索引,则需要对来自传感器的每个传入读取进行索引以提高性能。

You can leverage the document model to bucket the data into documents that hold the measurements for a particular timespan. 您可以利用文档模型将数据存储到保存特定时间跨度的度量的文档中。Consider the following updated schema which buckets the readings taken every minute into hour-long groups:考虑以下更新模式,将每分钟获取的读数分成一小时组:

{
  "_id": 1,
  "sensor_id": 12345,
  "start_date": ISODate("2019-01-31T10:00:00.000Z"),
  "end_date": ISODate("2019-01-31T10:59:59.000Z"),
  "measurements": [
    {
      "timestamp": ISODate("2019-01-31T10:00:00.000Z"),
      "temperature": 40
    },
    {
      "timestamp": ISODate("2019-01-31T10:01:00.000Z"),
      "temperature": 40
    },
    ...
    {
      "timestamp": ISODate("2019-01-31T10:42:00.000Z"),
      "temperature": 42
    }
  ],
  "transaction_count": 42,
  "sum_temperature": 1783
}

This updated schema improves scalability and mirrors how the application actually uses the data. 此更新的模式提高了可伸缩性,并反映了应用程序实际使用数据的方式。A user likely wouldn't query for a specific temperature reading. 用户可能不会查询特定的温度读数。Instead, a user would likely query for temperature behavior over the course of an hour or day. 相反,用户可能会查询一小时或一天内的温度行为。The Bucket pattern helps facilitate those queries by grouping the data into uniform time periods.Bucket模式通过将数据分组到一致性的时间段来帮助这些查询。

Combine the Computed and Bucket Patterns结合计算和桶模式

The example document contains two computed fields: transaction_count and sum_temperature. 示例文档包含两个计算字段:transaction_countsum_temperatureIf the application frequently needs to retrieve the sum of temperatures for a given hour, computing a running total of the sum can help save application resources. 如果应用程序经常需要检索给定小时的温度总和,计算该总和的运行总数可以帮助节省应用程序资源。This Computed Pattern approach eliminates the need to calculate the sum each time the data is requested.这种计算模式方法消除了每次请求数据时计算和的需要。

The pre-aggregated sum_temperature and transaction_count values enable further computations such as the average temperature (sum_temperature / transaction_count) for a particular bucket. 预聚集的sum_temperaturetransaction_count值允许进一步计算,例如特定桶的平均温度(sum_ temperatures/transaction-count)。It is much more likely that users will query the application for the average temperature between 2:00 and 3:00 PM rather than querying for the specific temperature at 2:03 PM. 用户更有可能在应用程序中查询下午2:00到3:00之间的平均温度,而不是在下午2:03查询特定温度。Bucketing and pre-computing certain values allows the application to more readily provide that information.打包和预计算某些值允许应用程序更容易地提供该信息。

Time Representations in MongoDBMongoDB中的时间表示

MongoDB stores times in UTC by default, and converts any local time representations into this form. 默认情况下,MongoDB以UTC存储时间,并将任何本地时间表示转换为该格式。Applications that must operate or report on some unmodified local time value may store the time zone alongside the UTC timestamp, and compute the original local time in their application logic.必须在未修改的本地时间值上操作或报告的应用程序可以将时区与UTC时间戳一起存储,并在其应用程序逻辑中计算原始本地时间。

Example示例

In the MongoDB shell, you can store both the current date and the current client's offset from UTC.在MongoDB shell中,您可以存储当前日期和当前客户端与UTC的偏移量。

var now = new Date();
db.data.save( { date: now,
                offset: now.getTimezoneOffset() } );

You can reconstruct the original local time by applying the saved offset:您可以通过应用保存的偏移来重建原始本地时间:

var record = db.data.findOne();
var localNow = new Date( record.date.getTime() -  ( record.offset * 60000 ) );
←  Model Monetary DataModel Computed Data →