Aggregation with User Preference Data与用户偏好数据的聚合

On this page本页内容

Data Model数据模型

Consider a hypothetical sports club with a database that contains a users collection that tracks the user's join dates, sport preferences, and stores these data in documents that resemble the following:考虑一个假设的体育俱乐部,其数据库包含一个users集合,跟踪用户的加入日期、体育偏好,并将这些数据存储在类似以下的文档中:

{
  _id : "jane",
  joined : ISODate("2011-03-02"),
  likes : ["golf", "racquetball"]
}
{
  _id : "joe",
  joined : ISODate("2012-07-02"),
  likes : ["tennis", "golf", "swimming"]
}

Normalize and Sort Documents规范化和排序文档

The following operation returns user names in upper case and in alphabetical order. 以下操作以大写字母顺序返回用户名。The aggregation includes user names for all documents in the users collection. 聚合包括users集合中所有文档的用户名。You might do this to normalize user names for processing.您可以这样做,以规范化要处理的用户名。

db.users.aggregate(
  [
    { $project : { name:{$toUpper:"$_id"} , _id:0 } },
    { $sort : { name : 1 } }
  ]
)

All documents from the users collection pass through the pipeline, which consists of the following operations:users集合中的所有文档都通过管道,管道包括以下操作:

  • The $project operator:$project运算符:

    • creates a new field called name.创建一个名为name的新字段。
    • converts the value of the _id to upper case, with the $toUpper operator. 使用$toUpper运算符将_id的值转换为大写。Then the $project creates a new field, named name to hold this value.然后$project创建一个名为name的新字段来保存该值。
    • suppresses the id field. 抑制id字段。$project will pass the _id field by default, unless explicitly suppressed.默认情况下将传递_id字段,除非显式抑制。
  • The $sort operator orders the results by the name field.$sort运算符按name字段对结果排序。

The results of the aggregation would resemble the following:聚合结果如下:

{
  "name" : "JANE"
},
{
  "name" : "JILL"
},
{
  "name" : "JOE"
}

Return Usernames Ordered by Join Month返回按加入月份排序的用户名

On this page在这一页

The following aggregation operation returns user names sorted by the month they joined. 以下聚合操作将返回按加入月份排序的用户名。This kind of aggregation could help generate membership renewal notices.这种聚合可以帮助生成成员资格续订通知。

db.users.aggregate(
  [
    { $project :
       {
         month_joined : { $month : "$joined" },
         name : "$_id",
         _id : 0
       }
    },
    { $sort : { month_joined : 1 } }
  ]
)

The pipeline passes all documents in the users collection through the following operations:管道通过以下操作传递users集合中的所有文档:

The operation returns results that resemble the following:该操作将返回类似以下的结果:

{
  "month_joined" : 1,
  "name" : "ruth"
},
{
  "month_joined" : 1,
  "name" : "harold"
},
{
  "month_joined" : 1,
  "name" : "kate"
}
{
  "month_joined" : 2,
  "name" : "jill"
}

Return Total Number of Joins per Month返回每月的联接总数

On this page在这一页

The following operation shows how many people joined each month of the year. 以下操作显示了一年中每个月有多少人加入。You might use this aggregated data for recruiting and marketing strategies.您可以将这些汇总数据用于招聘和营销策略。

db.users.aggregate(
  [
    { $project : { month_joined : { $month : "$joined" } } } ,
    { $group : { _id : {month_joined:"$month_joined"} , number : { $sum : 1 } } },
    { $sort : { "_id.month_joined" : 1 } }
  ]
)

The pipeline passes all documents in the users collection through the following operations:管道通过以下操作传递users集合中的所有文档:

The result of this aggregation operation would resemble the following:此聚合操作的结果如下:

{
  "_id" : {
    "month_joined" : 1
  },
  "number" : 3
},
{
  "_id" : {
    "month_joined" : 2
  },
  "number" : 9
},
{
  "_id" : {
    "month_joined" : 3
  },
  "number" : 5
}

Return the Five Most Common "Likes"返回五个最常见的“喜欢”

On this page在这一页

The following aggregation collects top five most "liked" activities in the data set. 下面的聚合集合了数据集中最受欢迎的前五个活动。This type of analysis could help inform planning and future development.这类分析有助于为规划和未来发展提供信息。

db.users.aggregate(
  [
    { $unwind : "$likes" },
    { $group : { _id : "$likes" , number : { $sum : 1 } } },
    { $sort : { number : -1 } },
    { $limit : 5 }
  ]
)

The pipeline begins with all documents in the users collection, and passes these documents through the following operations:管道从users集合中的所有文档开始,并通过以下操作传递这些文档:

The results of aggregation would resemble the following:聚合结果如下:

{
  "_id" : "golf",
  "number" : 33
},
{
  "_id" : "racquetball",
  "number" : 31
},
{
  "_id" : "swimming",
  "number" : 24
},
{
  "_id" : "handball",
  "number" : 19
},
{
  "_id" : "tennis",
  "number" : 18
}
←  Aggregation with the Zip Code Data SetMap-Reduce →