How to Work with Aggregation Framework in MongoDB?

In the previous article we have discussed about aggregation method in mongodb. In this article we are going to discuss about aggregation framework. Mongodb ver 2.2.0 supports aggregation framework.

What is aggregation framework in mongodb?

The aggregation framework in mongodb calculates aggregate values without the need for complex map-reduce operations. The aggregation framework is designed to be both performant and easy to use. The aggregation framework in mongodb is both multithreaded and written in C++. Aggregation tasks are built around the concept of the aggregation pipeline. The aggregation framework passes documents through a pipeline of operations which transform these objects as they go.

The aggregation framework lets you construct a server-side processing pipeline to be run on a collection.A rich set of operations are available for incorporation in the pipeline so as to achieve various kinds of collection transformations, ranging from simple multi-document calculations to complex projections and pivots. The framework fits nicely in a range of data manipulation tools available in MongoDB from basic built-in functions like document counts to map-reduce and Javascript, to custom code and language-specific packages, including Hadoop.

aggregation framework

Why should we use aggregation framework in mongodb?

In any database aggregation is extremely important. Let us imagine if we have one table of users and one table of clicks. Let’s say we wish to aggregate the clicks per user and groups on certain pages where those clicks occurred.

We can perform actions on the client side in this scenario. But in that  case we won’t have the benefit of speed,that server side gives to us. We will also need to pull all of the records needed for aggregation, even if we didn’t intend to use them all. Pulling out those records can result in a huge amount traffic to preventing other users from filling requests. Data aggregation within the database is extremely important.

Aggregation Operators:

Now, as we are aware of the importance of aggregation. We should discuss about the operators in the framework.

Each pipeline within an aggregation query is independent of the other and takes the full results of the previous pipeline into itself. To explain better let’s take an basic example:


db.accounts.aggregate([
    {$group:{_id:{name:'$name',branch:'$branch'},balance:{$sum:'$balance'}}},
    {$sort:{balance:-1}}
    {$group:{_id:'$_id.branch',name:{$first:'$_id.name'},balance:{$first:'$balance'}}},
])

Matching operator:
The $match operator is very important for any aggregation query. It replaces WHERE in sql query and find or findone in mongo shell. As example, let us see below, that the first $match is used to match only where name is sam, while the second $match is used to match where name is either sam or john.


{$match:{name:'sam'}}

{$match:{$or:[
    {name:'sam'},
    {name: 'john'}
]}}

Grouping:
$group operator is probably the most commonly used base function when aggregating. MongoDB supports quite a few group operators including $max, $min, $avg and $sum.

Let’s take a look at getting the average balance for all customers in a bank. In the below example all groupings are done on the _id field. The value of _id is the field to initiate group. In this case $name. This should produce a grouping in this case, since a person may have more than one account with our bank.


db.accounts.aggregate([
    {$group:{_id:'$name',avg:{$avg:'$balance'}}}
])

Unwinding:
Mongodb favors embedding certain data. Unwinding this embedded data allow us to use it dynamically with the other operators. It will unwind the subdocument to create multiples of the parent document for each element in the subdocument. As an eample:

we will take a set of embedded data


{
    _id:{},
    name: '',
    addresses: [
        {number:24},
        {number:25}
    ]
}

If we were to $unwind addresses, you can get this result:


[{
    _id:{},
    name: '',
    number:24,
},{
    _id:{},
    name: '',
    number:25
}]

This makes it possible to use $limit and $sort type operators to filter a  sub-document.

This means that we can build up complex aggregation queries with multiple $match, $group, $project and other operators. Now try it yourself and see what you can accomplish with the framework.

Related Links:

1> How to Get Started with MongoDB Database?
2> How to Get Started with MongoDB?
3> How to Import and Export Through Mongodb?
4> How to Use Projection in MongoDB?
5> Using sort method in mongodb
6> Map-Reduce in MongoDB
7> Introduction to Replication in MongoDB
8> Deploying a Replica Set in MongoDB
9> Discussing Replication Lag in MongoDB
10> Replica Set Members in Mongodb
11> Working with Sharding in MongoDB
12> Working with Index in MongoDB
13> Working with Aggregation in MongoDB
14> Working with Pipeline Concept in MongoDB
15> Discussing about Pipeline Expression in MongoDB

If you find this article helpful, you can connect us in Google+ and Twitter.

Leave a Reply

Your email address will not be published. Required fields are marked *