Introduction

In this post, we’re going to take a look at how to do MongoDB aggregation queries easily with the Aggregation Editor in Studio 3T (formerly MongoChef). We’re going to build a query based on the freely available housing data from the City of Chicago Data Portal.

Download our MongoDB GUI now if you haven’t already. It’s available for Windows, Mac, and Linux, and the Aggregation Editor feature is available on all three editions – Core, Pro, and Enterprise.

Prefer to watch? Check out our video tutorial.

Create the MongoDB Aggregation Query

Once we’ve opened up Studio 3T and connected to the database, we can select the collection we wish to query:

Select Collection for MongoDB Aggregation Query

We can open the Aggregation Editor by clicking the large ‘Aggregate‘ button in the main tool bar at the top. Then, select ‘Open Aggregation Screen‘ from the right-click context menu or press the ‘F4‘ shortcut key.

We now have an empty MongoDB aggregation query ready to be filled up, so let’s get cracking!

New Pipeline MongoDB Aggregation Query

Identify the Question We Want to Answer

The question we want to ask of our data is simple:

Which zip codes have the greatest number of senior housing units available?

To think how we’ll answer this and how we’ll form our query, let’s take a look at the data.

Click ‘Execute full pipeline‘ (executing an empty pipeline simply shows the contents of the collection).

Full Pipeline Results MongoDB Aggregation Query

If you prefer a JSON view of the data (and Studio 3T supports dynamically switching between table, tree and JSON views of your result data), it’s included below:

{ 
    "_id" : ObjectId("544f9533d4c6dc758c28fde4"), 
    "community_area" : {
        "name" : "Albany Park", 
        "number" : 14
    }, 
    "property" : {
        "type" : "Senior", 
        "name" : "Mayfair Commons"
    }, 
    "address" : "4444 W. Lawrence Ave.", 
    "zip_code" : "60630", 
    "phone_number" : "773-205-7862", 
    "management_company" : "Metroplex, Inc.", 
    "units" : 97, 
    "location" : {
        "x_coordinate" : 1145674.7538177613, 
        "y_coordinate" : 1931569.979044555, 
        "latitude" : 41.9682242321, 
        "longitude" : -87.7397474866, 
        "description" : "4444 W Lawrence Ave\n(41.968224232060564, -87.73974748655358)"
    }
}

OK, so we can see we have the fields we need. We can check "property.type" to see that it’s senior housing, and "zip_code" and "units" give us the zip code and number of available units there are, respectively.

To answer our question, we need to combine these into the right aggregation query. But first, let’s get familiar with MongoDB aggregation pipeline operators.

MongoDB Aggregation Pipeline Operators

A full list of the supported operators and their meaning is available here: http://docs.mongodb.org/manual/meta/aggregation-quick-reference/.

This link is always readily available by clicking ‘Operator Quick Reference’ in the app.

Additionally, here’s a list of the new aggregation pipeline operators that came out with MongoDB 3.4, which both IntelliShell and the Aggregation Editor fully support.

OPERATORDESCRIPTION
$bucketCategorizes incoming documents into groups, called buckets, based on a specified expression and bucket boundaries.
$bucketAutoCategorizes incoming documents into a specific number of groups, called buckets, based on a specified expression. Bucket boundaries are automatically determined in an attempt to evenly distribute the documents into the specified number of buckets.
$facetProcesses multiple aggregation pipelines within a single stage on the same set of input documents. Enables the creation of multi-faceted aggregations capable of characterizing data across multiple dimensions, or facets, in a single stage.
$sortByCountCategorizes or groups incoming documents by a specified expression to compute the count for each group. Output documents are sorted in descending order by the count.
$graphLookupPerforms a recursive search on a collection. To each output document, adds a new array field that contains the traversal results of the recursive search for that document.

Now we’re ready to create the first stage of our query where we’ll match against the senior property type

Add a New Stage

Click ‘Add New Stage‘ and you’ll see a new stage in the ‘Pipeline‘ tab.

New Stage MongoDB Aggregation Query

Double click the new stage to edit it (or simply select the ‘Stage 1‘ tab):

Match Operator MongoDB Aggregation Query

The screenshot above jumps ahead a little bit as the stage specification has already been filled, but let’s break down each piece in turn.

First, notice the ‘$match‘ in the combo box. It’s here where we select the stage’s ‘operator’.

A stage operator defines what the stage actually does. The ‘$match‘ operator takes the input set of documents and outputs only those that match the given criteria. It is essentially a filter.

For convenience, the specification of the Stage 1 ‘$match‘ operator is repeated below:

{
    "property.type": "Senior"
}

In the stage’s specification, we can see that we are matching against the "Senior" property type, meaning only documents with a value of "Senior" for the field "property.type" will be passed onto the (yet to be created) next stage of the MongoDB aggregation pipeline for further processing.

We can check the output of this and any other stage at any time by clicking ‘Show output from the selected stage‘. Similarly, we can see the input of any stage at any time by clicking ‘Show input to the selected stage‘. This is a really nice and convenient feature, as it makes keeping track of the precise form of the data we are working at each stage in the aggregation pipeline really easy.

We can see in the ‘Stage 1 output‘ tab that we have the results we need from this stage, and so let’s go on and create the next.

Grouping Results

We now need a way to group together the results from Stage 1 on zip code and then add up each of the available units figures. The ‘$group‘ operator is exactly what we need for this.

Group Operator MongoDB Aggregation Query

The Stage 2 ‘$group‘ operator specification is repeated below:

{
    _id: "$zip_code",
    totalUnits: { $sum: "$units" }
}

The specification of Stage 2 states that the output of this stage will be documents that have an “_id" with a distinct zip code as a value and so will group together documents input to this stage that have the same zip code, and a “totalUnits" field whose value is the sum of all the "units" field values from each of the documents in the group.

We can see the input to and output from tabs for this stage in the screenshot and can confirm that a reduction has taken place – of the 70 documents input to this stage, there were 36 distinct zip codes, and so the corresponding 36 documents are output from this stage.

Finding the Answer

As we want to know the zip codes that have the greatest number of senior housing units available, it would be convenient to sort the results from the greatest to the least total units available.

To do this, we’ll create a third stage using the ‘$sort‘ operator with the following specification, giving us exactly what we want:

{
    totalUnits: -1
}

Full Query and Results MongoDB Aggregation Query

Going back to the ‘Pipeline‘ tab we can see the result of the execution of the full query, as well as the full query itself, all in one one place. We can see we have the expected number of results from the full aggregation pipeline, and we can now answer our question – we have a list of the zip codes that have the greatest number of senior housing units available.

Wasn’t that easy? 🙂

Specify Query Options

Depending on your own particular query, you may wish to specify options such as to use a database cursor for the results (if the results are large), allow the query to write temporary intermediate results to disk or rather than run the query, explain aspects of the processing of the query.

These options can be set in the ‘Options‘ tab. Note that these options only became available in MongoDB 2.6, so if you are connected to a MongoDB 2.4 or earlier instance, the ‘Options‘ tab is not shown.

Options MongoDB Aggregation Query

Share MongoDB Aggregation Queries

The aggregation queries you have created can be saved to and loaded from file, so not only can you reload them in future sessions, but you can also share them with other colleagues and users.

Save aggregate queries directly to a file using Studio 3T's IntelliShell
You can also easily copy and paste aggregate queries to and from the clipboard.

Copy and paste aggregate queries to and from the clipboard using IntelliShell

There is also a preview of the raw MongoDB script of the aggregation query available by selecting ‘Show Query Preview‘ from the context menu. This can be rather handy if you simply wish to examine the raw underlying script, or make a quick copy to stick in an email, or perhaps combine in it in a larger, more complex query in Studio 3T’s IntelliShell.

Query Preview MongoDB Aggregation Query

Handy References

It can take a bit of time to master all the different operators available in the MongoDB aggregation pipeline, so links to the MongoDB Aggregation Pipeline Quick Reference and the Aggregation Section of the MongoDB Manual are always available within a click’s reach directly in the app itself via the ‘Operator Quick Reference‘ and ‘Aggregation Tutorial‘ links, respectively. It won’t be too long before you’re masterfully producing complex MongoDB aggregation queries of your own!

OK, that’s it for this post. As a helpful complement, I recommend reading our short guide on how to temporarily disable and enable aggregation pipeline stages. I hope you feel the same delight as we do about the amazing new features, convenience and boost to productivity the new Aggregation Screen in Studio 3T offers.

Please do check out Studio 3T, the best MongoDB client. A little example of the rich code completion and easy in-line editing experience it offers is shown below:

IntelliShell for MongoDB

 

Please also check out our Schema Explorer and IntelliShell tools at Studio3T.com, as I expect they’ll also help in making you a much more powerful and productive MongoDB user. As always, you can always find the latest tips and updates in our What’s New section.

We’re always very keen to hear about your experiences and ideas for Studio 3T. If you’d like to tell us about them please visit our feedback page or click the ‘Feedback‘ tool bar button in the app.

Editor’s Note: This post was originally published in September 2015.