Aggregation in MongoDB is a way of processing documents in a collection by passing them through stages in an aggregation pipeline. The stages in a pipeline can filter, sort, group, reshape, and modify documents that pass through the pipeline.
In this article, you’ll learn how to build a simple aggregation pipeline by defining stage operators using the Aggregation Editor. The Aggregation Editor is the pipeline editor in Studio 3T for checking and debugging the input and output of every stage in an aggregation pipeline.
If you haven’t already done so, download Studio 3T.
Open Aggregation Editor – F4
Run the entire pipeline – F5
Show Input to this Stage – F6
Show Output from this Stage – F7
Move Selected Stage Up – Shift + F8
Move Selected Stage Down – F8
Add New Stage – Shift + Ctrl + N (Shift + ⌘+ N)
Load query – Ctrl + O (⌘+ O)
Save query – Ctrl + S (⌘+ S)
Save query as – Shift + Ctrl + S (Shift + ⌘+ S)
Open Auto-Completion – Ctrl + Space (^ + Space)
Format code – Ctrl + Alt + L (⌥ + ⌘ + L)
Take the Free Academy 3T MongoDB 301 Aggregation Course and learn to use Aggregation in an interactive tutorial with quizzes to enhance your learning experience.
Basics
To open Aggregation Editor:
- Toolbar – Click on the Aggregate button
- Right-click – Right-click on a target collection and choose Open Aggregation Editor
- Shortcut – Press F4
The Aggregation Editor consists of the following sections: Pipeline, Stage editor, Pipeline output, Stage input/output, Query Code, and Explain.
Pipeline
Pipeline (top left) is where you can see all stages at a glance and add, duplicate, and move them as needed.
Stage editor
The Stage editor (top right) is where you write or edit the aggregate query. When you open a new Aggregation Editor tab, Stage 1 is automatically added for you, ready for you to enter the details and if required, add a name of your choice to the stage number.
Pipeline output
The Pipeline output tab (bottom) is where you can view the output of the full pipeline.
Stage input/output
The Stage input/output tab (bottom) is where the inputs and outputs are displayed in their respective panels, Stage Input and Stage Output.
Query Code
The Query Code tab (bottom) translates aggregation queries – as they were last run – to JavaScript (Node.js), Java (2.x, 3.x, and 4.x driver API), Python, C#, PHP, Ruby, and the mongo shell language.
Aggregation queries translated to the mongo shell language can be directly opened in a separate IntelliShell tab.
Explain
The Explain tab visualizes the information provided by explain()
– the steps MongoDB took to run the aggregation query – in a diagram format.
AI Helper
AI Helper is Studio 3T’s AI-powered assistant where you can type your aggregation query in natural language.
You’ll need an OpenAI API key to use AI Helper. You can generate an OpenAI API key by following this link: https://platform.openai.com/account/api-keys. Click the Configure button to open the Preferences dialog and paste your key in the OpenAI API Key box.
Options
The Options dialog is where disk use, custom collation, and index hint settings can be set.
Allow Disk Use enables writing to temporary files, which will then allow aggregation operations to write data to the _tmp
subdirectory in the dbPath
directory.
Customizing your queries’ collation influences how searching and sorting is performed. Read more about collation here.
Index hints enable you to tune aggregation performance by specifying an index to use when loading the pipeline with documents. You can select a reverse order collection scan, select a particular index, or create an index specification document to define the first pass that aggregation makes to fill the pipeline.
A MongoDB aggregation example
To illustrate how Aggregation Editor works, we’ll go through a three-stage aggregation query example which uses:
$match
as Stage 1$group
as Stage 2$sort
as Stage 3
We’ll use the publicly-available housing data from the City of Chicago Data Portal, You can download the zip file here, then import the JSON file to your MongoDB database.
Identify the question to answer
The question we want to ask of our data is simple:
Which zip code has the greatest number of senior housing units available?
To think how we’ll answer this and how we’ll form our query, let’s take a look at the data.
Click on Run full pipeline – which looks like a play button – on the toolbar to view the data. Note that executing an empty pipeline simply shows the contents of the collection.
You can switch between Table, Tree and JSON views of your results.
We can see the fields that we need. We can check Property Type
. Zip Code
and Units
give us the zip code and number of available units there are, respectively.
To answer our question, we need to combine these into the right aggregation query.
Add Stage 1: Match criteria with MongoDB $match
The first stage with $match
as the operator is added for you when you open a new Aggregation Editor tab. The operator defines what the stage does.
The $match
operator takes the input set of documents and outputs only those that match the given criteria. It is essentially a filter.
We want to filter out all senior housing units, so we’ve typed the query:
{ "Property Type": "Senior" }
Check stage inputs
On the Stage input/output tab, click Run – the play button – to view how many input documents went into the $match
stage.
By clicking on Count Documents, we can see that there were 389 input documents, which is exactly how many documents there are in the housing
collection.
Check stage outputs
We know that 389 documents went into the stage, but how many documents matched our specification, "Property Type": "Senior"
?
By clicking on Run under Stage 1: Output and clicking on Count Documents, we can see that 89 documents have a value of Senior
for the field Property Type
.
The stage input and output checks are convenient features for keeping track of your data at each stage in the aggregation pipeline.
Now that we have the results we need from Stage 1 – a quick visual check of the column Property Type
should do – we’re ready to pass them on to next stage of our aggregation pipeline, the $group
stage.
Add Stage 2: Group results with MongoDB $group
We now need a way to group the senior housing units from Stage 1 by zip code, and then calculate the sum of housing units for each zip code. The $group
operator is exactly what we need for this.
To add a new stage:
- Pipeline – Click Add stage or right-click anywhere in the Pipeline section and choose Add New Stage
- Shortcut – Press Shift + Ctrl + N (Shift + ⌘+ N)
Choose $group
from the Operator list in the Stage editor section and write the query:
{ _id: "$Zip Code", total: { $sum: "$Units" } }
This specification states that the output documents of this stage will contain:
- an
_id
with a distinct zip code as a value and will group input documents together that have the same zip code - a
total
field whose value is the sum of all theUnits
field values from each of the documents in the group
We can check the stage input, which we expect to be 89 documents. Nice!
The stage output returns 39 documents – meaning there were 39 unique zip codes – and only the fields we need, _id
and total
.
Add Stage 3: Sort results with MongoDB $sort
As we want to know the zip codes that have the greatest number of senior housing units available, it would be convenient to sort the results from the greatest to the least total units available.
To do this, we’ll add a third stage, choose the $sort
operator from the dropdown, and write the following specification:
{ total: -1 }
The stage input and output should, of course, be the same, but the zip codes should now be arranged in descending order.
It looks like 60624 is the place to be (for Chicago-based retirees).
Run the full aggregation pipeline
The Pipeline tab displays all the stages we’ve built in our aggregation pipeline – Stages 1, 2 and 3 – in one view.
To run the full pipeline:
- Toolbar – Click on the Run button in the toolbar
- Pipeline – Right-click and choose Run the Entire Pipeline
- Shortcut – Press F5
The Pipeline output tab should populate with the same results as those found in the last stage output.
Add a stage before or after a selected stage
It is also possible to place an additional stage before or after any selected stage:
- Toolbar – Click on the down arrow next to the Add stage button and choose Add New Stage Before Selected Stage or Add New Stage After Selected Stage
- Pipeline – Right-click and choose Add New Stage Before Selected Stage or Add New Stage After Selected Stage
- Stage editor – Right-click and choose Add New Stage Before This Stage or Add New Stage After This Stage
Duplicate a stage
Choose the stage you want to clone and:
- Pipeline – Click on the Duplicate button for the selected stage or right-click and choose Duplicate Selected Stage
- Stage editor – Right-click and choose Duplicate This Stage
Move a stage
Select the stage to move in the Pipeline section and:
- Click on the up and down arrows
- Right-click and select Move Selected Stage Up or Move Selected Stage Down
- Shortcuts – Press Shift + F8 to move a selected stage up or F8 to move it down
Alternatively, in the Stage editor – right-click and choose Move This Stage Up or Move This Stage Down.
Enable or disable a stage
To temporarily enable or disable stages in your pipeline, simply check or uncheck the stage checkbox as needed:
Or right-click on a stage in the Pipeline section and choose Include Stage in Pipeline or Exclude Stage From Pipeline.
Delete a stage
Select the stage to be deleted in the Pipeline section and:
- Click on the Delete button
- Right-click and choose Delete Selected Stage
Alternatively, in the Stage editor – right-click and choose Delete This Stage.
Toggle between vertical and horizontal layouts
Click on the Window icon on the Stage input/output tab to show stage inputs and outputs horizontally or vertically.
Refresh results
Refresh results in the Pipeline Output, Stage Input, and Stage Output sections:
- Toolbar – Click on the Refresh icon in the respective toolbars
- Right-click anywhere in these sections and choose Refresh View
- Shortcut – Press Ctrl + R (⌘ + R)
Change databases, collections, and connections while building aggregation queries
In the toolbar, click on any database, collection, or connection to select a different option from the dropdown menu.
View the aggregation query in full mongo shell code
To see the full MongoDB aggregation query instead of viewing them line-by-line or as stages:
- Run the entire pipeline.
- Click on the Query Code tab.
- Choose MongoDB Shell from the dropdown.
Generate JavaScript, Java, Python, C#, PHP, and Ruby code from MongoDB aggregation queries
To view an aggregation query’s equivalent code:
- Run the entire pipeline.
- Click on the Query Code tab.
- Select the target language.
Create a view from an aggregation query
Views are a great shortcut to accessing the data you need without having to run the same queries.
Right-click anywhere in the Pipeline section and Stage editor and choose Save > Create view from this aggregate query.
In the Create View dialog, name the view and click OK.
Your view opens as a new tab, next to the Aggregation Editor tab. The view is also displayed in the Connection Tree, under the database where your collection is located, within a separate folder called Views.
Explain the full pipeline
The Explain Tab features Visual Explain, which shows information on query plans and execution statistics normally provided by the explain()
method.
- Run the entire pipeline.
- Click on the Explain tab.
Save aggregate queries
To save your aggregation query so that you can use it throughout Studio 3T or as a JavaScript file:
- Pipeline section and Stage editor – Right-click and choose Save, and then choose Save query or Save file
- Shortcut – Save query – Ctrl + S (⌘+ S), Save file – Shift + Ctrl + S (Shift + ⌘+ S)
If you are using Studio 3T’s Team Sharing, you can save your aggregation query in a shared folder.
You and your team members can access the shared aggregation query from the My resources sidebar.
To store the connection, database, and collection details with your aggregation query, select the Save target details checkbox.
Open aggregate queries
To open aggregation queries previously saved as queries:
- Pipeline section and Stage editor – Right-click and choose Load > Load query
- Shortcut – Press Ctrl + O (⌘+ O)
To open aggregation queries previously saved as JavaScript files:
- Pipeline section and Stage editor – Right-click and choose Load > Load file
- Shortcut – Press Shift + Ctrl + O (Shift + ⌘+ O)
Copy and paste aggregate queries
The copy and paste function is extremely helpful, especially when working across Studio 3T’s various features (for example, going from SQL Query to the Query Editor).
To copy and paste an aggregation query:
- Pipeline section and Stage editor – Right-click and choose Copy Aggregate Query or Paste Aggregate Query
Even better than copying and pasting, why not share your aggregation query with other team members? See Save aggregate queries for how to share a query in the Aggregation Editor.