Try Studio 3T for Free
  1. Knowledge Base
  2. MongoDB Tutorials
  3. A MongoDB Aggregation Example with $match, $group & $sort

A MongoDB Aggregation Example with $match, $group & $sort

To build our MongoDB aggregation example, we will be using the Aggregation Editor, the stage-by-stage aggregation pipeline editor in Studio 3T.

Build accurate aggregation queries and make debugging easier by defining stage operators and checking inputs and outputs at each stage. Download it here, or if you have already done so, skip to the example.

Open Aggregation Editor – F4
Execute Full Pipeline – F5
Show Input to this Stage – F6
Show Output from this Stage – F7
Move Selected Stage Up - Shift + F8
Move Selected Stage Down - F8
Add New Stage - Shift + Ctrl + N (Shift + ⌘+ N)
Open Aggregate Query – Ctrl + O (⌘+ O)
Save Aggregate Query – Ctrl + S (⌘+ S)
Save Aggregate Query As – Shift + Ctrl + S (Shift + ⌘+ S)
Toggle query auto-completion – Ctrl + Space (^ + Space)
Format code – Ctrl + Alt + L (⌥ + ⌘ + L)

Basics

To open Aggregation Editor:

The Aggregation Editor, Studio 3T's MongoDB aggregate query builder

  • Toolbar – Click on the Aggregate button
  • Right-click – Right-click on a target collection and choose Open Aggregation Editor
  • Hotkey – Press F4

The Aggregation Editor has five main tabs: Pipeline, Stage, Query Code, Explain, and Options.

Pipeline tab

The Pipeline tab is the default tab upon opening Aggregation Editor.

The Pipeline Tab in Aggregation Editor

It has two main sections: Pipeline flow (top) is where you can see all stages at a glance and add, edit, duplicate, and move them as needed. Pipeline output (bottom) is where you can view the output of the full pipeline.

Stage tab(s)

Adding the first stage opens a Stage 1 tab next to the Pipeline tab. Each additional stage opens an appropriately-named, new tab.

The Stage Tab in Aggregation Editor

The Stage tab has two main sections. Stage editor (top) is where the query is written. Stage data (bottom) is where the inputs and outputs are displayed in their respective panels, Stage Input and Stage Output.

Query Code tab

Query Code translates aggregation queries – as they were last run in the Pipeline or Stage tabs – to JavaScript (Node.js), Java (2.x and 3.x driver API), Python, C#, PHP, Ruby, and the mongo shell language.

The Query Code with Aggregation Editor

Aggregation queries translated to the mongo shell language can be directly opened in a separate IntelliShell tab.

Explain tab

The Explain tab visualizes the information provided by explain() – the steps MongoDB took to execute the aggregation query – in a diagram format.

Explain Tab in Aggregation Editor

Options tab

The Options tab is where disk use and custom collation settings can be set.

Options Tab in Aggregation Editor

Allow Disk Use enables writing to temporary files, which will then allow aggregation operations to write data to the _tmp subdirectory in the dbPath directory.

Customizing your queries’ collation influences how searching and sorting is performed. Read more about collation here.

A MongoDB aggregation example

To illustrate how Aggregation Editor works, we’ll go through a three-stage aggregation query example which uses:

  • $match as Stage 1
  • $group as Stage 2
  • $sort as Stage 3

We’ll use the publicly-available housing data from the City of Chicago Data Portal, You can download the zip file here, then import the JSON file to your MongoDB database.

Identify the question to answer

The question we want to ask of our data is simple:

Which zip code has the greatest number of senior housing units available?

To think how we’ll answer this and how we’ll form our query, let’s take a look at the data.

Click on Execute full pipeline – which looks like a play button – on the toolbar to view the data. Note that executing an empty pipeline simply shows the contents of the collection.

Execute full pipeline to view the data

If you prefer a JSON view of the data, it’s included below. Studio 3T supports dynamically switching between TableTree and JSON views of your results, everywhere in the app.

{ 
    "_id" : ObjectId("5e4311926c8b851306f35d61"), 
    "Community Area Name" : "Humboldt Park", 
    "Community Area Number" : NumberInt(23), 
    "Property Type" : "Supportive Housing", 
    "Property Name" : "Humboldt Park Residences ", 
    "Address" : "1152 N. Christiana Ave.", 
    "Zip Code" : NumberInt(60647), 
    "Phone Number" : "773-276-5338 ext. 225", 
    "Management Company" : "LUCHA", 
    "Units" : NumberInt(69), 
    "X Coordinate" : NumberDecimal("1153731.4411072168"), 
    "Y Coordinate" : NumberDecimal("1907707.148117049"), 
    "Latitude" : NumberDecimal("41.9025857989"), 
    "Longitude" : NumberDecimal("-87.7107605283"), 
    "Location" : "(41.9025857989, -87.7107605283)", 
    "Historical Wards 2003-2015" : NumberInt(10), 
    "Wards" : NumberInt(49), 
    "Community Areas" : NumberInt(24), 
    "Zip Codes" : NumberInt(4299), 
    "Census Tracts" : NumberInt(150)
}

Now we can see that we have the fields we need.

We can check Property Type to see that it’s senior housing. Zip Code and Units give us the zip code and number of available units there are, respectively.

To answer our question, we need to combine these into the right aggregation query.

Add Stage 1: Match criteria with MongoDB $match

Click on Add a new stage within the Pipeline flow to add the first stage or click on the green plus icon in the toolbar.

Add a new stage in Aggregation Editor

This will open a Stage 1 tab next to the Pipeline tab.

Choose $match from the dropdown menu. It’s here where we select the stage’s “operator”, which defines what the stage actually does.

Stage 1: $match operator

The $match operator takes the input set of documents and outputs only those that match the given criteria. It is essentially a filter.

We want to filter out all senior housing units, so we’ve typed the query:

{
    "Property Type": "Senior"
}

Trigger query-autocompletion by pressing ^ + Space, or right-clicking anywhere in the Stage editor and choosing Open Auto-Completion.

Check stage inputs

Under Stage Input, click on Execute – the play button – to view how many input documents went into the $match stage.

By clicking on Count Documents, we can see that there were 389 input documents, which is exactly how many documents there are in the housing collection.

Check stage inputs with Aggregation Editor

Check stage outputs

We know that 389 documents went into the stage, but how many documents matched our specification, "Property Type": "Senior"?

By clicking on Execute under Stage Output and clicking on Count Documents, we can see that 89 documents have a value of Senior for the field Property Type.

Check stage outputs with Aggregation Editor

The stage input and output checks are convenient features for keeping track of your data at each stage in the aggregation pipeline.

Now that we have the results we need from Stage 1 – a quick visual check of the column Property Type should do – we’re ready to pass them on to next stage of our aggregation pipeline, the $group stage.

Add Stage 2: Group results with MongoDB $group

We now need a way to group the senior housing units from Stage 1 by zip code, and then calculate the sum of housing units for each zip code. The $group operator is exactly what we need for this.

To add a new stage:

There are multiple ways to add a new stage in Aggregation Editor

  • Toolbar – Click on the green plus icon
  • Pipeline tab – Right-click anywhere in the Pipeline flow section and choose Add New Stage
  • Hotkey – Press Shift + Ctrl + N (Shift + ⌘+ N)

Choose the $group operator from the dropdown and write the query:

{
    _id: "$Zip Code",
    total: { $sum: "$Units" }
}

This specification states that the output documents of this stage will contain:

  • an _id with a distinct zip code as a value and will group input documents together that have the same zip code
  • a total field whose value is the sum of all the Units field values from each of the documents in the group

We can check the stage input, which we expect to be 89 documents. Nice!

Group stage in the aggregation pipeline

The stage output returns 39 documents – meaning there were 39 unique zip codes – and only the fields we need,  _id and total.

Add Stage 3: Sort results with MongoDB $sort

As we want to know the zip codes that have the greatest number of senior housing units available, it would be convenient to sort the results from the greatest to the least total units available.

To do this, we’ll add a third stage, choose the $sort operator from the dropdown, and write the following specification:

{
    total: -1
}

The stage input and output should, of course, be the same, but the zip codes should now be arranged in descending order.

It looks like 60624 is the place to be (for Chicago-based retirees).

Adding the sort stage to the aggregation pipeline

Execute the full aggregation pipeline

The Pipeline tab displays all the stages we’ve built in our aggregation pipeline – Stages 1, 2 and 3 – in one view.

To execute the full pipeline:

  • Toolbar – Click on the play button in the toolbar
  • Pipeline or Stage tab – Right-click anywhere and choose Execute full pipeline
  • Hotkey –  Press F5

Execute the full MongoDB aggregation pipeline

The Pipeline output section should populate with the same results as those found in the last stage output.

Add a stage before or after a selected stage

It is also possible to place an additional stage before or after any selected stage:

Add a stage before or after a selected stage in the MongoDB aggregation example

  • Toolbar – Click on the down arrow next to the (+) icon and choose Add New Stage Before Selected Stage or Add New Stage After Selected Stage
  • Pipeline tab – Right-click anywhere in the Pipeline flow and choose Add New Stage Before Selected Stage or Add New Stage After Selected Stage

Duplicate a stage

Choose the stage you want to clone and:

Duplicate any selected stage easily in the Aggregation Editor

  • Toolbar – Click on the Duplicate (the selected stage) button
  • Pipeline tab – Right-click on your target stage and choose Duplicate Selected Stage
  • Stage tab – Select the stage to duplicate, right-click anywhere within the tab and choose Duplicate Selected Stage

Move a stage

Select the stage to move in the Pipeline tab, or click on the target Stage tab, and:

Move stages easily across the aggregation pipeline

  • Toolbar – Click on the up and down arrows
  • Pipeline tab – Right-click anywhere in the Pipeline flow section and select Move Selected Stage Up or Move Selected Stage Down
  • Stage tab – Choose the stage to move, right-click anywhere within the tab and choose Move This Stage Up or Move This Stage Down
  • Hotkeys – Press Shift + F8 to move a selected stage up or F8 to move it down

Enable or disable a stage

To temporarily enable or disable stages in your pipeline, simply check or uncheck the Include in the pipeline checkbox as needed:

Include or exclude a stage in the aggregation pipeline

Or right-click on a stage on the Pipeline tab and choose Include Stage in Pipeline or Exclude Stage in Pipeline.

Tab names of excluded stages are italicized

Tab names of excluded stages will be italicized (see Stage 3).

Delete a stage

Select the stage to be deleted in the Pipeline tab or its corresponding Stage tab and:

  • Toolbar – Click on the Delete button
  • Pipeline tab – Right-click anywhere in the Pipeline flow section and choose Delete Selected Stage

Toggle between vertical and horizontal layouts

Click on the Window icon on the top-right corner of the Stage data section to show stage inputs and outputs horizontally or vertically.

View stage inputs and outputs horizontally

Refresh results

Refresh results in the Pipeline output, Stage Input, and Stage Output sections:

Refresh the pipeline output

Refresh inputs and outputs

  • Toolbar – Click on the Refresh icon in the respective toolbars
  • Right-click anywhere in these sections and choose Refresh View
  • Hotkey – Press Ctrl + R (⌘ + R)

Change databases, collections, and connections while building aggregation queries

In the toolbar, click on any database, collection, or connection to select a different option from the dropdown menu.

Easily change databases, collections, and connections in Aggregation Editor

View the aggregation query in full mongo shell code

To see the full MongoDB aggregation query instead of viewing them line-by-line or tab-by-tab as stages:

  1. Execute the full pipeline.
  2. Click on the Query Code tab.
  3. Choose mongo shell from the dropdown.

The MongoDB aggregation example in full mongo shell code

All mongo shell code generated through Query Code can be opened directly in a separate IntelliShell tab by clicking on the Open IntelliShell icon.

Open MongoDB aggregation queries directly in IntelliShellRead more about IntelliShell, Studio 3T’s built-in mongo shell with auto-completion.

Generate JavaScript, Java, Python, C#, PHP, and Ruby code from MongoDB aggregation queries

To view an aggregation query’s equivalent code:

  1. Execute the full pipeline.
  2. Click on the Query Code tab.
  3. Select the target language.

Here’s our MongoDB aggregation query, in Python:

MongoDB aggregation query in Python

Read about Query Code in full here.

Create a view from an aggregation query

Views are a great shortcut to accessing the data you need without having to run the same queries.

Right-click anywhere in the Pipeline and Stage tabs and choose Create view from this aggregation query.

Create a view from an aggregation query

Name the view and click OK.

Create view from aggregation query

Your view should appear in the Connection Tree, under the database where your collection is located, within a separate folder called Views.

New view should appear in the Connection Tree

Explain the full pipeline

The Explain Tab features Visual Explain, which shows information on query plans and execution statistics normally provided by the explain() method.

  1. Execute the full pipeline.
  2. Click on the Explain tab.

Explain Tab in Aggregation Editor

Because of limitations in MongoDB, runtime statistics are not available for aggregation queries built through Aggregation Editor.

Save aggregate queries

To save your aggregation query as a JavaScript file:

  • Pipeline and Stage tabs – Right-click anywhere and choose Save Aggregate Query (As)
  • Hotkey – Save Aggregate Query – Ctrl + S (⌘+ S), Save Aggregate Query As – Shift + Ctrl + S (Shift + ⌘+ S)

Open aggregate queries

To open aggregation queries previously saved as JavaScript files:

  • Pipeline and Stage tabs – Right-click anywhere and choose Open Aggregate Query 
  • Hotkey – Press Ctrl + O (⌘+ O)

Copy and paste aggregate queries

The copy and paste function is extremely helpful, especially when working across Studio 3T’s various features (for example, going from SQL Query to the View Editor).

To copy and paste an aggregation query:

  • Pipeline and Stage tabs – Right-click anywhere and choose Copy Aggregate Query or Paste Aggregate Query
Updated on July 27, 2020

Was this article helpful?