The MongoDB $lookup aggregation stage
The aggregation pipeline stage $lookup
makes it possible to join data from an input collection (the collection you’re running the query on) and a lookup collection (the collection you want data from), as long as both collections are on the same database.
Output documents from the lookup collection are added as embedded documents in the input collection.
In this tutorial, we will use the customers
dataset as our input collection (download the .json file here) and the publicly-available Chicago affordable housing
dataset as our lookup collection (download the .json file here).
Next, import the two datasets following these instructions and make sure that they are on the same database.
We’ll use an equality match example to illustrate the $lookup
stage. An equality match requires that the input collection and the lookup collection have a field to match on.
(In our case, this is the address.zip_code
field found in customers
and the Zip Code
field in housing
.)
We want to show affordable housing options in the customers
dataset as a new embedded field address.zip_code.affordable_housing_options
, where there is a zip code match.
MongoDB $lookup function in Aggregation Editor
An aggregation pipeline is meant to have multiple stages.
If you’re building more complex, multi-stage aggregation queries, we recommend using Aggregation Editor. It lets you build queries stage by stage, check inputs and outputs, add, move, or delete stages as you go, and view your query in full mongo shell code.
Open the Aggregation Editor
Right-click on the input collection, and choose Open Aggregation Editor. In our example, the input collection is customers
.
Choose $lookup from the dropdown menu
Run the aggregation query
Type your command in the Stage Editor section, following the syntax described below.
Stage 1 with the MongoDB $lookup
will look like:
{ "from" : "housing", "localField" : "address.zip_code", "foreignField" : "Zip Code", "as" : "address.zip_code.affordable_housing_options" }
housing
is our lookup collectionaddress.zip_code
is the field name in our input collectionZip Code
is the field name in our lookup collection- We’d like to create a new field
address.zip_code.affordable_housing_options
in our input collection. This is where we’ll embed the output documents from our lookup collection.
Run the query by clicking the Run button in the toolbar.
Check your results
For a quick visual check, the best way is to use Table View.
Click anywhere on the Results tab and choose Show All Embedded Fields. This expands all of address
‘s embedded fields in the same view.
You should see our new field, affordable_housing_options
.
Remember: We only have affordable housing information for Chicago zip codes, so not all customers will have values for this new field.
Click on the column header to sort the visible results in descending order.
Scroll to the left and you should see Maria Kelly as the first result, with 2 elements.
Right-click on the entry and choose Document > View Document (JSON).
You should see two affordable housing options nested under the zip_code
field:
View the full mongo shell code
The Aggregation Editor provides a more convenient interface for building queries, but you can always view the full mongo shell code at any point in the pipeline.
Click on the Query Code tab and choose MongoDB Shell:
This will show you the full MongoDB query which you can also run in IntelliShell, Studio 3T’s built-in mongo shell.
Click on the Open in IntelliShell button.
This directly opens the query in a separate IntelliShell tab:
If you’re a pro at writing MongoDB queries, there’s also the hard(er) way of building aggregation queries: by manually writing them in the shell.
MongoDB $lookup function in mongo shell
If you’re running a single aggregation query that uses MongoDB $lookup
, you can do so directly in IntelliShell using the following syntax:
db.getCollection("insert-database-name-here").aggregate( [ { "$lookup" : { from: "name of lookup collection", localField: "field in the input collection", foreignField: "field in the lookup collection", as: "field name for the output documents" } } ] );
Right-click on your collection and choose Open IntelliShell
Right-click on your input collection and choose Open IntelliShell. In our example, it’s the customers
collection in the customerdata
database.
Run the aggregate query
In the Editor section, paste the $lookup
aggregation query.
Our query in this case is:
db.getCollection("customers").aggregate(
[
{
"$lookup" : {
"from" : "housing",
"localField" : "address.zip_code",
"foreignField" : "Zip Code",
"as" : "address.zip_code.affordable_housing_options"
}
}
]
);
Then, click on Run entire script in the toolbar.
Check your results
This will give you the same results as Check your results from Aggregation Editor.
As you will see in later examples, manually writing MongoDB aggregation queries can get quite cumbersome. Debugging, as you add more stages, will prove to be more difficult without a quick way to check inputs and outputs, which is a feature available in Aggregation Editor.
Next, we’ll add a second stage – $project
– to filter our documents to only show the fields we want.