Skip to content
Studio 3T - The professional GUI, IDE and client for MongoDB
  • Tools
    • Aggregation Editor
    • IntelliShell
    • Visual Query Builder
    • Export Wizard
    • Import Wizard
    • Query Code
    • SQL Query
    • Connect
    • Schema Explorer
    • Compare
    • SQL ⇔ MongoDB Migration
    • Data Masking
    • Task Scheduler
    • Reschema
    • More Tools and Features
  • Solutions
  • Resources
    • Knowledge Base
    • MongoDB Tutorials & Courses
    • Tool/Feature Documentation
    • Blog
    • Community
    • Testimonials
    • Whitepapers
    • Reports
  • Contact us
    • Contact
    • Sales Support
    • Feedback and Support
    • Careers
    • About Us
  • Store
    • Buy Now
    • Preferred Resellers
    • Team Pricing
  • Download
  • My 3T
search

Academy 3T

  • Explore our courses
    • MongoDB 101: Getting Started
    • MongoDB 201: Querying MongoDB Data
    • MongoDB 301: Aggregation
  • Get certified

Adding Lookup Data to the Aggregation Pipeline

MongoDB 301: Aggregation Adding Lookup Data to the Aggregation Pipeline

The first two sections of this course walked you through the process of building an aggregate statement that retrieves data from the customers collection in the sales database. In the first section, you used IntelliShell to create the statement and save it to the states_transactions.js file. In the second section, you imported the file into the Aggregation Editor, where you added two stages to the pipeline. Together, these sections provided you with a foundation for working with aggregate statements in both IntelliShell and the Aggregation Editor.

In this section, you’ll refine the statement even further, again importing it into the Aggregation Editor. This time, you’ll add three more stages and update several existing stages. In one of the new stages, you’ll perform a lookup operation that retrieves data from a second collection in the same database. You’ll then convert values in one of the lookup fields and then add a field that uses those values to calculate ratios. Finally, you’ll limit the results of the aggregate statement to the top five documents with the highest ratios.

By the end of this section, you will learn how to

  • Add lookup data to the aggregation pipeline
  • Convert string values in one of the lookup fields to integers
  • Add a computed ratio field based on the converted lookup field
  • Limit the number of returned documents

What you will need

  • Access to a MongoDB Atlas cluster
  • Access to the states_transactions.js file you updated in the second section
  • Ability to download a .json file from the internet

Working with advanced aggregate operators

The key to building an aggregate statement is to ensure that it includes the stages necessary to process the documents in the correct order and return the desired results. To help with these stages, MongoDB provides a set of aggregate operators that carry out different operations on the aggregation pipeline. Each stage within the pipeline is associated with a single aggregate operator that determines how the inputted documents will be processed and outputted to the next pipeline stage.

In the first two sections of this course, you worked with the following operators when building your aggregate statement:

  • $match. Filters out all documents from the pipeline except those that match the specified condition.
  • $group. Groups documents based on a specified expression and outputs a document for each grouping. The outputted documents typically include one or more computed fields that calculate the accumulated values.
  • $project. Restructures each document in the pipeline by adding, retaining, reshaping, or suppressing fields.
  • $replaceRoot. Replaces each document in the pipeline with the specified embedded document. 
  • $sort. Reorders the documents in the pipeline based on the specified sorting logic.

At this point, you should be fairly familiar with these operators, but if you have any questions about them, refer to the first two sections in this course or the MongoDB documentation.

$lookup

In this section of the course, you’ll learn how to use several other aggregate operators. The first of these is the $lookup operator, which joins documents in the primary collection with documents in a secondary collection. The primary collection is the one on which you base your aggregate statement. The secondary collection is the one that contains the data you’ll be incorporating into the aggregation pipeline based on matching values. To use the $lookup operator, the secondary collection must be non-sharded and be in the same database as the primary collection. 

When you run a stage based on the $lookup operator, MongoDB adds an array field to each document in the primary collection. The array field includes one element for each document in the secondary collection that matches the conditions set by the operator. For example, if only one document matches the specified condition, the array will include only one element, which will contain the matching document.

The most basic type of lookup operation is referred to as an equity match. In an equity match, the documents in the primary collection contain a field that matches a field in the secondary collection. Documents are considered matching between the two collections when the field values are equal. The following syntax shows the elements needed to perform an equity match:

$lookup:
{
  from: secondary_collection,
  localField: primary_collection_field,
  foreignField: secondary_collection_field,
  as: new_array_field_name
}

The $lookup operator takes four arguments, which you pass into the operator as a document with four fields:

  • The from argument specifies the name of the secondary collection, which must be in the same database.
  • The localField argument specifies the name of the field in the primary collection on which the match is based.
  • The foreignField argument specifies the name of the field in the secondary collection on which the match is based.
  • The as argument specifies the name of the array field that will be added to the documents in the primary collection, or more precisely, to the documents in the aggregation pipeline. The array will contain the matching documents from the secondary collection.

To demonstrate how this works, suppose you have two collections—invoices and products—and you’re building an aggregate statement based on the invoices collection.

For each document in the invoices collection, you want to retrieve product-specific information from the products collection. 

The product_sku field in the invoices collection matches the sku field in the products collection, so you can use these fields with the $lookup operator to perform an equity match, as shown in the following example:

db.invoices.aggregate([
  {
    $lookup:
    {
      from: "products",
      localField: "product_sku",
      foreignField: "sku",
      as: "product_info"
    }
  }
])

The example should be fairly straightforward:

  • The from argument specifies products as the secondary collection. 
  • The localField argument specifies the product_sku field in the invoices collection.
  • The foreignField argument specifies the sku field in the products collection.
  • The as argument assigns the name product_info to the new array field added to the pipeline.

If you were to run this statement, it would return the documents in the invoices collection, and each document would contain the product_info array field, which would include an element for each matching document (or zero elements if there were no matching documents). The following figure shows an example of what the results might look like when displayed in Tree View.

The first document in the results is expanded to show the product_info array and its elements. In this case, the array contains only one element, which means only one document in the products collection matched the specified criteria. Notice that the sku value in the product_info array has the same value as the product_sku value in the primary document. In the exercises to follow, you’ll get a closer look at the $lookup operator in action and the types of results you can expect.

$set

Another operator that we’ll cover in the exercises is the $set operator, which is an alias for the $addFields operator. The $set operator adds one or more fields to the aggregation pipeline, with the field values determined by the operator’s expression. If the name used for a new field is the same as an existing one, MongoDB overwrites the values in the existing field with the values defined in the operator’s expression. 

The following syntax shows the elements that make up a $set expression:

$set: { new_field_name: field_expression, ... }

When using the $set operator to create a field, you must provide a field name, followed by an expression that defines the field’s values. For example, the following aggregate statement adds a field named total to the pipeline and uses the $add operator to set its value:

db.invoices.aggregate([
  {
    $set: { total: { $add: [ "$subtotal", "$tax", "$shipping" ] } }
  }
])

The $add operator adds together the values from the $subtotal, $tax and $shipping fields in the invoices collection. The total field is then added to the pipeline, along with the other fields in the collection’s documents. The following figure shows what the output data might look like after the new field has been added to the pipeline.

$limit

The exercises in this section also introduce you to the $limit operator, which limits the number of documents passed to the next stage in the pipeline. If you add a $limit stage directly after a $sort stage, without any intervening stages, MongoDB can coalesce the two stages to reduce the number of items that need to be stored in memory.

To use the $limit operator, you need only specify the operator name and a positive integer as its expression. The integer determines the maximum number of documents to include in the pipeline, as shown in the following syntax:

$limit: positive_integer

For example, if you want to limit the number of documents in the invoices collection to 25, you can define that stage as follows:

db.invoices.aggregate([ { $limit: 25 } ])

Now only the top 25 documents will be returned, based on how the documents are sorted when inputted into the $limit stage.

See also
  • Aggregation Editor
  • The Beginner’s Guide to MongoDB Aggregation
  • Supported MongoDB Aggregation Operators and Stages
  • Aggregation Pipeline Quick Reference
  • db.collection.aggregate()

Lesson Content
0% Complete 0/4 Steps
Exercise 1: Adding lookup data to the aggregation pipeline
Exercise 2: Converting string values in one of the lookup fields to integers
Exercise 3: Adding a computed ratio field based on the converted lookup field
Exercise 4: Limiting the number of returned documents
Adding Lookup Data to the Aggregation Pipeline: Test your skills
Previous Lesson
Back to Course
Next Topic
  • Course Home Expand All
    Building a Basic Aggregation
    4 Topics | 1 Quiz
    Exercise 1: Filtering the documents in the aggregation pipeline
    Exercise 2: Grouping the documents in the aggregation pipeline
    Exercise 3: Sorting the documents in the aggregation pipeline
    Exercise 4: Adding processing options to the aggregation
    Building a Basic Aggregation: Test your skills
    Introducing the Aggregation Editor
    4 Topics | 1 Quiz
    Exercise 1: Importing an aggregate statement into the Aggregation Editor
    Exercise 2: Replace a field in the aggregation pipeline
    Exercise 3: Reorder the fields in the aggregation pipeline
    Exercise 4: Changing the sort order in the aggregation pipeline
    Introducing the Aggregation Editor: Test your skills
    Working with Arrays in the Aggregation Pipeline
    5 Topics | 1 Quiz
    Exercise 1: Using expression operators to filter input documents
    Exercise 2: Unwinding an array to create individual documents
    Exercise 3: Grouping array values and generating a document count for each group
    Exercise 4: Writing pipeline results to a new collection
    Working with Arrays in the Aggregation Pipeline: Test your skills
    MongoDB 301 Mid-Course Feedback
    Adding Lookup Data to the Aggregation Pipeline
    4 Topics | 1 Quiz
    Exercise 1: Adding lookup data to the aggregation pipeline
    Exercise 2: Converting string values in one of the lookup fields to integers
    Exercise 3: Adding a computed ratio field based on the converted lookup field
    Exercise 4: Limiting the number of returned documents
    Adding Lookup Data to the Aggregation Pipeline: Test your skills
    Working with Reschema for MongoDB
    4 Topics | 1 Quiz
    Exercise 1: Setting up a reschema unit that includes lookup data
    Exercise 2: Defining a target collection in the reschema unit
    Exercise 3: Adding and scheduling a task to create the target collection
    Exercise 4: Running an aggregate statement against the target collection
    Working with Reschema for MongoDB: Test your skills
    Reporting with Studio 3T Aggregations
    3 Topics | 1 Quiz
    Exercise 1: Creating a view based on an aggregation query
    Exercise 2: Exporting a collection as a .csv file for use by a third-party tool
    Exercise 3: Visualizing collection data in MongoDB Charts
    Reporting with Studio 3T Aggregations: Test your skills
    Course Extras
    Return to MongoDB 301: Aggregation
  • Studio 3T

    MongoDB Enterprise Certified Technology PartnerSince 2014, 3T has been helping thousands of MongoDB developers and administrators with their everyday jobs by providing the finest MongoDB tools on the market. We guarantee the best compatibility with current and legacy releases of MongoDB, continue to deliver new features with every new software release, and provide high quality support.

    Find us on FacebookFind us on TwitterFind us on YouTubeFind us on LinkedIn

    Education

    • Free MongoDB Tutorials
    • Connect to MongoDB
    • Connect to MongoDB Atlas
    • Import Data to MongoDB
    • Export MongoDB Data
    • Build Aggregation Queries
    • Query MongoDB with SQL
    • Migrate from SQL to MongoDB

    Resources

    • Feedback and Support
    • Sales Support
    • Knowledge Base
    • FAQ
    • Reports
    • White Papers
    • Testimonials
    • Discounts

    Company

    • About Us
    • Blog
    • Careers
    • Legal
    • Press
    • Privacy Policy
    • EULA

    © 2023 3T Software Labs Ltd. All rights reserved.

    • Privacy Policy
    • Cookie settings
    • Impressum

    We value your privacy

    With your consent, we and third-party providers use cookies and similar technologies on our website to analyse your use of our site for market research or advertising purposes ("analytics and marketing") and to provide you with additional functions (“functional”). This may result in the creation of pseudonymous usage profiles and the transfer of personal data to third countries, including the USA, which may have no adequate level of protection for the processing of personal data.

    By clicking “Accept all”, you consent to the storage of cookies and the processing of personal data for these purposes, including any transfers to third countries. By clicking on “Decline all”, you do not give your consent and we will only store cookies that are necessary for our website. You can customize the cookies we store on your device or change your selection at any time - thus also revoking your consent with effect for the future - under “Manage Cookies”, or “Cookie Settings” at the bottom of the page. You can find further information in our Privacy Policy.
    Accept all
    Decline all
    Manage cookies
    ✕

    Privacy Preference Center

    With your consent, we and third-party providers use cookies and similar technologies on our website to analyse your use of our site for market research or advertising purposes ("analytics and marketing") and to provide you with additional functions (“functional”). This may result in the creation of pseudonymous usage profiles and the transfer of personal data to third countries, including the USA, which may have no adequate level of protection for the processing of personal data. Please choose for which purposes you wish to give us your consent and store your preferences by clicking on “Accept selected”. You can find further information in our Privacy Policy.

    Accept all cookies

    Manage consent preferences

    Essential cookies are strictly necessary to provide an online service such as our website or a service on our website which you have requested. The website or service will not work without them.

    Performance cookies allow us to collect information such as number of visits and sources of traffic. This information is used in aggregate form to help us understand how our websites are being used, allowing us to improve both our website’s performance and your experience.

    Google Analytics

    Google Ads

    Bing Ads

    Facebook

    LinkedIn

    Quora

    Hotjar

    Functional cookies collect information about your preferences and choices and make using the website a lot easier and more relevant. Without these cookies, some of the site functionality may not work as intended.

    HubSpot

    Social media cookies are cookies used to share user behaviour information with a third-party social media platform. They may consequently effect how social media sites present you with information in the future.

    Accept selected