In this exercise, you’ll import the states_transactions.js file into the Aggregation Editor. The file contains the aggregate
statement that you updated in the second section of this course. You’ll use this statement as the foundation for all four exercises in this section, modifying the statement as you work through them. In this exercise, you’ll add lookup data to the aggregation pipeline, disable a stage, and modify two existing stages.
To add lookup data to the pipeline
- Launch Studio 3T and connect to MongoDB Atlas.
- In the Connection Tree, expand the sales database node and, if necessary, expand the Collections node.
- Right-click the customers collection node, and then click Open Aggregation Editor. Studio 3T adds the Aggregation tab to the main window. The tab displays the Aggregation Editor, with the editor’s Pipeline tab active. At this point, the aggregation pipeline is empty.
- On the Aggregation Editor toolbar, click the Open button (folder icon) on the left side, navigate to the folder that contains the states_transactions.js file, and double-click the file. Studio 3T uses the aggregate statement in that file to automatically populate the pipeline.
db.customers.aggregate( [ { "$match" : { "dob" : { "$lt" : ISODate("1970-01-01T00:00:00.000+0000") } } }, { "$group" : { "_id" : "$address.state", "total" : { "$sum" : "$transactions" } } }, { "$project" : { "_id" : 0.0, "state" : "$_id", "total" : 1.0 } }, { "$replaceRoot" : { "newRoot" : { "state" : "$state", "total" : "$total" } } }, { "$sort" : { "state" : 1.0 } } ], { "allowDiskUse" : false, "collation" : { "locale" : "en_US" } } );
To add this statement to the Aggregation Editor, copy the statement to your clipboard and then paste it into the Aggregation Editor by clicking the Clipboard button at the right end of the Aggregation Editor toolbar.
Regardless of how you add the aggregate
statement to the Aggregation Editor, the Pipeline tab should now list the aggregate
statement’s five stages, as shown in the following figure. The listing for each stage includes the stage’s operator and its expression.
- On the Pipeline tab of the Aggregation Editor, right-click the
$group
pipeline stage, and then click Add New Stage After Selected Stage.
The Aggregation Editor adds a new tab to the right of the 2: $group tab and makes the new tab active. Initially, the new tab is named 3: $match because Studio 3T uses$match
as its default operator when creating a new stage. - On the new tab, select $lookup from the Operator drop-down list near the tab’s upper left corner. The
$lookup
operator lets you incorporate data from another collection into the current pipeline.
When you select the $lookup operator, the Aggregation Editor changes the name of the tab to 3: $lookup and adds placeholder text. - In the editor window, replace the placeholder text and curly braces with the following code:
{ from: "population", localField: "_id", foreignField: "state", as: "state_info" }
The expression includes four arguments, separated by commas. Each argument includes a field name and the value assigned to that field:
- The
from
argument specifies that the lookup data should be retrieved from thepopulation
collection. - The
localField
argument specifies that the_id
field in the customers collection should be used for matching values in thepopulation
collection. - The
foreignField
argument specifies that thestate
field in thepopulation
collection should be used for matching values in thecustomers
collection. - The
as
argument specifies that the namestate_info
should be used for the array that will contain the data retrieved from thepopulation
collection.
In this case, the $lookup
stage is performing an equity join. The join is based on the _id
field in the customers
collection and the state
field in the population
collection, making it possible to merge documents in the two collections and include the results in the aggregation pipeline.
- In the Stage Input pane, click the Execute button. Studio 3T runs the pipeline for the first two stages and returns the data to that pane. This is the data that is used as input for the
$lookup
stage. - In the Stage Output pane, click the Execute button. This will execute the pipeline up to and including the
$lookup
stage. The following figure shows part of the results, as they appear in Tree View. (The documents might be displayed in a different order on your system.)
The output documents now include the state_info
array. For each primary document, the array includes one element, which is an embedded document. The embedded document includes the fields from the matching document in the population
collection. Notice that the _id
value in the primary document matches the state
value in the array.
- Go to the 2: $group tab and replace the total field name with transactions. This will make it easier to understand the data as the results become more complex. The operator expression should now look like the following code:
{ "_id": "$address.state", "transactions": { "$sum": "$transactions" } }
- Go to the 4: $project tab and clear the Include in the pipeline check box. This stage is not necessary with the $replaceRoot stage. Eventually, you will delete the
$project
stage, but for now keep it disabled. This is always a good practice when updating the pipeline, until you’re satisfied that you can safely delete the stage. - Go to the 5: $replaceRoot tab and replace the existing operator expression with the following code:
{ newRoot: {state: "$_id", transactions: "$transactions", population: "$state_info.population"} }
The code changes the state
value to $_id
and changes the name of the total
field to transactions
. These changes are necessary because you updated the $group
stage and disabled the $project
stage. The newRoot operator also includes a third argument, which adds the $state_info.population
field to the output.
- In the Stage Input pane, click the Execute button. Studio 3T runs the pipeline for all stages up to but not including the $replaceRoot stage.
- In the Stage Output pane, click the Execute button. This will execute the pipeline up to and including the
$replaceRoot
stage. The following figure shows part of the results, as they appear in Tree View.
Notice that the state_info array and its embedded document have been replaced by the population array, which contains only one element, the population value.
- Click Save on the Aggregation Editor toolbar to save your changes.
- Leave the Aggregation tab open and the existing statement in place for the next exercise. You’ll be building on this statement by adding two more pipeline stages.
If you’re not ready to move on to the next stage, you can save the statement for now and close Studio 3T. You can then reopen the states_transactions.js file when you’re ready to move onto the next exercise.