How to Filter Elements from MongoDB Arrays

Introduction

Often, we need to check documents that contain specific information inside arrays, and to project only some of the data stored inside them.

We need to be careful with how we build the queries when dealing with arrays. Oftentimes, we make the mistake in the query and fetch documents we are not looking for.

This is the case when we have two or more conditions and, as a result, we get documents in which one element of the array matches one condition and another document only matches the other condition.

Of course, what we want is that one element of the array matches both conditions at the same time.

Purpose

It is very common to project all the fields in the documents, but in some cases, our purpose is not only to select those documents that match our conditions but, beyond that, we want to project only those fields in the arrays that we need (or a particular user is allowed to check).

In this post, we are going to learn how to select only those documents in our dataset in which inside an array there are fields that satisfy our requirements. Also, we are going to learn how to project only the array’s fields we need.

To select or filter the documents that we need, we are going to use the $elemMatch operator alongside find().

Also, we can use $elemMatch inside the $match stage of the aggregation framework.

However, we will use the $filter function when comes the time to project those fields that we need.

A quick reminder: this operator runs in conjunction with the $project stage of the aggregation framework. Thus, if we need to select documents according to the values of the fields in arrays and to project only some array fields, and all of this in the same query, we need to use the aggregation framework.

Our MongoDB dataset

In this post, the dataset we are going to work with consists only in a collection of items called ‘articles’ and two documents – enough for our didactic purposes. As is already habitual in my previous posts, the database we are going to use is called ‘3tdb’.

The info we have in each document is:

  • Code of the article – ‘article.code’
  • Article description – ‘description’
  • Array with information regarding the purchases made to our providers – ‘purchase’
  • Array with information regarding the stock per country and warehouse – ‘stock’

This is the query to execute in our ‘3tdb’ database in order to store these two documents in our ‘articles’ collection:

$ mongo
use 3tdb
db.articles.insert([
{
	"_id" : 1,
	"description" : "DESCRIPTION ARTICLE AB",
	"article_code" : "AB",
	"purchase" : [
		{
			"company" : 1,
			"cost" : NumberDecimal("80.010000")
		},
		{
			"company" : 2,
			"cost" : NumberDecimal("85.820000")
		},
		{
			"company" : 3,
			"cost" : NumberDecimal("79.910000")
		}
	],
	"stock" : [
	    {
	        "country" : "01",
	        "warehouse" : {
	            "code" : "02",
	            "units" : 10
	        }
	    },
	    {
	        "country" : "02",
	        "warehouse" : {
	            "code" : "02",
	            "units" : 8
	        }
	    }
	]
},
{
	"_id" : 2,
	"description" : "DESCRIPTION ARTICLE AC",
	"article_code" : "AC",
	"purchase" : [
		{
			"company" : 1,
			"cost" : NumberDecimal("90.010000")
		},
		{
			"company" : 2,
			"cost" : NumberDecimal("95.820000")
		},
		{
			"company" : 3,
			"cost" : NumberDecimal("89.910000")
		}
	],
	"stock" : [
	    {
	        "country" : "01",
	        "warehouse" : {
	            "code" : "01",
	            "units" : 20
	        }
	    },
	    {
	        "country" : "02",
	        "warehouse" : {
	            "code" : "02",
	            "units" : 28
	        }
	    }
	]
}
]);

Result:

BulkWriteResult({
	"writeErrors" : [ ],
	"writeConcernErrors" : [ ],
	"nInserted" : 2,
	"nUpserted" : 0,
	"nMatched" : 0,
	"nModified" : 0,
	"nRemoved" : 0,
	"upserted" : [ ]
})

Our first try

Suppose we want to get those items with stock in the warehouse ‘02’ of the country ‘01’. If we do not have the necessary experience, our first try would be something like this:

db.articles.find({
    "stock.country" : "01",
    "stock.warehouse.code" : "02"
}).pretty();

Result:

{
	"_id" : 1,
	"description" : "DESCRIPTION ARTICLE AB",
	"article_code" : "AB",
	"purchase" : [
		{
			"company" : 1,
			"cost" : NumberDecimal("80.010000")
		},
		{
			"company" : 2,
			"cost" : NumberDecimal("85.820000")
		},
		{
			"company" : 3,
			"cost" : NumberDecimal("79.910000")
		}
	],
	"stock" : [
		{
			"country" : "01",
			"warehouse" : {
				"code" : "02",
				"units" : 10
			}
		},
		{
			"country" : "02",
			"warehouse" : {
				"code" : "02",
				"units" : 8
			}
		}
	]
}
{
	"_id" : 2,
	"description" : "DESCRIPTION ARTICLE AC",
	"article_code" : "AC",
	"purchase" : [
		{
			"company" : 1,
			"cost" : NumberDecimal("90.010000")
		},
		{
			"company" : 2,
			"cost" : NumberDecimal("95.820000")
		},
		{
			"company" : 3,
			"cost" : NumberDecimal("89.910000")
		}
	],
	"stock" : [
		{
			"country" : "01",
			"warehouse" : {
				"code" : "01",
				"units" : 20
			}
		},
		{
			"country" : "02",
			"warehouse" : {
				"code" : "02",
				"units" : 28
			}
		}
	]
}

It is very easy to build this query using the Visual Query Builder of the MongoDB GUI, Studio 3T .

Related reading: Documentation on the Visual Query Builder, Studio 3T’s drag-and-drop query builder

Studio 3T's Visual Query Builder, a MongoDB drag-and-drop query builder

The query has returned both documents.

If we look carefully at the document where _id field equals 1, our requirements are fulfilled in one of the subdocuments of the array, while in the document which _id field equals 2, only one condition is matched in each of the subdocuments of the array.

Our purpose is to obtain only those documents in which all of our conditions are fulfilled, at the same time, in the subdocument, or subdocuments, of the array. So, our query must not return the document which _id field equals 2.

In order to build this query in a proper way, we must use the $elemMatch operator.

How to use $elemMatch

The official MongoDB documentation for version 4.0 states:

The $elemMatch operator matches documents that contain an array field with at least one element that matches all the specified query criteria.

Syntax

This is the syntax of the $elemMatch operator:

<fieldName> : { $elemMatch : { <condition1>, <condition2>, … } }

How to use $elemMatch with find()

We can use the $elemMatch in the query part of the find() function, and/or in the projection part of it.

Let’s see right now how to use it in the query part with the goal to resolve our problem.

db.articles.find(
  { stock : { $elemMatch : { country : "01", "warehouse.code" : "02" } } }
).pretty();

Result:

{
	"_id" : 1,
	"description" : "DESCRIPTION ARTICLE AB",
	"article_code" : "AB",
	"purchase" : [
		{
			"company" : 1,
			"cost" : NumberDecimal("80.010000")
		},
		{
			"company" : 2,
			"cost" : NumberDecimal("85.820000")
		},
		{
			"company" : 3,
			"cost" : NumberDecimal("79.910000")
		}
	],
	"stock" : [
		{
			"country" : "01",
			"warehouse" : {
				"code" : "02",
				"units" : 10
			}
		},
		{
			"country" : "02",
			"warehouse" : {
				"code" : "02",
				"units" : 8
			}
		}
	]
}

As we expected, the query has returned the same document.

To make writing the aggregation query easier, you can use Studio 3T’s Aggregation Editor:

Studio 3T's Aggregation Editor simplifies building aggregation queries

How to use $filter

Filtering elements from arrays

We should query only the documents we need.

However, on several occasions, we do not want to project all the fields in the document, but only those that match our requirements.

For this, we need to filter out of the array only those elements that meet our criteria. Obviously, we must do this in the $project stage.

To achieve our goal (see: not to project elements that do not meet our requirements), we must use the $filter operator (new in 3.2 version) alongside the aggregation framework.

Let’s continue with the same example. We maintain the $match stage to get only the document we need and then we add the $project stage.

Inside this stage, we project all the fields we need and, through the $filter operator, we tell the database which fields inside the array we want to project too.

The $filter operator has three variables:

  • input: the array we are going to work with
  • as: reference name
  • cond: our requirements

So, right now this is how our query looks like:

db.articles.aggregate([
{
   "$match" : {
       "stock" : {
          "$elemMatch" : {
             "$and" : [
                { "country" : "01" },
                { "warehouse.code" : "02" }
             ]
          }
       },
   }
},
{
   "$project" : {
       "article_code" : 1, "description" : 1,
       "stock" : {
          "$filter" : {
             "input" : "$stock",
             "as" : "stock",
             "cond" : {
                "$and" : [
                   { "$eq" : [ "$$stock.country", "01" ] },
                   { "$eq" : [ "$$stock.warehouse.code", "02" ] }
                ]
             }
          }
       }
   }
}
]).pretty();

Let’s check whether the output responds to what we are looking for:

{
	"_id" : 1,
	"description" : "DESCRIPTION ARTICLE AB",
	"article_code" : "AB",
	"stock" : [
		{
			"country" : "01",
			"warehouse" : {
				"code" : "02",
				"units" : 10
			}
		}
	]
}

Ok, we got it!

In the next picture, we can see how the query looks like in Studio 3T:

The same MongoDB aggregation query in Studio 3T

$elemMatch in the find projection part

Maybe you’re wondering: why don’t we use the $elemMatch operator in the projection part of the find(), instead of using the $filter operator in the aggregation framework?

Let’s refer to the official documentation:

The $elemMatch operator projects the first matching element from an array based on a condition.

OK. In order to better understand what this means, we are going to insert a new document in our ‘3tdb.articles’ collection.

db.articles.insert(
{
	"_id" : 3,
	"description" : "DESCRIPTION ARTICLE AD",
	"article_code" : "AD",
	"purchase" : [
		{
			"company" : 1,
			"cost" : NumberDecimal("70.010000")
		},
		{
			"company" : 2,
			"cost" : NumberDecimal("75.820000")
		},
		{
			"company" : 3,
			"cost" : NumberDecimal("69.910000")
		}
	],
	"stock" : [
		{
			"country" : "01",
			"warehouse" : {
				"code" : "01",
				"units" : 5
			}
		},
		{
			"country" : "01",
			"warehouse" : {
				"code" : "02",
				"units" : 6
			}
		}
	]
});

And now, we are going to run our query using the $elemMatch operator in the projection part of the find() function.

db.articles.find(
  { _id : 3 },
  { stock : { $elemMatch : { country : "01" } } }
).pretty();

This is the document the query returns:

{
	"_id" : 3,
	"stock" : [
		{
			"country" : "01",
			"warehouse" : {
				"code" : "01",
				"units" : 5
			}
		}
	]
}

Thus, this is not what we are looking for and this is the reason we do not use it.

Conclusion

We’ve learned how to resolve one of the most common problems that commonly plagues those new to MongoDB, which is how to query an array when more than one condition must be fulfilled in the same element. We now also know how to use the $elemMatch operator.

Once we achieved our first goal, the second was to extract from the array only the elements we needed. Now we know and are able to use the $filter operator alongside the aggregation framework.

Finally, we’ve learned about the use of the $match operator in the project part of the find function.


Want to learn more about MongoDB arrays and aggregation? Here are other tutorials you might find helpful:

Updated on May 24, 2019

Was this article helpful?

Related Articles

Leave a Comment