Visual Explain | MongoDB Explain, Visualized

When querying against large MongoDB databases, it is ideal to equip yourself with a GUI that breaks down how queries are executed, so that you can take measures to optimize MongoDB performance.

Studio 3T visualizes the MongoDB explain method through Visual Explain, a performance tuning feature which shows the steps MongoDB took to execute a query, with accompanying runtime statistics for each stage. Try it today, or skip to the performance tuning use cases.

Performance tuning use cases

Know when to add an index

To illustrate how useful Visual Explain can be, let’s take a look at the titles collection, which is part of the Pubs database.

Download the titles collection here.

It contains 10,000 documents, each containing 12 fields: _id, title_id, title, type, TypeList, pub_id, price, advance, royalty, ytd_sales, notes, and pubdate.

To test how expensive a query can be, let’s run a simple query in Visual Query Builder.

Let’s sort the collection by price in descending order:

Sort the titles collection by price in descending order

db.getCollection("titles").find({}).sort(
    { 
        "price" : -1.0
    }
);

Now, let’s click on the Explain tab.

By default, Studio 3T will show brief mode. It shows the diagram of stages without runtime statistics:

Visual Explain shows the diagram in query planner mode by default

To show runtime statistics, click on the Run full explain button to trigger full mode:

What information can we gather from this?

It took MongoDB 20 milliseconds to run the collection scan

Visual Explain shows that the collection scan took 20ms

A collection scan means that the query looked at each document to get the results.

2. 10,000 documents were passed at each stage

Hover over the number to see exactly how many documents were passed from one stage to another.

3. It took MongoDB 21 milliseconds to run the Sort key generator stage and another 30 milliseconds to run the Sort stage.

Shows milliseconds added by the sort stages

4. The whole query took 71 milliseconds to run.

Shows the total time it took to run the query

Now, how can we optimize this query?

If you often find yourself often sorting price in descending order, then it would be best to create an index on the field, price.

Running the same query all the time? Optimize it by adding an index

Learn how to create an index in Studio 3T.

Now that the index is set up, let’s run the same query again (and this time, we expect it to be faster).

Indeed, MongoDB runs the same query much faster, at 30ms (versus 71ms).

Instead of using a collection scan which took 20ms, MongoDB chose an index scan which only took 10ms. The database detected that an appropriate index exists for the query, so it scanned a limited number of documents, instead of needlessly scanning each one.

By choosing the index scan, the database was also able to skip the sort stages, and instead went straight to fetching the documents, resulting in 41ms saved.

Spot uneven shard loads

MongoDB uses sharding to horizontally scale data, which it tries to evenly distribute across shards – but this is not always the case.

Visual Explain can come in handy in spotting imbalanced loads when you’re querying a sharded database.

Take for example the snapshot below of crimes, a collection of about 2,486,000 documents, hosted on a database with two shards.

To test, we run an empty query and discover that:

Shard 1 handled 1.6 million documents
Shard 2 handled 893K documents

A look at an uneven distribution of shard load

Now we know that the database isn’t distributing the workload evenly as expected. Shard 1 is handling almost twice the load.

We can then move on to the next steps of investigating the sharded cluster balancer or chunk size to fix a bigger issue, which Visual Explain helped us uncover.

Optimize queries by applying relevant filters first

When it comes to querying, the best practice is to apply operators that do the most filtering first.

Visual Explain is an ideal way to check if this is being done.

Using the same crimes collection from above, we’d like to know how many cases of theft occurred within Chicago’s public transport system, Chicago Transit Authority (CTA).

We’d also like to sort these cases in descending order by date, and we’d only like to see the four relevant fields _id, location.description, primary_type (of crime), and date.

Here’s the full query ran in Visual Query Builder:

db.getCollection("crimes").find(
    { 
        "location.description" : /.*cta.*/i, 
        "primary_type" : "Theft"
    }, 
    {
        "_id" : 1.0,  
        "location.description" : 1.0, 
        "primary_type" : 1.0, 
        "date" : 1.0
    }
).sort(
    { 
        "date" : -1.0
    }
);

Now let’s click on the Explain tab.

What we see is exactly what we want to happen: the database is applying the filter operator first.

From the original 2.5M documents, only 13K meet the filter criteria and are then passed on to the later stages.

The filter stage should be applied in the early stages of a query

If Visual Explain shows you that this isn’t the case, then the next step would be to debug your query.

A way to debug

Say, for example, that you wanted to know how many customers have priority support and live in Kansas, from your pool of 1,000 customers.

Download the Customers collection to follow along.

You already have a hunch that this can’t be a big number, but Visual Explain shows that 485 documents – 485 customers – meet the criteria:

Visual Explain can be useful in determining where queries could have gone wrong

This is clearly way too a large a number that could only point to an error in your query logic. (Do you spot the $ne operator? 🙂 )

While a simple example, it still illustrates the usefulness of having a quick and visual way of double-checking the soundness of your query.

MongoDB explain(), visualized

In previous versions of Studio 3T, it was already possible to see query plans and execution statistics through the Explain Tab, which displayed this information in tree view, like this:

The Explain tab gives an in-depth view into query performance.

Visual Explain is the evolution of the Explain Tab.

Instead of clicking through such a tree, the Visual Explain visualizes the same strategy undertaken by MongoDB in a simple-to-follow diagram.

Brief vs. Full Mode

By default, Visual Explain will show the explain diagram in brief mode without runtime statistics, like this:

Brief Mode doesn't display runtime statistics

It’s fast and always available for all queries, without delay.

Full mode, on the other hand, shows both the stages and runtime statistics:

Full mode displays stages and runtime statistics, but might cause a delay

It requires running the query, which might cause a delay depending on its size.

To trigger full mode, click the Run full explain button in the toolbar:

Click on the Run full explain button to show full mode

Because of limitations in MongoDB, full mode is not available for aggregation queries built through Aggregation Editor and SQL Query.

Hover over elements

You can hover over all elements in full mode to trigger additional help texts, for example:

Arrows

Hover over arrow elements to reveal more information

Difference between L-shaped, zigzag arrows, and straight arrows

L-shaped arrows indicate that the target collection or index is accessed sequentially.

Zigzag arrows indicate that the target collection or index is accessed in random order.

Straight arrows could indicate a number of events, such as processing of documents within the same stage, transfer of documents to the next stage, etc.

Runtime statistics

Hovering over runtime statistics displays information like the estimated number of bytes processed by the stage:

Hover over the statistic to see amount of data processed

Or the full document count (e.g. 1007 documents instead of 1.0K):

Reveal the full document count by hovering over the statistic

Or the time it took to execute an individual stage, or the whole query if you hover over the Result block:

Hover over the statistic to show how long it took to execute a stage, or the whole query

Query stages

Behind every MongoDB query is a query plan, and the Visual Explain displays how this plan was executed stage by stage.

View full description

To view a stage’s full description, double-click it or right-click and choose What’s This?.

Trigger the stage description by double-clicking on any stage

Runtime statistics

Visual Explain’s full mode displays actual runtime statistics, such as:

Stage output (number of documents produced)

Shows the number of documents produced by a stage.

Execution time

Shows the milliseconds it took to execute a stage

Tracks the number of milliseconds it took to execute a stage.

Amount of data processed

Displays the estimated amount of data processed to carry out the operation.

Learn how to use the information from Visual Explain to optimize MongoDB query performance.

View the original query plan in JSON

Missing the old Explain tab? You still have the option to view the query plan as a JSON document. Click on the View original JSON document button:

Click on the icon to view the query plan as a JSON document

View a specific stage’s query plan in JSON

The original JSON document still contains a lot of information not visible in the diagram, but they tend to be very nested and difficult to work with.

To spare the back-and-forth scrolling for those who are interested in this extra information, the Visual Explain makes it much easier to view a specific stage’s JSON fragment.

Right-click on a stage block.

Click on view original JSON fragment.

This will display only that specific stage’s JSON fragment, instead of the full query plan.

Displays the stage's specific JSON fragment

List of supported query stages

Aggregation operator

Represents a stage in an aggregation pipeline. All such stages read their input documents, apply operator-specific processing, and pass the result to the next stage.

Collection scan

Represents a collection scan query stage. Collection scan reads the documents from the collection one by one and passes them to the next stage. The documents are produced in no particular order. Collection scan is performed whenever the query engine decides that the easiest (or the only possible) way to get all the required documents is to read every document in the entire collection. Collection scan is a very I/O expensive operation, especially if the collection is large.

Fetch document

Represents a fetch document query stage. This stage reads a stream of ids produced by the previous stage. Each id then fetches a document with that id from the collection, then passes the document to the next stage. Although each individual document lookup is fast, this stage generates a lot of random I/O operations, which can become very I/O expensive, especially when multiplied by the number of documents.

Filter

Represents a document filter. Not a stage by itself, filtering is performed as a part of various stages. All the input documents are checked against some condition, and only those that match are passed to the next stage.

Index scan

Represents an index scan query stage. Index scan quickly locates and reads index records that match the query filter. Each matching record is passed to the next stage. The records are typically produced in order of the index (e.g. alphabetical). Since indexes contain only a handful of fields, this stage is typically followed by a fetch, which reads an entire document from the collection, using ids extracted from the index record. Index scan is only used when the query engine decides that the current query will benefit from going through the index, rather than running a full collection scan.

Union + deduplicate

Represents a union + deduplicate query stage. The purpose of this stage is to combine the results of its input stages, discarding identical documents that may be encountered more that once.

Limit

Represents a limit query stage. This stage reads the first N documents, passes them to the next stage, then terminates the process without examining the rest.

Projection

Represents a projection query stage. Projection transforms the input documents by removing some fields and possibly adding others. The transformed documents – which typically have just a small number of fields left – are passed to the next stage.

Shard merge sort

Represents a shard merge sort query stage. This stage combines the results obtained from several shards into one. All the shards’ results are assumed to be sorted, and the produced documents will also be sorted in the same order.

Sharding filter

Represents a sharding filter query stage. This stage reads the input documents and passes through only those which satisfy the sharding condition for this shard. This way, a shard never produces a document that shouldn’t belong to it.

Shard merge

Represents a shard merge query stage. This stage combines the results obtained from several shards into one.

Single shard

Represents a single shard query stage. This stage appears when all the documents to satisfy the query fall under the same shard, so it’s effectively a “merge” of one.

Skip

Represents a skip query stage. This stage reads the first N documents and discards them without producing anything, then passes the rest of the documents to the next stage.

Sort buffer

Represents a sort buffer. Not a stage by itself, sort buffer limits the number of documents to be produced by the sort stage. This allows to reduce the amount of memory required for sorting.

Sort key generator

Represents a sort key generator query stage. This is a preliminary stage to the sort stage that follows. From the input documents it extracts the fields required for sorting and passes the result to the sort stage.

Sort

Represents a sort query stage. Sorting rearranges the input documents and produces them in order. This stage can’t output anything before all the input documents have been processed, which potentially requires a lot of memory.

Subplan

Represents a subplan stage. This stage plans each clause of the $or query individually, and then
creates an overall query plan based on the winning plan from each clause.

Text match

Represents a text match query stage. This stage returns only the documents that match the full-text search
expression.

Text

Represents a text query stage. The purpose of this stage is to collect statistics on the full-text
search process itself.

Unknown

This block indicates that a query stage of this type is not known
to the application, and is displayed as “just some stage”.

Not supported

This block indicates that the format of the query plan is not supported and it is not parsed at all.

Interested in learning more about MongoDB performance tuning? Check out these related tutorials:

Visual Explain | MongoDB Explain, Visualized

Performance tuning use cases

Know when to add an index

Spot uneven shard loads

Optimize queries by applying relevant filters first

A way to debug

MongoDB explain(), visualized

Brief vs. Full Mode

Hover over elements

Arrows

Runtime statistics

Query stages

View full description

Runtime statistics

Stage output (number of documents produced)

Execution time

Amount of data processed

View the original query plan in JSON

View a specific stage’s query plan in JSON

List of supported query stages

About The Author

Kathryn Vargas

We value your privacy

Privacy Preference Center

Manage consent preferences