MongoDB 5: the new features

MongoDB is the most widely used NoSQL database in the world. Its continuous growth is due to the continuous development of new features. Version 5, released at the end of July 2021, introduced some very interesting new features. In this article we will analyze the most relevant and most useful in their daily use.

Tempo di lettura: 4 minuti

MongoDB is the world’s most widely used document-based NoSQL database and, in recent years, is becoming a viable alternative to relational databases. Suffice it to say that it is number 1 among NoSQL databases and fifth overall among all databases.

What makes it so special that many companies have decided to focus on it over more traditional relational databases? Its features are the answer. In addition to being schemaless, i.e. the lack of a fixed data schema definition that allows a reduction in software production time, the features released with each release increase its ability to handle increasingly complex data in different application contexts. If you are curious to learn how to efficiently use MongoDB through modeling patterns we recommend the book Design with MongoDB: Best models for applications.

In this article we will analyze the most important new features of the latest version released on July 13, 2021, namely MongoDB 5.

Time Series Collections

MongoDB 5 introduces time series collections that efficiently store sequences of measurements taken over a period of time. This feature enables the use of MongoDB 5 in the Internet Of Things (IOT) field. Using this type of collection over standard collections improves query efficiency and reduces disk usage for data and secondary indexes.

Time series collections behave like standard collections. Therefore, data insertion and querying is done in the same way as for other collections. Internally, MongoDB treats these types of collections as non-materialized views writable to internal collections that automatically organize time series data in an insertion-optimized storage format.

Queries on time series collections benefit from the optimized internal storage format, returning results faster.

Commands

Creating a Time Series Collection

It is necessary to explicitly define that a collection is used for time series using the db.createCollection() command. You cannot transform an existing collection into this type. Below is an example of creating a collection for time series.

db.createCollection(
    "weather24h",
    {
       timeseries: {
          timeField: "timestamp",
          metaField: "metadata",
          granularity: "hours"
       },
       expireAfterSeconds: 86400
    }
)

During creation you can specify the following parameters.

Parameter	Type	Description
timeseries.timeField	string	Required. The name of the field that contains the date in each document in the time series. Documents in a time series collection must have a valid BSON date as the value for the timeField.
timeseries.metaField	string	Optional. The name of the field that contains metadata in each time series document. The metadata in the specified field should be data used to label a unique set of documents and should rarely or never change. The name of the specified field cannot be _id or the same as timeseries.timeField. The field cannot be of type array.
timeseries.granularity	string	Optional. Possible values are "seconds", "minutes" and "hours". The default granularity is set to "seconds". To improve performance you should set a value corresponding to the closest time interval between the measurements you want to store. In case you specify the timeseries.metaField field, you need to consider the time interval between consecutive measurements that have the same unique value for the metaField field, i.e. those that come from the same source. Otherwise, you must consider the time interval between all measurements that will be included in the collection.
expireAfterSeconds	number	Optional. Enable automatic deletion of documents in the collection by specifying the number of seconds after which the documents expire. MongoDB, using a Time To Live (TTL) index type, will automatically delete expired documents.

Insert measurements

Each inserted document must contain only one measure. To insert a single document, you use the db.collection.insertOne() method. Otherwise, you use the insertMany() method as shown below.

db.weather.insertMany([{
   "metadata": [{"sensorId": 5578}, {"type": "temperature"}],
   "timestamp": ISODate("2021-05-18T00:00:00.000Z"),
   "temp": 12
}, {
   "metadata": [{"sensorId": 5578}, {"type": "temperature"}],
   "timestamp": ISODate("2021-05-18T04:00:00.000Z"),
   "temp": 11
}])

Query and aggregation pipeline

To query a document from a collection of time series, you run queries with the default syntax. For example, to retrieve documents with a certain timestamp, you perform the following query.

db.weather.findOne({
   "timestamp": {"$date": "2021-05-11T04:00:00.000Z"}
})

You can also use the aggregation pipeline to perform more complex queries. For example, if you want to calculate the average temperature measured during each day you would run the following pipeline.

db.weather.aggregate([
   {
      $project: {
         date: {
            $dateToParts: { date: "$timestamp" }
         },
         temp: 1
      }
   },
   {
      $group: {
         _id: {
            date: {
               year: "$date.year",
               month: "$date.month",
               day: "$date.day"
            }
         },
         avgTmp: { $avg: "$temp" }
      }
   }
])

The $dateToParts command extracts the values of the various timestamp fields and saves them in the date field as an embedded document. In this way you can then group the measurements by day, month and year. In case you want to filter the results for a specific day you should insert a stage of type $match.

Aggregation pipeline

Several new features have been added to the aggregation pipeline. In addition to an improvement in some of the operators through the use of indexes, the following are the most significant changes from previous versions.

New operators

MongoDB 5 introduces new aggregation pipeline operators shown below.

Operator	Description
$count	$count (aggregation accumulator) provides a count of all documents when used in the existing $group pipeline and in the new $setWindowFields stage of MongoDB 5.0.
$dateAdd	Increments a Date object by a specified number of time units.
$dateDiff	Returns the difference between two dates.
$dateSubtract	Decreases a Date object by a specified number of time units.
$dateTrunc	Truncate a date.
$getField	Returns the value of a field specified by a document. You can use $getField to retrieve the value of fields with names that contain dots (.) or begin with a dollar sign ($).
$sampleRate	It is used to probabilistically select documents from a pipeline at a given rate.
$setField	Adds, updates, or removes a specified field in a document. You can use $setField to add, update, or remove fields with names that contain dots (.) or begin with a dollar sign ($).
$rand	It generates a random float value between 0 and 1 each time it is executed. The new $sampleRate operator is based on $rand.

Window operator

MongoDB 5.0 introduces the $setWindowFields stage to perform operations on a specified range, called a window, of documents within a collection. The operation returns results based on the chosen window operator.

For example, you can use the $setWindowFields stage to produce the result of:

Difference in sales between two documents in a set.
Sales rankings.
Cumulative sales totals.
Analysis of complex time series information without exporting the data to an external database.

For example, you can calculate the cumulative amount of bake sales for each state with the following command.

db.cakeSales.aggregate( [
   {
      $setWindowFields: {
         partitionBy: "$state",
         sortBy: { orderDate: 1 },
         output: {
            cumulativeQuantityForState: {
               $sum: "$quantity",
               window: {
                  documents: [ "unbounded", "current" ]
               }
            }
         }
      }
   }
] )

The partitionBy: “$state” parameter partitions the documents according to the value of the state field. Within each partition the documents are sorted by increasing values of the orderDate field (the oldest orderDate is first). Finally, the stage sets a cumulativeQuantityForState field to calculate the cumulative quantity for each state. The calculation is done using the $sum operator within the document window defined by a lower bound (in this case unbounded) and an upper bound (in the example the current document).

The description of the various parameters of this stage with examples can be found in the official documentation.

New Shell MongoDB: mongosh

As of MongoDB version 5, the mongo shell is deprecated and replaced by mongosh. The new shell offers several advantages over the previous version, including:

Improved syntax highlighting.
Improved command history.
Improved recording.

In the first release mongosh only supports a subset of the mongo shell methods. The full list of currently supported methods are described in detail in the official documentation. Also, to maintain backward compatibility, the methods that mongosh supports use the same syntax as the corresponding methods in the mongo shell.

Conclusions

MongoDB 5 is another step forward of this NoSQL database. With the introduction of time series collections, its application to IOT becomes much easier. There are also other new features/changes besides the ones described above that improve the performance and capabilities of the DBMS. You can find the full list of new features in the Release Notes.

More To Explore

Development

Supabase: the Open-Source Backend for Your Vibe-Coded Apps

Lovable and Bolt build the frontend in minutes. But where does user data live? How does login work? Who can see what? Supabase answers all of these questions: managed PostgreSQL, ready-to-use authentication, file storage, and Row Level Security — all free up to a generous limit, all integrable in a single click from the main vibe coding tools.

Alessandro Fiori 29 June 2026

Artificial intelligence

Sentiment Analysis & Topic Modeling: What Your Customers Really Mean

You have 200 reviews, 500 support tickets, 1,000 social media comments. Reading them all would take days — and you’d still miss the most important patterns. Sentiment Analysis and Topic Modeling solve exactly this: in ten minutes you get the emotional tone of every text, recurring themes grouped automatically, and a strategic summary that manual reading would never have produced.

Alessandro Fiori 22 June 2026